All posts by IEP Author

Yinyang (Yin-yang)

yinyangYinyang (yin-yang) is one of the dominant concepts shared by different schools throughout the history of Chinese philosophy. Just as with many other Chinese philosophical notions, the influences of yinyang are easy to observe, but its conceptual meanings are hard to define. Despite the differences in the interpretation, application, and appropriation of yinyang, three basic themes underlie nearly all deployments of the concept in Chinese philosophy: (1) yinyang as the coherent fabric of nature and mind, exhibited in all existence, (2) yinyang as jiao (interaction) between the waxing and waning of the cosmic and human realms, and (3) yinyang as a process of harmonization ensuring a constant, dynamic balance of all things. As the Zhuangzi (Chuang-tzu) claims, “Yin in its highest form is freezing while yang in its highest form is boiling. The chilliness comes from heaven while the warmness comes from the earth. The interaction of these two establishes he (harmony), so it gives birth to things. Perhaps this is the law of everything yet there is no form being seen” (Zhuangzi, Chapter 21). In none of these conceptions of yinyang is there a valuational hierarchy, as if yin could be abstracted from yang (or vice versa), regarded as superior or considered metaphysically separated and distinct. Instead, yinyang is emblematic of valuational equality rooted in the unified, dynamic, and harmonized structure of the cosmos. As such, it has served as a heuristic mechanism for formulating a coherent view of the world throughout Chinese intellectual and religious history.

  1. Origins of the Terms Yin and Yang
  2. The Yinyang School
  3. Yinyang as Qi (Vital Energy)
  4. Yinyang as Xingzi (Concrete Substance)
  5. The Yinyang Symbol
  6. References and Further Reading

1. Origins of the Terms Yin and Yang

The earliest Chinese characters for yin and yang are found in inscriptions made on “oracle bones” (skeletal remains of various animals used in ancient Chinese divination practices at least as early as the 14th century B.C.E.). In these inscriptions, yin and yang simply are descriptions of natural phenomena such as weather conditions, especially the movement of the sun. There is sunlight during the day (yang) and a lack of sunlight at night (yin). According to the earliest comprehensive dictionary of Chinese characters (ca. 100 CE), Xu Shen’s Shuowen jiezi (Explaining Single-component Graphs and Analyzing Compound Characters), yin refers to “a closed door, darkness and the south bank of a river and the north side of a mountain.” Yang refers to “height, brightness and the south side of a mountain.” These meanings of yin and yang originated in the daily life experience of the early Chinese. Peasants depended on sunlight for lighting and their daily life routines. When the sun came out, they would go to the field to work; when the sun went down, they would return home to rest. This sun-based daily pattern evidently led to a conceptual claim: yang is movement (dong) and yin is rest (jing). In their earliest usages, yin and yang existed independently and were not connected. The first written record of using these two characters together appears in a verse from the Shijing (Book of Songs): “Viewing the scenery at a hill, looking for yinyang.” This indicates that yang is the sunny side and yin is the shady side of hill. This effect of the sun exists at the same time over the hill.

2. The Yinyang School

According to Sima Tan (Ssu-ma Tan, c. 110 B.C.E.), there existed a school of teaching during the “Spring and Autumn” (770-481 B.C.E.) and “Warring States” (403-221 B.C.E.) periods that bore the name of yinyang. He lists this yinyang school alongside five others (Confucian, Mohist, Legalist, Fatalist, and Daoist) and defines its theory as “the investigation of the shu [art] of yin and yang.” According to him, this school focused on omens of luck and explored the patterns of the four seasons. In other words, the yinyang school was concerned with methods of divination or astronomy (disciplines that were not distinct from one another in early China, as elsewhere in the ancient world) and the calendrical arts (which entailed study of the four seasons, eight locations, twelve du [measures] and twenty-four shijie [time periods]). Just as the Confucians (rujia) arose from the ranks of rushi (“scholar-gentlemen”) who excelled at ritual and music, those of the yingyang school came from the fangshi (“recipe-gentlemen”) who specialized in various numerological disciplines known as shushu (“number-arts”). These shushu included tianwen (astronomy), lipu (calendar-keeping), wuxing (“five phases” correlative theory), zhuguai (tortoise-shell divination), zazha (fortune-telling) and xingfa (face-reading). The Han dynasty chronicle Shiji (Records of the Historian) lists Zou Yan (305-240 B.C.E.) as a representative of the yinyang school who possessed a profound knowledge of the theory of yinyang and wrote about a hundred thousand words on it. However, none of his works have survived.

By the Han dynasty (202 B.C.E.-220 C.E.), yinyang was associated with wuxing (“five phases”) correlative cosmology. According to the “Great Plan” chapter of the Shujing (Classic of Documents), wuxing refers to material substances that have certain functional attributes: water is said to soak and descend; fire is said to blaze and ascend; wood is said to curve or be straight; metal is said to obey and change; earth is said to take seeds and give crops. Wuxing is used as a set of numerological classifiers and explains the configuration of change on various scales. The so-called yinyang wuxing teaching – an “early Chinese attempt in the direction of working out metaphysics and a cosmology” (Chan 1963: 245) – was a fusion of these two conceptual schemes applied to astronomy and the mantic arts.

3. Yinyang as Qi (Vital Energy)

The most enduring interpretation of yinyang in Chinese thought is related to the concept of qi (ch’i, vital energy). According to this interpretation, yin and yang are seen as qi (in both yin and yang forms) operating in the universe. In the “Duke Shao” chapter of the Zuozhuan (The Book of History), yin and yang are first defined as two of six heavenly qi:

There are six heavenly influences [qi] which descend and produce the five tastes, go forth in the five colours, and are verified in the five notes; but when they are in excess, they produce the six diseases. Those six influences are denominated the yin, the yang, wind, rain, obscurity, and brightness. In their separation, they form the four seasons; in their order, they form the five (elementary) terms. When any of them is in excess, they ensure calamity. An excess of the yin leads to diseases of cold; of the yang, to diseases of heat. (Legge 1994: 580).

Here, yin and yang are the qi of the universe. These qi flow within the natural as well as the human worlds. They are the basic fabric of existence:

Heaven and earth have their regular ways, and men like these for their pattern, imitating the brilliant bodies of Heaven, and according with the natural diversities of the Earth. (Heaven and Earth) produce the six atmospheric conditions [qi], and make use of the five material elements. Those conditions (and elements) become the five tastes, are manifested in the five colours, and displayed in the five notes. When they are in excess, there ensue obscurity and confusion, and people lose their (proper) nature… There were mildness and gentleness kindness and harmony, in imitation of the producing and nourishing action of Heaven. There are love and hatred, pleasure and anger, grief and joy, produced by the six atmosphere conditions [qi]. Therefore (the sage kings) carefully imitated these relations and analogies (in forming ceremonies), to regulate those six impulses…When there is no failure in the joy and grief, we have a state in harmony with the nature of Heaven and Earth, which consequently can endure long. ( Legge 1994: 708).

Thus qi, a force arising from the interplay between yin and yang, becomes a context in which yinyang is seated and functions. Yinyang as qi provides an explanation of the beginning of the universe and serves as a building block of the Chinese intellectual tradition. In many earlier texts, one may observe how yinyang generates a philosophical perspective on heaven, earth and human beings. Chapter 42 of the Laozi says that “everything is embedded in yin and embraces yang; through chong qi [vital energy] it reaches he [harmony].” It is through yinyang’s function as qi and the interaction between them that everything comes into existence. Zhuangzi also speaks about the “qi of yin and yang”: “When the qi of yin and yang are not in harmony, and cold and heat come in untimely ways, all things will be harmed.” (Zhuangzi ch. 31) On the other hand, “when the two have successful intercourse and achieve harmony, all things will be produced.” (Zhuangzi ch. 21)

The interpretation of yinyang as qi conceives yinyang as a dynamic and natural form of flowing energy, a complementary in the primordial potency of the universe. The Huainanzi offers more detailed explanation of the cosmological process of yin and yang:

When heaven and earth were formed, they divided into yin and yang. Yang is generated [sheng] from yin and yin is generated from yang. Yin and yang mutually alternate which makes four fields [wei, “celestial circles”] penetrate. Sometimes there is life, sometimes there is death, that brings the myriad things to completion. (ch. 2)

This process also explains the beginning of human life. When qi moved, the clear and light rose to be heaven and the muddy and heavy fell to become earth. When these two qi interacted and attained the stage of harmony (he), human life began. This shows that everything is made from the same materials and difference relies on the interaction.

Qi also takes on various forms and is convertible from one form to another with order and pattern. The concept of yinyang supplies a unitary vision of heaven, earth and human beings and makes the world intelligible in terms of a resonance between human beings and the universe. The Guoyu (Discourses of the States) describes how earthquakes took place at the confluence of the Jing, Wei, and Lou rivers during the second year of Duke You of the western Zhou dynasty. A certain Boyang Fu claims that the Zhou empire is doomed to collapse, explaining that

The qi of heaven and earth can’t lose its order. If its order vanishes people will be disoriented. Yang was stuck and could not get out, yin was suppressed and could not evaporate, so an earthquake was inevitable. Now the earthquakes around the three rivers are due to yang losing its place and yin being pressed down. Yang is forsaken under yin so the source of rivers has been blocked. If the foundation of rivers is blocked the country will definitely collapse. This is because of the fact that the flowing water and flourishing land are necessities for the people’s lives. If the water and land cannot sustain the people’s living conditions, the country will inevitably fall. (Discourse of the States 1994: 22).

Not only does this ¬yinyang-flavored explanation claim to illuminate natural phenomena, it also implies that there is an intrinsic relationship between natural events and political systems. Human beings, especially political leaders, must align their virtuous actions with the morally-oriented universe. If they follow and harmonize with (shun) the order and patterns of the universe, they will be rewarded with prosperity and flourishing, but if they go against and conflict with (ni) it, they will be punished with disasters and destruction. Whether one engages in shun or ni depends upon whether yin and yang are in a state of balance. Thus, yinyang provides a heuristic outlook for human understanding as well as ethical guidance for achieving harmony in action. As chapter 8 of the Huainanzi claims:

Yinyang embodies the harmony of heaven and earth, manifests the forms of myriad things, contains qi to transform the things and completes various kinds of things; yinyang extends and penetrates to the deepest level; begins in emptiness then becomes full and moves in boundless lands.

4. Yinyang as Xingzi (Concrete Substance)

Yinyang also has been understood as some concrete substance (xingzhi), according to which yixing and yangxing define everything in the universe. In the Yijing (I-Ching, The Book of Changes), yinyang is presented as xingzhi. Yang was identified with the sun and yin with the moon:

Heaven and earth correlate with vast and profound; four seasons correlate with change and continuity [biantong]; the significance of yin and yang correlate with sun and moon; the highest excellence [zhide] correlates the goodness of easy and simple.(Sishu wujing 1990: 197)

The Guanzi, an important work of the Huang-Lao school, discusses this view along the same lines: “The sun is in charge of yang, the moon is in charge of yin, the stars are in charge of harmony [he].” (Guanzi 2000: 151). This xingzhi interpretation materializes the concept of yinyang in some concrete contexts and shows that the universe is orderly, moral and gendered. The pattern of the world is written in a gendered language. Yinyang is something one can see, feel, and grasp through the senses. For example, in the Liji (Book of Ritual), music represents the he (harmony) of heaven and earth, while li (ritual) represents the order of heaven and earth: “Music is coming from yang, ritual is coming from yin. The harmony of yinyang receives the myriad things.” (Sishu wujing 1990: 525) In the human world, male as yang should be cultivated, otherwise the day will suffer; female as yin should be cultivated too, otherwise the moon will be affected.

According to Dong Zhongshu, (195-115 B.C.E.), both Tian (heaven) and human beings have yinyang. Therefore, there is an intrinsic connection between tian and human beings through the movement of yin and yang. Yinyang is an essential vehicle for interactions between heaven and human beings: “The qi of yinyang moves heaven above as well as in human beings. When it is among human beings it is displayed itself as like, dislike, happy and mad, when it is in heaven it is seen as warm, chilly, cold and hot.” (Dong Zhongshu 1996: 436) In Dong’s cosmological vision, the whole universe is a giant yinyang. One of many examples of this vision is Dong’s proposal to control floods and prevent droughts by proper human interaction. In chapter 74 (“Seeking the Rain”) of his Luxuriant Gems of the Spring and Autumn, Dong asserts that a spring drought indicates too much yang and not enough yin. So one should “open yin and close yang” (1996: 432) He suggests that the government should have the south gate closed, which is in the direction of yang. Men, embodying yang, should remain in seclusion. Women, embodying yin, should appear in public. He even requests all married couples to copulate (ouchu) to secure more yinyang intercourse. It is also important during this time to make women happy. (1996: 436) In chapter 75 (“Stopping the Rain”), Dong alleges that the flood proves there is too much yin so one should “open yang and close yin” (1996: 438). The north gate, the direction of yin, should be wide open. Women should go into concealment and men should be visible. Officers in the city should send their wives to the countryside in order to make sure that yin will not conquer yang. Derk Bodde defines this practice as a “sexual sympathetic magic.” (Bodde 1981: 373)

Finally, yinyang also plays a pivotal role in traditional Chinese thought about health and the human body. The early medical text known as the Huangdi neijing (The Yellow Emperor’s Classic of Internal Medicine) provides a detailed account of physiological functions and pathological changes in the body and guidance for diagnosis and treatment in terms of yinyang. Five zang (organs) — the kidneys, liver, heart, spleen and lungs — are classified as yin. They control the storage of vital substance and qi. Six fu (organs) — the gallbladder, stomach, small and large intestines, urinary bladder and triple burner (referring to three parts of the body cavity: the upper burner, which houses the heart and lungs; the middle burner, which houses the spleen and stomach; and the lower burner, which houses the kidney, urinary bladder and small and large intestines) — are yang and control the transport and digestion of food. The storage is a yin function, and the transport and transformation of substance is a yang function. But the zang and fu organs can be further subdivided into yin and yang. The activity or function of each organ is its yang aspect, while its substance is its yin aspect. Yin should flow smoothly and yang should vivify steadily. They regulate themselves so as to maintain equilibrium. Yin and yang do not exist in isolation but are in a dynamic state in which they interact and fashion the complicated and intricate system of the human body.

5. The Yinyang Symbol

There is no a clear and definite way to determine the exact date of origin or the person who created the popular yinyang symbol. No one has ever claimed specific ownership of this popular image. However, there is a rich textual and visual history leading to its creation. Inspired by a primeval vision of cosmic harmony, Chinese thinkers have sought to codify this order in various intellectual constructions. Whether to formulate this underlying pattern through words and concepts or numbers and visual images has been debated since the Han dynasty. The question first surfaced in the interpretation of the Yijing. The Yijing is constructed around sixty-four hexagrams (gua), each of which is made of six parallel broken or unbroken line segments (yao). Each of the sixty-four hexagrams has a unique designation; its image (xiang) refers to a particular natural object and conveys the meaning of human events and activities. The Yijing thus has generated a special way to decipher the universe. It mainly incorporates three elements: xiang (images), shu (numbers), and li (meanings). They act as the mediators between heavenly cosmic phenomena and earthly human everyday life. From the Han dynasty through the Ming and Qing dynasties (1368-1912 CE), there was a consistent tension between two schools of thought: the school of xiangshu (images and numbers) and the school of yili (meanings and reasoning). At issue between them is how best to interpret the classics, particularly the Yijing. The question often was posed as: “Am I interpreting the six classics or are the six classics interpreting me?”

For the school of Xiangshu the way to interpret the classics is to produce a figurative and numerological representation of the universe through xiang (images) and shu (numbers). It held that xiangshu are indispensable structures expressing the Way of heaven, earth and human being. Thus the school of Xiangshu takes the position that “I interpret the classics” by means of the images and numbers. The emphasis is on the appreciation of classics. The school of Yili, on the other hand, focuses on an exploration of the meanings of the classics on the basis of one’s own reconstruction. In other word, the school of Yili treats all classics as supporting evidence for their own ideas and theories. The emphasis is more on idiosyncratic new theories rather than the explanation of the classics. In what follows, our inquiry focuses on the legacy of the Xiangshu school.

The most common effort of the Xiangshu school was to draw tu (diagrams). Generations of intellectuals labored on the formulation and creation of numerous tu. Tu often delineate structure, place, and numbers through black and white lines. They are not aesthetic objects but rather serve as a means of articulating the fundamental patterns that govern phenomena in the universe. Tu are universes in microcosm and demonstrate obedience to definite norms or rules. During the Song dynasty (960-1279 CE), the Daoist monk Chen Tuan (906-989 CE) made an important contribution to this tradition by drawing a few tu in order to elucidate the Yijing. Though none of his tu were directly passed down, he is considered the forerunner of the school of tushu (diagrams and writings). It is said that he left behind three tu; since his death, attempting to discover these tu has become a popular scholarly pursuit. After Chen Tuan, three trends in making tu emerged, exemplified by the work of three Neo-Confucian thinkers: the Hetu (Diagram of River) and Luoshu (Chart of Luo) ascribed to Liu Mu (1011-1064 CE), the Xiantian tu (Diagram of Preceding Heaven) credited to Shao Yong (1011-1077 CE), and the Taijitu (Diagram of the Great Ultimate) attributed to Zhou Dunyi (1017-1073 CE). These three trends eventually led to the creation of the first yinyang symbol by Zhao Huiqian (1351-1395 CE), entitled Tiandi Zhiran Hetu (Heaven and Earth’s Natural Diagram of the River) and pictured above at the head of this entry.

6. References and Further Reading

  • Bennett, Steven J. “Patterns of the Sky and the Earth: A Chinese Science of Applied Cosmology.” Chinese Science (March 1978) 3: 1-26.
  • Chan, Wing-tsit, ed. A Source Book in Chinese Philosophy. Princeton, NJ: Princeton University Press, 1963.
  • Bodde, Derk. Essays on Chinese Civilization. Princeton, NJ: Princeton University Press, 1981.
  • Dong, Zhongshu. Luxuriant Gems of the Spring and Autumn. Ed. Su Xing. Beijing: Chinese Press, 1996.
  • Fung, Yu-lan. A Short History of Chinese Philosophy. Trans. Derk Bodde. New York: The Free Press, 1997.
  • Graham, A.C. Yin-Yang and the Nature of Correlative Thinking. Singapore: The Institute of East Asian Philosophies, 1986.
  • Guanzi. Ed. Guan Bo. Beijing: Hua Xia Press, 2000.
  • Guoyu (Discourse of the States). Eds. Wu Guoyi, Hu Guowen and Li Xiaolu. Shanghai: Guji Press, 1994.
  • Henderson, John B. The Development and Decline of Chinese Cosmology. New York: Columbia University Press, 1984.
  • Huainanzi. Ed. Liu An. Xi’an: Sanqing Press, 1998.
  • Inoue, Satoshi. Xianqin Yinyang Wuxing (Pre-Qin Yinyang and Five Phases). Hubei: Education Press, 1997.
  • Kohn, Livia. “Ying and Yang: The Natural Dimension of Evil.” In Philosophies of Nature: The Human Dimension, eds. Robert S. Cohen and Alfred I. Tauber (New York: Kluwer Academic Publishers, 1997), 91-106.
  • Legge, James. The Chinese Classics: The Ch’un Ts’ew, with Tso Chuen. Taipei: SMC Publishing Inc., 1994.
  • Li, Shen and Guo Yu, eds. The Complete Selection of Diagrams of Zhouyi. Shanghai: China Eastern Normal University Press, 2004.
  • Makeham, John. Transmitters and Creators: Chinese Commentators and Commentaries on the Analects. Harvard East Asian Monographs, no. 228. Cambridge, MA: Harvard University Press, 2003.
  • Needham, Joseph. Science and Civilization in China. Cambridge: Cambridge University Press, 1956.
  • Porkert, Manfred. The Theoretical Foundations of Chinese Medicine: Systems of Correspondence. Cambridge, MA: MIT Press, 1974.
  • Puett, Michael J. To Become a God: Cosmology, Sacrifice and Self-Divination in Early China. Cambridge: Harvard University Press, 2002.
  • Roth, Harold D. Original Tao: Inward Training (Nei-yeh) and the Foundations of Taoist Mysticism. New York: Columbia University Press, 1999.
  • Rubin, Vitaly A. “The Concepts of Wu-Hsing and Yin-Yang,” Journal of Chinese Philosophy 9 (1982): 131-157.
  • Sishu wujing (Four Books and Five Classics). China: Yuling Press, 1990.
  • Yabuuti, Kiyosi. “Chinese Astronomy: Development and Limiting Factors.” In Chinese Science: Explorations of an Ancient Tradition, eds. Shigeru Nakayama and Nathan Sivin (Cambridge, MA: MIT Press, 1973), 91-103.
  • Yang, Xuepeng. Yinyang Qi yu Bianliang (Yinyang Qi and Changes). Beijing: Chinese Science Press, 1993.
  • Yates, Robin D.S. Five Lost Classics: Tao, Huang-Lao, and Yin-yang in Han China. New York: Ballantine Books, 1997.
  • Zhuangzi. Ed. by Chen Guying. Beijing: Chinese Press, 1983.

Author Information

Robin R. Wang
Email: rwang@lmu.edu
Loyola Marymount University
U. S. A.

Evolutionary Epistemology

Evolutionary Epistemology (EE) is a naturalistic approach to epistemology and so is part of philosophy of science. Other naturalistic approaches include sociological, historical and anthropological explanations of knowledge. What makes EE specific is that it subscribes to the idea that cognition is to be understood primarily as a product of biological evolution. What does this mean exactly? Biological evolution is regarded as the precondition of the variety of cognitive, cultural, and social behavior that an organism, group or species can portray. In other words, biological evolution precedes (socio-)cultural (co-)evolution. Conversely, (socio-)cultural (co-)evolution originates as a result of biological evolution. Therefore:

  • EE studies the origin, evolution and current mechanisms of all cognitive capacities of all biological organisms from within biological (evolutionary) theory. Here cognition is broadly conceived, ranging from the echolocation of bats, to human-specific symbolic thinking;
  • Besides studying the cognitive capacities themselves, EE investigates the ways in which biological evolutionary models can be used to study the products of these cognitive capacities. The cognitive products studied include, for example, the typical spatiotemporal perception of objects of all mammals, or more human-specific cognitive products such as science, culture and language. These evolutionary models are at minimum applied on a descriptive level, but can also be used as explanations for the behavior under study. In other words, the cognitive mechanisms and their products are understood to be either comparative with, or the result of, biological evolution.
  • Within EE it is sometimes assumed that biological evolution itself is a cognitive process.

Table of Contents

  1. Overview
  2. Context of Use
    1. EE and Selection Theory
    2. The EEM and EET Program
  3. EE and Naturalized Epistemology
  4. Different EEs: The Units and Levels of Selection Debate
  5. The Environment, the Adaptationist Program and Traditional EE
    1. The Adaptationist Program
    2. Traditional EE
      1. Karl Popper
      2. Konrad Lorenz
      3. Donald Campbell
      4. Stephen Toulmin
      5. Peter Munz
  6. Evolution from the Point of View of the Organism
    1. The Constructivist Approach
    2. The Non-Adaptationist Approach within EE
  7. Evolution from the Point of View of Genes
  8. Universal Selection Mechanisms Repeated and Extended
    1. Lewontin’s “Logical Skeleton” of Natural Selection
    2. Universal Darwinism
    3. Blind Variation and Selective Retention
    4. Universal Selectionism
    5. Replication, Variation and Environmental Interaction
    6. Generate-Test-Regenerate / Replicator-Interactor-Lineage
    7. Universal Symbiogenesis
  9. References and Further Reading

1. Overview

A general account of the meaning and history of the term “evolutionary epistemology” is given in sections 1 and 2 below. It is important to understand in advance that different kinds of evolutionary epistemology (EE) can be distinguished, but all forms share the following assumption: that cognition, to a greater or lesser extent, needs to be studied from within evolutionary theory. Disagreements arise about:

  • where to draw the line between the cognitive and the non-cognitive,
  • which aspects of cognition should be studied from within evolutionary theory, and
  • which aspects of evolutionary theory should apply to the study of cognition.

Evolutionary theory itself is far from synonymous with the theory of evolution by natural selection. Rather, heterogeneous views on evolution arise when one takes the units and levels of selection debate (sections 3 through 6) as points of departure. Different perspectives on evolution emerge when one looks at evolution from the point of the environment (section 4), the organism (section 5), and genes (section 6). The development of different EEs parallels this perspectivism. That is, based on these different viewpoints, different EEs have been put forward. The adaptationist approach to evolution is the basis of traditional EE. Non-adaptationist approaches to EE have been based on the constructivist approach to evolution. The “gene’s eye view” of evolution has resulted in a quest for universal evolutionary epistemological mechanisms.

2. Context of Use

The concept “evolutionary epistemology” was first introduced by Donald T. Campbell (1974). However, he repeatedly refused to be called the founding father of EE since he saw himself as denoting “… something that has sprung up all over for a hundred years or more” (Campbell in Callebaut, 1993: 289).

If EE were to have a motto, it might come from Michael Ruse’s (1988) famous book title Taking Darwin Seriously. This means that when one adheres to an evolutionary view of life, one needs to understand all biological processes not only as the outcome of evolution, but also as something that can only be investigated adequately by making use of evolutionary theory.

Evolutionary epistemology understands epistemology to be a product of biological evolution. Therefore, epistemology is studied from within evolutionary biology. Cognition is no longer understood to be linguistic (propositional) or a human-bounded characteristic. Rather, all organisms can show behavior that is cognitively based.

Hence, the first major quest of evolutionary epistemologists is distinguishing between the different cognitive processes that biological organisms from all major kingdoms of life can display.

Second, they investigate how these cognitive capacities evolved from unicellular organisms onwards.

Third, the products of cognition (on the one hand, the perception of light, or color, on the other hand, science, culture and language) are understood from within an evolutionary approach.

The use of biological theories and mechanisms to comprehend cognition is either meant to be descriptive or explanatory. In this context, Ruse (1988: 32) differentiates between an “analogy-as-heuristic” and an “analogy-as-justification.” The former term refers to using metaphors and analogies from evolutionary theory to describe, for example, the evolution of science loosely and to discover new approaches to research. The latter research strategy involves applying evolutionary analogies to justify and thus to validate such things as the evolution of science.

In sum, the underlying view of EE is thus that there is a universal evolutionary mechanism that lead first to the evolution of life in general, and second, that this mechanism is also at work within the evolution of cognition, and within the products of cognition such as language, science and culture.

Some evolutionary epistemologists such as Campbell (1974), therefore also assume that this evolutionary mechanism in its own workings portrays an evolutionary mechanism. This concept will be discussed later.

The concept “EE” today is commonly used as a synonym for selection theory on the one hand, and, on the other hand, as part of the EEM and EET program.

a. EE and Selection Theory

EE has strong affinities with selection theory (Campbell, 1997). The latter is a theory that adheres to the view that all and only selectionist, as opposed to instructionist (behaviorist) — explanations of an organism’s traits (including cognitive ones) are valid. Behaviorist explanations state that it suffices to describe the visible, external behavior that an organism portrays in order to develop adequate explanations of that behavior. Selectionist accounts, by contrast, also examine internal elements that underlie a certain trait (such as genes, for example) and the evolutionary emergence of that trait. The term selection theory was first introduced by Simmel and Baldwin in the 19th century (Campbell, 1997). Today, however, a wide range of biologists, neurologists, and evolutionary epistemologists are selectionists (for an example, see Cziko 1995), but these scholars do not recognize or accept any direct influence of Simmel and Baldwin’s selection theory.

Throughout this article, the more general term EE is maintained. The reason is twofold. First, not all topics that are investigated by selectionists are relevant for the study of cognition. Second, not every Evolutionary Epistemologist defends a solely selectionist account of cognition. Rather, other evolutionary principles such as self-organization, for example, are also included to comprehend (the products of) cognition (as will be discussed in section 5). Finally, analogies are not only drawn between evolutionary theory and the evolution of science and knowledge. Culture, language, economics etc. can also be interpreted from within these evolutionary epistemological frameworks.

b. The EEM and EET Program

A useful distinction within EE is made by Bradie (1986). Two different programs are identified, the EEM and the EET program. Within the Evolution of Epistemological Mechanisms, the evolution of cognition and cognitive knowledge mechanisms is investigated from within the Modern Synthesis. The Modern Synthesis is the standard paradigm within evolutionary biology on how evolution occurs. This is based on the principle of evolution by natural selection as first introduced by Charles Darwin.

Furthermore, the products of cognitive evolution, such as language, science, or culture, are also understood to be the result of biological evolution, and it is assumed that in their emergence or structure an evolutionary pattern can also be found. The following example can illustrate this: the evolution of language or culture is at least partly the result of biological evolution. Hence, the same evolutionary mechanisms that are used to describe the evolution of cognition are also applicable to the products of cognition, such as language or culture. The EET program (Bradie, 1986) was introduced specifically for epistemological or scientific theories. The ways in which analogies are drawn between the evolution of science on the one hand and natural selection on the other are investigated within Evolution of Epistemological Theories.

Different evolutionary epistemologists are active within the above mentioned various fields and within extra-philosophical scientific fields, which makes it difficult to pinpoint the common assertions made by all evolutionary epistemologists. Adherents of an EEM position, for example, can object to the widely subscribed idea that science also needs to be explained from within evolutionary epistemology, as adherents of the EET program state. What binds evolutionary epistemologists is the idea that evolutionary theory, to some extent, can explain aspects of cognition.

3. EE and Naturalized Epistemology

What is so different about EE that it can be distinguished from all other epistemological endeavors? To answer this question, we need to first situate, and secondly, evaluate EE in relation to other philosophical frameworks.

EE is part of the naturalistic turn. The naturalistic turn itself is a larger movement that emphasizes the importance of a sociology of knowledge, anthropology of knowledge, and the historical study of knowledge. Evolutionary Epistemology in turn emphasizes the importance of the biology of knowledge. More specifically, the study of biological evolution is the precondition of all investigations into cognition (Wuketits, 1984: 2-19). Therefore, it explains evolution itself as a cognitive process.

Furthermore, within EE, knowledge and cognition are no longer conceived of as necessarily proposition-like or language-like or human-bounded. As such, EE stands opposed to traditional philosophical approaches to cognition (such as empiricist and rationalist ones that understand knowledge to be language-like), and it also goes beyond Quine’s Naturalized Epistemology. In order to understand this, first naturalized epistemology is briefly discussed and then the difference with EE is explained.

Naturalized epistemology was first introduced by Quine (1969), who stressed that the study of science and scientific thinking should revolve around how knowledge is processed, rather than what knowledge is in itself. Therefore, he emphasized that we should reject the idea of a first philosophy. Within a first philosophy, it is assumed that philosophy can make claims about science without using the sciences. If one would make use of the sciences, this would be understood as circular. Quine, however, stressed that we should investigate epistemology from within the natural sciences, more specifically, psychology:

The stimulation of his sensory receptors is all the evidence anybody has had to go on, ultimately, in arriving at his picture of the world. Why not see how this construction really proceeds? Why not settle for psychology?” (Quine, 1969: 269-70) […] [A]t this point it may be more useful to say that epistemology still goes on, though in a new setting and a clarified status. Epistemology, or something like it, simply falls into place as a chapter of psychology and hence of natural science. (Quine 1969: 273-4)

Epistemology is defined as that discipline which studies exactly how our sense organs construct a picture of the world. The study of knowledge involves the investigation into (1) the relation between neural input and observational sentences, and (2) an investigation into the relation between theoretical and observational sentences. Hence, according to Quine, knowledge, or more specifically, cognition, is still understood to be language-like: it is assumed that somehow our neural input is transformed into verbal output. A rather behaviorist position is taken by Quine, because the study of how our neurological abilities relate to language is not assessed. Somehow the relation between sensory input and language is assumed to be direct.

Neurology today, however, has shown us multiple times that at the neurological or cognitive level, there is no direct, and certainly no necessary relation between our categorizations and our language (Changeaux 1985; Gazzaniga 1994, 2000; Damasio 1996 and 1999; Ledoux 1998).

Furthermore, because of the rise of ethology and ecology (the study of the external behavior of animals in relation to their natural settings), cognition as a scientific concept has been broadened to include non-linguistic behavior as well.

It is here that evolutionary epistemology makes its entrée. Konrad Lorenz (1958) for example, was one of the founding fathers (together with Nikolas Tinbergen) of ethology. Lorenz stressed the importance of a cognitivist approach of behavior, hereby also including internal behavior.

In contrast to Naturalized Epistemology, EE does not only examine the relation between human, language-like knowledge and the world. Any type of relation that an organism engages in with its environment is understood as a knowledge relation, irrespective of whether or not these organisms have language.

Munz (2001: 9) points out that what makes EE unique is that knowledge is comprehended as a cognitive relation between an organism and its environment. Empiricists for example understood knowledge to be a relation between a knower and something knowable by induction, while rationalists define knowledge as a relation between a knower and something known because of deduction. Even within the sociology of knowledge movement, knowledge is not understood from within its relation between an organism and its environment, rather here it is comprehended as a relation between different knowers.

What makes EE different from all other naturalistic approaches within philosophy, is that it does not regard epistemology as a mere study of how a human knower comes to know what is knowable. Rather EE studies how knowledge about the environment is gained across different species, and what knowledge-gaining mechanisms arise in biological organisms through time enabling these organisms to cope with their environment. This means that within EE not only human cognition but all sorts of behavior that organisms at all levels in biological evolution display (ranging from instinctive behavior to cultural behavior or even chemotaxis – that is to say, communication between cells) are regarded as devices that are put to use to gain knowledge. And equally important, these mechanisms themselves are also comprehended as knowledge in and of themselves.

Within EE, contrary to behaviorism, internal factors that determine behavior and cognition are also included. Because biological evolution led to the rise and acquisition of different cognitive/knowledge processes, this evolution itself is explained as a knowledge process.

4. Different EEs: The Units and Levels of Selection Debate

The units and levels of selection debate is taken as the point of departure to distinguish between different types of EE. EEs draw on evolutionary theory to explain epistemology or cognition. However, there are disagreements on what evolution in general is. Therefore different, sometimes complementary evolutionary theories are put forward by evolutionary biologists. It is only logical that this results in different evolutionary epistemologies. Three different perspectives are described to understand evolution and the different EEs that arise when using these perspectives:

  1. Evolution from the point of view of the environment, which lead to traditional, adaptationist approaches to EE;
  2. Evolution from the point of view of the organism, which lead to non-adaptationist, constructivist approaches; and
  3. Evolution from the point of view of genes, which opens the quest for universal selection formulas.

How did the units and levels of selection debate get started?

The Modern Synthesis (Ayala, 1978, Maynard-Smith, 1993, Mayr, 1978), which is the standard paradigm on how biological evolution occurs, states very strictly that the phenotype (the visible organism) is the unit of selection. This phenotype either is selected at the level of the environment, if this visible organism is adapted to that environment, or the organism dies out, if it is maladaptive.

With the rise of Postneodarwinian theory on the one hand, and Systems Theory on the other, the debate over the units and levels of selection was introduced first in biology, and later within evolutionary epistemology. In this discussion the primary question asked is whether there are units and levels of selection other than the phenotype and the environment. The concept “units of selection” was coined by the biologist Richard Lewontin in his famous homonymous article of 1970. The concept “levels of selection” was introduced by Robert Brandon in his 1982 article by the same name. However the discussion dates back to scientific debates concerning the possibility of group selection in the 1960s between William Hamilton (1964) and George C. Williams (1966, chapter 4), and still even further back in time to the 19th century when Herbert Spencer introduced and applied the “survival of the fittest” idea to human populations and society.

5. The Environment, the Adaptationist Program and Traditional EE

a. The Adaptationist Program

The concept “adaptationist program,” was first introduced by Gould and Lewontin (1979) — but is not subscribed to by these authors themselves. The adaptationist program regards “ […] natural selection as so powerful and the constraints upon it so few that direct production of adaptation through its operation becomes the primary cause of nearly all organic form, function, and behavior” (Gould and Lewontin, 1979:584-5).

To understand this, the distinction between ontogeny (the development of an organism from conception until death) and phylogeny (the evolution of species) is in order. Within Lamarckian theory, no strict separation between ontogenetic and phylogenetic processes is adhered to. Within this paradigm, also known as the inheritance of acquired characteristics, traits acquired during the lifetime of an individual can be passed on immediately to the next generation.

With the introduction of Darwin’s principle of natural selection, for the first time in history it was possible to distinguish between ontogenetic and phylogenetic processes, because of the distinction that is made between the inner and the outer world of the organism (Lewontin: 2000: 42-3). The inner milieu of the organism is, according to Darwin, subjected to, amongst other things, developmental growth processes that are not themselves subjected to evolution by natural selection. The outer environment, by contrast, is the sole scene where evolution by natural selection occurs. Here the environment either does or does not select an organism. Regarding the inner milieu of the organism, Darwin himself quite often made use of Lamarck’s theory. He used it as an explanation for how novel individual variation arises. Natural selection was never interpreted by Darwin as being the cause of the variation; in fact, he did not know how variation occurred. Therefore, he invoked Lamarck’s principle of the inheritance of acquired characteristics. Natural selection only selected amongst the given variation.

These ideas were later incorporated into the Modern Synthesis. Organisms vary. This variety is the result of, on the one hand, the specific combinations of genetic material that an organism carries, and on the other hand, possible random mutations that occur within these genes. One acquires the genetic material that one carries through birth, thus no child can choose its specific genetic code. And, the genetic mutations that sometimes occur, occur randomly, they are blind. That is to say, mutations are random errors that occurred during the copying of this genetic material. The genetic material that one carries can be neutral, adaptive or maladaptive for the carrier in the “struggle for existence.” The point, however, is that from this perspective, the organism itself cannot by any means whatsoever influence the genetic material that it carries. Eventually, it is the environment that indirectly selects adaptive organisms through the elimination of the unfit. Thus, the Modern Synthesis views this selection process as taking place between the phenotype and the environment. And the selection process itself is said to occur only externally: the “level of selection” is the external environment, and the selection of the “unit of selection,” the organism, occurs independently of internal processes such as developmental growth.

ev-ep-diagram1

Figure 1. The adaptationist approach focuses on the external relation
between the environment and the organism.

Thus, within the adaptationist approach the organism and the environment are conceived as two separate entities that only interact during the selection process but develop independently from one another (fig. 1).

Adaptation is literally the process of fitting an object to a pre-existing demand… Organisms adapt to the environment because the external world had acquired its properties independently of the organism, which must adapt or die. (Lewontin, 2000: 43)

In other words, Neodarwinian theory adheres to a strict dualistic viewpoint (Gontier, 2006) between organism and environment: the organism is passively selected, or not, by an active environment. The organism cannot influence its chances of survival or fitness. For this reason, according to Lewontin (1978), one can defend the position that because of the emphasis these scholars lay on adaptation, Neodarwinians explain evolution from the point of view of the environment. Hence, they actually give a description of the environment through the organism, rather than describing the organism itself.

b. Traditional EE

It is the latter position that has been one of the basic tenets of traditional EE, namely, that one is able to gain knowledge about the environment by studying the organisms that live in it, because organisms literally “re-present” the outer world.

What does this mean? Logical empiricism failed in providing a non-arbitrary relation between the world and human language. However, the search for such a non-arbitrary relation between the outer world and the organisms that inhabit that world was continued from within the adaptationist approach. In this position it is assumed that there is an unchangeable outer world to which organisms adapt. If it is true that organisms are adapted to the outer world, and that all and only the fit survive and reproduce in the long run, then these adaptive organisms can tell us something about that environment. An ant, for example, can tell us something about the soil.

This section provides an overview of the major traditional evolutionary epistemologies and how they developed out of the adaptationist view of evolution.

i. Karl Popper

Beginning with Sir Karl Popper’s (1963) ideas concerning conjectures and refutations (also called trials and errors), the following position is defended within traditional EE: there is a growth of (scientific) knowledge which is comparable with the succession of adaptations in evolution. The task of EE thus becomes explaining this growth.

Adhering to the strict distinction made between ontogeny and phylogeny, it is argued that at no stage during evolution does an organism receive knowledge from the outer world. Bold conjectures are made about the outer world and if these hypotheses are not falsified by experiments performed by the scientific community, they survive. In the long run, unfit theories are eliminated by the process of falsification, and there is a growth in knowledge. Theories that survive longer than others are understood to tentatively corroborate the truth. The analogy with biological evolution is clear: a selectionist account is preferred over an instructionist one. This means that at no point does an organism choose its genetic endowment. However, if this organism, with the genetic endowment that it is born with, stands the test of the environment, that is, if it survives long enough so that it can reproduce, than the organism‘s genetic traits survive, and it is said that it is adapted to its environment. In the long run, only the fit survive; maladaptive organisms are not able to survive long enough to reproduce and spread their genes in the gene pool again, and therefore die out.

Thus, just as the Modern Synthesis stresses that an organism can by no means directly receive instructions from the environment, Popper (1963: 46) emphasizes that we force our interpretations upon the world prior to our observations: “Without waiting, passively, for repetitions to impress regularities upon us, we actively try to impose regularities upon the world.” These are the conjectures that are put forward for trial, to be selected or eliminated according to the test-results. Scientific theories are thus not the result of observations, but of wild hypotheses. Although Popper himself is not part of the field of EE, his work on conjectures and refutations is often regarded as a first account on EE.

ii. Konrad Lorenz

Konrad Lorenz is also a representative of traditional EE, since he too worked within the adaptationist program. Lorenz (1941, 1985) is famous for reinterpreting Kant’s synthetic a priori claims. No longer are the inborn categories regarded as evidently true, rather, they are understood to be “ontogenetically a priori and phylogenetically a posteriori.” This means that an individual organism is born with innate dispositions. These innate dispositions are acquired phylogenetically, through the evolution of the species, by means of the mechanism of natural selection. Most importantly, these dispositions are fallible, because they are the result of selection, not instruction. That is, these dispositions are adaptations, and natural selection only weeds out maladaptive organisms, which results in the survival of the adaptive ones. According to the Modern Synthesis, at no time in evolution does natural selection actually cause or create the adaptive traits that are presented to the environment (again because of the strict distinction made between ontogeny, where natural selection does not work, and phylogeny, where it does.

According to Lorenz, and contrary to Kant, the thing in itself (Das Ding an Sich) is knowable through the categories of the knower, not the characteristics of the thing in itself, and selection results in a partial isomorphism through adaptation. Lorenz states that:

The central nervous apparatus does not prescribe the laws of nature any more than the hoof of the horse prescribes the form of the ground. Just as the hoof of the horse, this central nervous apparatus stumbles over unforeseen changes in its task. But just as the hoof of the horse is adapted to the ground of the steppe which it copes with, so our central nervous apparatus for organizing the image of the world is adapted to the real world with which man has to cope. (In Campbell, 1974: 447)

Thus, through adaptation, there is a correspondence between our images of the world and the world in itself, or between organism and environment, or between theories and the world. This is of course not a 1-to-1 correspondence; our image of a tree is not like a real tree, but because our cognitive apparatus is adapted to the world, there is a partial isomorphism between the two. Adaptations thus become a description of the world in a biological language (Lorenz, 1977).

The reinterpretation of Kant’s synthetic a priori claims is not solely the work of Lorenz; rather it dates as far back as Herbert Spencer. For the most complete overview of authors who have reinterpreted Kant’s ideas in this way, see Campbell (1974).

iii. Donald Campbell

Donald T. Campbell (1974) goes one step further than Lorenz because he rethought the distinction between ontogeny and phylogeny. No longer is natural selection something that solely works on the level of the environment; natural selection is internalized as well. Furthermore, the mechanism of natural selection, in its own workings, is said to work selectively as well.

Campbell’s (1959: 153-5) main goal was to develop an empirical science of induction (not to be confused with behaviorist instruction; see section 1). This empirical science consisted of a comparative study of the psychology of knowledge, a biological science of cognition, a sociology of knowledge, and a science of history. In other words, he wanted to build a science of science, which Campbell (1974) termed EE. This discipline had to be compatible with evolutionary biology and social evolution (Campbell, 1974: 413). In his 1959 paper he characterized biology as the study of “progressive adaptation.” Therefore, he made an abstraction of the mechanism of natural section by introducing the blind-variation-and-selective-survival mechanism (Campbell, 1959). Later he would call it the blind-variation-and-selective-retention scheme (Campbell (1960).

Campbell’s (1959: 156-8) EE is based upon six philosophical assumptions:

  1. Hypothetical realism: EE acknowledges as a hypothesis the existence of an external world where entities exist and processes occur. This differs from Popper’s critical realism in that the existence of the world in itself also needs to be proven through observation.
  2. No first philosophy: EE rejects the idea of a first philosophy, subscribing rather to the view that knowledge needs to be explained using scientific knowledge.
  3. No distinction between human beings and animals is adhered to. On the contrary, it is fully acknowledged that human beings are animals.
  4. EE is an “epistemology of the other one” as Campbell (1974: 448) calls it. This means that EE raises the question of how organisms come to know, not how a knower acquires knowledge. That is to say, it studies the relationship between an organism’s cognitive capacities and the environment that it is selected to cognize.
  5. Epistemological dualism: there is a difference between what is knowable and what is known. Knowledge always constitutes indirect and fallible constructions that never completely correspond with the thing in itself.
  6. Perspectivism: each of the different hypotheses that are formed provides another perspective. These can partially overlap, but also differ from one another. In the latter case, different positions can be regarded as equal.

According to Campbell, science was only one aspect of a general knowledge process and this process was hierarchical in nature. Knowledge is no longer merely language-like and human bounded. On the contrary, different biological and social layers can be distinguished which, each on its own, encompasses a different aspect of knowledge. And here too, the focus lies on the acquisition and growth of knowledge.

In his 1959 article, Campbell distinguishes between 12 knowledge processes. These include machines on the one hand, but also bisexuality, heterozygosis, and meiotic cell division, on the other. In his 1960 article Campbell discusses creative thinking as a separate learning process.

Finally, in his 1974 article he distinguishes ten different levels that are applicable to biological and social evolution. This is the last and most canonized hierarchy that Campbell (1974: 422-435) introduced and it are these ten levels that are now discussed.

(i) Non-mnemonic problem solving

Organisms that engage in non-mnemonic problem solving do not have a memory. Bacteria, for example, are such organisms. They blindly search for food until they find it: they cannot remember previous food sources, and they cannot voluntarily go to one. They are just swept away by the wind.

(ii) Vicarious locomotor devices

Examples are the echolocation of bats, or a blind man’s cane. They replace the blind exploration of the surrounding space by trial and error movements.

(iii) Habit and (iv) Instinct

Habit, instinct, and visual diagnosis are all closely related to each other, according to Campbell. Both instincts and habits are mostly founded upon visual stimuli that trigger a learned or innate response. Innate knowledge does not represent innate ideas; rather it corresponds to expectations or hypotheses that have no prior validity. Therefore, the distinction between “primitive instincts” and “learned habits” is false: all instincts are fine-tuned by learning processes and all learning makes use of inborn knowledge mechanisms. And both are hypotheses that need to be tested. Furthermore, Campbell introduces the popular habit-to-instinct view of his time, namely that by means of natural selection, habits will become instincts (without explaining how this takes place).

(v) Visually supported thought

This can be thought of as insightful problem-solving. Organisms endowed with this knowledge process are able to perform insightful behavior when they can visually perceive their surrounding environment. Campbell offers as an example the Köhler experiments, where primates are capable of showing some kind of “aha” experience.

(vi) Mnemonically supported thought

Organisms with memory capacities can re-present the environment, thereby replacing the need for a constant visually perceivable environment. Because one can imagine the environment, one can also have creative and intelligent thoughts, of unseen or unexperienced things (such as a mermaid).

(vii) Socially vicarious exploration: observational learning and imitation

Trial and error exploration by one member of the community can replace the trial and error exploration by all the other members of society. This is because certain organisms are able to learn by observing others. Imitating other’s behavior reduces the possibility that each individual on its own needs to invent a certain behavior. This implies that we live in a shared world; a solipsistic view is impossible. Campbell also stresses that learned behavior cannot jump from brain to brain; rather it needs to be learned in turn by trial and error. So a memetic position is not feasible in Campbell’s view.

(viii) Language

Language overlaps with (vi) and (vii) and is broadly conceived as including human language but also other communication systems such as bee language and pheromones. With language, the environment is represented by words that are contingently chosen (they don’t necessarily correspond with the world; the relation is indirect). Language acquisition too, does not merely encompass the direct passing on of words to children. Children, through trial and error, learn to correctly use the words they hear to describe certain objects and/or events, which again implies a strictly behavioristic model.

(ix) Cultural transmission

Changes in technology and culture also represent a blind variation and selective retention scheme. Complete social organizations or either selected or not and their respective leaders replace the behavior of the members of the community.

(x) Science

Science is part of cultural evolution. And also science reveals a trial and error pattern.

Many of the above mentioned knowledge mechanisms that Campbell introduced are today further divided or re-defined. Nevertheless it was Campbell who for the first time in history so clearly distinguished between different knowledge processes. Thus he showed that knowledge is not to be understood in a uniform manner.

Campbell’s more general blind-variation-and-selective-retention scheme, that is supposed to run through all levels of the hierarchy, is still applied today.

All increases in knowledge or adaptivity are an inductive process, and adaptivity is also comprehended as knowledge (Campbell, 1960). This differs from an instructionist process, because at no time is the organism a blank slate that is written upon by the environment. While natural selection does not cause blind variation, in a way it does cause indirect selective retention, through the elimination of the unfit. “At no stage has there been any transfusion of knowledge from the outside, not of mechanisms of knowing, nor of fundamental certainties.” (Campbell, 1974: 413). Therefore, according to Campbell (1960: 380-381):

  1. All knowledge-gaining-processes or inductive achievements are the result of a blind-variation-and-selective-retention scheme. The latter is thus a universal schema or heuristic that can account for the evolution of these different processes.
  2. Furthermore, within the course of evolution, one can distinguish between many later-evolved processes that shortcut full blind-variation-and-selective-retention processes. Vision, for example, shortcuts blind trial and error locomotion. Such new mechanisms are also inductively achieved (by natural selection). The process by which these inductively achieved mechanisms shortcut and accelerate earlier mechanisms is called vicarious selection. This concept is derived from the Christian vicar, because such shortcuts substitute earlier mechanisms in a way that a vicar substitutes God. What is important is that knowledge mechanisms that are acquired later are (again because they are inductively achieved) not necessarily more accurate; they are only more efficient (Campbell, 1959: 162). These shortcuts themselves evolved through a process of blind-variation-and-selective-retention. And later stages partly determine earlier stages of knowledge processes which Campbell (1974) termed downward causation.
  3. Finally, these shortcuts have not only evolved by blind-variation-and-selective-retention. In the operation of these shortcuts, a blind-variation-and-selective-retention process can also be detected. Thus it is Campbell who is the first to state clearly that not only does a selection process lie at the basis of evolution, but also that this selection process itself adheres to such a selection process.

In his 1995 article (published posthumously in 1997 by Heyes and Frankel), Campbell rejected his earlier ideas about treating adaptations as knowledge and he restricted knowledge to be those vicarious selectors. In fact, the whole adaptationist approach became more and more problematic to Campbell (1987: 140) in his later writings and he started to emphasize that Panglossian adaptationism needs to be avoided at all times within EE. Retention is equally important, just as variation and selection are, especially when science is concerned.

iv. Stephen Toulmin

Specifically regarding scientific thinking, especially in the works of Stephen Toulmin (1972), a strong analogy is drawn with natural selection. Ideas and concepts are the results of scientific thinking and these are, by analogy with the gene pool, introduced into the pool of scientists through science journals, conferences, books etc., leading to the rise of competition between different ideas. Only the fittest ideas survive while the less fit die out. However, this “fitness” is not solely the result of the scientific value of the idea; other factors enter into the equation. For example, sociological reasons are included as causal factors for why an idea is or is not rejected.

v. Peter Munz

Peter Munz, another author working within the adaptationist program, calls his version of EE, “Philosophical Darwinism” (2001). Contrary to the previous authors discussed, Munz states that even variation, which is normally conceived of as being blind (the result of random mutations and genetic recombinations), is the result of a selective process. Inspired by the works of Popper, he goes so far as to state that organisms are “embodied theories,” and theories are “disembodied organisms.”

According to Munz (2001: 151-160), every organism is a theory about its environment. That is, an organism primarily gives knowledge about the environment. Moreover, an organism can be regarded as a definition of that environment. An organism mirrors its environment because of selective adaptation. Therefore, an organism literally becomes a not yet falsified theory of a certain aspect of the environment, its Umwelt/niche, and thus it becomes a provisionally true hypothesis. A theory/organism — the two are synonymous in Munz’s view — has certain expectancies about its environment, and if these are met, then the organism/theory survives; if not, the organism/theory is falsified. The longer an organism/theory survives, the more truth is approximated.

The behavior of a fish and the functioning of a theory of water are exactly identical. The fish represents water by its structure and its functioning. Both features define an initial condition (for example, the degree of viscosity of water) which, when spotted or sensed, trigger off a prognosis or behavioral response which, in case of a fish, fails to be falsified. By contrast, a bird does not represent water (Munz, 2001: 155) .

Thus, an organism is an embodied theory about its environment. An organism re-presents that part of the world that it is adapted to and this representation is thus no longer verbal or conscious. Embodied theories, according to Munz, are also no longer expressed in language, but in anatomical structures or reflex responses, etc.

Besides regarding organisms as embodied theories, theories become disembodied organisms in Munz’s view. A human being is both because it possesses linguistic knowledge. Linguistically expressed theories, according to Munz (2001: 160-8), are also the result of a process of variation and selective retention. Here too, linguistically expressed theories are literally organisms. In the wake of Popper, Munz stresses that theories should be reified. Linguistic theories are built up from language, and there exists no causal link between this language and the causal impact that the world has upon the non-linguistic body. Therefore, language and consciousness create uncertainty: expressions can only be hypothetical. In addition, at first language appears to be maladaptive, since it delays non-linguistic, embodied responses. Nevertheless, such expressions are adaptive as well, because they enable variation. Selection can only work when there is variation which it can select from, and therefore, for Munz, the growth of scientific linguistic knowledge is possible.

In contrast to previous adaptationist EEs, according to Munz, this variation is also the result of selectionist processes. Eventually, Munz (2001: 184) stresses that his theory results in an antropic principle. With the origin and evolution of life, the world represents itself, onto itself, through disembodied organisms and embodied theories. Contrary to physics, it is biology that can give us a valid picture of how the world is.

In summary, within traditional evolutionary epistemological accounts, the strict distinction between phenotype and environment, as put forward by adherents of the Modern Synthesis, is adhered to. This leads to the possibility that one can gain knowledge about the environment by studying organisms that are adaptive to that environment. Thus, within this tradition it is assumed that organisms can provide a non-arbitrary relation, not between language and the outer world, but between whole organisms (their bodies) and the outer world. This position however encounters problems when one takes an organismic point of view, a position that will be discussed in the next section.

6. Evolution from the Point of View of the Organism

When evolution is regarded from within an organismic point of view, a constructivist account emerges which in turn leads to the non-adaptationist approach within EE. Therefore, first the constructivist approach is examined. Secondly, the elements that are subtracted from this approach for the development of the non-adaptationist approach to EE are outlined.

a. The Constructivist Approach

Following Lewontin and Gould’s critical review of the adaptationist program, evolutionary theory was interrogated from less adaptationist perspectives as well. Opposed to the strict adaptationist account, the systems theoretical approach defends the following constructivist position.

…[T]he claim that the environment of an organism is causally independent of the organism, and the changes in the environment are autonomous and independent of changes in the species itself, is clearly wrong. It is bad biology, and every ecologist and evolutionary biologist knows that it is bad biology. The metaphor of adaptation, while once an important heuristic for building evolutionary theory, is now an impediment to a real understanding of the evolutionary process that needs to be replaced by another. Although all metaphors are dangerous, the actual process of evolution seems best captured by the process of construction. (Lewontin: 2000: 48)

Instead of portraying organisms as passive elements that are subjected to selection, Lewontin (2000: 51-64) introduces a more constructivist approach to evolution in which five different aspects of the organism-environment relation are distinguishable.

  1. Organisms partly determine by themselves which elements from the external environment belong to their environment or niche, and they determine to a large extent how these different elements relate to one another. A shrub, for example, can be part of the habitat of a butterfly, while a tree is not.
  2. Organisms not only largely choose what is part of their environment; they also literally construe the environment that surrounds them. This process is called niche construction. Beavers, for example, build their own dams.
  3. Furthermore, organisms constantly change their environment in an active manner; every act of consumption is an act of production. The first photosynthetic organisms, for example, changed earth dramatically from an oxygen-low to an oxygen-rich planet.
  4. Through time, organisms learn to anticipate the external conditions that the environment provides. For instance, according to certain environmental conditions, certain chordates are able to switch from a sexual to an asexual form. Other organisms hoard food for the winter.
  5. Finally, according to Lewontin, organisms modify signals that are coming from their surrounding by their biological build-up. That is to say, they modify external signals into internal signals to which their bodies are able to react. For example, if the external temperature rises, the molecules that form the organisms do not start to tremble. Rather, an internal signal in the brain will lead to the release of certain hormones that cool the body down so that it does not get overheated.

Hence, from within the systems theoretical approach, the relation between an organism and its environment is understood from within a dialectical point of view (Callebaut & Pinxten, 1987: 41, Gontier, 2006).

ev-ep-diagram2

Figure 2. Within systems theory, the focus lies not only on the mutual relation between the organism and
its environment, rather internal processes specific to the organism and/or the environment are taken into account.

An organism not only is determined by the external environment, the organism can also, to a certain extent, determine its environment by construing and reconstruing it in an active manner (fig. 2). Therefore, the concept “environment” is also broadened to include the inner environment where inner homeostatic, self-regulating processes are responsible for an organism’s survival (point 4 and 5 above). Because of this, it is said that the constructivist approach explains evolution from the organismic point of view (Gutmann and Weingarten 1990; Wuketits, 2006).

b. The Non-Adaptationist Approach within EE

The non-adaptationist approach to EE was first introduced by Franz Wuketits (1989). All adaptationist approaches to EE adhere to the view that it is possible, to an extent, to develop a correspondence theory. A correspondence theory states that there is a 1-to-1 correspondence between the environment and the organisms that live in it, or between theories and the world. For instance, the ant can tell us something about the soil. In order to make this claim feasible, natural selection needs to be reduced to, or at a minimum the emphasis should rest heavily on, the mechanism of adaptation. It is only through the mechanism of adaptation that such correspondence can be obtained.

In the wake of Ludwig von Bertalanffy, one of the founders of systems theory, the importance of the study of the whole organism is stressed, next to the study of the (adaptive) relation between the organism and the environment. Within systems theory, organisms are conceived of as partly open, partly closed systems. That is to say, organisms constantly take matter and energy from, and give matter and energy to, their environment, while they themselves maintain a “steady state” (Wuketits, 2002: 193). Later on, Prigogine (1996) would introduce the concept of “dissipative structures.” A whirlpool, for example, maintains its form while the water of which it is composed, constantly changes. But once the water flow stops, the whirlpool no longer exists. Organisms are more than such dissipative structures. They are homeostatic systems, because not only can they self-regulate and self-organize, they can also maintain themselves to a certain extent. That is why it is said that organisms are partly open, partly closed systems; they receive and donate matter and energy to and from their environment. They also distinguish themselves from that environment and are able to construct their environment as well.

Developmental systems theory (DST) (Maturana and Varela 1980; Oyama 2000a and b; Dupré 2001) grew out of systems theory and, as the concept suggests, it focuses on developmental processes. It understands organisms to be autocatalytic systems, systems which are able to self-organize and self-maintain, not so much because they are adapted to the environment they live in, but because they are able to self-maintain, sometimes even despite the environment, due to the inner mechanisms they develop in order to survive. Therefore, these inner mechanisms of self-organization and self-regulation are comprehended as causal factors that need to be part of the explanation of why organisms behave in a certain manner.

Within the non-adaptationist tradition of EE, being adapted does not mean that there is a one-to-one correspondence with the environment. Instead, being adapted implies having the ability to change the environment to make it livable for the organism, and thus to enhance survival. Adaptation thus becomes only one aspect that needs to be studied, together with non-adaptationist approaches. Wuketits (2006: 38-9):

… a nonadaptationist view of cognition and knowledge and a nonadaptationist version of evolutionary epistemology (…) is mainly based on the following assumptions: (1) Cognition is the function of active bio-systems and not of blind machines that just respond to the outer world. (2) Cognition is not a reaction to the outer world but results from complex interactions between the organism and its surroundings. (3) Cognition is not a linear process of step-by-step accumulation of information but a complex process of continuous error elimination.

In sum, an EE based upon systems theoretical evolutionary theory is not anti-adaptationist (Wuketits 1995: 359-60). It is non-adaptationist because the world constantly changes because of the organisms that inhabit it. This makes it difficult to approximate a one-to-one correspondence.

Instead of adhering to such a correspondence theory, the non-adaptationist approach puts forward a coherence theory. Because of these processes of inner self-organization, self-regulation and the possibility for an organism to partially (re)construct its environment, an organism is partly capable of creating its own habitat. Different organisms develop different habitats because they have evolved differently and have different inner mechanisms which enable them to cope with, and interact with, the outer world. Here, according to Wuketits (2006), it is not useful to ask which habitat is more real or more in correspondence with the world in itself (an sich), because every organism capable of surviving has proven that it is adequate. Therefore a coherence theory adheres to a functional notion of reality. What an organism, according to its own inner mechanisms of perception, perceives as real, is real for that organism in its struggle for existence. If that organism is able to survive because of the way it perceives things, it is able to reproduce and reintroduce its genes into the gene pool. Wuketits (2006: 43):

First, organisms do not simply get a picture of (parts of) reality, but develop, as was already hinted at, a particular scheme of reaction. … Second, the notion of a world-in-itself becomes obsolete or at least redundant. What counts for any organism is that it copes with its own world properly.

7. Evolution from the Point of View of Genes

Thus far we have examined the “organismic point of view” towards evolution defended by the systems theoretical approach, and the description of evolution from the “point of view of the environment” as is the case with the Modern Synthesis. A third and final alternative for describing evolution is the “gene’s eye view.” The gene’s eye view was introduced by Richard Dawkins (1976), following Williams (1966).

This approach opened the discussion concerning universal Darwinism (section 7) and introduced the important concept of a “replicator,” a concept that is often used within universal selectionism.

According to Dawkins (1982: 162) the unit of selection is not the phenotype, but the replicator: “… any entity in the universe of which copies are made” and this replicator, contrary to the vehicles that temporarily house them “…is potentially immortal… the rationale is that an entity must have a low rate of spontaneous, endogenous change, if the selective advantage of its phenotypic effects over those of rival (‘allelic’) entities is to have any significant effect.” (Dawkins, 1982: 164).

A replicator carries information that can be copied. An example par excellence is genetic material that, according to the specific sequence of nucleotides (the building blocks of genes), encodes for certain characteristics. Organisms, according to Dawkins, are mere vehicles that temporarily accommodate such information-carrying replicators. In the long run, because of their longevity, fecundity and copying-fidelity, these “selfish genes” outlive their temporary housing. Therefore, the emphasis for Dawkins should lie on the replicator, not the individual organism. That is not to say that the environmental approach so characteristic of the Modern Synthesis is wrong, according to Dawkins, rather it should be complemented with the gene’s point of view of evolution.

…[t]here are two ways in which we can characterize natural selection. Both are correct: they simply focus on different aspects of the same process. Evolution results from the differential survival of replicators. Genes are replicators; organisms and groups of organisms are not replicators, they are vehicles in which replicators travel about. Vehicle selection is the process by which some vehicles are more successful than other vehicles in ensuring the survival of their replicators. (Dawkins, 1982: 162)

It is the organism’s job to deliver its genes as quickly and faithfully as possible within the gene pool. “Vehicle selection is the differential success of vehicles in propagating the replicators that ride inside them.” (Dawkins, 1982: 166) Every behavior an organism displays that is not reducible to the benefit of its genetic material is, from the point of view of the gene, futile and even unnecessarily costly. Organisms are only important in so far as they are able to propagate their genes. Therefore, although this view can be complemented with the Modern Synthesis, it stands opposed to the “organismic point of view.”

8. Universal Selection Mechanisms Repeated and Extended

Thus far we have seen that the units and levels of selection debate that started within biology also set off an evolutionary epistemological debate concerning the different units and levels of selection in science.

One of the main goals set forward by many Evolutionary Epistemologists is the development of a normative and explanatory framework that is based upon, and is at the least analogical to, evolutionary thinking. The quest for universal selection formulas that was already launched as early as the nineteenth century was spurred again by this units and levels of selection debate. The goal of such a uniform universal formula is that it not only explains biological evolution, but also the evolution of science, culture, the brain, economics, etc.

Scientists and philosophers alike have introduced different formulas that generalize and universalize natural selection and other evolutionary theories. Discussions in the field revolve around the question of whether there exists one universal selection formula which can be utilized to interpret all other kinds of evolutionary processes (including the evolution of culture, psychology, immunology, language, etc.), or whether such formulas can only help at a descriptive, and therefore, merely analogical, level. In what follows, different evolutionary frameworks are briefly touched upon so that the interested reader has an idea of where to look for different applications of these schemas.

a. Lewontin’s “Logical Skeleton” of Natural Selection

Lewontin (1970: 1) was the first to make an abstraction of natural selection. He argued that “the logical skeleton” of Darwin’s theory is “a powerful predictive system for changes at all levels of biological organization.” Lewontin distinguishes between three principles: phenotypic variation, differential fitness (because of different environments) and the heritability of that fitness. Lewontin (1970: 1) introduced this logical skeleton to pinpoint “different units of Mendelian, cytoplasmic, or cultural inheritance.” He distinguished between the selection of molecules (regarding the origin of life), cell organelles (regarding cytoplasmic evolution), cellular selection (different cell types divide at different rates, comparable with what today is called epigenetics), gametic selection, individual selection, kin selection and population selection.

b. Universal Darwinism

Dawkins (1983: 15) states that wherever life originates, that life can only be explained by using Darwin’s theory of natural selection. According to Dawkins, the most important property of life is that it is adapted to its environment, and adaptation requires a Darwinist explanation. Dawkins (1983: 16) states: “I agree with Maynard Smith […] that ‘The main task of evolution is to explain complexity, that is, to explain the same set of facts which Paley used as evidence of a Creator.’”

Organisms are “adaptively complex” (Dawkins, 1983: 17). This means that a complex structure like the eye, for example, evolved by natural selection for vision. Organisms or organismal traits are adapted to the environment and also evolved to enable adaptation towards that environment. Thus, through adaptation, an organism possesses information about that environment (Dawkins, 1983: 21). Selection refers to “…the non-random selection of randomly varying replicating entities by reason of their ‘phenotypic’ effects” (Dawkins, 1983: 32). It can be further divided into “one-off selection” and “cumulative selection.” The former relates to the selection of a stable configuration, a universally occurring process. The latter enables complex adaptation, because the next generation builds upon earlier generations through such things as the passing on of genes, but not solely by this mechanism.

Most importantly, for Dawkins, it is replicators that are selected. The reason that he introduces the concept “replicator” is twofold. First, he wants to extend the Modern Synthesis by introducing the gene’s eye view. Second, he introduces the term replicator, instead of gene, because he wants to universalize the principle of natural selection. The unit of selection, according to Dawkins, is the replicator, but replicator is a generic term; not only genes (individual genes or whole chunks of the chromosome), but also memes –which he defines as “… brain structures whose ‘phenotypic’ manifestation as behavior or artifact is the basis of their [cultural] selection,” are replicators (Dawkins, 1982: 164). The idea of memetics was later expanded by Blackmore (1999).

c. Blind Variation and Selective Retention

Campbell’s scheme is a formula that can be universalized. Every relationship that an organism engages in with its environment is a knowledge relation. Variation is blind, either because of random mutations and genetic recombinations, or, in the case of the development of scientific theories, blind trials result in blind variation.

Selection does not only occur at the level of the interaction between phenotype and environment, for selection is also internalized by the process of vicarious selection (see above). And trial and error learning has always been somewhat synonymous with blind-variation-and-selective-retention, according to Campbell.

In his earlier writings, Campbell (1959, 1960) emphasizes the notion of variation, because only when there is sufficient variation will there be competition and selection. Later, he emphasized the selective retention-part of his theory: those traits that are already adaptive also need to be retained by the current generation in order to keep being adaptive. In science as well, existing theories must be retained and passed on to the next generation through learning, or this information dies out. Hence tradition within culture or science, for example, also became a more important element in Campbell’s later writings (1987).

d. Universal Selectionism

The concept “universal selectionism” was first introduced by Gary Cziko (1995) and roughly corresponds with Campbell’s blind-variation-and-selective-retention scheme, although he prefers the term selectionism. In his 1995 book, Cziko explains this scheme as being applicable not only to biological evolution, but also to the evolution and growth of knowledge, immunology, and the development of the brain, thinking and culture. Selectionism is the only theory that, according to Cziko (1995: 13- 26), can explain the fit of an organism with its environment. Throughout history, providentialism and instructionism have also been assumed to explain this fit, but only selectionism can explain the mechanism of adaptation.

e. Replication, Variation and Environmental Interaction

The replication, variation, and environmental interaction scheme was first introduced by David Hull (1980) as a critique on Dawkins’s notion of replicators and vehicles. In Dawkins’s view, organisms are mere vehicles that temporarily accommodate the selfish genes that ride inside them and an organism can actually be equated with the workings of its genes. Hull’s theory differs from Dawkins’s, because the former states that organisms can display behavior that is not reducible to their genes. On a more general level, Hull introduced the notion of an interactor to complement Dawkins’s view (1980). Thus, he basically re-introduced the common assumption held by the Modern Synthesis that what interacts with the environment are organisms, not genes. But the notion of interaction can also be universalized. The most recent account of this formula is given in Hull, Langman and Glenn (2001).

For selection to occur, three conditions need to be met: replication, variation, and environmental interaction. Replication is dependent on the interaction between the organism and its environment (Hull, Langman and Glenn, 2001: 511). The formula they propose should be equally applicable to biology, immunology and operant behavior, although it should not be identical to biological selection theory. All three sorts of evolution share certain properties but also have their own peculiarities. Changes in operant behavior, for example, are not transmitted immediately to the next generation.

In contrast to Campbell and Plotkin, Hull, Langman, and Glenn (2001: 513) define selection as “[The] repeated cycles of replication, variation, and environmental interaction so structured that environmental interaction causes replication to be differential. The net effect is the evolution of the lineages produced by this process.”

Within postneodarwinian theory, variation is either perceived as part of the selection process, or as a precondition for selection to occur. If variation occurs, this results either from mutations that occur in the sex cells at the biological level, or from different behavioral patterns that in their own right are the result of environmental interaction. Replication, according to these authors (Hull, Langman and Glenn, 2001: 514-6), concerns the repetition/copying of “information.”

Finally, environmental interaction is characterized as causing replication to differ because certain replicators are more frequently selected than others, which in turn has nothing to do with the introduction of new variation. Only at the level of interaction between the organism and the environment does selection occur.

Hull’s scheme is one of the few schemes that has already been implemented in extra-philosophical and extra-biological fields. William Croft (2000, 2002) for instance uses it for the study of language change.

f. Generate-Test-Regenerate / Replicator-Interactor-Lineage

Plotkin prefers the notion of “universal Darwinism” over universal selectionism (1995, chapter 3). He distinguishes between two universal formulas. The first, the generate-test/selection-regenerate formula is more general. It does not a priori say anything about the mechanisms or units that cause this generating and testing. This formula is again very close to Campbell’s scheme. as well as Lewontin’s (Plotkin, 1995: 84). A second formula does specify the units and mechanisms: replication, interaction and lineages. The reason Plotkin distinguishes between the two is that he wants to avoid having to pinpoint a priori a replicator in cultural evolution.

Selection processes, according to Plotkin, always take place in three steps: first, there is the generation of variation, and the nature of variation does not in itself need to be specified (genes, phenotypes, theories etc. all can vary). This phase is always followed by a test phase, where natural selection is of course the prototypical way in which there occurs selection based upon the test results. Finally, there is regeneration of old and newly evolved varieties (Plotkin, 1995: 84). While it is obvious that Plotkin mainly has the selection of genetic material in mind here, he also sees his formula appropriate in order to explain learning and intelligence. How information is transmitted is not determined a priori, rather it is important that old variations are regenerated throughout time.

The replicator-interactor-lineage formula is first an elaboration and specialization of Plotkin’s first formula since it combines Dawkins’s notion of a replicator with Hull’s notions of an interactor and lineage, the latter term referring to “… entities that can change indefinitely through time as a result of replication and interaction.” (Plotkin, 1995: 97). Hull himself defines lineages as “… spatiotemporal sequences of entities that causally produce one another. Entities in the sequence are in some sense ‘descended’ from those earlier in the sequence” (1981: 146).

According to Plotkin (1995: xv), adaptation and knowledge are related in two ways: first the capacity to acquire knowledge is in itself an adaptation, and secondly, adaptations are also a form of knowledge. Adaptations are “in-formed” by the environment. Therefore, adaptation is knowledge (Plotkin, 1995: 116) and there can be a tentative growth of knowledge.

g. Universal Symbiogenesis

SET, the Serial Endo-symbiogenetic Theory of Lynn Margulis and Dorian Sagan (2002), is a theory that describes the origin of the five kingdoms. In brief, different bacteria merged and evolved into multi-cellular life. What is interesting here is that different bacteria literally merged and thus that evolution does not exclusively occur according to speciation models. The physicist Freeman Dyson (1992) therefore introduces the principle of universal symbiogenesis, where symbiotic mergings and speciation models intertwine. Throughout the evolution of life, which is the same for the evolution of the universe, there is an increase in diversification on the one hand and symbiogenesis on the other. Different structures originate and then later merge to form new structures. Within the evolution of life, there was the origin of the first microbial organisms, which than merged again and evolved into multi-cellular organisms.

Dyson defines universal symbiogenesis as “the reattachment of two structures, after they have been detached from each other and have evolved along separate paths for a long time, so as to form a combined structure with behavior not seen in the separate components” (Dyson, 1998: 121).

In conclusion, it can be said that the specific theory of evolution that one adheres to also partly determines what kind of evolutionary epistemology can be adhered to. Since evolutionary epistemology bases itself first on the sciences, no attempt is made by different evolutionary epistemologists to put forward one all-encompassing theory or program that all evolutionary epistemologists should adhere too. On the contrary, the diversity of evolutionary epistemologies is championed by scholars working in the field.

9. References and Further Reading

  • Ayala, Francisco J. 1978. “The Mechanisms of Evolution.” Scientific American 239 (3): 48-61.
  • Blackmore, Susan. 1999. The Meme Machine – with a foreword of Richard Dawkins. Oxford: Oxford University Press.
  • Bradie, Michael, 1986. “Assessing Evolutionary Epistemology.” Biology & Philosophy, 1, 401-459.
  • Brandon, Robert N. 1982. “The Levels of Selection.” In: Brandon, Robert N.; and Burian, Richard M. (eds). 1984. Genes, Organisms, Populations: Controversies over the units of selection 133-9. Cambridge: Massachusetts Institute of Technology.
  • Brandon, Robert N.; and Burian, Richard M. (eds). 1984. Genes, Organisms, Populations: Controversies over the Units of Selection. Cambridge: Massachusetts Institute of Technology.
  • Callebaut, Werner; and Pinxten, Rik. 1987. “Evolutionary Epistemology Today: Converging Views from Philosophy, the Natural and Social Sciences.” In: Callebaut, Werner; and Pinxten, Rik, (eds.). 1987. Evolutionary Epistemology: A Multiparadigm Program With a Complete Evolutionary Epistemology Bibliography 3-55. Dordrecht: Reidel.
  • Callebaut, Werner. 1993. Taking The Naturalistic Turn or How Real Philosophy of Science Is Done. Chicago IL: The University of Chicago Press.
  • Campbell, Donald T. 1959. “Methodological Suggestions from a Comparative Psychology of Knowledge Processes.” Inquiry 2 (3): 152-83.
  • Campbell, Donald T. 1960. “Blind Variation and Selective Retention in Creative Thought as in Other Knowledge Processes.” Psychological Review 67(6): 380-400.
  • Campbell, Donald T. 1974. “Evolutionary Epistemology.” In: Schlipp, Paul A. (ed.), The Philosophy of Karl Popper Vol. I, 413-459. Illinois: La Salle.
  • Campbell, Donald T. 1987. “Selection Theory and the Sociology of Scientific Validity.” In: Callebaut, Werner; and Pinxten, Rik (eds), Evolutionary Epistemology 139-58. Dordrecht: D. Reidel Publishing Company.
  • Campbell, Donald T. 1997. “From Evolutionary Epistemology Via Selection Theory to a Sociology of Scientific Validity: Edited by Cecilia Heyes and Barbara Frankel” Evolution and Cognition, 3 (1), 5-38.
  • Changeaux, Jean-Pierre. 1985. Neuronal Man: The Biology of Mind. New York: Oxford University Press.
  • Croft, William. 2000. Explaining Language Change: An Evolutionary Approach. Essex: Pearson.
  • Croft, William. 2002. “The Darwinization of Linguistics.” Selection 3(1): 75-91.
  • Cziko, Gary. 1995. Without Miracles: Universal Selection Theory and the Second Darwinian Revolution. Cambridge: Massachusetts Institute of Technology.
  • Damasio, Antonio R. 1996 (1994). Descartes’s Error: Emotion, Reason and the Human Brain. London: Papermac [First published by New York: Grosset/Putnam].Damasio, Antonio R. 1999. The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York: Harcourt Brace & Company.
  • Dawkins, Richard. 1976. The Selfish Gene. New York: Oxford University Press.
  • Dawkins, Richard. 1983. “Universal Darwinism.” In: Hull, David Lee; and Ruse, Michael (eds), The Philosophy of Biology 15-35. Oxford: Oxford University Press [First published in: Bendall, D. S. (ed.), 1998. Evolution from Molecules to Man 403-25. Cambridge: Cambridge University Press].
  • Dawkins, Richard. 1982. “Replicators and Vehicles.” In: Brandon, N. R.; and Burian, R. M. (eds) 1984, Genes, Organisms, Populations 161-79. Cambridge: Massachusetts Institute of Technology Press.
  • Dupré, John. 2001. Human Nature and the Limits of Science. Oxford: Clarendon Press.
  • Dyson, Freeman. 1998. “The Evolution of Science.” In: Fabian, Andrew C. (ed.), Evolution: Society, Science and the Universe 118-35. Cambridge: Cambridge University Press.
  • Gazzaniga, Michael S. 1994. Nature’s Mind: The Biological Roots of Thinking, Emotions, Sexuality, Language, and Intelligence. New York: Basic Books.
  • Gazzaniga, Michael S. 2000. The Mind’s Past. California: University of California Press.
  • Gontier, Nathalie. 2006. “Introduction to Evolutionary Epistemology, Language and Culture.” In: Gontier, Nathalie, Van Bendegem, Jean Paul and Aerts, Diederik (eds), Evolutionary Epistemology, Language and Culture – A non-adaptationist systems theoretical approach1-29. Dordrecht: Springer.
  • Gould, Stephen J. and Lewontin, R. C. 1979. “The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Program.” Proc. R. Soc. London, B 205, 581-589.
  • Guttmann, Wolfgang F.; and Weingarten, Michael. 1990. “Die biotheoretischen Mängel der evolutionären Erkenntnistheorie.” Journal for General Philosophy of Science 21: 309-328.
  • Hamilton, William. D. 1964. “The Genetical Evolution of Social Behavior, I and II.” Journal of theoretical biology, 7, 1-52.
  • Heyes, Cecilia; and Hull, David (eds). 2001. Selection Theory and Social Construction – The Evolutionary Naturalistic Epistemology of Donald T. Campbell. New York: State University of New York Press.
  • Hull, David L. 1980. “Individuality and Selection.” Annual Review of Ecology and Systematics, II: 311-32.
  • Hull, David L. 1981. “Units of Evolution.” In: Brandon, N. R.; and Burian, R. M. (eds) 1984, Genes, Organisms, populations 142-159. Cambridge: Massachusetts Institute of Technology Press.
  • Hull, David L. 1988. Science as a Process: An Evolutionary Account of the Social and Conceptual Development of Science. Chicago: The University of Chicago Press.
  • Hull, David L.; Langman, Rodney E.; and Glenn, Sigrid S. 2001. “A General Account of Selection: Biology, Immunology, and Behavior.” Behavioral and Brain Sciences 24: 511-573.
  • Ledoux, Joseph. 1998 (1996). The Emotional Brain: The Mysterious Underpinnings of Emotional Life. New York: Touchstone Edition [First published by New York: Simon & Schuster incorporate].
  • Lewontin, Richard. 1970. “The levels of Selection.” Annu. Rev. Ecol. Syst. 1-18.
  • Lewontin, Richard. 1978. “Adaptation.” Scientific American 239 (3): 157-69.
  • Lewontin, Richard. 2000. The Triple Helix: Gene, Organisms and Environment. Cambridge: Harvard University Press.
  • Lorenz, Konrad. 1941. “Kant’s Lehre vom Apriorischen im Lichte gegenwärtiger Biologie.” Blätter für Deutsche Philosophie 15: 94-125. (English translation in Plotkin, Henry C., op. cit. 121-143.)
  • Lorenz, Konrad. 1958. “The Evolution of Behavior.” Scientific American 199 (6), 67-78.
  • Lorenz, Konrad. 1977. Behind the Mirror. London: Methuen.
  • Lorenz, Konrad. 1985. “Wege zur Evolutionären Erkenntnistheorie.” In: Ott, Jörg A., Wagner, G. P. and Wuketits, F. (eds), Evolution, Ordnung und Erkenntnis. Berlin: Verlag, 13-20.
  • Margulis, Lynn; and Sagan, Dorion. 2002. Acquiring Genomes: A Theory of the Origin of Species. New York: Basic Books.
  • Maturana, Humberto; and Varela, Francisco. 1980. Autopoiesis and cognition. Dordrecht: D. Reidel Publishing Company.
  • Maynard Smith, John. 1993 (1958). The Theory of Evolution. Cambridge: Canto [First published by Cambridge: Cambridge University Press].
  • Mayr, Ernst. 1978. “Evolution.” Scientific American 239 (3): 39-47.
  • Munz, Peter. 2001 (1993). Philosophical Darwinism: on the origin of knowledge by means of natural selection. London: Routledge.
  • Oyama, Susan. 2000a. The Ontogeny of Information: Developmental Systems and Evolution. Durham: Duke University Press.
  • Oyama, Susan. 2000b. Evolution’s Eye: A Systems View of the Biology-Culture Divide. Durham: Duke University Press.
  • Plotkin, Henry. 1995 (1994). Darwin Machines and the Nature of Knowledge: Concerning Adaptations, Instinct and the Evolution of Intelligence. London: Penguin Books.
  • Popper, Karl. 1963. Conjectures and Refutations. London: Routledge & Kegan Paul.
  • Prigogine, Ilya. 1996. La fin des certitudes: temps, chaos et les lois de la nature. Paris: Odile Jacob.
  • Quine, William V. 1969. “Epistemology Naturalized.” In: Bernecker, Sven; and Dretske, Fred (eds), 2000. Knowledge: Readings in Contemporary Epistemology 266-78. Oxford: Oxford University Press [First published in Quine, William V., 1969, Ontological Relativity and Other Essays 69-90. New York: Colombia University Press. Original title: “Naturalized Epistemology.”].
  • Ruse, Michael. 1988. Taking Darwin Seriously. Oxford: Blackwell publishers.
  • Toulmin, Stephen. 1972. Human Understanding: The Collective Use and Evolution of Concepts. Princeton: Princeton University Press.
  • Williams, George C. 1966. Adaptation and Natural Selection. Princeton: Princeton University Press.
  • Wuketits, Franz (ed.) 1984. Concepts and Approaches in Evolutionary Epistemology. Dordrecht: D. Reidel Publishing Company
  • Wuketits, Franz M. 1985. “Die Systemtheoretische Innovation der Evolutionslehre.” In: Ott, Jörg A., Wagner, G. P. and Wuketits, F. (eds), Evolution, Ordnung und Erkenntnis. Berlin: Verlag, 68-81.
  • Wuketits, Franz M. 1989. “Cognition: A Non-adaptationist View.” La Nuova Critica 9-10, 5-15.
  • Wuketits, Franz M. 1990. Evolutionary Epistemology and its Implications for Humankind. New York: State University of New York Press.
  • Wuketits, Franz M. 1992. “Adaptation, Representation, Construction: An Issue in Evolutionary Epistemology.” Evolution and Cognition 2: 151-162.
  • Wuketits, Franz M. 1995. “A Comment on Some Recent Arguments in Evolutionary Epistemology: and Some Counterarguments.” Biology and Philosophy 10: 357-363.
  • Wuketits, Franz M. 2001. “The Philosophy of Donald T. Campbell: A Short Review and Critical Appraisal.” Biology and Philosophy 16: 171-88.
  • Wuketits, Franz. M. 2002. “Ludwig von Bertalanffy (1901-1972) und die theoretische Biologie heute.” Naturwissenschafliche Rundschau, 55 (4): 190-194.
  • Wuketits, Franz M. 2006. “Evolutionary Epistemology – The Non-Adaptationist Approach.” In: Gontier, Nathalie, Van Bendegem, Jean Paul and Aerts, Diederik (eds), Evolutionary Epistemology, Language and Culture – A Non-Adaptationist Systems Theoretical Approach 33-46. Dordrecht: Springer.

Research for this article was supported by the Fund for Scientific Research – Flanders (F.W.O.-Vlaanderen) and the Centre for Logic and Philosophy of Science, where the author is a Research Assistant.

Author Information

Nathalie Gontier
Email: Nathalie.Gontier@vub.ac.be
Vrije Universiteit Brussel
Belgium

Cicero: Academic Skepticism

cicero-02Cicero (106 to 43 B.C.E.) adopted the philosophical view of the Academic skeptics as a young man sometime in the 80’s. In 89/8, Philo of Larissa, the head of Plato’s Academy, fled from Athens to Rome for political reasons. While at Rome, Cicero attended Philo’s public lectures and began to study philosophy with him. Cicero also studied with the most prominent representatives of other Hellenistic philosophical schools: Posidonius (a Stoic), Zeno of Citium and Phaedrus (Epicureans), and Cratippus (a Peripatetic). Although the Academy probably ceased to exist as an institution after Philo’s death in 84, Cicero continued to champion its methodology in his philosophical dialogues.

The Academic position appealed to Cicero for a variety of reasons (Section 1). The Academics argued on both sides of every issue in order to undermine the dogmatic confidence of their interlocutors. Cicero’s teacher Philo also applied this method in order to determine which position enjoyed the most rational support. Given his rhetorical and forensic skills, Cicero likely found this method attractive. It was also ideal for his project of inducing the ruling class Romans to take up the practice of philosophy. Rather than present his personal views, Cicero laid out in dialogue form the strongest arguments he could mine from other philosophical texts. The idea was to encourage the reader to come to his own conclusion, but even more importantly, to adopt the Academic method of inquiry. Perhaps the most attractive feature of Academic philosophy for Cicero was the intellectual freedom guaranteed by the method. The Academic is bound to no particular doctrine as an Academic. He is only bound to accept the verdict of his best rational assessment of the arguments pro and con.

Cicero asserts that the reasons for his Academic allegiance are set out fully in his Academica (De Natura Deorum 1.11). Although these Academic books are fragmentary, they nonetheless provide a detailed account of the dispute between the Academics and Stoics on the possibility of knowledge (Sections 2 and 3) along with Philo’s explanation for how we can manage quite well without knowledge (Section 4).

Table of Contents

  1. The Skeptical Academy and its Appeal to Cicero
  2. Arguments For and Against Stoic Epistemology in the Academica
  3. Indirect Arguments in Support of Stoic Epistemology in the Academica
  4. The Positive Fallibilism of the Philonian Academy
  5. References and Further Reading
    1. Latin Texts and Translations
    2. Select Bibliography of Secondary Literature

1. The Skeptical Academy and its Appeal to Cicero

There were some important variations among the Academics during the Academy’s skeptical period (c. 268/7 B.C.E to 1st century B.C.E.), but there is also a unifying feature. They all focused squarely, if not exclusively, on refutation. Inspired by Socrates (as he appears in some of Plato’s dialogues), they sought to combat the overly confident attitude of the dogmatists. Since the Stoics were the most influential dogmatists of the time, the skeptical Academics devoted much of their energy to combating them in particular. (Dogmatism in the Hellenistic period is simply a matter of positively affirming that one knows the truth of some systematically related philosophical propositions—it need not have the pejorative connotation currently associated with it.)

In order to refute their opponents, the Academics argued dialectically. Rather than assert a position themselves, they would reveal to the interlocutor that his beliefs are mutually inconsistent and thus that he is not able to justify his claim to knowledge. For example, suppose I claim to know that justice is whatever the strong say it is. Then, in response to a skeptic’s questioning, I am led on the basis of my own premises to conclude that justice is not whatever the strong say it is. It follows that I did not really know what I claimed to know. The operative assumption is that if I had known what justice is, I would have been able to show why my belief is true. If I contradict myself or run out of plausible reasons, then I do not know what justice is after all—even if my belief turns out to be true, I do not know why it is true.

Later Academics also began arguing on both sides of every issue, pro and con. Some apparently sought to show that nothing whatsoever could be known about the issue in question. To accomplish this end, they showed others that there are equally strong arguments for and against, and thus no compelling reason to accept any position. Others employed the same method in order to discover which side of an issue could be most plausibly defended. But all of the Academics agreed that the Stoics had failed to adequately defend their epistemology; that is, they had not shown that knowledge is possible (much more on this below in sections 2 and 3).

Cicero found the later Academic position appealing for a variety of reasons. The method of arguing pro and con was a natural fit with his tremendous oratorical and forensic skill. As a lawyer and orator he was pleased by the Academy’s insistence on teaching rhetoric along with philosophy on the grounds that the two disciplines were mutually supportive. He also found the position ideally suited to his philosophical project of inspiring the Roman ruling class to take up the practice of philosophy. In his dialogues he employs the Academic method with the intention of encouraging his readers to think for themselves rather than to rely on authority.

He was perhaps most attracted by the Academics’ intellectual freedom. In his earliest statement of Academic allegiance, Cicero remarks that he will gladly change his opinion if someone points out his error. For it is not shameful to have insufficiently understood something. It is shameful, however, to have persevered foolishly and for a long time with insufficient understanding. The reason for this is that insufficient understanding is due to the common weakness of mankind. It is, to some extent inevitable, or at least excusable. Foolish perseverance, however, can be avoided, and hence is shameful and blameworthy. (De Inventione 2.9) Cicero describes such perseverance as the stubborn adherence to one’s position because he has come to feel some affection for it. The Academic, by contrast, is supposed to have no extra-rational motives in defending his view or in persevering, when or if he does.

Part of the rationale for this way of proceeding is that we cannot fully appreciate the relative strengths and weaknesses of the available philosophical positions unless we have thoroughly explored what can be said for and against them. To align oneself to a philosophical position prior to this is premature. As we start out we lack the knowledge or wisdom we seek, and thus we are not in a position to adequately judge which system or which philosopher to follow. Once one undertakes the Academic project, he or she may find, as Cicero did, that one lifetime is not sufficient for completing the project and taking a final stand.

This freedom to change one’s position in accordance with a new assessment of the arguments may appear to dispense with any concern for consistency. Suppose for example that I no longer believe that the arguments in favor of going to war with Carthage are compelling. While I previously believed Rome was justified in going to war, I now believe the opposite. As an Academic I am free to change my position as often as I like. I am not bound by any doctrinal constraints due to my philosophical allegiance. And I am not bound by what I formerly believed. Remaining consistent with my former beliefs is never as important as accepting the verdict of my current assessment of the arguments.

Academic freedom is not an end in itself however; it is a means to arriving at the most rationally defensible position. This is why Cicero characterizes the Academic’s method as aimed at drawing out and articulating that view which can be maintained most consistently (Academica 2.9) and as aimed at revealing what is true or at least the closest approximation to the truth (Academica 2.7, 2.65-66, De Officiis 2.8, Tusculan Disputations 1.8). The consistency sought is an accord with the rational evidence and not with one’s previous beliefs.

Cicero frequently singles out this freedom as the most definitive and attractive feature of the Academics’ philosophical practice (for example, De Legibus 1.36, Academica 2.134, Tusculan Disputations 4.83, 5.33, 5.82, De Officiis 2.7, 3.20). They alone are free to accept whatever strikes them as most plausible at that moment (see Section 4 below for more on Academic probabilism).

2. Arguments For and Against Stoic Epistemology in the Academica

During his final encyclopedic burst of dialogues (46-44 B.C.), Cicero wrote his epistemological work, the Academica. The original version contained two books named after the principal interlocutor in each, Catulus and Lucullus. The latter of the two is extant, and generally referred to as Ac. 2 or Lucullus (= Luc.). Cicero revised these original two books, dividing them into four, and replaced Lucullus with Varro as principal interlocutor throughout. Only about the first fourth of the revised version is extant. It is generally referred to as Ac. 1.

In these books Cicero presents arguments for and against the Stoic theory of knowledge as well as the Academics’ own positive, fallibilistic alternative. It should be noted that ethics and epistemology are inextricably connected in the Academic books. Cicero remarks on several occasions that what they are investigating is the sage—that is, an ideal of the perfectly wise human being. Ultimately, the question about the possibility of knowledge on the Stoic account, and in Hellenistic philosophy in general, is a question about the possibility of wisdom. The Hellenistic philosophers followed Plato’s Socrates in taking their primary task to be the discovery of the best human life.

In order to meet this challenge, Zeno of Citium, the founder of Stoicism, developed an account of how the knowledge that Socrates sought—that is, the knowledge that guides one in living the best possible human life—could in fact be attained. That it could be attained he established on the grounds that the universe is providentially arranged. From the providential arrangement it follows that human beings must be equipped to satisfy their desire for knowledge, for Nature would not have acted so capriciously as to give us such an important desire without also providing the means to fulfill it.

If one developed his natural abilities in accordance with Nature he would eventually learn to infallibly distinguish what is true from what is false, at least with regard to matters pertaining to happiness. The sage is not omniscient, but he is infallible. His knowledge guarantees that he will always live in accordance with nature, which is identical to being virtuous and happy.

All of the sage’s beliefs are true, and grasped in such a way that no experience and no argument is able to dislodge him from his conviction. This irrefutability depends crucially on the fact that all of the sage’s beliefs are true and firmly grasped as such. If he were to hold even one false belief he might be persuaded to rely on it in abandoning true beliefs. So we can see that the sage’s knowledge is systematic in that each of his true beliefs is supported by the others.

The foundation of this account of knowledge is a type of impression about which we cannot be mistaken. This type of impression provides us with a criterion of truth, that is, a measuring stick one can use to determine what is and what is not the case. If the Academics could succeed in showing that there are no such impressions, they would effectively undermine the possibility of attaining the knowledge built upon them.

Thus, the central issue in Ac. 2 is whether or not an impression can be apprehended or grasped in such a way as to guarantee its truth. Zeno described such an impression as cognitive, or mentally graspable (katalêpton), and defined it as one that

(1) comes from what is the case, that is, some existent state of affairs
(2) accurately conveys all the relevant features of what it comes from, and
(3) cannot be exactly like an impression that comes from what is not the case (Ac. 2.77, cp. Sextus’ account at Adversus Mathematikos [= M] 7.248).

Katalêpsis occurs when one assents to a cognitive impression, thereby firmly grasping its truth. So whenever one assents to a cognitive impression one necessarily forms a true belief.

The pressing question is whether one can learn to distinguish cognitive from non-cognitive impressions. It seems that one can never know whether (1) and (2) have been satisfied except by inspecting the perceptual content of the impression. If so, this opens the way for the Academics’ main objection. (It should be noted that the Stoics did not think all cognitive impressions are perceptual. We may have cognitive impressions of evaluative states of affairs—for example that it is good for us to help our friend. However most of the evidence regarding the possibility of such impressions is limited to perceptual cases, and so the following discussion will be similarly limited.)

The Academics demanded that the Stoics produce an instance of this cognitive grasping that is immune to skeptical counterexamples, that is, immune to scenarios in which a true impression provides the same sensory evidence as its false imitator. Apparently there is a plentiful supply of such counterexamples, and the Academics spent a great deal of effort developing them. (Ac. 2.42) One type illustrates cases of misidentification: for example, identical twins, eggs, statues, or imprints in wax made by the same ring. (Ac. 2.84-87) Another type involves cases of illusion, dreams and madness. (Ac. 2.88-91)

So it seems that any example of an allegedly cognitive impression offered by the Stoics can be countered by the Academics’ doppelganger or a scenario in which some mental defect and not the object is responsible for the perceptual content of the impression. In either case, the Academics challenge the third characteristic above of cognitive impressions. This challenge is evident in Cicero’s report. (Ac. 2.83, cf. 2.40) The Academics agree with the Stoics that some impressions are true and some are false, and that false impressions are never cognitive. They also agree that if there were no differences between two impressions then these impressions must either both be cognitive or both fail to be cognitive, that is, either the perceptual content of both guarantee their truth or fails to guarantee their truth. In other words, if there were no differences between the two impressions it cannot be the case that one is cognitive and the other is not. The crucial premise, and the crux of the debate, is the Academics’ claim, contrary to (3) above, that

(4) for every true impression there may exist a false one that is identical (that is, qualitatively, not numerically).

If we grant (4), then there can be no impression whose perceptual content guarantees its truth; that is, there can be no cognitive impressions. Imagine that you have received an exceedingly clear and distinct impression of an orange. No matter how much you seek to corroborate the truth of this impression, or acquire an even clearer and more distinct impression, it may still turn out to be false. Based on the way it appears, you can never know whether it is a true impression or a false one that is qualitatively identical to the true one. The possibility of error inevitably enters if we must recognize an impression as cognitive for it to play its intended epistemic role.

In response to (4), the Stoics insisted that no two impressions can be identical (Ac. 2.50). So even though two impressions may seem identical, there will still be distinguishable features. In these sorts of cases we must sharpen our skills and refrain from assenting in the meantime (Ac. 2.56-57). But Cicero replies that it makes no difference whether the impressions are strictly identical or only indistinguishable to us (Ac. 2.85). The issue, as he understands it, is whether we are ever actually in a position to accurately identify an impression as cognitive on the basis of its perceptual content.

This interpretation may be unfair to the Stoics however. At one point Lucullus, the spokesman for the Stoics in the Academica, compares assenting to cognitive impressions to the sinking of a scale’s balance when weight is put on it. The mind necessarily yields and cannot refrain from giving its approval to what is perspicuous. (Ac. 2.38) Sextus also attributes this view to later Stoics: when the cognitive impression lacks any obstacles it lays hold of us by the hair and practically drags us to assent. (M 7.257) In the end, assent must still be voluntary. But what these passages suggest is some sort of natural fit between cognitive impressions and our rational faculty such that cognitive impressions are, at least potentially, compelling in a way that false impressions cannot be. According to this view, cognitive impressions affect the properly trained mind in a way that is quite different from the way false impressions affect the same mind. If there is this natural fit between cognitive impressions and our rational faculty, then perhaps it is possible after all to acquire the necessary level of discernment to guarantee that one will only assent to cognitive impressions.

Even so, Cicero was apparently satisfied that the Stoics had not succeeded in showing that cognitive impressions provide us with a criterion of truth in practice. He was more convinced by the seemingly limitless supply of false impressions that we cannot currently distinguish from true ones than by the remote possibility of developing our powers of discernment to overcome such possible deceptions.

3. Indirect Arguments in Support of Stoic Epistemology in the Academica

Lucullus also presents some indirect arguments. He assumes the truth of Cicero’s Academic position (akatalêpsia, that is, the denial of the possibility of katalêpsis) and derives unacceptable consequences. There are two types of such argument: first that akatalêpsia is self-refuting or inconsistent (Ac. 2.33, 44, 58, cf. also 111), and second that akatalêpsia removes the possibility of certain sorts of successful action, especially virtuous action (Ac. 2.19-27, 32-39). These are versions of the two most often repeated arguments against virtually every ancient skeptic. In this context, however, they are specifically tailored as responses to the rejection of the Stoic criterion.

First, consider the charge that akatalêpsia is self-refuting. Lucullus remarks that the Academics’ crucial premise (4) tells us that there are (or at least may be) no differences between any given true impression and a false one. And yet the Academics also claim that some impressions are true and some are false, and this implies that there is some difference between them. (Ac. 2.44) Thus in rejecting katalêpsis, the Academics inconsistently argue that there is and there is not a difference between any given true impression and a false one.

There is an easy rejoinder available. Cicero need only claim that there are no perceptual differences between any given true impression and a false one. This is consistent with saying that there are causal differences, specifically that true impressions come from what is the case and false ones do not. Cicero does not deny that truth exists, but rather that we can grasp it with certainty. (Ac. 2.111) So the problem lies not with the world, but rather with our inability to develop our powers of discernment to the level required by the Stoic theory. No matter how much practice we may have at distinguishing eggs, there may always be a pair of eggs whose similarities exceed our ability.

But Lucullus’ objection is not merely that akatalêpsia entails the impossibility of correctly identifying which of my impressions are true. His objection also includes the claim that akatalêpsia entails the eradication of any adequate conception of truth. (Ac. 2.33) If we have no adequate conception of truth, however, we cannot consistently assert that some impressions are true and some are false. In other words, we should not accept that there is a real distinction between truth and falsity, right and wrong, or any other pair, unless we are confident that our corresponding conceptions of each accurately reveal this distinction. Granting this point, the difficulty for the Stoics lies in explaining why akatalêpsia entails the eradication of any adequate conception of truth in the first place.

Unfortunately, Lucullus does not elaborate on this point. But the explanation must have something to do with the Stoic view of oikeiôsis, the providential process by which Nature guides the moral and intellectual development of all human beings. In sketching the Stoic view of oikeiôsis, we will also arrive at the second sort of objection mentioned above, namely that akatalêpsia removes the possibility of certain sorts of successful action, especially virtuous action.

The Stoics believe that Nature implants in each of us a love of ourselves that is expressed in our primary and earliest drive towards self-preservation. We are naturally disposed to choose what is in accordance with our nature and reject what is opposed or harmful to it. As a result of this innate tendency, we all inevitably develop accurate conceptions (prolêpseis) of what is helpful and what is harmful with respect to self-preservation. This explains, among other things, the instinctive drive of newborns to nurse: the breast is perceived as beneficial.

These naturally developed conceptions must be veridical in keeping with the providence of nature. If they were misleading it would threaten our existence as a species, and it would be impossible to develop such faulty conceptions further into the organized bodies of knowledge exhibited in skillful activity. Nature does not guarantee that we will develop our naturally acquired conceptions into systematic bodies of knowledge and ultimately into virtuous dispositions; neither does Nature guarantee that all acorns will grow into magnificent oaks. But the raw material is provided in both cases.

Assenting to cognitive impressions is essential to the process by which we develop our naturally developed conceptions (prolêpseis) into the more precise conceptions (ennoiai) that regulate our rational judgments. For example, in De Finibus 3, Cicero’s Stoic spokesman Cato describes the process by which our natural disposition towards self-preservation is transformed into a true conception of the good. Our drive for self-preservation leads us to accurate conceptions of what is valuable or beneficial. Then, if we reason correctly about the nature of this value, we gradually discern what is genuinely valuable, the good itself. (De Finibus 3.16 ff.) But again it would not be possible to arrive at a true conception of the good if the raw material were somehow misleading.

Lucullus remarks that the mind “seizes some impressions [presumably cognitive ones] in order to make immediate use of them, others, which are the source of memory, it stores away so to speak, while all the rest it arranges by their likenesses, and thereby conceptions of things are produced…” (Ac. 2.30, tr. Long and Sedley [= LS] 40N) So we arrive at our conceptions in general by performing mental operations on sensory experience. (cf. Diogenes Laertius 7.53) If we cannot rely on the accuracy of sensory experience, that is if we deny the possibility of katalêpsis, then it will be impossible to form an accurate conception of truth, or anything else. This in turn undermines our ability to distinguish the true from the false in general.

Cognitive impressions are thus part of a natural fit between the world and our rational faculties—they indicate a basic or immediate way in which the world is intelligible to us. By denying the existence of cognitive impressions, Lucullus claims the Academics obliterate this crucial link and render the world ultimately unintelligible. They “tear out the very tools or equipment of life, or rather they actually ruin the foundations of the whole of life and rob the living being itself of the mind which gives it life…” (Ac. 2.31, tr. LS 40N) And he asks, if the conceptions that we form on the basis of our experience “were false or imprinted by the kind of impressions which were indiscernible from false ones, how on earth could we make use of them?” (Ac. 2.22, tr. LS 40M, cf. Ac. 2.19-20) Lucullus must mean “how could we successfully make use of them?”—otherwise, we could simply say “poorly and unreliably.” His question presupposes the apparent success we have had in organizing sensory experience into the systematic bodies of knowledge that are employed in skillful activities. To account for this success he thinks we must acknowledge that some impressions are cognitive.

The denial of katalêpsis also eliminates the possibility of virtue or wisdom. If we cannot form an accurate conception of the good, then we can never be sure that any of our particular actions are in fact good. Personifying wisdom, Lucullus remarks that she cannot possibly be wisdom if she is doubtful and in ignorance regarding the ultimate good which provides the measure against which we evaluate everything. (Ac. 2.24) For example, suppose I assent to the proposition that it is good for me to teach my students about skepticism. The Stoics believe that if my conception of the good is incorrect, or even if I do not know whether it is correct, the resulting action is not virtuous. It may be the right thing to do, but virtue requires that I know it is right, and that my conviction is unshakeable by any argument. Katalêpsis provides the basis for such certainty. The denial of katalêpsis thus removes the possibility of virtue.

The most obvious weakness of these objections is the extent to which they presuppose controversial elements of the Stoic system. Unless the skeptical opponent accepts these elements, the objections have no force. But Cicero does respond to these objections, perhaps because he accepts much of the Stoic system, though in the provisional way characteristic of an Academic. In his defense of the Academic position he shows how successful and skillful action and even virtue are possible without katalêpsis.

4. The Positive Fallibilism of the Philonian Academy

The development of a positive alternative to Stoic katalêpsis is generally thought to be the result of a misinterpretation of the earlier Academics’ more radical skepticism, especially Carneades’ skepticism. The radical variety makes no provisions for acquiring beliefs; having successfully refuted every available (if not possible) position, the skeptic’s only option is to suspend judgment and believe nothing. The moderate variety, by contrast, aims at acquiring the most rationally defensible position with the full awareness of one’s fallibility.

Cicero insists that Academics do not deny the existence of true impressions; they deny only the possibility of an infallible grasp of them. He offers no explicit defense for the claim that true impressions exist, but he does recognize the existence of technical expertise; the general accuracy of our impressions would then provide the best explanation for this fact. Thus far he is in agreement with Lucullus: there could be no technical expertise if there were absolutely no distinction between true and false impressions. Technical expertise seems to presuppose that most of the impressions we rely on are in fact true.

Such reliability, however, is completely independent of our ability to infallibly differentiate true from false. As long as we make a responsible and cautious use of our impressions, always allowing for the possibility of error, the occasional deception is no serious cause for alarm.

In response to the Stoic objections that akatalêpsia would lead to inaction, the Academics did suggest that we may get along very well by relying on what appears to be subjectively plausible: Arcesilaus refers to this as what is reasonable (to eulogon), and Carneades as what is plausible (to pithanon). Cicero translates these Greek terms with one of his most important philosophical coinages, probabilitas. Regardless of what his predecessors intended by their skeptical alternatives, Cicero clearly intends that probabilitas is somehow like the truth. He frequently uses probabile and veri simile interchangeably (Ac. 2.7-9, 32, 99, Tusculan Disputations 1.17, 2.5).

Furthermore, he acknowledges that probabilitas is useful both “in the conduct of life and in philosophical investigation and discussion” (Ac. 2.32). So it seems that Cicero is not concerned exclusively with explaining relatively mundane successes like our ability to navigate, or even the more noteworthy successes of science, but also the possibility of making progress philosophically. Indeed, he maintains, both in the Academic books and elsewhere, that virtue is possible without Stoic katalêpsis. This is evident in the character of the “Academic sage.”

The Academic sage “is not afraid lest he may appear to throw everything into confusion and make everything uncertain. For if a question be put to him about duty or about a number of other matters in which practice has made him an expert, he would not reply in the same way as he would if questioned as to whether the number of stars is even or odd, and say that he did not know; for in things uncertain there is nothing probable, but in things where there is probability the wise man will not be at a loss either what to do or what to answer” (Ac. 2.110, tr. by H. Rackham). Guided solely by probabilitas, the sage will plan out his entire life (Ac. 2.99).

Cicero is much less forthcoming with regard to the details of how the sage employs probabilitas in adjudicating competing philosophical claims. But that the sage does employ probabilitas in this way is evident from the fact that he accepts the denial of the possibility of katalêpsis as probable. (Ac. 2.110) Such a decision indicates that the sage has weighed both sides of the debate and arrived at his probable judgment as a result.

It is likely that Cicero is following Philo’s adaptation of Carneades’ account of how we should test our sensory impressions when in doubt. (This is most extensively reported by Sextus Empiricus, M 7.166-189, see also Ac. 2.78). In matters of relatively little importance, or when we don’t have time for a more thorough examination we rely on whatever seems immediately plausible. Even though unexamined, such impressions may strike us with varying degrees of force or vividness. But since every individual impression is accompanied by a host of other related impressions, we should examine these as well, time permitting. When none of these concurrent impressions seem false, or inconsistent with the impression in question, our belief is greater. In matters of the greatest importance, especially those pertaining to our happiness, we should go a step further and examine each of the concurrent impressions individually, cross-questioning each of them on the testimony of the others. (M 7.184)

Impressions that survive this scrutiny are most credible. But the degrees of credibility have no upper limit since cross-questioning may proceed indefinitely. What the higher levels of scrutiny have in common is that they are aimed primarily at disconfirmation (M 7.189). In the end, what reveals itself as most credible is what has survived the most extensive attempts at “refutation.”

Given that Cicero sees himself as engaged in the same philosophical practice as Carneades, it is likely that disconfirmation plays the same central role in the philosophical application of probabilitas as in the empirical application of Carneades’ criterion. So to employ this fallible criterion in philosophical investigation would require a serious and sustained effort to refute the view in question. If it survives such critical scrutiny, it will appear to be like the truth. Since we are dealing with degrees of justification, approximation to the truth most likely refers to the extent to which the view in question has been rationally defended. The further assumption underlying this is that the truth cannot be refuted. Surviving serious attempts at refutation would then provide inductive evidence of the truth of that view, and the more it survives the more it will appear to be like the truth.

Unlike the empirical cases, philosophical issues typically do not force a judgment. We may reflect indefinitely on whether justice is whatever the strong say it is whereas life-and-death, fight-or-flight, judgments cannot wait. This open-endedness is reflected in Cicero’s own consideration of the dispute between the Stoics and Peripatetics on the sufficiency of virtue for a happy [eudaimôn] life. Sometimes he was swayed by the Stoics’ position that virtue can guarantee a happy life with or without external goods like health and wealth. And sometimes he was swayed by the Peripatetic view that virtue requires at least some of those external goods to secure a happy life. The fact that Cicero continued to the end of his life to struggle with this issue does not mean that he failed as an Academic. Arriving once and for all at the philosophical view that can be most consistently maintained is not required; continuing to search for it is.

5. References and Further Reading

a. Latin Texts and Translations

  • Brittain, C., tr. 2006. Cicero: On Academic Scepticism, Indianapolis: Hackett Publishing.
  • Long and Sedley, tr. 1987. The Hellenistic Philosophers, Volumes 1 and 2, Cambridge: Cambridge University Press.
    • Long and Sedley provide translations of and commentary on a good portion of the Academica. Their volumes are indispensable to the study of Hellenistic philosophy in general, and the commentary on the selections from the Academica are extremely helpful.
  • Rackham, H., tr. 1933/1994. Cicero: De Natura Deorum, Academica, Cambridge, MA: Harvard University Press.
    • This volume in the Loeb Classical Library contains the Latin text with English translation on facing pages. It is currently the only English translation available of the Academic books in their entirety (as we have them).
  • Reid, J. S. 1885. M. Tulli Ciceronis, Academica, London: MacMillan.
    • For textual analysis and philosophical commentary, Reid’s edition is still valuable.

b. Select Bibliography of Secondary Literature

  • Brittain, C. 2001. Philo of Larissa: the Last of the Academic Sceptics, Oxford: Oxford University Press.
    • Though Brittain does not deal specifically with Cicero as a philosopher, he makes extensive use of the Academic books in reconstructing the positions held by Philo as well as the history of the Academy in general. This is a very carefully researched and comprehensive book. In addition to presenting a stimulating reconstruction of Philo’s views, there is a very useful appendix containing all the testimonia on Philo along with translations.
  • Glucker, J. 1978. Antiochus and the Late Academy, Hypomnemata 56, Göttingen.
  • Glucker, J. 1988. “Cicero’s Philosophical Affiliations,” in Dillon, J.M. and A.A. Long, eds., The Question of “Eclecticism”: Studies in Later Greek Philosophy, Berkeley.
  • Glucker, J. 1995. “Probabile, Veri Simile, and Related Terms,” in Powell, ed.
  • Görler, W. 1995. “Silencing the troublemaker: De Legibus 1.39 and the Continuity of Cicero’s Scepticism,” in Powell, ed.
    • This is a response to an earlier article by Glucker which argues that Cicero changed his affiliation twice, once from a youthful adherence to the skeptical Academy to the more dogmatic position of Antiochus, and then later in life back again.
  • Mansfeld, J. and B. Inwood, eds. 1997. Assent and Argument: Studies in Cicero’s Academic Books, Leiden: Brill.
    • This and the following volume are highly recommended as a starting point for further study in Cicero’s skepticism and the late Academy in general.
  • Powell, J.G.F., ed. 1995. Cicero the Philosopher, Oxford.
  • Tarrant, H. 1985. Scepticism or Platonism, Cambridge: Cambridge University Press.

Author Information

Harald Thorsrud
Email: hthorsrud@agnesscott.edu
New Mexico State University
U. S. A.

Divine Command Theory

Philosophers both past and present have sought to defend theories of ethics that are grounded in a theistic framework. Roughly, Divine Command Theory is the view that morality is somehow dependent upon God, and that moral obligation consists in obedience to God’s commands. Divine Command Theory includes the claim that morality is ultimately based on the commands or character of God, and that the morally right action is the one that God commands or requires. The specific content of these divine commands varies according to the particular religion and the particular views of the individual divine command theorist, but all versions of the theory hold in common the claim that morality and moral obligations ultimately depend on God.

Divine Command Theory has been and continues to be highly controversial. It has been criticized by numerous philosophers, including Plato, Kai Nielsen, and J. L. Mackie. The theory also has many defenders, both classic and contemporary, such as Thomas Aquinas, Robert Adams, and Philip Quinn. The question of the possible connections between religion and ethics is of interest to moral philosophers as well as philosophers of religion, but it also leads us to consider the role of religion in society as well as the nature of moral deliberation. Given this, the arguments offered for and against Divine Command Theory have both theoretical and practical importance.

Table of Contents

  1. Modern Moral Philosophy
  2. Some Possible Advantages of Divine Command Theory
  3. A Persistent Problem: The Euthyphro Dilemma
  4. Responses to the Euthyphro Dilemma
    1. Bite the Bullet
    2. Human Nature
    3. Alston’s Advice
    4. Modified Divine Command Theory
  5. Speech Acts and Obligations to Act
  6. Ethics Without God
  7. Other Objections to Divine Command Theory
    1. The Omnipotence Objection
    2. The Omnibenevolence Objection
    3. The Autonomy Objection
    4. The Pluralism Objection
  8. Conclusion: Religion, Morality, and the Good Life
  9. References and Further Reading

1. Modern Moral Philosophy

In her influential paper, “Modern Moral Philosophy,” Elizabeth Anscombe (1958) argues that moral terms such as “should” and “ought” acquired a legalistic sense (that is, being bound by law) because of Christianity’s far-reaching historical influence and its legalistic conception of ethics. For example, use of the term “ought” seems to suggest a verdict on an action, and this in turn suggests a judge. On a law conception of ethics, conformity with the virtues requires obeying the divine law. A divine law requires the existence of God, as the divine lawgiver. Anscombe claims that since we have given up on God’s existence, we should also give up the use of moral terms that are derived from a theistic worldview. Since we have given up belief in God, we should also give up the moral understanding that rests on such belief, and engage in moral philosophy without using such terms. For Anscombe, this meant that we should abandon talk of morality as law, and instead focus on morality as virtue.

Alan Donagan (1977) argues against these conclusions. Donagan’s view is that Anscombe was mistaken on two counts. First, he rejects her claim that we can only treat morality as a system of law if we also presuppose the existence of a divine lawgiver. Second, Donagan contends that neither must we abandon law-based conceptions of morality for an Aristotelian virtue ethic. The reason for this, according to Donagan, is that a divine command must express God’s reason in order for it to be expressive of a divine law. Given this, if we assume that human reason is at least in principle adequate for directing our lives, then the substance of divine law that is relevant to human life can be appreciated with human reason, apart from any reference to a divine being. Moreover, according to Donagan, even if we conceive of morality as Aristotle did, namely, as a matter of virtue, it is quite natural to think that each virtue has as its counterpart some moral rule or precept. For example, ‘to act in manner x is to be just’ has as its counterpart ‘to act in manner x is morally right’. And if we can apprehend the relevant moral virtue via human reason, then we can also apprehend the relevant moral law by that same reason. Given the foregoing points raised by Anscombe and Donagan, a divine command theorist might opt for a conception of morality as virtue, as law, or both.

Before looking at some possible advantages of Divine Command Theory, it will be helpful to clarify further the content of the view. Edward Wierenga (1989) points out that there are many ways to conceive of the connection between God and morality. A strong version of Divine Command Theory includes the claim that moral statements (x is obligatory) are defined in terms of theological statements (x is commanded by God). At the other end of the spectrum is the view that the commands of God are coextensive with the demands of morality. God’s commands do not determine morality, but rather inform us about its content. Wierenga opts for a view that lies between these strong and weak versions of Divine Command Theory. In what follows, I will, following Wierenga, take Divine Command Theory to include the following claims: (i) God in some sense determines what is moral; (ii) moral obligations are derived from God’s commands, where these commands are understood as statements of the revealed divine will.

2. Some Possible Advantages of Divine Command Theory

In his Critique of Practical Reason, Immanuel Kant, who has traditionally not been seen as an advocate of Divine Command Theory (for an opposing view see Nuyen, 1998), claims that morality requires faith in God and an afterlife. According to Kant, we must believe that God exists because the requirements of morality are too much for us to bear. We must believe that there is a God who will help us satisfy the demands of the moral law. With such a belief, we have the hope that we will be able to live moral lives. Moreover, Kant argues that “there is not the slightest ground in the moral law for a necessary connection between the morality and proportionate happiness of a being who belongs to the world as one of its parts and is thus dependent on it” (p. 131). However, if there is a God and an afterlife where the righteous are rewarded with happiness and justice obtains, this problem goes away. That is, being moral does not guarantee happiness, so we must believe in a God who will reward the morally righteous with happiness. Kant does not employ the concept of moral faith as an argument for Divine Command Theory, but a contemporary advocate could argue along Kantian lines that these advantages do accrue to this view of morality.

Another possible advantage of Divine Command Theory is that it provides an objective metaphysical foundation for morality. For those committed to the existence of objective moral truths, such truths seem to fit well within a theistic framework. That is, if the origin of the universe is a personal moral being, then the existence of objective moral truths are at home, so to speak, in the universe. By contrast, if the origin of the universe is non-moral, then the existence of such truths becomes philosophically perplexing, because it is unclear how moral properties can come into existence via non-moral origins. Given the metaphysical insight that ex nihilo, nihilo fit, the resulting claim is that out of the non-moral, nothing moral comes. Objective moral properties stick out due to a lack of naturalness of fit in an entirely naturalistic universe. This perspective assumes that objective moral properties exist, which is of course highly controversial.

Not only does Divine Command Theory provide a metaphysical basis for morality, but according to many it also gives us a good answer to the question, why be moral? William Lane Craig argues that this is an advantage of a view of ethics that is grounded in God. On theism, we are held accountable for our actions by God. Those who do evil will be punished, and those who live morally upstanding lives will be vindicated and even rewarded. Good, in the end, triumphs over evil. Justice will win out. Moreover, on a theistic view of ethics, we have a reason to act in ways that run counter to our self-interest, because such actions of self-sacrifice have deep significance and merit within a theistic framework. On Divine Command Theory it is therefore rational to sacrifice my own well-being for the well-being of my children, my friends, and even complete strangers, because God approves of and even commands such acts of self-sacrifice.

An important objection to the foregoing points is that there is something inadequate about a punishment and reward orientation of moral motivation. That is, one might argue that if the motive for being moral on Divine Command Theory is to merely avoid punishment and perhaps gain eternal bliss, then this is less than ideal as an account of moral motivation, because it is a mark of moral immaturity. Should we not instead seek to live moral lives in community with others because we value them and desire their happiness? In response to this, advocates of Divine Command Theory may offer different accounts of moral motivation, agreeing that a moral motivation based solely on reward and punishment is inadequate. For example, perhaps the reason to be moral is that God designed human beings to be constituted in such a way that being moral is a necessary condition for human flourishing. Some might object that this is overly egoistic, but at any rate it seems less objectionable than the motivation to be moral provided by the mere desire to avoid punishment. Augustine (see Kent, 2001) develops a view along these lines. Augustine begins with the notion that ethics is the pursuit of the supreme good, which provides the happiness that all humans seek. He then claims that the way to obtain this happiness is to love the right objects, that is, those that are worthy of our love, in the right way. In order to do this, we must love God, and then we will be able to love our friends, physical objects, and everything else in the right way and in the right amount. On Augustine’s view, love of God helps us to orient our other loves in the proper way, proportional to their value. However, even if these points in defense of Divine Command Theory are thought to be satisfactory, there is another problem looming for the view that was famously discussed by Plato over two thousand years ago.

3. A Persistent Problem: The Euthyphro Dilemma

The dialogue between Socrates and Euthyphro is nearly omnipresent in philosophical discussions of the relationship between God and ethics. In this dialogue, written by Plato (1981), who was a student of Socrates, Euthyphro and Socrates encounter each other in the king’s court. Charges have been brought against Socrates by Miletus, who claims that Socrates is guilty of corrupting the youth of Athens by leading them away from belief in the proper gods. In the course of their conversation, Socrates is surprised to discover that Euthyphro is prosecuting his own father for the murder of a servant. Euthyphro’s family is upset with him because of this, and they believe that what he is doing—prosecuting his own father—is impious. Euthyphro maintains that his family fails to understand the divine attitude to his action. This then sets the stage for a discussion of the nature of piety between Socrates and Euthyphro. In this discussion, Socrates asks Euthyphro the now philosophically famous question that he and any divine command theorist must consider: “Is the pious loved by the gods because it is pious, or is it pious because it is loved by the gods?” (p. 14).

For our purposes, it will be useful to rephrase Socrates’ question. Socrates can be understood as asking “Does God command this particular action because it is morally right, or is it morally right because God commands it?” It is in answering this question that the divine command theorist encounters a difficulty. A defender of Divine Command Theory might respond that an action is morally right because God commands it. However, the implication of this response is that if God commanded that we inflict suffering on others for fun, then doing so would be morally right. We would be obligated to do so, because God commanded it. This is because, on Divine Command Theory, the reason that inflicting such suffering is wrong is that God commands us not to do it. However, if God commanded us to inflict such suffering, doing so would become the morally right thing to do. The problem for this response to Socrates’ question, then, is that God’s commands and therefore the foundations of morality become arbitrary, which then allows for morally reprehensible actions to become morally obligatory.

Most advocates of Divine Command Theory do not want to be stuck with the implication that cruelty could possibly be morally right, nor do they want to accept the implication that the foundations of morality are arbitrary. So, a divine command theorist might avoid this problem of arbitrariness by opting for a different answer to Socrates’ question, and say that for any particular action that God commands, He commands it because it is morally right. By taking this route, the divine command theorist avoids having to accept that inflicting suffering on others for fun could be a morally right action. More generally, she avoids the arbitrariness that plagues any Divine Command Theory which includes the claim that an action is right solely because God commands it. However, two new problems now arise. If God commands a particular action because it is morally right, then ethics no longer depends on God in the way that Divine Command Theorists maintain. God is no longer the author of ethics, but rather a mere recognizer of right and wrong. As such, God no longer serves as the foundation of ethics. Moreover, it now seems that God has become subject to an external moral law, and is no longer sovereign. John Arthur (2005) puts the point this way: “If God approves kindness because it is a virtue and hates the Nazis because they were evil, then it seems that God discovers morality rather than inventing it” (20, emphasis added). God is no longer sovereign over the entire universe, but rather is subject to a moral law external to himself. The notion that God is subject to an external moral law is also a problem for theists who hold that in the great chain of being, God is at the top. Here, there is a moral law external to and higher than God, and this is a consequence that many divine command theorists would want to reject. Hence, the advocate of a Divine Command Theory of ethics faces a dilemma: morality either rests on arbitrary foundations, or God is not the source of ethics and is subject to an external moral law, both of which allegedly compromise his supreme moral and metaphysical status.

4. Responses to the Euthyphro Dilemma

a. Bite the Bullet

One possible response to the Euthyphro Dilemma is to simply accept that if God does command cruelty, then inflicting it upon others would be morally obligatory. In Super 4 Libros Sententiarum, William of Ockham states that the actions which we call “theft” and “adultery” would be obligatory for us if God commanded us to do them. Most people find this to be an unacceptable view of moral obligation, on the grounds that any theory of ethics that leaves open the possibility that such actions are morally praiseworthy is fatally flawed. However, as Robert Adams (1987) points out, a full understanding of Ockham’s view here would emphasize that it is a mere logical possibility that God could command adultery or cruelty, and not a real possibility. That is, even if it is logically possible that God could command cruelty, it is not something that God will do, given his character in the actual world. Given this, Ockham himself was surely not prepared to inflict suffering on others if God commanded it. Even with this proviso, however, many reject this type of response to the Euthyphro Dilemma.

b. Human Nature

Another response to the Euthyphro Dilemma which is intended to avoid the problem of arbitrariness is discussed by Clark and Poortenga (2003), drawing upon the moral theory of Thomas Aquinas. If we conceive of the good life for human beings as consisting in activities and character qualities that fulfill us, then the good life will depend upon our nature, as human beings. Given human nature, some activities and character traits will fulfill us, and some will not. For example, neither drinking gasoline nor lying nor committing adultery will help us to function properly and so be fulfilled, as human beings. God created us with a certain nature. Once he has done this, he cannot arbitrarily decide what is good or bad for us, what will help or hinder us from functioning properly. God could have created us differently. That is, it is possible that he could have made us to thrive and be fulfilled by ingesting gasoline, lying, and committing adultery. But, according to Aquinas, he did no such thing. We must live lives marked by a love for God and other people, if we want to be fulfilled as human beings. The defender of this type of response to the Euthyphro Dilemma, to avoid the charge of arbitrariness, should explain why God created us with the nature that we possess, rather than some other nature. What grounded this decision? A satisfactory answer will include the claim that there is something valuable about human beings and the nature that we possess that grounded God’s decision, but it is incumbent upon the proponent of this response to defend this claim.

c. Alston’s Advice

In his “Some Suggestions for Divine Command Theorists”, William Alston (1990) offers some advice to advocates of Divine Command Theory, which Alston believes will make the view as philosophically strong as it can be. Alston formulates the Euthyphro dilemma as a question regarding which of the two following statements a divine command theorist should accept:

1. We ought to love one another because God commands us to do so.

or

2. God commands us to love one another because that is what we ought to do.

Alston’s argument is that if we interpret these statements correctly, a theist can in fact grasp both horns of this putative dilemma. One problem with opting for number 1 in the above dilemma is that it becomes difficult if not impossible to conceive of God as morally good, because if the standards of moral goodness are set by God’s commands, then the claim “God is morally good” is equivalent to “God obeys His own commands”. But this trivialization is not what we mean when we assert that God is morally good. Alston argues that a divine command theorist can avoid this problem by conceiving of God’s moral goodness as something distinct from conformity to moral obligations, and so as something distinct from conformity to divine commands. Alston summarizes his argument for this claim as follows:

…a necessary condition of the truth that ‘S ought to do A’ is at least the metaphysical possibility that S does not do A. On this view, moral obligations attach to all human beings, even those so saintly as to totally lack any tendency, in the ordinary sense of that term, to do other than what it is morally good to do. And no moral obligations attach to God, assuming, as we are here, that God is essentially perfectly good. Thus divine commands can be constitutive of moral obligations for those beings who have them without it being the case that God’s goodness consists in His obeying His own commands, or, indeed, consists in any relation whatsoever of God to His commands (p. 315).

Alston concludes that Divine Command Theory survives the first horn of the dilemma. However, in so doing, perhaps the theory is delivered a fatal blow by the dilemma’s second horn. If the divine command theorist holds that “God commands us to love our neighbor because it is morally good that we should do so,” then moral goodness is independent of God’s will and moral facts stand over God, so to speak, insofar as God is now subject to such facts. Hence, God is no longer absolutely sovereign. One response is to say that God is subject to moral principles in the same way that he is subject to logical principles, which nearly all agree does not compromise his sovereignty (See The Omnipotence Objection below). Alston prefers a different option, however, and argues that we can think of God himself as the supreme standard of goodness. God does not consult some independent Platonic realm where the objective principles of goodness exist, but rather God just acts according to his necessarily good character. But is not arbitrariness still present, insofar as it seems that it is arbitrary to take a particular individual as the standard of goodness, without reference to the individual’s conformity to general principles of goodness? In response, Alston points out that there must be a stopping point for any explanation. That is, sooner or later, when we are seeking an answer to the question “By virtue of what does good supervene on these characteristics?” we ultimately reach either a general principle or an individual paradigm. And Alston’s view is that it is no more arbitrary to invoke God as the supreme moral standard than it is to invoke some supreme moral principle. That is, the claim that good supervenes on God is no more arbitrary than the claim that it supervenes on some Platonic principle.

d. Modified Divine Command Theory

Robert Adams (1987) has offered a modified version of the Divine Command Theory, which a defender of the theory can appropriate in response to the Euthyphro Dilemma. Adams argues that a modified divine command theorist “wants to say…that an act is wrong if and only if it is contrary to God’s will or commands (assuming God loves us)” (121). Moreover, Adams claims that the following is a necessary truth: “Any action is ethically wrong if and only if it is contrary to the commands of a loving God” (132). On this modification of Divine Command Theory, actions, and perhaps intentions and individuals, possess the property of ethical wrongness, and this property is an objective property. That is, an action such as torturing someone for fun is ethically wrong, irrespective of whether anyone actually believes that it is wrong, and it is wrong because it is contrary to the commands of a loving God.

One could agree with this modification of Divine Command Theory, but disagree with the claim that it is a necessary truth that any action is ethically wrong if and only if it is contrary to the commands of a loving God. One might hold that this claim is a contingent truth, that is, that in the actual world, being contrary to the commands of a loving God is what constitutes ethical wrongness, but that there are other possible worlds in which ethical wrongness is not identified with being contrary to the commands of a loving God. It should be pointed out that for the theist who wants to argue from the existence of objective moral properties back to the existence of God, Adams’ stronger claim, namely, that an action is wrong if and only if it goes against the commands of a loving God, should be taken as a necessary truth, rather than a contingent one.

At any rate, whichever option a modified divine command theorist chooses, the modification at issue is aimed at avoiding both horns of the Euthyphro Dilemma. The first horn of the dilemma posed by Socrates to Euthyphro is that if an act is morally right because God commands it, then morality becomes arbitrary. Given this, we could be morally obligated to inflict cruelty upon others. The Modified Divine Command Theory avoids this problem, because morality is not based on the mere commands of God, but is rooted in the unchanging omnibenevolent nature of God. Hence, morality is not arbitrary nor would God command cruelty for its own sake, because God’s nature is fixed and unchanging, and to do so would violate it. It is not possible for a loving God to command cruelty for its own sake. The Modified Divine Command Theory is also thought to avoid the second horn of the Euthyphro Dilemma. God is the source of morality, because morality is grounded in the character of God. Moreover, God is not subject to a moral law that exists external to him. On the Modified Divine Command Theory, the moral law is a feature of God’s nature. Given that the moral law exists internal to God, in this sense, God is not subject to an external moral law, but rather is that moral law. God therefore retains his supreme moral and metaphysical status. Morality, for the modified divine command theorist, is ultimately grounded in the perfect nature of God.

5. Speech Acts and Obligations to Act

Philip Quinn (1978, 1998) offers the following two statements, which he takes to be equivalent:

  1. The moral law imposes the obligation that p.
  2. God commands that p.

For Quinn, then, an agent is obliged to p just in case God commands that p. God is the source of moral obligation. Quinn illustrates and expands on this claim by examining scriptural stories in which God commands some action that apparently violates a previous divine command. Consider God’s command to the Israelites to plunder the Egyptians reported in Exodus 11:2. This seems to go against God’s previous command, contained within the Ten Commandments, against theft. One response to this offered by Quinn is to claim that since theft involves taking what is not due one, and God commanded the Israelites to plunder the Egyptians, their plunder of the Egyptians does not count as theft. The divine command makes obligatory an action that would have been wrong apart from that command. Such moral power is not available to human beings, because only God has such moral authority by virtue of the divine nature.

Elsewhere, Quinn (1979) considers a different relationship between divine commands and moral obligations. Rather than equivalence, Quinn offers a causal theory in which our moral obligations are created by divine commands or acts of will: “…a sufficient causal condition that it is obligatory that p is that God commands that p, and a necessary causal condition that it is obligatory that p is that God commands that p” (312).

Quinn’s accounts lead us to the question of the relationship between speech acts and obligations to act, discussed by philosophers such as Rawls (1999) and Searle (1969). Consider the act of making a promise. If S promises R to do a, is this sufficient for S incurring an obligation to do a? On the account offered by Rawls, under certain conditions, the answer is yes. Just as rules govern games, there is a public system of rules that governs the institution of promising, such that when S promises R to do a, the rule is that S ought to do a, unless certain conditions obtain which excuse S from this obligation. If S is to make a genuine promise that is morally binding, S must be fully conscious, rational, aware of the meaning and use of the relevant words, and free from coercion. For Rawls, promising allows us to enter into stable cooperative agreements that are mutually advantageous. If the institution of promise making is just, then Rawls argues that the principle of fairness applies. For Rawls, the principle of fairness states that “a person is required to do his part as defined by the rules of an institution when two conditions are met: first, the institution is just (or fair)…and second, one has voluntarily accepted the benefits of the arrangement or taken advantage of the opportunities it offers to further one’s interests” (96). If these conditions are met, then S does incur an obligation to do a by virtue of S’s promise to R.

What implications does the above have for Divine Command Theory? Speech acts can entail obligations, as we have seen with respect to the institution of promise making. However, the case of divine commands is asymmetrical to the case of promising. That is, rather than incurring obligations by our own speech acts, Divine Command Theory tells us that we incur obligations by the communicative acts of another, namely, God. How might this work?

An advocate of Divine Command Theory might argue that some of Rawls points apply to the obligations created by the communicative acts of God. For example, our divine command theorist might claim that if God commands S to do a, S must do a if S meets Rawls’ demands of full consciousness, rationality, awareness of the meaning and use of the relevant words, and freedom from coercion. The rule of fairness applies and its demands are satisfied, according to our divine command theorist, because she holds that the institution of obedience to God’s commands is just and fair, given God’s nature, and because S has voluntarily accepted the benefits of this arrangement with God or taken advantage of the opportunities afforded by the arrangement to further her own interests. So, if S has consented to be a follower of a particular religion, and if the requirements of that religion are just and fair, and if S benefits from this arrangement, then S can incur obligations via divine commands. The upshot is not that the foregoing religious and metaphysical claims are true, but rather that by applying some of Rawls’ claims about promise making, we are able to recognize a possible connection between divine commands and the obligation to perform an action. In the next section, Kai Nielsen challenges the truth of these claims, as well as the overall plausibility of Divine Command Theory.

6. Ethics Without God

In his Ethics Without God, Kai Nielsen (1973) argues against the Divine Command Theory and espouses the view that morality cannot be dependent on the will of God. Nielsen advances an argument for the claim that religion and morality are logically independent. Nielsen admits that it may certainly be prudent to obey the commands of any powerful person, including God. However, it does not follow that such obedience is morally obligatory. For a command of God’s to be relevant to our moral obligations in any particular instance, God must be good. And while the religious believer does maintain that God is good, Nielsen wants to know the basis for such a belief. In response, a believer might claim that she knows God is good because the Bible teaches this, or because Jesus embodied and displayed God’s goodness, or that the world contains evidence in support of the claim that God is good. However, these responses show that the believer herself has some logically prior criterion of goodness based on something apart from the mere fact that God exists or that God created the universe. Otherwise, how does she know that her other beliefs about the Bible, Jesus, or the state of the world support her belief that God is good? Alternatively, the religious believer might simply assert that the statement “God is good” is analytic, that is, that it is a truth of language. The idea here is that we are logically prohibited from calling any entity “God” if that entity is not good in the relevant sense. In this way, the claim “God is good” is similar to the claim “Bachelors are unmarried males.” But now another problem arises for the religious believer, according to Nielsen. In order to properly refer to some entity as “God,” we must already have an understanding of what it is for something to be good. We must already possess a criterion for making judgments of moral goodness, apart from the will of God. Put another way, when we say that we know God is good we must use some independent moral criterion to ground this judgment. So, morality is not based on God because we need a criterion of goodness that is not derived from God’s nature. It follows that God and morality are independent.

Nielsen considers another possibility that remains open to the divine command theorist: she might concede that ethics does not necessarily depend on God, but maintain that God is required for the existence of an adequate morality, that is, one that satisfies our most persistent moral demands. If we take happiness to be the ultimate aim of all human activity, then the ultimate aim of all of our moral activity is also happiness. The divine command theorist can then claim that the mistake of Nielsen and other secular moralists is that they fail to see that only in God can we as human beings find ultimate and lasting happiness. God gives purpose to our lives, and we are fulfilled in loving God. Given this fact of human nature, the divine command theorist can argue that only by faith in God can we find purpose in life. Goodness may not be identical with the will of God, but loving God is the reason we exist. On this account, we need God to be fulfilled and truly happy. We are secure in the knowledge that the universe is not against us, ultimately, but rather that God will guide us, protect us, and care for us. This frees us from anxiety, and enables us to direct our lives towards genuine happiness by living according to the will of God in friendship with God. While from a secular perspective it may seem irrational to live according to an other-regarding ethic, from the viewpoint of the religious believer it is rational because it fulfills our human nature and makes us genuinely happy.

In response to this, Nielsen argues that we simply do not have evidence for the existence of God. Without such evidence, the religious believer’s claim that human nature is truly fulfilled in relationship to God is groundless (for more on the issues Nielsen raises, see Moreland and Nielsen, 1990). Moreover, people can, have, and do live purposeful lives apart from belief in God. Religious faith is not necessary for having a life of purpose. Nielsen adds the skeptical doubt that human beings do not have any ultimate function that we must fulfill to be truly happy. We were not made for anything. This realization need not lead us to nihilism, however. For Nielsen, the notion that in order to have a purpose for our lives there must be a God trades on a confusion. Nielsen argues that even if there is no purpose of life, there can still be a purpose in life. While there may not be a purpose for humans qua humans, we can still have purpose in another sense. That is, we can have purpose in life because we have goals, intentions, and motives. Life is purposeless in the larger sense, but in this more restricted sense it is not, and so things matter to us, even if God does not exist. Life has no Purpose, but our lives can still have purpose. A divine command theorist would likely challenge Nielsen’s view that purpose in the latter sense is sufficient for human flourishing.

7. Other Objections to Divine Command Theory

a. The Omnipotence Objection

An implication of the Modified Divine Command Theory is that God would not, and indeed cannot, command cruelty for its own sake. Some would argue that this implication is inconsistent with the belief that God is omnipotent. How could there be anything that an all-powerful being cannot do?

In his discussion of the omnipotence of God, Thomas Aquinas responds to this understanding of omnipotence, and argues that it is misguided. Aquinas argues that we must consider “the precise meaning of ‘all’ when we say that God can do all things” (First Part, Question 25, Article 3). For Aquinas, to say that God can do all things is to say that he can do all things that are possible, and not those that are impossible. For example, God cannot make a round corner, because this is absolutely impossible. Since “a round corner” is a contradiction in terms, it is better to say that making a round corner cannot be done, rather than God cannot make such a thing. This response, however, is insufficient for the issue at hand, namely, that on a Modified Divine Command Theory, God would not and cannot command cruelty for its own sake. There is no logical contradiction in terms here, as there is in the case of the round corner. Aquinas offers a further response to this sort of challenge to God’s omnipotence. His view is that “to sin is to fall short of a perfect action; which is repugnant to omnipotence” (Ibid). For Aquinas, there is something about the nature of sin (a category in which commanding cruelty for its own sake would fall) that is contrary to omnipotence. Hence, that God cannot do immoral actions is not a limit on his power, but rather it is entailed by his omnipotence. Aquinas’ view is that God cannot command cruelty because he is omnipotent.

b. The Omnibenevolence Objection

On Divine Command Theory, it problematically appears that God’s goodness consists in God doing whatever he wills to do. This problem has been given voice by Leibniz (1951), and has recently been discussed by Quinn (1978), Wierenga (1989), Alston (1989), and Wainright (2005). The problem is this: if what it means for an action to be morally required is that it be commanded by God, then God’s doing what he is obligated to do is equivalent to his doing what he commands himself to do. This, however, is incoherent. While it makes sense to conceive of God as forming an intention to do an action, or judging that it would be good to do an action, the notion that he commands himself to do an action is incoherent. Moreover, on Divine Command Theory, God could not be seen as possessing moral virtues, because a moral virtue would be a disposition to do an action that God commands. This is also incoherent.

In response, divine command theorists have argued that they can still make sense of God’s goodness, by pointing out that he possesses traits which are good as distinguished from being morally obligatory. For example, God may be disposed to love human beings, treat them with compassion, and deal with them fairly. These dispositions are good, even if they are not grounded in a disposition to obey God. And if we take these dispositions to be essential to God’s nature, that is, if they are possessed by God in every possible world in which God exists, then, as Wierenga (1989) points out, while it is still the case that whatever God does is good, “the range of ‘whatever God were to do’ includes no actions for which God would not be praiseworthy” (p. 222). Wainright (2005) explains further that while it is true that the moral obligatoriness of truth telling could not have been God’s reason for commanding it, the claim that God does not have moral reasons for commanding it does not follow. This is because the moral goodness of truth telling is a sufficient reason for God to command it. Once God does command it, truth telling is not only morally good, but it also becomes morally obligatory, on Divine Command Theory.

c. The Autonomy Objection

The idea that to be morally mature, one must freely decide which moral principles will govern one’s life serves as an objection to Divine Command Theory, because on the theory it is not our own wills that govern our moral lives, but the will of God. We are no longer self-legislating beings in the moral realm, but instead followers of a moral law imposed on us from the outside. In this sense, autonomy is incompatible with Divine Command Theory, insofar as on the theory we do not impose the moral law upon ourselves. However, Adams (1999) argues that Divine Command Theory and moral responsibility are compatible, because we are responsible for obeying or not obeying God’s commands, correctly understanding and applying them, and adopting a self-critical stance with respect to what God has commanded us to do. Given this, we are autonomous because we must rely on our own independent judgments about God’s goodness and what moral laws are in consistent with God’s commands. Additionally, it seems that a divine command theorist can still say that we impose the moral law on ourselves by our agreeing to subject ourselves to it once we come to understand it, even if it ultimately is grounded in God’s commands.

d. The Pluralism Objection

The last objection to note is that given the variety and number of religions in the world, how does the divine command theorist know which (putatively) divine commands to follow? The religions of the world often give conflicting accounts of the nature and content of the commands of God. Moreover, even if such a person believes that her religion is correct, there remains a plurality of understandings within religious traditions with respect to what God commands us to do. In response, some of the issues raised above regarding autonomy are relevant. A divine command theorist must decide for herself, based on the available evidence, which understanding of the divine to adopt and which understanding of divine commands within her particular tradition she finds to be the most compelling. This is similar to the activity and deliberation of a secular moralist who must also decide for herself, among a plurality of moral traditions and interpretations within those traditions, which moral principles to adopt and allow to govern her life. This takes us into another problem for divine command theory, namely, that it is only those who follow the correct religion, and the correct interpretation of that religion, that are moral, which seems highly problematic. However, Divine Command Theory is consistent with the belief that numerous religions contain moral truth, and that we can come to know our moral obligations apart from revelation, tradition, and religious practice. For example, a divine command theorist could grant that a philosophical naturalist may come to see that beneficence is intrinsically good through a rational insight into the necessary character of reality (see Austin, 2003). It is consistent with Divine Command Theory that we can come to see our obligations in this and many other ways, and not merely through a religious text, religious experience, or religious tradition.

8. Conclusion: Religion, Morality, and the Good Life

In his A Just Society (2004), Michael Boylan argues that we must engage in self-analysis for the purpose of both constructing and implementing a personal plan of life that is coherent, comprehensive, and good. In this activity, we must recognize that there are many types of values by which we live, including but not limited to religious, ethical, and aesthetic values. Of particular interest in this context is Boylan’s discussion of God’s command to Abraham to kill Isaac. Here we have a conflict between the religious and the ethical. Boylan notes that in the story, Abraham does not kill Isaac, but if he had his community must judge him to be a murderer. The reason for this is that Abraham’s community does not know whether the command to kill Isaac was a legitimate divine command, or some delusion of Abraham’s. So, this community must depend upon the ethical prohibition against murder when evaluating Abraham’s actions. Boylan’s position contrasts with Kierkegaard’s, who is generally interpreted as believing that Abraham’s action is justified by a suspension of the ethical, so that in this case the religious trumps the ethical. However, in such disputes, Boylan argues that when the commands of religion (or the values of aesthetics) clash with the demands of morality, in a just society morality should win the day.

Regardless of what one makes of this, when evaluating the philosophical merits and drawbacks of Divine Command Theory, one should take a broad perspective and consider the possible connections between the theory and other religious and moral issues, as well as the relevant aesthetic, epistemic, and metaphysical questions, in order to develop a personal plan of life that is coherent, comprehensive, and good.

9. References and Further Reading

  • Adams, Robert M. 1987. The Virtue of Faith and Other Essays in Philosophical Theology. New York: Oxford University Press.
  • Adams, Robert M. 1999. Finite and Infinite Goods. New York: Oxford University Press.
  • Alston, William. 1989. Divine Nature and Human Language: Essays in Philosophical Theology. Ithaca, N.Y.: Cornell University Press.
  • Alston, William. 1990. “Some Suggestions for Divine Command Theorists.” In Christian Theism and the Problems of Philosophy. Edited by Michael Beaty. Notre Dame, Ind.: University of Notre Dame Press: 303-326.
  • Anscombe, G. E. M. 1958. “Modern Moral Philosophy.” Philosophy 33: 1-19.
  • Arthur, John. 2005. “Morality, Religion, and Conscience.” In Morality and Moral Controversies: Readings in Moral, Social, and Political Philosophy. Edited by John Arthur. Seventh edition. Upper Saddle River, N.J.: Pearson Prentice Hall: 15-23.
  • Audi, Robert and William Wainwright. 1986. Rationality, Religious Belief, and Moral Commitment. Ithaca, N.Y.: Cornell University Press.
  • Austin, Michael W. 2003. “On the Alleged Irrationality of Ethical Intuitionism: Are Ethical Intuitions Epistemically Suspect?” Southwest Philosophy Review 19: 205-213.
  • Beaty, Michael, ed. 1990. Christian Theism and the Problems of Philosophy. Notre Dame, Ind.: University of Notre Dame Press.
  • Beaty, Michael, Carlton Fisher, and Mark Nelson, eds. 1998. Christian Theism and Moral Philosophy. Macon, Geo.: Mercer University Press.
  • Boylan, Michael. 2004. A Just Society. Lanham, Md.: Rowman and Littlefield.
  • Clark, Kelly James and Anne Poortenga. 2003. The Story of Ethics: Fulfilling Our Human Nature. Upper Saddle River, N.J.: Prentice Hall.
  • Copan, Paul. 2003. “Morality and Meaning Without God: Another Failed Attempt.” Philosophia Christi Series 2, 6: 295-304.
  • Donagan, Alan. 1977. The Theory of Morality. Chicago: The University of Chicago Press.
  • Hare, John. 1997. The Moral Gap: Kantian Ethics, Human Limits, and God’s Assistance. New York: Oxford University Press.
  • Hare, John. 2000. “Naturalism and Morality.” In Naturalism: A Critical Analysis. Edited by William Lane Craig and J. P. Moreland. New York: Routledge: 189-212.
  • Kant, Immanuel. 1993. Critique of Practical Reason. Third Edition. Translated by Lewis White Beck. Upper Saddle River, N.J.: Prentice Hall.
  • Kent, Bonnie. “Augustine’s Ethics.” 2001. In The Cambridge Companion to Augustine. Edited by Eleonore Stump and Norman Kretzmann. New York: Cambridge University Press: 205-233.
  • Kierkegaard, Søren. 1985. Fear and Trembling. Translated by Alastair Hannay. New York: Penguin.
  • Kretzmann, Norman. 1983. “Abraham, Isaac, and Euthyphro: God and the Basis of Morality.” In Hamarti, The Concept of Error in the Western Tradition: Essays in Honor of John M. Crossett. Edited by D.V. Stump, E. Stump, J.A. Arieti, and L. Gerson. New York: Edwin Mellen Press.
  • Leibniz, Gottfried Wilhelm. 1951. Theodicy. London: Routledge, Kegan, and Paul.
  • Mackie, J. L. 1977. Ethics: Inventing Right and Wrong. New York: Penguin Books.
  • Moreland, J. P. and Kai Nielsen. 1990. Does God Exist?: The Great Debate. Nashville: Thomas Nelson.
  • Morris, Thomas V. 1987. “Duty and Divine Goodness.” American Philosophical Quarterly 21.
  • Morris, Thomas V. 1991. Our Idea of God: An Introduction to Philosophical Theology. Notre Dame, Ind.: University of Notre Dame Press.
  • Morriston, Wes. 2001. “Must There Be a Standard of Moral Goodness Apart from God?” Philosophia Christi Series 2, 3: 127-138.
  • Murphy, Mark. “Divine Command, Divine Will, and Moral Obligation.” Faith and Philosophy 15 (1998): 3-27.
  • Nielsen, Kai. 1973. Ethics Without God. Buffalo, N.Y.: Prometheus Books.
  • Nuyen, R. T. 1998. “Is Kant a Divine Command Theorist?” History of Philosophy Quarterly 15: 441-453.
  • Plato. 1981. Five Dialogues: Euthyphro, Apology, Crito, Meno, Phaedo. Translated by G. M. A. Grube. Indianapolis, Ind.: Hackett Publishing Company.
  • Quinn, Philip L. 1978. Divine Commands and Moral Requirements. Oxford: Clarendon Press.
  • Quinn, Philip L. 1979. “Divine Command Ethics: A Causal Theory.” In Divine Command Morality: Historical and Contemporary Readings. Edited by Janine Idziak. New York: Edwin Mellen Press, 1979: 305-325.
  • Quinn, Philip. 1992. “The Primacy of God’s Will in Christian Ethics.” Philosophical Perspectives 6: 493-513.
  • Stump, Eleonore, and Norman Kretzmann. 1985. “Absolute Simplicity.” Faith and Philosophy 2: 353-382.
  • Stump, Eleonore. 2001. “Evil and the Nature of Faith.” In Seeking Understanding: The Stob Lectures 1986-1998. Grand Rapids, Mich.: Eerdmans: 530-550.
  • Thomas Aquinas, Saint. 1947. The Summa Theologica. Translated by the Fathers of the English Dominican Province.
  • Wainright, William J. 2005. Religion and Morality. Burlington, Verm.: Ashgate.
  • Wierenga, Edward. 1983. “A Defensible Divine Command Theory.” Nous 17, pp. 387-407.
  • Wierenga, Edward. 1989. The Nature of God: An Inquiry into Divine Attributes. Ithaca, N.Y.: Cornell University Press.
  • William of Ockham. Super 4 Libros Sententiarum II, 19.
  • Zagzebski, Linda. 2004. Divine Motivation Theory. New York: Cambridge University Press.

Author Information

Michael W. Austin
Email: mike.austin@eku.edu
Eastern Kentucky University
U. S. A.

Contextualism in Epistemology

In very general terms, epistemological contextualism maintains that whether one knows is somehow relative to context. Certain features of contexts—features such as the intentions and presuppositions of the members of a conversational context—shape the standards that one must meet in order for one’s beliefs to count as knowledge. This allows for the possibility that different contexts set different epistemic standards, and contextualists invariably maintain that the standards do in fact vary from context to context. In some contexts, the epistemic standards are unusually high, and it is difficult, if not impossible, for our beliefs to count as knowledge in such contexts. In most contexts, however, the epistemic standards are comparatively low, and our beliefs can and often do count as knowledge in these contexts. The primary arguments for epistemological contextualism claim that contextualism best explains our epistemic judgments—it explains why we judge in most contexts that we have knowledge and why we judge in some contexts that we don’t—and that contextualism provides the best solution to puzzles generated by skeptical arguments.

Table of Contents

  1. Introduction
  2. Subjunctive Conditionals Contextualism
  3. Relevant Alternatives Contextualism and Rejecting Closure
    1. Dretske’s Relevant Alternatives Theory of Knowledge
    2. Relevant Alternatives Contextualisms that Reject Closure
  4. Relevant Alternatives Contextualism and Accepting Closure
  5. Contextualism and Epistemic Rationality
  6. Other Forms of Epistemological Contextualism
    1. Explanatory Contextualism
    2. Evidential Contextualism
    3. Contextualism as a Theory of Knowledge
  7. Objections to Contextualism
  8. Alternatives to Contextualism
  9. Conclusion
  10. References and Further Reading

1. Introduction

Epistemological contextualism has evolved primarily as a response to views that maintain that we have no knowledge of the world around us. Taking quite seriously the problems presented by skepticism, contextualists seek to resolve the apparent conflict between claims like the following:

  1. I know that I have hands.
  2. But I don’t know that I have hands if I don’t know that I’m not a brain-in-a-vat (that is, a bodiless brain that is floating in a vat of nutrients and that is electrochemically stimulated in a way that generates perceptual experiences that are exactly similar to those that I am now having in what I take to be normal circumstances).
  3. I don’t know that I’m not a brain-in-a-vat (henceforth, a BIV).

These claims, when taken together, present a puzzle. (1), (2), and (3) are independently plausible yet mutually inconsistent. That (1) is plausible seems to require no explanation. (3) is plausible because it seems that in order to know that I’m not a BIV, I must rule out the possibility that I am a BIV. Yet the BIV and I have perceptual experiences that are exactly similar—it seems to the BIV, just as it seems to me, that he has hands, that he is sitting at his desk and in front of his computer, and so on. Accordingly, my perceptual experiences give me no reason to favor the belief that I am not a BIV over the belief that I am. Thus, since I have only my perceptual experiences to go on, I cannot rule out the possibility that I’m a BIV. Considerations like these contribute to (3)’s plausibility.

Moreover, it seems that I can’t know that I have hands—and, in general, that I can’t know that I have any body at all —if I can’t rule out the possibility that I’m a bodiless BIV. This, then, contributes to the plausibility of (2). It seems in addition that (2) always retains its plausibility, no matter how high or low we set the standards for knowledge. Keith DeRose (1999a) defends this claim by noting that it is always a comparative fact that my epistemic position with respect to the claim that I’m not a BIV is just as strong as my epistemic position with respect to the claim that I have hands. If this is correct, then (2) is true across contexts, no matter what the epistemic standards.

Yet in spite of the fact that they are independently plausible, (1), (2), and (3) are mutually inconsistent; they cannot all be true. It seems, therefore, that we must give up one of these claims. But which one should we give up, and why?

In trying to answer these questions, contextualists maintain that ‘know’ either is or functions very much like an indexical, that is, an expression whose semantic content (or meaning) depends on the context of its use. For example, the word ‘here’ is an indexical. I say, “Jaime is here,” and what I mean depends on where I am when I say it. If I’m in the conference room, then I mean, all other things being equal, that Jaime is in the conference room. ‘I’ is also an indexical—its meaning depends on the context of its use and, in particular, on who is using it. When Jaime says, “I am in the conference room,” then he means, all other things being equal, that Jaime is in the conference room. Yet when Julie uses ‘I’, she means something different; Julie’s ‘I’ means Julie.

If ‘know’ is an indexical, its semantic content (or meaning) will depend on the context in which it is used. Furthermore, since context will affect the semantic content of ‘know’, context will have an effect on the semantic content of complex lexical items in which ‘know’ appears, for example, on the semantic content of knowledge attributions like ‘Jaime knows that he’s in the conference room’. Contextualists have put the point this way:

the truth-conditions of knowledge ascribing and knowledge denying sentences (sentences of the form ‘S knows that P’ and ‘S doesn’t know that P’ and related variants of such sentences) vary in certain ways according to the contexts in which they are uttered. What so varies is the epistemic standards that S must meet (or, in the case of a denial of knowledge, fail to meet) in order for such a statement to be true. (DeRose 1999a, p. 187)

Given this, contextualists maintain that (1), (2), and (3) do not in fact conflict, even though it seems that they do. They suggest, first of all, that some contexts set very high epistemic standards, standards according to which knowledge requires a great deal. Contexts in which these high standards are in play are typically those in which we are considering and taking seriously certain skeptical hypotheses. For example, in order to know anything at all about the world around us, these high standards might require us to rule out the possibility that we are BIVs, or the possibility that we are now dreaming, or the possibility that we are now being deceived by an omnipotent but malevolent demon. Yet our perceptual experiences afford us no evidence that would allow us to rule out these skeptical possibilities, for if we were BIVs, for example, we would be having exactly the same perceptual experiences that we’re now having. Thus, we fail to meet these high epistemic standards with respect both to the belief that I have hands and to the belief that I’m not a BIV. (1) is therefore false in these high-standards contexts while (3) is true. According to contextualists, then, we should reject (1) in high-standards contexts. When we do so, we are no longer faced with a conflict, for the conflict presents itself only when we insist on the truth of each of the three mutually inconsistent claims. Moreover, in rejecting (1) in high-standards contexts, contextualism gives the skeptic his due, and takes seriously the compelling nature of skeptical arguments.

Nevertheless, contextualists maintain that in most contexts, the epistemic standards are comparatively low. Typically, these are ordinary contexts in which we are considering no skeptical hypotheses. In such contexts, we can have knowledge of the world around us without eliminating skeptical possibilities like the BIV possibility. In order to know that I have a hand, for example, I need eliminate only possibilities like those in which I have no hands, or in which I have paws or claws instead of hands. Moreover, the evidence provided by my perceptual experiences—the evidence that I obtain by looking at my hands, or by hearing the sounds made when I clap them together—does allow me to eliminate these possibilities. Thus, we can meet the epistemic standards that are in place in low-standards contexts. (1) is therefore true in these contexts while (3) is false. According to contextualists, then, we should reject (3) in low-standards contexts. And here again, in rejecting (3), we keep the conflict between (1), (2), and (3) from presenting itself. Moreover, in rejecting (3) in low-standards contexts, contextualism allows us to retain our ordinary knowledge—it allows us to know the things we ordinarily take ourselves to know.

Yet if we are never actually faced with a conflict between (1), (2), and (3), why does it seem as if we are? Contextualists respond in this way: Since we most often find ourselves in low-standards contexts, we tend to evaluate knowledge attributions according to the epistemic standards that are in place in those contexts. Thus, we tend to reckon (1) true. However, since (3) makes explicit reference to BIVs, our evaluation of that claim tends to lead us to entertain the BIV skeptical scenario. Doing this can raise the epistemic standards—it can push us into a context in which the epistemic standards are quite high—and so we tend to reckon (3) true. And so it seems that we are faced with a conflict between (1), (2), and (3). Yet it merely seems as if we are faced with such a conflict. For, as we have seen, when the epistemic standards are high, (1) is false while (3) is true. But when the standards are lower, (1) is true while (3) is false.

Contextualism also allows us to explain why it seems in certain contexts that we don’t know that we have hands (for example). We make these epistemic judgments at least partly because it’s true in such contexts that we don’t know that we have hands. And we judge in other contexts that we know that we have hands at least partly because such claims are true in those other contexts. Thus, contextualism not only helps us to see our way out of apparent conflicts like those between (1), (2), and (3), but it also helps us to explain why we make the epistemic judgments that we do.

The most prominent forms of epistemological contextualism are based either on Robert Nozick’s subjunctive conditionals account of knowledge or on the relevant alternatives theory of knowledge that is associated with Fred Dretske and Alvin Goldman. The primary difference between these two forms of contextualism is in how they characterize epistemic standards. As we will see, the former characterizes the standards in terms of subjunctive conditionals, while the latter characterizes them in terms of relevant alternatives. We will consider subjunctive conditionals contextualism in Section 2 and relevant alternatives contextualism in Sections 3 and 4. Some forms of contextualism, however, are based on neither of these theories. One such view is the version of contextualism that Stewart Cohen advocates most recently, and we will consider this view in Section 5. Let us turn now, though, to subjunctive conditionals contextualism.

2. Subjunctive Conditionals Contextualism

Keith DeRose provides an influential brand of epistemological contextualism. It is intended to solve the puzzles generated by groups of statements like the following:

  1. I know that I have hands.
  2. But I don’t know that I have hands if I don’t know that I’m not a BIV.
  3. I don’t know that I’m not a BIV.

DeRose claims that in contexts in which the standards for knowledge are unusually high, we should reject (1) and that the skeptic can truthfully say in such contexts that I don’t know that I have hands. In other contexts, however, the epistemic standards are more relaxed and we can both reject (3) and correctly say that I do know that I have hands.

DeRose’s contextualist solution seeks to explain the plausibility of (3) by utilizing resources provided by Robert Nozick. Specifically, DeRose’s solution appeals to the Subjunctive Conditionals Account (SCA) of the plausibility of (3). According to SCA, “we have a very strong general, though not exceptionless, inclination to think that we don’t know that P when we think that our belief that P is a belief we would hold even if P were false” (DeRose 1999a, p. 193). DeRose calls the belief that P insensitive if it is one that we would hold even if P were false. SCA’s generalization thus becomes: We are inclined to think that S doesn’t know that P if we think that S’s belief that P is insensitive.

DeRose claims that even though this generalization does not represent our ordinary standard for knowledge, there are contexts in which the skeptic puts it into place as the standard (for example, by mentioning skeptical possibilities like the possibility that you are now a BIV). The standard in such contexts is the skeptical standard, according to which my beliefs must be sensitive if they are to count as knowledge. When this standard is in place, as it is in skeptical contexts, I fail to know that I’m not a BIV. For my belief that I’m not a BIV is not sensitive: I would believe that I wasn’t a BIV even if I were a BIV. Moreover, since (2) is true in all contexts, it follows that I don’t know in skeptical contexts that I have hands. In this way, DeRose’s contextualism explains the plausibility of (3) and gives the skeptic his due by arguing that there are contexts in which we should reject (1).

But DeRose wants to avoid the boldly skeptical conclusion that I never know that I have hands, and he does this by arguing that in ordinary contexts of knowledge attribution—contexts in which the skeptical standard is not in place and in which the epistemic standards are comparatively low—we can reject (3). In these contexts, the skeptical standard is not in place, and our beliefs need not be sensitive in order to count as knowledge. Thus, we can truthfully assert in ordinary contexts that I do know that I have hands. And, since (2) is true in all contexts, it follows that I know in ordinary contexts that I’m not a BIV. In this way, DeRose’s contextualism explains the plausibility of rejecting (3) and allows us to retain the knowledge that we ordinarily take ourselves to have.

According to DeRose, the relevant difference between these contexts is that the standards for knowledge are quite high in skeptical contexts but comparatively low in ordinary ones. But what accounts for this difference? DeRose recognizes that he must “explain how the standards for knowledge are raised [by the skeptic]” (DeRose 1999a, p. 206) if his solution is to be adequate. Essential to this explanation is DeRose’s Rule of Sensitivity:

When someone asserts that S knows (or does not know) that P, the standards for knowledge tend to be raised, if need be, to a level such that S’s belief that P must be sensitive if it is to count as knowledge. (DeRose 1999a, p. 206)

He then provides the following explanation of how the skeptic raises the standards.

In utilizing [puzzles like those generated by (1)-(3)] to attack our putative knowledge of O [where O is a proposition that we ordinarily take ourselves to know], the skeptic instinctively chooses her skeptical hypothesis, H, so that it will have these two features: (1) We will be in at least as strong a position to know that not-H as we’re in to know that O, but (2) Any belief we might have to the effect that not-H will be an insensitive belief…. Given feature (2), the skeptic’s assertion that we don’t know that not-H, by the Rule of Sensitivity, drives the standards for knowledge up to such a point as to make that assertion true. …And since we’re in no stronger an epistemic position with respect to O than we’re in with respect to not-H (feature (1)), then, at the high standards put in place by the skeptic’s assertion of [(3)], we also fail to know that O. (DeRose 1999a, pp. 206-7)

DeRose maintains, then, that the skeptic’s assertion is the mechanism she uses to raise the standards for knowledge. When the skeptic asserts that I don’t know that I’m not a BIV, the Rule of Sensitivity is invoked, and the standards for knowledge are raised to such a level that my beliefs must be sensitive if they are to count as knowledge. And since my belief that I’m not a BIV is not sensitive—that is, since I would believe that I wasn’t a BIV even if I were a BIV—I do not know in skeptical contexts that I’m not a BIV. Thus, given the truth of (2), I do not know in skeptical contexts that I have hands (or, for that matter, anything that I ordinarily take myself to know.)

Nevertheless, when no one mentions a skeptical hypothesis, the Rule of Sensitivity is not invoked, and the epistemic standards allow beliefs to count as knowledge even though they are not sensitive. This means that in ordinary contexts, we are still in a position to know the things we ordinarily take ourselves to know.

3. Relevant Alternatives Contextualism and Rejecting Closure

Perhaps the main motivation for epistemological contextualism is now the relevant alternatives theory of knowledge. There are two kinds of relevant alternatives contextualism. One kind rejects the closure principle, according to which knowledge is closed under known implication:

If S knows that p, and knows that p implies q, then S knows that q.

The closure principle is both plausible and explanatorily valuable. For one thing, it helps to explain how we come to know things via deduction. I know, for example, that tomorrow is Saturday. I know this because I know that today is Friday and that if today is Friday then tomorrow is Saturday. The closure principle helps to account for this knowledge, and the fact that I come to know things via deduction—and in accordance with the closure principle—renders that principle both plausible and desirable.

A second kind of relevant alternatives contextualism accepts the closure principle.

In Section 3.2, we will consider Mark Heller’s relevant alternatives contextualism, which represents accounts that reject the closure principle. Before examining Heller’s contextualism, however, we should consider the theory that motivates it.

a. Dretske’s Relevant Alternatives Theory of Knowledge

Fred Dretske proposes “to think of knowledge as an evidential state in which all relevant alternatives (to what is known) are eliminated” (Dretske 2000b, p. 52). This is the relevant alternatives theory of knowledge, or RA. But this leaves several questions unanswered.

First, what is an alternative to p? A proposition q is an alternative to p if and only if it cannot be true both that q and that p. Thus, the proposition that this animal is a Siberian grebe is an alternative to the proposition that it’s a Gadwall duck. For the animal cannot be both a Siberian grebe and a Gadwall duck.

Second, what is a relevant alternative to p? Dretske says that a relevant alternative is an alternative “that a person must be in a[n] evidential position to exclude (when he knows that P)” (Dretske 2000b, p. 57). But this doesn’t help very much at all. What is it about the alternatives that S must exclude that makes them such that she must exclude them? Unfortunately, there is no widely accepted response to this question. The vote seems to be split between two candidates. Some, including Dretske, say that an alternative q is relevant only if there is an objective possibility that q. But others say that q can be a relevant alternative simply because we regard q as a possibility.

Third, what does it mean to eliminate a relevant alternative? Here, too, there is disagreement. One view about elimination is the strongest view, according to which S can eliminate a relevant alternative q only if her evidence for believing not-q is strong enough to allow her to know that not-q. A proponent of RA might instead adopt the strong view, according to which S can eliminate q if her evidence for thinking that not-q is either strong enough to allow her to know that not-q or strong enough to allow her to have very good reason to believe that not-q. A proponent of RA might also adopt the weak view, according to which S can eliminate a relevant alternative q by meeting one of the following three conditions: (i) her evidence for not-q is strong enough to allow her to know that not-q, (ii) her evidence for not-q is strong enough to allow her to have very good reason to believe that not-q, or (iii) S’s belief that not-q is epistemically non-evidentially rational, where this is “a way in which it can be rational (or reasonable) [for S] to believe [that not-q] without possessing evidence for the belief” (Cohen 1988, p. 112). Some RA contextualists make it clear that they have something like the weak view in mind (see Cohen 1988 and Stine 1976), but most fail to make it clear which of the three views they adopt.

Dretske argues that I can know that p without eliminating the irrelevant alternatives to p. Still, he maintains that my knowing that p entails nothing whatsoever about whether I know that q, where q is an irrelevant alternative to p and might even be a necessary consequence of p. This amounts to a denial of the closure principle. Suppose that the alternative that this is a Siberian grebe is irrelevant to my knowing that it is a Gadwall duck. Notice too that the negation of the former proposition is a necessary consequence of the latter proposition—if this is a Gadwall duck, then it is not a Siberian grebe. Dretske claims that I can know that this is a Gadwall duck even though I don’t know that it’s not a Siberian grebe. Thus, Dretske holds that the closure principle is false.

This verdict is quite controversial, however, and there is disagreement over this matter even among proponents of RA. I see the lines of this disagreement as boundaries between different kinds of RA theories, and we can classify RA theories according to whether they accept or reject closure. We might choose to do this partly because RA contextualists, as well as RA theorists in general, tend to make it clear whether they accept closure, while they do not always make it clear where they stand on other issues (e.g., on the issue of relevance and on the issue of elimination). Primarily, though, we should distinguish between RA contextualists who accept closure and those who reject it because their views about closure crucially influence how they respond to skepticism. As we shall shortly see, those who reject closure deny one of the conflicting claims, namely, (2), the claim that I don’t know that I have hands if I don’t know that I’m not a BIV. So, according to RA contextualists who reject closure, there really is no conflict at all between claims (1) and (3). But according to those who accept closure, there is such a conflict. For, by the closure principle, in contexts in which I don’t know that certain skeptical alternatives do not obtain, I also fail to know certain things about the external world.

In Section 4, we will see how RA contextualists who accept closure respond to skepticism. In the following section, however, we will examine the response provided by RA contextualists who reject closure.

b. Relevant Alternatives Contextualisms that Reject Closure

Consider the puzzle that is generated by the following argument:

  1. I don’t know that I’m not a BIV in a treeless world (that is, a BIVT).
  2. If I know that there is a tree before me (call the italicized proposition T), and I know that T implies my not being a BIVT, then I know that I’m not a BIVT.
  3. So, I don’t know that T (given that I know that T implies my not being a BIVT).

In “Relevant Alternatives and Closure,” Mark Heller follows Dretske’s lead and argues that we can solve this skeptical puzzle by rejecting the closure principle, of which (5) is an instance.

To show why we should give up (5) (and hence the closure principle), Heller argues for a particular interpretation of RA. He claims that (5) is false if his interpretation of RA is true. He calls his interpretation Expanded Relevant Alternatives, or ERA.

(ERA) S knows that p only if S does not believe p in any of the closest not-p worlds or any more distant not-p worlds that are still close enough.

ERA accounts for our inclination to think, for example, that if I know that T, I will not believe that T in any of the closest worlds in which it’s not the case that T. In addition, ERA accounts for our inclination to think that something else is sometimes needed if I am to know that T. Imagine that “the actual world is cluttered with papier mâché tree facsimiles which S is unable to distinguish from real trees” (Heller 1999b, p. 200). In this case, we are inclined to say that S doesn’t know that T even if she doesn’t believe that T in any of the closest not-T worlds. Here, even though worlds that are cluttered with papier mâché tree facsimiles are not among the closest not-T worlds, they are close enough to the actual world to count as relevant. So Heller claims that in at least some cases, if S is to know that p, she must not believe that p in any of the close enough not-p worlds.

ERA provides the foundation for a relevant alternatives contextualism, for it allows us to see different contexts as setting different epistemic standards. Which not-p worlds count as epistemically relevant—that is, which not-p worlds count as being close enough to the actual world—will vary from context to context. And since ERA characterizes epistemic standards in terms of relevant alternatives (that is, in terms of relevant not-p worlds), it allows for the context-sensitivity of epistemic standards.

In light of this, Heller maintains, we may solve the skeptical puzzle by concluding that (5) is false. Note first of all that there are no contexts in which I know that I’m not a BIVT. Given ERA, if I am to know that I’m not a BIVT, I must not believe that I’m not a BIVT in any of the closest BIVT worlds. Thus, since I do believe that I’m not a BIVT in the closest BIVT worlds, I don’t know that I’m not a BIVT.

Nevertheless, there are contexts in which I do know that T. This is true because we use “different worlds as relevant alternatives when considering whether [I know that T] from those used when considering whether [I know that I’m not a BIVT]” (Heller 1999b, p. 197). According to ERA, I know in C that T because I don’t believe that T in any of the not-T worlds that are close enough to the actual world. (And we need consider only the close enough not-T worlds because those worlds include the closest not-T worlds.) So given that ERA is true, (5) is false: I can know that there is a tree before me (and hence evade the skeptic’s snare) even though I don’t know that I’m not a BIVT. We can therefore solve the skeptical puzzle by giving up the closure principle.

Any solution to the skeptical puzzle that denies the truth of (5) must explain why it seems to us that (5) is true. In providing this explanation, Heller argues that (5) seems true because some contexts conform to the demands of the closure principle. For example, there are contexts in which astonishingly distant not-T worlds—for example, worlds in which I am a BIVT—are close enough to the actual world to count as epistemically relevant. In those contexts, I know neither that T nor that I’m not a BIVT. For, in BIVT worlds, I believe both that T and that I’m not a BIVT. The fact that there are contexts such as these, contexts that conform to the demands of the closure principle, can make it seem that (5) is true.

4. Relevant Alternatives Contextualism and Accepting Closure

Some relevant alternatives contextualisms accept the closure principle. In this section, we will examine the contextualist theory espoused by Stewart Cohen in his influential article “How to be a Fallibilist.” Cohen’s theory is perhaps the most prominent relevant alternatives contextualism and should be counted among the most notable of all contextualisms.

Cohen’s contextualism, like others, is intended to solve certain skeptical puzzles. The puzzle with which Cohen is concerned is familiar—it consists of three independently plausible but mutually inconsistent propositions.

  1. I know that I have hands.
  2. If I don’t know that I’m not a BIV, then I don’t know that I have hands.
  3. I don’t know that I’m not a BIV.

To solve this paradox, Cohen relies on a relevant alternatives contextualism, one that accepts the plausibility—and indeed the truth—of proposition (2), which follows from the closure principle (given that I know that my having hands implies my not being a BIV). Cohen claims that in skeptical contexts, contexts in which the BIV alternative is relevant, we should accept propositions (2) and (3) but deny proposition (1). However, in ordinary contexts, contexts in which the BIV alternative is not relevant, we should accept (1) and (2) but deny (3).

Let’s look at the details of Cohen’s account. For Cohen,

an alternative (to [some proposition] q) h is relevant (for [some person] S) = df S’s epistemic position with respect to h precludes S from knowing q. (Cohen 1988, p. 101)

Cohen also claims that there are criteria of relevance and that these criteria ought to reflect our intuitions about the conditions under which S knows that q. He says that our intuitions are influenced both by conditions that are internal and by conditions that are external to a person’s evidence. Accordingly, he offers two criteria of relevance. First, there is the external criterion.

An alternative (to p) h is relevant if the probability of h conditional on reason r and certain features of the circumstances is sufficiently high (where the level of probability that is sufficient is determined by context). (Cohen 1988, p. 102)

By this criterion, the fact that there are a number of cleverly painted mules in the zoo, whether or not I have any evidence for this fact, can be sufficient to make relevant the alternative that this is a cleverly painted mule. Presumably, if there are a number of cleverly painted mules in the zoo, it is probable to some determinate degree d that this is a cleverly painted mule rather than, say, a zebra. And according to Cohen, the context determines, for example, that probabilities of degree d* and higher are sufficiently high to render an alternative relevant. Thus, according to the external criterion, if d is greater than or equal to d*, the alternative that this is a cleverly painted mule will be relevant in this context.

Second, there is the internal criterion.

An alternative (to q) h is relevant if S lacks sufficient evidence (reason) to deny h, i.e., to believe not-h (Cohen 1988, p. 103),

where the amount of evidence that is sufficient is presumably determined by context. By this criterion, the amount of evidence that S has for her belief that this is not a cleverly painted mule can be sufficiently low to make relevant the alternative that it is a cleverly painted mule. We may again presume that S has a determinate amount of evidence a for her belief that this is not a cleverly painted mule. Here, the context determines, say, that amounts of evidence a* and lower are sufficiently low to render an alternative relevant. So if a is less than or equal to a*, the alternative that this is a cleverly painted mule will be relevant in this context.

Both the internal criterion and the external criterion are sensitive to context. According to Cohen, then,

there will be no general specification of what constitutes sufficient evidence to deny an alternative in order for it not to be relevant, and as such, no general specification of what constitutes sufficient evidence to know q. Rather, this will depend on the context in which the attribution of knowledge occurs. (Cohen 1988, p. 103)

But how do the standards of relevance shift? Cohen recognizes that he must explain how this shift occurs if his contextualist solution to the skeptical paradox is to work. Because Cohen thinks of reasons as statistical in nature, he thinks that they advertise both the chance that we believe correctly on their basis and the chance that we believe erroneously on their basis. When the chances for error are highlighted, those chances become salient, and the standards for relevance shift. Thus, highlighting the chances for error allows certain alternatives to become relevant.

For example, suppose that I have reasons to believe that this is a zebra. It looks for all the world like a zebra; it is in an area of the zoo that is clearly marked “zebras”; I believe with good reason that zookeepers put only zebras in areas marked “zebras”; and so on. But perhaps someone underscores the fact that all of these reasons are compatible with this animal’s being a cleverly painted mule. Such mules look for all the world like zebras, and in a pinch even the most conscientious zookeeper might put such creatures in an area marked “zebras.” Underscoring these facts makes salient the chance that I believe erroneously on the basis of my reasons, and it makes relevant the alternative that this is a cleverly painted mule.

This suggests that, for Cohen, the standards of relevance shift whenever someone underscores the statistical nature of our reasons, whenever someone points out that there is a chance that we believe erroneously on the basis of those reasons. So, in ordinary contexts, contexts in which no one underscores the chance that I believe erroneously, that chance will not be salient, and I can know on the basis of my reasons that this is a zebra. However, in skeptical contexts, contexts in which someone does underscore the chance that I believe erroneously, that chance will be salient. In these contexts, my attention will have been focused on the chance that I am wrong, and the alternative that this is a cleverly painted mule will be relevant. Since I cannot eliminate that alternative, I do not know that this is a zebra.

Cohen suggests that his relevant alternatives contextualism allows us to solve skeptical puzzles like those that focus on zebras and cleverly painted mules. This is because his version of the relevant alternatives theory is formulated in terms of evidence, and such puzzles involve beliefs for which we can have evidence. But Cohen suggests that radical skeptical paradoxes involve beliefs for which we can have no evidence—”radical skeptical hypotheses are immune to rejection on the basis of any evidence” (Cohen 1988, p. 111). As it is, then, Cohen’s relevant alternatives contextualism seems ill equipped to resolve radical skeptical paradoxes.

To overcome this difficulty, Cohen adjusts his version of the relevant alternatives theory so that it takes into account beliefs for which I can have no evidence. He claims that for some such beliefs it is epistemically rational for me to hold them even though I possess no evidence for them. He calls beliefs of this sort intrinsically rational beliefs. Among the intrinsically rational beliefs is my belief that I’m not a BIV. According to Cohen, it is rational for me to believe that I’m not a BIV even though I have no evidence for that belief.

Taking into account intrinsically rational beliefs, Cohen amends the internal criterion of relevance. First, he says that

it is reasonable for a subject S to believe a proposition q just in case S possesses sufficient evidence in support of q, or q is intrinsically rational. (Cohen 1988, p. 113)

He then provides the following amended version of the internal criterion, or ICa:

(ICa:) An alternative (to p) h is relevant if it is not sufficiently reasonable for S to deny h (to believe not-h), where, presumably, the degree of reasonableness that is sufficient is determined by context.

Cohen now notes that according to ICa: the alternative that I am a BIV is not ordinarily relevant. For my belief that I’m not a BIV is intrinsically rational. This means that the alternative that I am a BIV does not preclude me from knowing, on the basis of my reasons, that I have hands. Thus, I can know in ordinary contexts that I have hands (given both that my reasons are sufficient for my knowing that I have hands and that all relevant alternatives are eliminated). Furthermore, Cohen claims that since the standards are comparatively low in ordinary contexts, I can also know in those contexts that I’m not a BIV.

However, there are contexts in which the skeptic underscores the fact that I can have no evidence for my belief that I’m not a BIV. By doing this, the skeptic focuses my attention on the chance of error. According to Cohen, this makes relevant the alternative that I am a BIV, and I cannot eliminate that alternative. So, by the standards that apply in these skeptical contexts, I know neither that I’m not a BIV nor that I have hands. In this way, then, Cohen solves the radical skeptical puzzle while maintaining that closure holds.

5. Contextualism and Epistemic Rationality

Certain objections have led Cohen to abandon the relevant alternatives contextualism that he presents in “How to be a Fallibilist” and to revise his contextualist solution to radical skeptical paradoxes. He is most troubled by two objections. First, he is troubled by the idea that I can have evidence for my belief that I’m not a BIV. Second, he is troubled by the idea that his account commits him to the view that I can have a priori knowledge of some contingent facts, in particular, of the fact that I’m not a BIV. On the view that he presents in “How to be a Fallibilist,” I can know that I’m not a BIV solely on the basis of the intrinsic rationality of denying that I am a BIV. According to Cohen (see Cohen 1999, p. 69), this means that I can know a priori that I’m not a BIV and hence that I can have a priori knowledge of some contingent facts. These two objections have led Cohen away from his earlier relevant alternatives contextualism.

Even though Cohen now admits that I can have evidence for my belief that I’m not a BIV, he still thinks that there are beliefs for which I can never have evidence. He formulates a new radical skeptical paradox in terms of such beliefs. Cohen asks us to imagine a creature that is a BIV but will never have evidence that it is. Call such a creature a BIV*. Now, my belief that I’m not a BIV* is a belief for which I will never have evidence. We can formulate the following new paradox in terms of that belief.

  1. I know that I have hands.
  1. f I don’t know that I’m not a BIV*, then I don’t know that I have hands.
  2. I don’t know that I’m not a BIV*.

Since this paradox involves a skeptical hypothesis for which I can never have evidence, the idea that I can have evidence for my belief that I’m not a BIV* should not trouble Cohen’s solution to this new paradox.

But given that Cohen has abandoned the relevant alternatives framework, just what is his solution to the BIV* paradox? He notes first of all that my belief that I’m not a BIV* can be intrinsically rational, or what he now calls non-evidentially rational. Once again, S’s belief that p is non-evidentially rational if it is epistemically rational for S to believe that p even though S has no evidence for that belief. Furthermore, Cohen now suggests that

S knows that p if and only if her belief that p is epistemically rational to some degree d, where epistemic rationality has both an evidential and a non-evidential component, and where d is determined by context. (see Cohen 1999, pp. 63-69, 76-77)

Suppose, then, that I have a certain amount of evidence for my belief that I have hands, and that my belief that I have hands is therefore evidentially rational to degree de:. Suppose too that my belief that I’m not a BIV* is non-evidentially rational to some degree dne. Cohen claims that “the non-evidential rationality [of my belief that I’m not a BIV*] is a component of the overall rationality or justification for any empirical proposition” (Cohen 1999, p. 86, fn. 36). So we may suppose that my belief that I have hands is epistemically rational to degree d*, where d* equals de plus dne.

Cohen now says that the degree to which a belief must be epistemically rational if it is to count as knowledge is “determined by some complicated function of speaker intentions, listener expectations, presuppositions of the conversation, salience relations, etc.” (Cohen 1999, p. 61). He suggests that the listeners’ cooperation is an essential part of this function. He also claims that in ordinary contexts this complicated function specifies that a belief is sufficiently epistemically rational if it is epistemically rational to degree do. And d*—the degree to which my belief that I have hands is epistemically rational—is greater than do. This means that I can know in ordinary contexts that I have hands. “And since my having a hand entails my not being a brain-in-a-vat [and a fortiori a BIV*], in those same [ordinary] contexts, my belief that I am not a brain-in-a-vat is sufficiently rational for me to know I am not a brain-in-a-vat” (Cohen 1999, p. 77). This allows him to overcome the objection that I know a priori that I’m not a BIV, for “my knowledge that I am not a brain-in-a-vat is based, in part, on my empirical evidence (the evidence that I have a hand), and so is not a priori” (Cohen 1999, p. 76). In ordinary contexts, then, we accept propositions (1) and (7) of the new radical skeptical paradox, but deny proposition (8).

But in skeptical contexts the complicated function specifies that a belief is sufficiently epistemically rational only if it is epistemically rational to degree ds. And d* is less than Ds This means that in skeptical contexts “my belief that I have a hand is not sufficiently rational for me to know I have a hand. In those same [skeptical] contexts, I have no basis for knowing I am not a brain-in-a-vat” (Cohen 1999, p. 77). In skeptical contexts, we accept propositions (7) and (8) but deny proposition (1). In this way, then, Cohen solves the BIV* paradox while maintaining that closure holds.

6. Other Forms of Epistemological Contextualism

Besides those already discussed, a few other forms of epistemological contextualism warrant mention. We begin with the form that belongs to Steven Rieber, which is most similar to those already considered.

a. Explanatory Contextualism

In “Skepticism and Contrastive Explanation,” Steven Rieber provides a contextualist solution to the skeptical puzzle generated when (1), (2), and (3) are considered together. He first proposes the following analysis of knowledge:

S knows that P … iff: the fact that P explains why S believes that P. (Rieber 1998, p. 194)

He next claims that his analysis of knowledge “generates the sort of context-sensitivity needed to solve the skeptical puzzle” (Rieber 1998, p. 195). He says that “what counts as an explanation is highly context-dependent. In particular, as recent work on contrastive explanation has made clear, it can depend on an implied contrast” (Rieber 1998, p. 195). For example, only those who have syphilis contract paresis, but most of those who have syphilis never get paresis. Suppose that Smith has both syphilis and paresis. We might ask

(S) Does the fact that Smith has syphilis explain why he contracted paresis?

According to Rieber, the answer to this question can depend on what is being implicitly contrasted with Smith. If there is an implied contrast with Jones, who has neither syphilis nor paresis, then we understand (S) to be asking

(J) Does the fact that Smith has syphilis explain why he rather than Jones contracted paresis?

And the answer to (J) might well be yes. However, if there is an implied contrast with Brown, who has syphilis but did not contract paresis, then we understand (S) to be asking

(B) Does the fact that Smith has syphilis explain why he rather than Brown contracted paresis?

And the answer to (B) might well be no. So it seems that whether one thing explains another can depend on context. Thus, given Rieber’s explanatory analysis of knowledge, knowledge too will be context-sensitive.

Rieber’s analysis of knowledge seems to him to be well suited to solve the skeptical puzzle. He suggests that on his analysis of knowledge, to ask

(9) Do I know that I have hands?

is to ask

(9a) Does the fact that I have hands explain why I believe that I have hands?

Rieber claims that in ordinary contexts the answer to (9a) is clearly yes, and so I know in such contexts that I have hands. Presumably, I also know in those contexts that I’m not a BIV.

But a consideration of the BIV skeptical possibility can make salient a contrast with that possibility. When this contrast is salient, we understand (9) to be asking

(9b) Does the fact that I have hands rather than being a handless BIV explain why I believe that I have hands rather than that I am a handless BIV?

The answer to (9b) is no, for all of the evidence that I have for my belief that I have hands is compatible with my being a handless BIV. And whenever the answer to (9b) is no, so is the answer to (9). Thus, in skeptical contexts, contexts in which a contrast with the BIV possibility is salient, we should accept (3) but deny (1). The skeptic can truthfully say in such contexts that I know neither that I’m not a BIV nor that I have hands.

Rieber’s explanatory contextualism thus solves our skeptical puzzle. In ordinary contexts, we accept (1) and (2) but deny (3). I know in such contexts both that I have hands and that I’m not a BIV. However, when we consider certain skeptical possibilities, certain contrasts become salient. In these contexts, I know neither that I have hands nor that I’m not a BIV.

b. Evidential Contextualism

In “Contextualism and the Problem of the External World,” Ram Neta argues that the standards for knowledge are invariant, and therefore that we should not see the skeptic as being able to raise those standards. We ought instead to understand the skeptic to be restricting what can count as evidence. The skeptic does this, according to Neta, by exploiting the context-sensitivity of our attributions of evidence. When she brings up the BIV skeptical hypothesis, for example, the skeptic restricts what I can truthfully regard as my evidence to just those mental states that are available to me whether or not I am a BIV. That is, she prevents any of my current mental states from counting as evidence for my beliefs about the external world, thereby creating an unbridgeable (in this context, at least) epistemic gap between my evidence and my beliefs. In these contexts, my beliefs fail to meet the epistemic standard and therefore fail to count as knowledge. Still, in contexts in which I am considering no skeptical hypotheses, I can have plenty of evidence for my beliefs about the external world. In such contexts, my beliefs can meet the epistemic standards and can therefore count as knowledge. In this way, Neta’s version of contextualism, like the other versions we’ve considered, is meant to resolve familiar conflicts and to explain why we judge in most contexts that we have knowledge but why we judge in other contexts that we don’t.

c. Contextualism as a Theory of Knowledge

The last two forms of epistemological contextualism, those belonging to Michael Williams and to David Annis, have few similarities with the forms we’ve considered so far.

In his recent work, Williams argues for contextualism, which is, for him, the view that “independently of all [situational, disciplinary and other contextually variable factors], a proposition has no epistemic status whatsoever. There is no fact of the matter as to what kind of justification it either admits of or requires” (Williams 1996a, p. 119). His arguments for contextualism also count as arguments against epistemological realism, which is the view that even independently of contextual factors, there is a fact of the matter as to what kind of justification a belief requires. In particular, epistemological realism maintains the truth of the doctrine of epistemic priority (or DEP). According to DEP, our beliefs about the external world must be justified by sensory experience if they are to amount to knowledge. Williams argues that epistemological realism in general and DEP in particular are “contentious and possibly dispensable theoretical ideas about knowledge and justification” (Williams 1999b, p. 144). He also argues that skepticism depends essentially on these contentious ideas, and that, being theoretical, they are not forced on us by our ordinary ways of epistemic thinking. This suggests that skepticism is unnatural and thus that the burden of proof belongs to the skeptic. Yet since the skeptic cannot carry this burden, we have, according to Williams, no reason to take skepticism seriously.

Annis’ contextualism is meant to be an alternative both to foundationalism and to coherentism. Annis complains that both foundationalism and coherentism ignore the social nature of justification. According to his version of contextualism, then, S is justified in believing that p only if she can meet certain objections that express real doubts. These objections can include, but are not necessarily limited to, those according to which S is not in a position to know that p and those according to which p is false. We might object, for example, that since S is not reliable in situations like this, she is not in a position to know that the book on yonder shelf is brown. Thus, if S is to be justified in believing that the book is brown, she must be able to meet that objection. The justification of S’s belief that p also depends, according to Annis, on who offers certain objections and on the importance of S’s being right about p. It matters, for example, that it is S’s flight instructors, rather than her teasing friends, who object that she is unreliable when it comes to distinguishing the colors of fairly distant objects. A theory of justification that includes contextual parameters like these, Annis argues, fares better than either foundationalism or coherentism, both of which overlook the social nature of justification.

7. Objections to Contextualism

In this section, we will discuss two leading objections to epistemological contextualism. These are by no means the only criticisms that have been leveled against contextualism, but they introduce themes that have motivated additional objections as well as alternatives to contextualism. A discussion of these objections, then, should provide a center of operations for an exploration of objections to contextualism.

Palle Yourgrau (1983) argues that contextualism allows for dialogues such as the following since it claims that the standards for knowledge shift from context to context:

A: Is that a zebra?
B: Yes, it is a zebra.
A: But can you rule out its merely being a cleverly painted mule?
B: No, I can’t.
A: So you admit you didn’t know it was a zebra.
B: No, I did know then that it was a zebra. But after your question, I no longer knew.

This dialogue strikes Yourgrau as absurd, for it seems that nothing changes during the course of the conversation that would account for a change in B’s epistemic state: B is in just as good an epistemic position at the beginning of the conversation as she is at the end of the conversation, and so it seems that if B knows at the beginning, she should also know at the end. This suggests that, contrary to epistemological contextualism, we cannot affect shifts in the standards for knowledge simply by mentioning certain skeptical possibilities.

Contextualists (see DeRose 1992) have replied to this sort of objection by saying that once A introduces a skeptical possibility and thereby raises the standards for knowledge, B can no longer truly say, “I did know then that it was a zebra.” Once the standards for knowledge have been raised, the truth of any attribution of knowledge, including an attribution that is meant to apply only at some time in the past, must be judged according to those higher standards. Once the standards have been raised, B cannot both attribute knowledge to himself in the past and deny knowledge to himself in the present. He should now only deny himself knowledge; once the standards have been raised, neither B’s past self nor his present self knows that this is a zebra.

Stephen Schiffer has leveled a different sort of criticism at epistemological contextualism. Again, contextualism maintains that we attribute knowledge relative to standards that shift from context to context. This is to say, in effect, that when we say that B knows that this is a zebra, we mean that she knows relative to such-and-such an epistemic standard that this is a zebra. Putting this another way, contextualism maintains that our knowledge attributions are implicitly relative. Yet the contextualist’s response to Yourgrau’s objection suggests that B—or anyone else, for that matter—might fail to realize that our knowledge attributions are implicitly relative to an epistemic standard that shifts from context to context. Schiffer argues, however, that it is a general linguistic truth that speakers do realize that certain attributions are implicitly relative. For example, anyone who utters, “It’s raining,” in order to say that it’s raining in London knows full well that she’s asserting that it’s raining in London. Yet, according to Schiffer, when we utter, “B knows that it’s a zebra,” we typically do not take ourselves to be asserting that B knows relative to any standard. All this suggests, Schiffer argues, that the contextualist is wrong to think that our knowledge attributions are implicitly relative, and hence wrong to think that the standards for knowledge can shift from context to context.

8. Alternatives to Contextualism

Objections like these push people away from epistemological contextualism and toward theories that envisage epistemic standards that remain invariant from context to context. Two such theories present themselves as alternatives to contextualism. The first is skepticism, and the second is Mooreanism. Both skeptics and Mooreans maintain that the standards for knowledge do not shift. Yet while the skeptic claims that they are invariantly quite high, the Moorean claims that the standards are invariantly comparatively low.

The skeptic contends not only that there are no contexts in which we know that we’re not BIVs, but also that there are no contexts in which we know that we have hands (see, for example, Unger 1975 and Stone 2000). This response strikes some as implausible, however, since it does not accord with the thought that there are many contexts in which we can and do know things about the world around us.

The Moorean contends that there are never any insurmountable obstacles to our knowing both that we have hands and that we’re not BIVs.

Ernest Sosa’s Moorean response begins with the rejection of Nozick’s idea that knowledge requires sensitivity (see Section 2). He argues instead that knowledge requires safety, according to which S would believe that p only if it were the case that p (see Sosa 1999, p. 142). Moreover, both my belief that I have hands and my belief that I’m not a BIV are safe. Hence, both beliefs can always count as knowledge. Sosa says that

after all, not easily would one believe that [one was not radically deceived] without it being true … . In the actual world, and for quite a distance away from the actual world, up to quite remote possible worlds, our belief that we are not radically deceived matches the fact as to whether we are or are not radically deceived. (Sosa 1999, p. 147)

Yet if I can know across contexts that I’m not a BIV, why is it that it sometimes seems as if I don’t know that I’m not a BIV? Sosa maintains that since we can easily mistake safety for sensitivity, and since the belief that we’re not BIVs is not sensitive, it can sometimes seem to us that our belief that we’re not BIVs is not safe and thus that we don’t know that we’re not BIVs. Nevertheless, this is, according to Sosa, a mere appearance. For, since our belief is safe, we can know across contexts that we’re not BIVs and thus adopt a Moorean response to our skeptical puzzles.

Tim Black also provides a Moorean response to these puzzles. Employing Nozick’s sensitivity requirement for knowledge, Black argues in “A Moorean Response to Brain-in-a-Vat Scepticism” that the only worlds that are relevant to whether or not S knows that p are those in which S’s belief is produced by the method that actually produces it. This means that BIV worlds—possible worlds in which S is a BIV—are not relevant to whether S knows that she’s not a BIV. For BIV worlds are worlds in which her belief is produced by a method other than the one that actually produces it. Thus, since BIV worlds are not relevant to whether S know things about the external world, S can know both that she has hands and that she’s not a BIV. This, too, suggests a Moorean response to our skeptical puzzles.

9. Conclusion

We have now characterized epistemological contextualism in a way that allows several different theories to count as versions of that position. We have seen in particular that epistemological contextualists maintain that certain features of conversational contexts shape the standards that one must meet in order for one’s beliefs to count as knowledge. Understood in this way, a fairly wide range of views will count as versions of epistemological contextualism. Different versions will disagree over which features of conversational contexts can shape the epistemic standards, and over how the relevant contextual features help to shape those standards. Yet in spite of the differences between versions of epistemological contextualism, each seeks to achieve the valuable ends of explaining our epistemic judgments and solving the puzzles that are generated by skeptical arguments.

10. References and Further Reading

  • Annis, David. (1978) “A Contextual Theory of Epistemic Justification.” American Philosophical Quarterly 15: 213-219.
  • Austin, J. L. (1979) “Other Minds.” In Philosophical Papers, 3rd ed. New York: Oxford University Press.
  • Black, Tim. (2002a) “A Moorean Response to Brain-in-a-Vat Scepticism.” Australasian Journal of Philosophy 80: 148-163.
  • Black, Tim. (2002b) “Relevant Alternatives and the Shifting Standards for Knowledge.” Southwest Philosophy Review 18: 23-32.
  • Brueckner, Anthony. (1994) “The Shifting Content of Knowledge Attributions.” Philosophy and Phenomenological Research 54: 123-126.
  • Cohen, Stewart. (1986) “Knowledge and Context.” Journal of Philosophy 83: 574-583.
  • Cohen, Stewart. (1987) “Knowledge, Context, and Social Standards.” Synthese 73: 3-26.
  • Cohen, Stewart. (1988) “How to be a Fallibilist.” Philosophical Perspectives 2, Epistemology: 91-123.
  • Cohen, Stewart. (1998a) “Contextualist Solutions to Epistemological Problems: Scepticism, Gettier, and the Lottery.” Australasian Journal of Philosophy 76: 289-306.
  • Cohen, Stewart. (1998b) “Two Kinds of Skeptical Argument.” Philosophy and Phenomenological Research 58: 143-159.
  • Cohen, Stewart. (1999) “Contextualism, Skepticism, and the Structure of Reasons.” Philosophical Perspectives 13, Epistemology: 57-89.
  • Cohen, Stewart. (2000a) “Contextualism and Skepticism.” Philosophical Issues 10, Skepticism: 94-107.
  • Cohen, Stewart. (2000b) “Replies [to Klein, Hawthorne, and Prades].” Philosophical Issues 10, Skepticism: 132-139.
  • Cohen, Stewart. (2001) “Contextualism Defended: Comments on Richard Feldman’s ‘Skeptical Problems, Contextualist Solutions’.” Philosophical Studies 103: 87-98.
  • DeRose, Keith. (1992) “Contextualism and Knowledge Attributions.” Philosophy and Phenomenological Research 52: 913-929.
  • DeRose, Keith. (1996a) “Knowledge, Assertion and Lotteries.” Australasian Journal of Philosophy 74: 568-580.
  • DeRose, Keith. (1996b) “Relevant Alternatives and the Content of Knowledge Attributions.” Philosophy and Phenomenological Research 56: 193-197.
  • DeRose, Keith. (1999a) “Solving the Skeptical Problem.” Reprinted in Keith DeRose and Ted A. Warfield, eds., Skepticism: A Contemporary Reader. Oxford: Oxford University Press.
  • DeRose, Keith. (1999b) “Contextualism: An Explanation and Defense.” In John Greco and Ernest Sosa, eds., The Blackwell Guide to Epistemology. Oxford: Blackwell.
  • DeRose, Keith. (1999c) “Introduction: Responding to Skepticism.” In Keith DeRose and Ted A. Warfield, eds., Skepticism: A Contemporary Reader. Oxford: Oxford University Press.
  • DeRose, Keith. (2000a) “How Can We Know that We’re Not Brains In Vats?” Southern Journal of Philosophy 38 (Spindel Conference Supplement): 121-148.
  • DeRose, Keith. (2000b) “Now You Know It, Now You Don’t.” Proceedings of the Twentieth World Congress of Philosophy: Volume V, Epistemology: 91-106.
  • DeRose, Keith. (2002) “Assertion, Knowledge, and Context.” Philosophical Review 111 (2): 167-203.
  • DeRose, Keith. (2004)a “Single Scoreboard Semantics.” Philosophical Studies 119 (1-2): 1-21.
  • DeRose, Keith. (2004b) “Sosa, Safety, Sensitivity, and Skeptical Hypotheses.” In John Greco, ed., Sosa and His Critics. Oxford: Blackwell.
  • Dretske, Fred I. (1981) Knowledge and the Flow of Information. Cambridge, MA: MIT Press.
  • Dretske, Fred I. (2000a) “Epistemic Operators.” In Perception, Knowledge and Belief: Selected Essays. Cambridge: Cambridge University Press.
  • Dretske, Fred I. (2000b) “The Pragmatic Dimension of Knowledge.” In Perception, Knowledge and Belief: Selected Essays. Cambridge: Cambridge University Press.
  • Feldman, Richard. (1999) “Contextualism and Skepticism.” Philosophical Perspectives 13, Epistemology: 91-114.
  • Feldman, Richard. (2001) “Skeptical Problems, Contextualist Solutions.” Philosophical Studies 103: 61-85.
  • Fogelin, Robert J. (1999) “The Sceptic’s Burden.” International Journal of Philosophical Studies 7: 159-172.
  • Garfinkel, Alan. (1981) Forms of Explanation. New Haven: Yale University Press.
  • Goldman, Alvin I. (1976) “Discrimination and Perceptual Knowledge.” Journal of Philosophy 73: 771-791.
  • Hambourger, Robert. (1987) “Justified Assertion and the Relativity of Knowledge.” Philosophical Studies 51: 241-269.
  • Hawthorne, John. (2002) “Lewis, the Lottery and the Preface.” Analysis 62: 242-251.
  • Heller, Mark. (1989) “Relevant Alternatives.” Philosophical Studies 55: 23-40.
  • Heller, Mark. (1999a) “The Proper Role for Contextualism in an Anti-Luck Epistemology.” Philosophical Perspectives 13, Epistemology: 115-129.
  • Heller, Mark. (1999b) “Relevant Alternatives and Closure.” Australasian Journal of Philosophy 77: 196-208.
  • Hofweber, Thomas. (1999) “Contextualism and the Meaning-Intention Problem.” In Kepa Korta, Ernest Sosa, Xabier Arrazola, eds., Cognition, Agency and Rationality. Dordrecht: Kluwer.
  • Jacobson, Stephen. (2001) “Contextualism and Global Doubts about the World.” Synthese 129: 381-404.
  • Johnsen, Bredo C. (2001) “Contextualist Swords, Skeptical Plowshares.” Philosophy and Phenomenological Research 62: 385-406.
  • Klein, Peter. (2000) “Contextualism and the Real Nature of Academic Skepticism.” Philosophical Issues 10, Skepticism: 108-116.
  • Kornblith, Hilary. (2000) “The Contextualist Evasion of Epistemology.” Philosophical Issues 10, Skepticism: 24-32.
  • Lewis, David. (1979) “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8: 339-359.
  • Lewis, David. (1986) “Causal Explanation.” In Philosophical Papers, Volume II. New York: Oxford University Press.
  • Lewis, David. (1996) “Elusive Knowledge.” Australasian Journal of Philosophy 74: 549-567.
  • Lipton, Peter. (1990) “Contrastive Explanation.” In Dudley Knowles, ed., Explanation and its Limits. Cambridge: Cambridge University Press.
  • Lipton, Peter. (1991) Inference to the Best Explanation. London: Routledge.
  • Neta, Ram. (2002) “S knows that p.” Noûs 36: 663-681.
  • Neta, Ram. (2003) “Contextualism and the Problem of the External World.” Philosophy and Phenomenological Research 66: 1-31.
  • Neta, Ram. (2003) “Skepticism, Contextualism, and Semantic Self-Knowledge.” Philosophy and Phenomenological Research 67 (2): 396–411.
  • Nozick, Robert. (1981) Philosophical Explanations. Cambridge, MA: Harvard University Press.
  • Oakley, I.T. (2001) “A Skeptic’s Reply to Lewisian Contextualism.” Canadian Journal of Philosophy 31: 309-332.
  • Pritchard, Duncan. (2000) “Closure and Context.” Australasian Journal of Philosophy 78: 275-280.
  • Pritchard, Duncan. (2001) “Contextualism, Scepticism, and the Problem of Epistemic Descent.” Dialectica 55: 327-349.
  • Pritchard, Duncan. (2002) “Recent Work on Radical Skepticism.” American Philosophical Quarterly 39: 215-257.
  • Rieber, Steven. (1998) “Skepticism and Contrastive Explanation.” Noûs 32: 189-204.
  • Rysiew, Patrick. (2001) “The Context-Sensitivity of Knowledge Attributions.” Noûs 35: 477-514.
  • Schaffer, Jonathan. (2001) “Knowledge, Relevant Alternatives and Missed Clues.” Analysis 61: 202-208.
  • Schaffer, Jonathan. (2004a) “From Contextualism to Contrastivism.” Philosophical Studies, 119 (1-2): 73-104.
  • Schaffer, Jonathan. (2004b) “Skepticism, Contextualism, and Discrimination.” Philosophy and Phenomenological Research, 69 (1): 138–155.
  • Schiffer, Stephen. (1996) “Contextualist Solutions to Scepticism.” Proceedings of the Aristotelian Society 96: 317-333.
  • Shatz, David. (1981) “Reliability and Relevant Alternatives.” Philosophical Studies 39: 393-408.
  • Shuger, Scott. (1983) “Knowledge and its Consequences.” American Philosophical Quarterly 20: 217-225.
  • Sosa, Ernest. (1999) “How to Defeat Opposition to Moore.” Philosophical Perspectives 13, Epistemology: 141-153.
  • Sosa, Ernest. (2000a) “Skepticism and Contextualism.” Philosophical Issues 10, Skepticism: 1-18.
  • Sosa, Ernest. (2000b) “Replies [to Tomberlin, Kornblith, and Lehrer].” Philosophical Issues 10, Skepticism: 38-41.
  • Sosa, Ernest. (2004) “Relevant Alternatives, Contextualism Included.” Philosophical Studies, 119 (1-2): 35-65.
  • Stanley, Jason. (2000) “Context and Logical Form.” Linguistics and Philosophy 23: 391-434.
  • Stanley, Jason. (2004) “On the Linguistic Basis for Contextualism.” Philosophical Studies, 119 (1-2): 119-146.
  • Stine, Gail. (1976) “Skepticism, Relevant Alternatives, and Deductive Closure.” Philosophical Studies 29: 249-261.
  • Stone, Jim. (2000) “Skepticism as a Theory of Knowledge.” Philosophy and Phenomenological Research 60: 527-545.
  • Stroud, Barry. (1984) The Significance of Philosophical Scepticism. New York: Oxford University Press.
  • Unger, Peter. (1975) Ignorance: A Case for Scepticism. Oxford: Oxford University Press.
  • Unger, Peter. (1984) Philosophical Relativity. Minneapolis: University of Minnesota Press.
  • Unger, Peter. (1986) “The Cone Model of Knowledge.” Philosophical Topics 14: 125-178.
  • Vogel, Jonathan. (1987) “Tracking, Closure, and Inductive Knowledge.” In Steven Luper-Foy, ed., The Possibility of Knowledge: Nozick and His Critics. Totowa, NJ: Rowman and Littlefield.
  • Vogel, Jonathan. (1990) “Are There Counterexamples to the Closure Principle?” In Glenn Ross and Michael D. Roth, eds., Doubting: Contemporary Perspectives on Skepticism. Dordrecht: Reidel Publishing Company.
  • Vogel, Jonathan. (1997) “Skepticism and Foundationalism: A Reply to Michael Williams.” Journal of Philosophical Research 22: 11-28.
  • Vogel, Jonathan. (1999) “The New Relevant Alternatives Theory.” Philosophical Perspectives 13, Epistemology: 155-180.
  • Williams, Michael. (1988) “Epistemological Realism and the Basis of Scepticism.” Mind 97: 415-439.
  • Williams, Michael. (1996a) Unnatural Doubts: Epistemological Realism and the Basis of Scepticism. Princeton, NJ: Princeton University Press.
  • Williams, Michael. (1996b) “Understanding Human Knowledge Philosophically.” Philosophy and Phenomenological Research 56: 359-378.
  • Williams, Michael. (1997) “Still Unnatural: A Reply to Vogel and Rorty.” Journal of Philosophical Research 22: 29-39.
  • Williams, Michael. (1999a) Groundless Belief: An Essay on the Possibility of Epistemology, 2nd ed. Princeton, NJ: Princeton University Press.
  • Williams, Michael. (1999b) “Fogelin’s Neo-Pyrrhonism.” International Journal of Philosophical Studies 7: 141-158.
  • Williams, Michael. (2001) “Contextualism, Externalism and Epistemic Standards.” Philosophical Studies 103: 1-23.
  • Williamson, Timothy. (2001). “Comments on Michael Williams’ ‘Contextualism, Externalism and Epistemic Standards’.” Philosophical Studies 103: 25-33.
  • Yourgrau, Palle. (1983) “Knowledge and Relevant Alternatives.” Synthese 55: 175-190.

Author Information

Tim Black
Email: tim.black@csun.edu
California State University, Northridge
U. S. A.

Edward Caird (1835—1908)

cairdA Scottish philosopher of the latter half of the nineteenth century, Edward Caird was one of the key figures of the idealist movement that dominated British philosophy from 1870 until the mid 1920s. Best known for his studies of Kant and Hegel, he argued that “Kantian philosophy is only a first stage, though of course a necessary stage, in the transition of philosophy to higher forms of Idealism.” Caird exercised a strong influence on the second generation of idealists, such as John Watson and Bernard Bosanquet. During his long and productive life, Caird was active in university and local politics and in educational and social reform. In his two series of Gifford lectures, he developed an important evolutionary account of religious conceptions ( the idea of the good, the soul, God, and the relation of God to humanity).

Table of Contents

  1. Biography
  2. Critique of Kant and Hegel
  3. Philosophical Style
  4. Evolution and Religion
  5. Reference and Further Reading

1. Biography

Edward Caird was born in Greenock, Scotland, on March 23, 1835. A younger brother of the theologian John Caird (1820-1898), Edward began his studies at the University of Glasgow (which he briefly abandoned due to ill health), later moving to Balliol College, Oxford, from which he graduated in 1863. Following his graduation, he became Tutor at Merton College, Oxford (1864-1866), but soon left for the Professorship of Moral Philosophy at Glasgow (1866-1893). There, in addition to carrying out his academic duties, Caird was active in university and local politics, and was responsible for establishing the study of political sciences at the University. Following the death of Benjamin Jowett (1817-1893), Caird returned to Oxford, where he served as Master of Balliol College until 1907. He was a founding fellow of the British Academy (1902), a corresponding member of the French Academy, and held honorary doctorates from the Universities of St Andrews (1883), Oxford (1891), Cambridge (1898) and Wales (1902).

Like many of the British idealists, Caird had a strong interest in classical literature. In his two volumes of Essays on Literature and Philosophy (1892), he brought together critical essays on Goethe, Rousseau, Carlyle, Dante and Wordsworth, with a discussion (in Volume II) of Cartesianism (Descartes, Malbranche and Spinoza) and metaphysics.

Caird’s politics were generally liberal and progressive. He supported the education of women, opposed the Anglo-Boer War (1899-1902) and, like Green, was involved in the ‘university settlement’ programs–particularly in Glasgow and in London–where recent university graduates and professionals attempted to narrow the gap between social classes by living and working among and with the poor.

In 1907, Caird resigned his position as Master of Balliol, and died the following year on November 1. He is buried in St Sepulchre’s Cemetery, Oxford, alongside Jowett and Green.

2. Critique of Kant and Hegel

Along with T.H. Green (1836-1882), Caird was one of the first generation of British idealists, whose philosophical work was largely in reaction to the then-dominant empiricist and associationist views of Alexander Bain (1818-1903) and J.S. Mill. He had, however, an ability of literary expression which Green did not possess; he was also more inclined to discuss questions by the method of tracing the historical development of the ideas involved. But while Green died at the early age of 47, Caird enjoyed a relatively long and productive life. It is, in part, for this reason that he exercised such a strong influence—particularly on the relation of philosophy and religion—on later idealists such as John Watson (1847-1939) and Bernard Bosanquet (1848-1923). Though often considered to be Hegelian, Caird was arguably more profoundly influenced by Kant, although he was far from an uncritical reader.

Caird’s first major work was A Critical Account of the Philosophy of Kant (1877), focusing on the Critique of Pure Reason and the Prolegomena to any Future Metaphysics. It was superseded in 1889 by The Critical Philosophy of Immanuel Kant (two volumes) in which Caird wished to show the relation of the three Critiques and the continuity in the movement of Kant’s thought. In general, Caird was convinced that, though Kant had inaugurated a new era in philosophy with his attempt to integrate the a priori and the a posteriori, he failed to carry out this task fully. It was here that Caird’s idealism took over. In these volumes on Kant, Caird sought “to display in the very argument of the great metaphysician, who was supposed to have cut the world in two with a hatchet, an almost involuntary but continuous and inevitable regression towards objective organic unity.” Thus, he argued that “Kantian philosophy is only a first stage, though of course a necessary stage, in the transition of philosophy to higher forms of Idealism.” (1877, p. 667)

A sympathetic exposition of Hegel’s philosophy is contained in his monograph on Hegel (1883) and, in 1885, his Social Philosophy and Religion of Comte (based on a collection of articles that had been previously published in the magazine, Contemporary Review) appeared. In these two works, Caird critically interprets these authors on lines of his own. Concerning Comte, for example, Caird writes that there cannot be a ‘religion of Humanity’ that is not, at the same time, a religion of God. In his treatment of Hegel, as of Kant, Caird’s purpose was to show that there is a center of unity to which the mind must come back out of all differences, however varied and alien in appearance. The analysis was preliminary to reconstruction.

3. Philosophical Style

Caird’s way of philosophizing differed from that of many of his contemporaries. It was consistently and even obtrusively constructive. According to Caird, “the true manner of honoring a thinker is to force oneself to understand him from his own point of view,” and only then “to submit his ideas to as objective an examination as possible.” Thus, he seized on the truths contained in the authors with whom he dealt, and was only incidentally concerned with their errors. One of the results of this, however, was that Caird’s own views are often to be found only indirectly–that is, in his exposition and commentary of the views of others.

4. Evolution and Religion

Like many other idealists, such as D.G. Ritchie (1853-1903), Caird was concerned to show the relation of evolutionary theory to the development of thought and culture. His first set of Gifford lectures, The Evolution of Religion (2 volumes, 1893), deals less than his other works with an exposition of the views of other philosophers. These lectures focussed on the possibility of a science of religion and the nature of religion from Greek times, but were especially centered on the development of the Christian faith through to the Reformation. Caird shows the spiritual sense of humanity as at first dominated by the object, but constrained by its own abstractions to swing around so as to fall under the sway of the subject.

In 1904 Caird’s second set of Gifford lectures, The Evolution of Theology in the Greek Philosophers,appeared. Here, he provides again an evolutionary account of religious conceptions (e.g., the idea of the good, the soul, God, and the relation of God to humanity) toward a ‘reflective religion’ or theology. The story of Greek philosophy, which Caird considered mainly (but not exclusively) in its relation to theology, was carried from Plato through Aristotle, the Stoics, and Philo, to Plotinus and–in the final lecture–to Christian theology and St. Augustine.

In general, Caird’s views on religion were importantly related to his understanding of ethics, and Caird borrows from Hegel (and Goethe) the ethical idea of self sacrifice, or “dying to live,” which was to have an important role in the work of Bosanquet. Caird consistently emphasized the importance of religion, and that a genuine metaphysics must be able to provide an account of it.

5. References and Further Reading

  • The Collected Works of Edward Caird, 12 Volumes, Ed. and Introd. Colin Tyler, Bristol, UK: Thoemmes Press, 1999.
  • A Critical Account of the Philosophy of Kant, with an Historical Introduction. Glasgow: J. Maclehose, 1877.
  • The Problem of Philosophy at the Present Time: an Introductory Address Delivered to the Philosophical Society of the University of Edinburgh. Glasgow, James Maclehose & sons, 1881. (43 p.)
  • Hegel, Philadelphia: J. B. Lippincott and co.; Edinburgh: W. Blackwood and sons, 1883.
  • The Social Philosophy and Religion of Comte. Glasgow: J. Maclehose and sons, 1885. New York, Macmillan, 1885.
  • The Moral Aspect of the Economical Problem; Presidential Address to the Ethical Society. London, Swan Sonnenschein, Lowrey & Co., 1888. (18 p.)
  • The Critical Philosophy of Immanuel Kant, Glasgow: J. Maclehose & sons, 1889; New York: Macmillan, 1889. 2 v.
  • Essays on Literature and Philosophy, Glasgow, J. Maclehose and sons, 1892. 2 v. [v. 1. Dante in his relation to the theology and ethics of the Middle Ages. Goethe and philosophy. Rousseau. Wordsworth. The problem of philosophy at the present time. The genius of Carlyle; v. 2. Cartesianism. Metaphysic.]
  • The Evolution of Religion. 2 v., Glasgow: James Maclehose, 1893; New York: Macmillan, 1893. [Gifford lectures; 1890/1891-1891/1892]
  • Address on Plato’s Republic as the Earliest Educational Treatise, Delivered by Edward Caird at the Closing Ceremony of the Session 1893-94. Bangor: Jarvis & Foster, 1894 (22 p.)
  • The Evolution of Theology in the Greek Philosophers. 2 v., Glasgow: J. Maclehose and sons, 1904. [Gifford lectures, Glasgow; 1900/1901 and 1901-1902].
  • Idealism and the Theory of Knowledge. London: Henry Frowde, 1903 (14 p.)
  • Lay Sermons and Addresses : Delivered in the Hall of Balliol College, Oxford. Glasgow : J. Maclehose; New York: Macmillan, 1907.

The standard assessment of Caird’s work is:

  • The Life and Philosophy of Edward Caird by Sir Henry Jones and John Henry Muirhead. Glasgow: Maclehose, Jackson and co., 1921.

The IEP desires a newer and more detailed article on Caird.

Author Information

Revised by William Sweet
U. S. A.

Transcendental Arguments

Transcendental arguments are partly non-empirical, often anti-skeptical arguments focusing on necessary enabling conditions either of coherent experience or the possession or employment of some kind of knowledge or cognitive ability, where the opponent is not in a position to question the fact of this experience, knowledge, or cognitive ability, and where the revealed preconditions include what the opponent questions. Such arguments take as a premise some obvious fact about our mental life—such as some aspect of our knowledge, our experience, our beliefs, or our cognitive abilities—and add a claim that some other state of affairs is a necessary condition of the first one. Transcendental arguments most commonly have been deployed against a position denying the knowability of some extra-mental proposition, such as the existence of other minds or a material world. Thus these arguments characteristically center on a claim that, for some extra-mental proposition P, the indisputable truth of some general proposition Q about our mental life requires that P. Eighteenth Century Prussian philosopher Immanuel Kant is usually credited with introducing the systematic use of the transcendental argument. His use of it included arguments aimed at refuting epistemic skepticism, as well as arguments with the more fundamental purpose of showing the legitimacy of the application of certain concepts—in particular those of substance and cause—to experience. Later scholars have developed a variety of general objections to the transcendental argument strategy. In response, some recent and contemporary philosophers have offered updated strategies similar in form to transcendental arguments, but with less controversial premises and/or more modest goals.

Table of Contents

  1. Transcendental Reasoning and Skepticism
  2. The Modal Objection
  3. The Analytic/Criteriological Approach
  4. The Verificationism/Idealism Objection
  5. The Uniqueness-of-Conceptual-Framework Objection
  6. Modest Transcendental Arguments
  7. Objections to Modest Transcendental Arguments
  8. A More Modest Project for Kant
  9. Prospects for Strong Transcendental Arguments
  10. References and Further Reading

1. Transcendental Reasoning and Skepticism

“Transcendental” reasoning, for Kant, is reasoning pertaining to the necessary conditions of experience. Though he did coin the term “transcendental argument” in a different context, Kant actually did not use it to refer to transcendental arguments as they are understood today. He did sometimes use the term “transcendental deduction” for a range of arguments concerning the necessary conditions of coherent experience. Early uses of the term “transcendental argument” for arguments of this type have been noted in Charles Peirce and J. L. Austin. Often, the purpose of a transcendental argument is to answer a variety of epistemic skepticism by showing that the skeptical position itself (or its expression) implies or presupposes the possibility of the very knowledge in question. In this way, as Kant puts it in his Critique of Pure Reason, “the game played by idealism [is] turned against itself.” The skeptic is shown to presuppose the very facts he or she calls into question. (Kant also had a more modest use for transcendental arguments pertaining merely to establishing the applicability of certain fundamental concepts; see Section 8, below.)

Kant’s anti-skeptical arguments were inspired by a number of figures, but his primary concern was with what he saw as the empiricist skepticism of David Hume. In his Treatise of Human Nature, Hume argues that all ideas are derived from simple sense-impressions, simple impressions of reflection, and reflection on the mind’s operations. He goes on to argue that complex ideas of material objects are not fully grounded in the data of the senses, but are based in part on psychological propensities to pass from one idea to another. Our senses do not present us with the characteristics of mind-independence and perdurance; rather, our experience consists in sequences of impressions, some of which exhibit a resembling constancy with each other over time. To this picture, Hume argues, we must add an imaginative propensity to identify, and thus attribute continued existence to, impressions exhibiting constancy and coherency. Since the distinctness of these impressions conflicts with our propensity to identify them, we posit enduring and independent items that are responsible for various subjective impressions. One natural conclusion from this line of reasoning is that, whatever compulsion we might feel to acknowledge external, material things, neither reason nor the senses can be said to yield knowledge of such items.

Kant addresses skepticism about the material world most directly with his “Refutation of Idealism” in the second edition Critique of Pure Reason. There he argues that the possibility of recognizing the time-order of one’s own perceptions depends on the application of the concept of alteration to one’s own mental states. And in order for us to possess and apply the concept of alteration, it must be exhibited in the sensory experience of objective alteration. This experience cannot be based on patterns or regularities in experience (including its constancy and coherence), since the recognition of any such pattern depends on the organization of one’s experiences in time. The possibility of the organization of one’s own experiences in time (and even recognizing that one’s own states have a determinate time-order at all) requires relating changes in those experiences to objective alterations. Since we do make judgments about the time-order of our own experiences, we must have experienced objective alteration.

Kant’s answer to the skeptic thus takes roughly the following form:

(1) I make judgments about the temporal order of my own mental states.
(2) I could not make judgments about the temporal order of my own mental states without having experienced enduring substances independent of me undergoing alteration.
(3) Hence independent, enduring substances exist.

He thus establishes a claim to knowledge of the existence of enduring, independent objects by showing that the skeptic is committed to something (in this case, consciousness of one’s own perceptions as ordered in time) that is impossible without the existence of such objects. The skeptic thus is either committed to the existence of such things by virtue of accepting the obvious fact that we are conscious of our own perceptions as ordered in time, or presumes the existence of such things in the very attempt to raise doubt about it. This result would license the conclusion that we have knowledge of material objects, or at least that skepticism about the very existence of such items is incoherent.

Kant’s refutation of skepticism matches the template for a common understanding of the classical form of a transcendental argument:

(1) Some proposition Q about our mental life, the truth of which is immediately apparent or presumed by the skeptic’s position.
(2) The truth of some extra-mental proposition P, our knowledge of which is questioned by the skeptic, is a necessary condition of Q.
(3) Therefore P.

Transcendental arguments are further distinguished by the fact that the necessity they draw on is, characteristically, neither empirical nor analytic necessity. Rather, claims like those found in the second premise imply some claim to synthetic a priori knowledge—knowledge of substantive facts about the world derived by a priori metaphysical reasoning. If such claims were based on empirical observation, they would beg the question against most relevant forms of skepticism; if these claims were merely analytic, then it is unlikely any substantive conclusion could be derived from them.

Transcendental arguments can be characterized as demonstrations that the skeptic’s articulation of her own position is self-defeating in some way. These arguments imply that the skeptic cannot even coherently articulate a given position. Epicurus is reported to have argued that, without free choice, one assents to propositions only because one is determined to do so. Without free choice, then, it would be impossible to rationally assent to any proposition—that is, to assent to it because one has good reasons to think it is true, rather than because one must. The proposition that one has no free choice is thus self-stultifying, in that, if true, it cannot be warranted. This reasoning implies the following argument:

(1) I am able to rationally assent to the proposition that there is no free choice.
(2) I could not rationally assent to any proposition if there were no free choice.
(3) Hence, there is free choice.

Hilary Putnam (1981), drawing on his concept of content-externalism, holds that we cannot refer to brains and vats if we are brains in vats who have never actually experienced such things. If we have never had contact with external objects, our language is “Vat-English,” rather than English. Since reference, in his view, is partly determined by its context and causal history, it would be impossible for a permanent brain-in-a-vat to raise doubts about whether she is a brain in a vat. Given this theory of reference, the proposition that all persons are and have always been brains in vats is self-defeating, in that it is either false or not affirmable by anyone. Insofar as the skeptic supposes that the issue is a legitimate one to raise, she presupposes that the relevant concern is moot:

(1) I am able to raise the question as to whether all persons have always been brains in vats.
(2) I could not refer to brains in vats unless some person (that is, myself) were acquainted with such things.
(3) Hence, it is not the case that all persons have always been brains in vats.

Finally, it is an implication of Kant’s reasoning in the Refutation of Idealism that the proposition that no one has had any contact with material objects would be literally unthinkable without contact with material objects to give one a sense of an objective system of temporal relations (in turn enabling inner time-determination). If Kant is right, then such a proposition is performatively self-falsifying in the strongest sense: the possibility of the skeptic articulating her own position would prove its falsity.

2. The Modal Objection

One general objection commonly raised against transcendental arguments concerns the very type of necessity transcendental arguments rely upon. Transcendental arguments characteristically center on a claim to synthetic a priori knowledge. Take, for example, Kant’s claim that the experience of enduring objects undergoing alteration is a precondition of subjective time-consciousness. This claim is neither grounded in experience nor follows from the meanings of the terms involved. He does provide some (often rather obscure) reasoning to support this claim, but that support, again, typically involves claims to synthetic a priori knowledge. Such claims have been portrayed as ultimately relying on a mysterious faculty of philosophical intuition, of insight into the natures of things not grounded in observation or experiment, the legitimacy of which is at least as doubtful as sensory perception or empirical inference.

3. The Analytic/Criteriological Approach

Partly in response to concerns about the modality of Kantian transcendental arguments, and in response to allied concerns about claims to synthetic a priori knowledge, Peter Strawson, Jonathan Bennett, and others have promoted a strategy structurally similar to Kant’s, but which is intended to avoid such problematic claims. Their strategy is analytic, in that it concerns relationships between beliefs or concepts and the conceptual frameworks needed to give those beliefs or concepts their content.

In Individuals, Strawson (1959) offers a transcendental argument purporting to demonstrate the existence of other minds. He argues that, to employ the concept of one’s own mind in the self-ascription of mental states, one must be able to distinguish between one’s own mental states and the mental states of others. This requires the ability to predicate mental states of both oneself and others. But, he continues, in order to employ (or understand) any general concept one needs criteria for its application. In order to ascribe mental states to oneself, then, one must be in possession of logically adequate criteria (that is to say, behavioral criteria) for ascribing mental states to others.

Strawson’s (1966) approach in The Bounds of Sense to reconstructing Kant’s Refutation of Idealism argument works similarly. His reconstruction states that, to give content to the idea of one’s being in some particular conscious state at some particular time, one needs “the idea of a system of temporal relations which comprehends more than those experiences themselves.” One’s experiences thus must be taken as experiences of things independent of oneself with their own temporal order. The idea of temporal order, he argues, cannot be gleaned from one’s own case alone; the application of the concept of temporal ordering depends on the possession and application of a concept of objectivity. But does the requirement that one have and apply the concept of an objective order guarantee that there really exists such an order? Is it not sufficient that we think there is one? Similarly, is it not sufficient for the self-ascription of mental states that we think there are other minds? Strawson’s reply rests on his “principle of significance,” which states that “there can be no legitimate, or even meaningful, employment of ideas or concepts which does not relate them to empirical or experiential conditions of their application.” One’s assessment of the analytic/criteriological approach depends on one’s assessment of this verificationism-inspired principle.

4. The Verificationism/Idealism Objection

In a much-cited essay, Barry Stroud (1968) argues that, to any claim that the truth of some proposition is a necessary condition of some fact about our mental life, the skeptic can always reply that it would be enough for it merely to appear to be true, or for us merely to believe that it is true. Transcendental arguments, he claims, at best demonstrate how things must appear, or what we must believe, rather than how things must be. Anti-skeptical transcendental arguments of familiar sorts are thus left with a gap to fill. Stroud’s contention—which is now widely accepted—is that such arguments, when aimed at refuting epistemic skepticism, can only close that gap by adverting either to a sort of verificationism or to idealism. In the case of Strawson’s arguments above, even supposing that we must be in possession of some criteria for applying concepts of other minds and/or an objective world, this fact only has anti-skeptical consequences if we also accept that there is no meaningful concept-application without experiential criteria sufficient for knowing whether the concept is instantiated. As Stroud points out, such a principle is implausible. Further, if we accepted such a principle, other aspects of transcendental arguments would be superfluous. All we would have to show is that we meaningfully employ external-world concepts; it would be impossible for any form of skepticism to be meaningful or intelligible.

As Stroud goes on to point out, another way of closing the gap between it being necessary that things appear a certain way and things being that way, would be to embrace an idealism that reduces how things are to how things appear, or must appear, to us. Kant did not rely on any verificationist principle in making the case against skepticism, but according to many scholars his “transcendental idealism” made possible the jump from how things must be experienced by us to how things must be by reducing objects of experience to mere mental representations. But such idealism is unacceptable to most: embracing idealism to answer the epistemic skeptic results in a Pyrrhic victory at best.

Despite Stroud’s blanket assertion, it should be noted that the verification/idealism objection only applies on a case-by-case basis. Some arguments that take the form of transcendental arguments may have other deficiencies, but do not rely on either verificationism or idealism. A few scholars have observed that Descartes’s “Cogito, ergo sum” argument can be re-conceived as a transcendental argument:

(1) I think.
(2) In order to think “I think,” it is necessary to exist.
(3) Hence, I exist.

This argument meets the criteria for a transcendental argument: it takes a fact about one’s mental life as a premise, adds that some extra-mental fact is a necessary condition of the truth of that premise, and concludes that the extra-mental fact holds. This argument would turn on the claim that the statement, “I do not exist” (or better, the proposition that no one exists) is performatively self-defeating in the sense that the fact of its performance counts as conclusive evidence against its truth. That is what connects the mental fact (I am thinking about whether I exist) to the relevant extra-mental fact (I exist). Regardless of how this argument might fail in some other respect, it presupposes neither verificationism nor idealism in closing the gap between the internal and the external.

5. The Uniqueness-of-Conceptual-Framework Objection

Another important general objection to transcendental arguments concerns the hidden assumption requiring the uniqueness of the conceptual scheme that is held to be a precondition of experience in any given transcendental argument. Kant, for example, argues that experience is only possible if certain concepts are applied a priori in its organization, such as the concepts of substance and cause. Strawson similarly argues that experience is only possible via the application of the concept of an objective system of temporal relations. Stephan Körner (1974), however, famously characterized arguments resting on such claims as hopeless, because there is no way to establish the uniqueness of the relevant conceptual precondition. His concern is that other conceptual schemes and principles—perhaps unimaginable to us—might suffice as well. But if such schemes cannot be ruled out, then the validity of any such argument cannot be decisively established. All transcendental arguments make some claim about necessary enabling conditions. Given that the sense of necessity in question is not logical, how can the uniqueness of the enabling conditions ever be shown?

6. Modest Transcendental Arguments

In response to some of these concerns Stroud has proposed that we keep transcendental arguments, but moderate the goal we hope to achieve with them (Stroud 1994 and 1999). The goal of a “modest” transcendental argument is just to show the indispensability of some belief, concept, or conceptual framework. The conclusion such arguments hope to draw is not a refutation of some variety of epistemic skepticism via a demonstration of the alternative, but rather a demonstration of the unintelligibility of the skeptical position. The idea is that, by showing that it is impossible consistently to maintain a given position, one also shows that it is legitimate to ignore it. Arguments of this sort seek to show that beliefs about, say, an external world or other minds are indispensable to coherent experience or the use of language.

The modest strategy in replying to external-world skepticism would be to concede that one cannot prove transcendentally that there is an external world, but to show that one must believe in such a world, or presuppose such a world as part of one’s interpretive framework, as a precondition of coherent experience. This, Stroud argues, would be sufficient to entitle one to ignore external-world skepticism. We are entitled to hold a belief, according to this line of thought, if that belief can be shown to be incorrigible or invulnerable to correction.

One major advantage to modest transcendental arguments is that they are not subject to the verificationism/idealism objection. All that such arguments seek to show is that we must believe a certain way, not that the world must be a certain way. Thus there is no gap to be closed between showing that the world must appear a certain way and eliminating the possibility that the world really is not that way.

7. Objections to Modest Transcendental Arguments

Arguments relying on the relative necessity of some conceptual framework or set of beliefs, however, are subject to certain general objections. A version of Körner’s uniqueness objection still seems applicable. To provide some response to the epistemic skeptic, an indispensability argument would have to show that a given belief is indispensable as such, rather than just indispensable for us. And to do that is impossible; we can only argue for the uniqueness of a conceptual or doxastic framework on the basis of our own concepts and beliefs. As Stern (2000) puts it, if indispensability “is weaker than infallibility in so far as it leaves open the possibility that our belief that p is false, how can p be immune from doubt?; and if it is immune from doubt though possibly false, isn’t this a vice rather than a virtue?” If the “necessity” of some set of beliefs or conceptual framework just follows from our own inability to think outside that framework, then the discovery of this necessity is just a discovery about our own limitations, rather than a discovery about the world around us. Indispensability may indeed be all a modest transcendental argument needs to show that skepticism is inert (for us), but is this an interesting result if it stems just from our own incapacities?

This kind of concern is reflected in a challenge to the classical claim that radical skepticism about reason is self-defeating. How can we know that logical inference really is truth-preserving? How can we know that the principle of non-contradiction is true? It would seem that such a skeptical position is unanswerable, because any answer involves argument, which presupposes the validity of deductive inference. But, as Aristotle first suggested in his Metaphysics, when one makes a statement asserting the impossibility of rationally supporting any claim one makes, one presupposes the theoretical possibility of claims being rationally supported (c.f. Meynell 1984). The framework under which we suppose that it is possible to rationally support claims is, in other words, indispensable, and the belief that it is possible to do so is invulnerable. This argument is, effectively, a modest transcendental argument.

But why can’t the skeptic make the same point while limiting herself to asking for proof of the universal and necessary validity of deductive inference? The skeptic need not on this approach make some claim to the effect that statements may not be rationally supportable (a claim, in other words, that itself calls for support). An inherent inconsistency in the affirmation of some such claim need not, then, be a concern (see Fowler 1987). In asking for proof, of course, the skeptic in some way implies that there is at least some prima facie doubt with regard to the operation of reason in finding truth. So in that way the skeptic must be implying at least a prima facie possibility that reason is inadequate to that task.

A modest transcendental argument establishing the indispensability of a conceptual framework has the effect of reducing the skeptic either to inconsistency or to raising doubts in the abstract. Since the alternative is inconceivable, the skeptic cannot consistently commit to the possibility of the alternative. Yet it seems too quick to go directly from showing that some conceptual framework is necessary for us to deny any relevance to questions about the truth of the framework. It is not clear, then, that any modest transcendental argument really renders its target skepticism inert. Even if the skeptic is shown to be unable consistently to raise a certain possibility, that possibility is not thereby taken out of contention. However abstract (or even inexpressible) the doubt may be that remains, the modest transcendental argument falls short of establishing epistemic entitlement.

8. A More Modest Project for Kant

There is another kind of modest application of transcendental arguments that is not subject to the above concerns, owing to its pursuit of a different kind of result. Part of Kant’s project is not so much concerned with responding to the epistemic skeptic as with responding to an opponent who questions the very conceptual legitimacy of external-world concepts like substance and cause. Despite an emphasis in contemporary philosophy on epistemic skepticism, for Kant conceptual legitimacy appears to be the primary or fundamental application of transcendental reasoning. This project is the major concern of his “Transcendental Deduction of the Categories” in the Critique of Pure Reason. He employs a legal metaphor at the beginning of his defense of our use of such conceptsto distinguish between “what is lawful (quid juris) and that which concerns the fact (quid facti).” His avowed focus, then, is on the “lawfulness” of our application of external-world concepts. He is concerned, as a first goal at least, with the applicability (or “objective validity”) of these concepts quite independently of their instantiation. That this should be a primary goal for Kant makes a lot of sense in light of some of his major precursors. Though in other respects having very different views, Leibniz, Berkeley, and Hume each questioned the legitimacy of the application of concepts like substance and cause to experience. Leibniz denied not only the existence of material substance but its metaphysical possibility. Because matter is infinitely divisible, he argued, it cannot be a basic constituent of the universe. Only minds can be substances, so the concept of substance is not even appropriately applied to matter. Berkeley argued that all we can describe are our ideas, and there is no sense in saying that ideas resemble material objects or their qualities. Talk of material objects independent of the mind is incoherent. Finally, Hume argued that it is impossible to find a source for the concepts of substance and cause in perception sufficient to explain either the occurrence or even the content of such ideas.

Leibniz, Berkeley, and Hume all have in common, then, the position that external-world concepts like substance and cause are either incoherent or inapplicable to perceptual experience. In modern terms, they held that such application, if possible at all, is a category mistake. It is not difficult to see how at least part of Kant’s project in his transcendental deduction of these concepts is to refute this view, as distinguished from the project of proving that we veridically experience a world of causally-related substances. His strategy in doing so is notoriously hard to pin down, but the gist of it is that he claims that the concept of an objective world (which would include the concepts of substance and cause) is needed as an organizing principle—a rule or set of rules—for reproducing and synthesizing in judgment one’s various and otherwise inherently unconnected representations. For example, because all experience qua one’s subjective flow of perceptions is successive, the concept of cause is needed to distinguish between a succession of experiences representing the experience of an object (which could be experienced differently and yet be thought of as the same object) and a succession of experiences representing the experience of an event (the order of the stages of which determines the way it can be experienced). Because the thought of a causal relationship between event-stages is constitutive of the thought of an event, and because distinguishing between an accidental and externally-determined sequence of experiences is necessary to time-determination, the a priori possession of the concept of cause is a necessary condition of coherent experience.

The legitimacy of the concepts of substance and cause would also be a consequence of some of Kant’s more explicitly anti-skeptical arguments. A consequence of his reasoning in the “Refutation of Idealism,” for example, is that objective time-determination is implicated in subjective time-determination. The application of concepts relevant to determining an objective time-order (as the concepts of substance and cause are, he had explained earlier) is inseparable from subjective self-awareness. Since each of Kant’s precursors allow for an inner mental life, they cannot consistently deny the legitimacy of applying concepts like substance and cause to perceptual experience. This would not prove the existence of causally-related material substances, but it would accomplish quite a lot: it would demonstrate the inadequacy, in a certain respect, of Leibnizian metaphysics, Berkeleyan phenomenalism, and Humean empiricism.

9. Prospects for Strong Transcendental Arguments

Defenders of strong anti-skeptical transcendental arguments still exist. Kenneth Westphal (2003), for example, is more confident than most that some of Kant’s core transcendental arguments can be successful. He believes that the process by which Kant identifies our basic cognitive capacities and their enabling conditions (Westphal calls this “epistemic reflection”) has been confused with simple introspection, which is an empirical enterprise concerned with the contents of one’s consciousness. He argues that Kant does convincingly show that we legitimately apply certain concepts a priori as a necessary condition of coherent consciousness, and that there are, in fact, “perduring, perceptible, causally interacting physical objects.”

Despite Kant’s remaining defenders, however, few now believe that transcendental arguments can yield a direct refutation of epistemic skepticism. Most now agree that more modest goals are in order if such arguments are to remain relevant. Such modest variations on the transcendental argument form continue to appear in a variety of contexts.

10. References and Further Reading

  • Aristotle, Metaphysics, Book Γ.
  • Austin, J.L. (1939). “Are There A Priori Concepts?”, Proceedings of the Aristotelian Society 18.
  • Bardon, Adrian (forthcoming). “The Aristotelian Prescription: Skepticism, Retortion, and Transcendental Arguments,” International Philosophical Quarterly.
  • Bardon, Adrian (2004). “Kant’s Empiricism in His Refutation of Idealism,” Kantian Review 8.
  • Bardon, Adrian (2005). “Performative Transcendental Arguments,” Philosophia 33.
  • Bennett, Jonathan (1966). Kant’s Analytic (Cambridge: Cambridge University Press).
  • Berkeley, George (1979). Three Dialogues between Hylas and Philonous, ed. by Robert Adams (Indianapolis: Hackett).
  • Brueckner, Anthony (1983). “Transcendental Arguments I.” Noûs 17, pp. 551-76.
  • Brueckner, Anthony (1984). “Transcendental Arguments II.” Noûs 18, pp. 197-225.
  • Cassam, Quassim (1999). Self and World (Oxford: Oxford University Press).
  • Cassam, Quassim (1987). “Transcendental Arguments, Transcendental Synthesis, and Transcendental Idealism,” Philosophical Quarterly 37.
  • Davidson, Donald (1984). Inquiries into Truth and Interpretation (Oxford: Oxford University Press).
  • Fowler, Corbin (1987). “Scepticism Revisited,” Philosophy 62, pp. 385-88.
  • Genova, A.C. (1984). “Good Transcendental Arguments.” Kant-Studien 75, pp. 469-95.
  • Gram, Moltke (1975). “Why Must We Revisit Transcendental Arguments?” The Journal of Philosophy 72, pp. 624-6.
  • Guyer, Paul (1987). Kant and the Claims of Knowledge. (Cambridge: Cambridge University Press).
  • Hintikka, Jaakko (1962). “Cogito, Ergo Sum: Inference or Performance?”, The Philosophical Review 71.
  • Hume, David (1978). A Treatise of Human Nature, 2nd ed. (Oxford: The Clarendon Press).
  • Kant, Immanuel (1998). Critique of Pure Reason, ed. and trans. by Paul Guyer and Allen Wood (Cambridge: Cambridge University Press).
  • Körner, Stephan (1974). Categorial Frameworks (Oxford: Basil Blackwell).
  • Leibniz, G.W. (1998). Monadology, in G.W. Leibniz: Philosophical Texts, trans. by Richard Franks and R.S. Woolhouse (Oxford: Oxford University Press).
  • Lewis, C.I. (1946). An Analysis of Knowledge and Valuation. (La Salle: Open Court).
  • Lewis, C.I. (1969) Values and Imperatives, ed. by J. Lange (Stanford: Stanford University Press).
  • Lipson, Morris (1987). “Objective Experience.” Noûs 21, pp. 319-43.
  • Lonergan, Bernard (1970). Insight (New York: Philosophical Library).
  • Meynell, Hugo (1984). “Scepticism Reconsidered,” Philosophy 59, pp. 431-42 .
  • Peirce, C.S. (1931 & 1958). Collected Papers of Charles Sanders Peirce, 8 vols., vols. i-iv ed. C. Hartshorne and P. Weiss (Cambridge, MA: Harvard University Press, 1931-5), vols. vii-viii ed. A. Burks Cambridge, MA: Harvard University Press, 1958).
  • Putnam, Hilary (1981). Reason, Truth, and History (New York: Cambridge University Press).
  • Rosenberg, Jay F. (1975). “Transcendental Arguments Revisited.” The Journal of Philosophy 72, pp. 611-24.
  • Schaper, Eva (1972). “Arguing Transcendentally,” Kant-Studien 63, pp. 101-16.
  • Stern, Robert (2000). Transcendental Arguments and Skepticism (Oxford: Oxford University Press).
  • Strawson, P.F. (1966). The Bounds of Sense (London: Methuen & Co.).
  • Strawson, P.F. (1959). Individuals (New York: Methuen & Co.).
  • Strawson, P.F. (1985). Skepticism and Naturalism: Some Varieties (New York: Columbia University Press).
  • Stroud, Barry (1999). “The Goal of Transcendental Arguments,” in Robert Stern (ed.), Transcendental Arguments: Problems and Prospects (Oxford: Oxford University Press, 1999).
  • Stroud, Barry (1994). “Kantian Argument, Conceptual Capacities, and Invulnerability,” in Paolo Parrini (ed.), Kant and Contemporary Epistemology ( Dordrecht: Kluwer Academic Publishers).
  • Stroud, Barry (1968). “Transcendental Arguments,” Journal of Philosophy 65 (1968).
  • Westphal, Kenneth (2003). “Epistemic Reflection and Transcendental Proof,” in Strawson and Kant, ed. by Hans-Johann Glock (Oxford: Oxford University Press).

Author Information

Adrian Bardon
Email: bardona@wfu.edu
Wake Forest University
U. S. A.

George Santayana (1863—1952)

santayanGeorge Santayana was an influential 20th century American thinker whose philosophy connected a rich diversity of historical perspectives, culminating in a unique and unrivaled form of materialism, one recommending a bold reconciliation of spirit and nature. Santayana was also a poet, and he wrote a work of fiction, The Last Puritan, that was a Book of the Month Club selection in 1936, the same year he adorned the cover of Time magazine. Though he spent his formative intellectual life in America and ultimately is best categorized philosophically in that tradition, Santayana spent the better part of his life and publishing career in Europe. He spent his early childhood in his birth-country of Spain and throughout his expansive travels and residencies never relinquished his native citizenship. Displaying in both composition and criticism a prodigious literary imagination, Santayana’s writings appealed to a wide audience, and he remains to this day one of the most quoted of twentieth century thinkers. Probably the most well-known sentence of Santayana’s is also one of the least accurately quoted: “Those who cannot remember the past are condemned to repeat it” (The Life of Reason: Reason in Common Sense. Scribner’s, 1905: 284). Scholarly interest in Santayana today remains modest but diverse. Santayana was a thinker of rare stature whose work deserves the highest compliment of all: it can and may well still be read millennia from now.

Table of Contents

  1. Life
  2. Writings
  3. Philosophy
    1. Ontology and Epiphenomenalism
    2. Realms and Terminology
    3. Realms Defined
  4. Naturalism in World Perspective
  5. Legacy
  6. References and Further Reading
    1. MIT Press Critical Editions
    2. Other Santayana Works
    3. Books About Santayana

1. Life

George Santayana was born on December 16, 1863 in Madrid, Spain. He lived his first eight years in Spain, his next forty years in Boston, and his last forty years in Europe. Accordingly, Santayana arranged his life in his autobiography, Persons and Places, in three parts: (1) “Background,” (2) “On Both Sides of the Atlantic,” and (3) “All on One Side.” The Background (1863-1886) encompassed his childhood in Ávila, Spain, through his undergraduate years at Harvard. The second period, during which Santayana traveled between the U.S. and Europe, covered his Harvard years (1886-1912), both as graduate student (Ph.D. 1889) and professor. The third period (1912-1952) was that of the retired professor writing and traveling in Europe, and eventually adopting Rome as his center of activity.

Santayana’s birth name was Jorge Agustín Nicolás Ruiz de Santayana. At the time of his birth Santayana’s father, Agustín Ruiz de Santayana, had only in the last few years met and married Josefina Borrás Sturgis, the recent widow of a Boston merchant named George Sturgis. While Agustín and Josefina united long enough to marry and produce young Jorge (the only child of their union), the two would ultimately part ways. Receiving financial support from her brother-in-law Robert (George Sturgis died leaving her little), Josefina decided to move herself and her surviving Sturgis children to Boston while for eight years young George and his father remained in Ávila. In 1872, father and son made the twelve-day sea voyage to Boston where Agustín briefly attempted to settle in with his wife and her Sturgis children, and, failing to do so, left young George with them to return to Spain in the spring of 1873. This early uprooting and estrangement from his father surely had a deep emotional impact on Santayana, and indeed in his autobiography he characterizes the move as a “moral disinheritance.”

Santayana had a rich early education, spending eight years at the Boston Latin School. He revealingly reflects on those early years (the fall of 1874 through 1882), in his autobiography: “…I know I was solitary and unhappy, out of humor with everything that surrounded me, and attached only to a persistent dream-life, fed on books of fiction, on architecture and on religion.” Besides Latin, students of the Boston Latin School studied Greek, Mathematics, History, French, English Composition, Literature, and Rhetoric. Through this exposure Santayana managed to develop a life-long appreciation for classical and medieval worlds and their cultural contributions, to a great extent preferring them to modern offerings. These appreciations would contribute a breadth of historical perspective to Santayana’s mature philosophical works that is unrivaled by his American contemporaries.

In his early education Santayana nurtured a love of poetry and even entertained seriously the possibility of becoming an architect. Entering Harvard upon graduation from the Latin School in 1882, Santayana respectively took his undergraduate and graduate degrees (B.A., ’86, Ph.D. ‘89), benefiting incalculably from the philosophical mentorship of his teachers, amongst whom were two of the most famous “golden age” Harvard philosophers: William James and Josiah Royce. Upon successful completion of his doctorate, Santayana, by now fully committed to the discipline, began teaching philosophy at Harvard in the fall of 1889. He would remain there until his departure at the zenith of academic success. In 1912 Santayana took advantage of a modest inheritance from the death of his mother to retire from Harvard, and left for Europe indefinitely.

As to his time in America, though he does offer the occasional fond or sympathetic reflection, Santayana largely hated academic life and commercialism and the dead Puritanism that he identified in his novel The Last Puritan. Probably referring obliquely to his own eventual feelings of exile in America, Santayana wrote: “It is natural for a man to like to live at home, and to live long elsewhere without a sense of exile is not good for his moral integrity” (Winds of Doctrine, Charles Scribner’s Sons, 1913, pg. 6).

He left the U.S. to live an intellectually free life in Oxford, Paris, and, after 1925, Rome. Unsuccessful in his efforts to leave Rome before World War II, on October 14, 1941 he entered the Clinica della Piccola Compagna di Maria, or “Convent of the Blue Nuns,” a hospital-clinic where he lived until his death in September of 1952. He is buried in the only Spanish plot in Rome’s Campo Verano Cemetery.

2. Writings

Next to Ralph Waldo Emerson, Santayana is arguably one of the best writers in the Classical American tradition. Most philosophers tend to read Santayana as a literary figure (which he is) rather than a serious philosopher (which he is also), part of which has to do with the fact that his publications strike in both directions simultaneously: an oddity from the perspective of a public that tends to quarantine the two areas of interest.

His philosophical works reflect two distinct periods, the early “humanistic” period in which he composed The Sense of Beauty (1896), Interpretations of Poetry and Religion (1900), and the five-volume The Life of Reason (1905-6); and the later “ontological” period which yielded Scepticism and Animal Faith (1923), and the four-volume ontology titled Realms of Being (between 1927 and 1940).

Santayana sometimes repudiated his earlier work, in part for its having the taint of academic life. He especially spoke down at times about the Life of Reason series for its association with the progressivism of the day, and it was later edited by Santayana and his late-life personal assistant and secretary, Daniel Cory, with the intent of removing some of its more humanistic overtones.

These authorial disparagements notwithstanding, The Life of Reason series holds up as one of the greatest philosophical works of the early half of the twentieth century. His peer and adversarial contemporary John Dewey praised the series in a review of 1907 as “the most adequate contribution America has yet made—always excepting Emerson—to moral philosophy” (John Dewey, in John Dewey: The Middle Works, Volume 4 [1907-1909], edited by Jo Ann Boydston, Southern Illinois University Press, 1977: 241). The series would have a lasting influence on naturalistic philosophy in the twentieth century.

In his budding writing career Santayana also published a volume of poetry (an 1894 collection titled Sonnets and Other Verses). Nevertheless his poetic muse would fade with the passing of years. Despite in his early years attracting a near-cult following of Harvard poets, and later maintaining the same mentorship through their Rome pilgrimages, letters, and solicitations of feedback, Santayana’s literary exertions would be restricted to fiction and philosophy.

Early in his career at Harvard, Santayana would feel the pressure to produce a work of philosophy. The Sense of Beauty (1896)—an exercise in aesthetic formalism—was culled from a series of lectures he gave between 1892 and 1893 as a newly appointed Harvard professor. The book contains the famous definition of beauty as “pleasure regarded as a quality of the thing.” To this day The Sense of Beauty is arguably the most widely read of Santayana’s philosophical corpus. This is most likely due to its restrictive scope in comparison to his other philosophical works, while there has been the tendency for Santayana’s more ambitious philosophy to be neglected. This neglect probably will subside with the ongoing MIT Press Critical Edition publications of The Works of Santayana, edited by William G. Holzberger and Herman J. Saatkamp, Jr.

After The Sense of Beauty, Santayana published Interpretations of Poetry and Religion in 1900, a work which famously provoked William James—Santayana’s then-recent colleague—to characterize his philosophy as a “perfection of rottenness.” The book also provoked a key recognition from the other of Santayana’s early influential mentors, and also dissertation advisor, Josiah Royce. Santayana relates that Royce told him around the time of Interpretations that “the gist of [his] philosophy [is] the separation of essence from existence” (“Apologia Pro Mente Sua” in The Library of Living Philosophers: The Philosophy of George Santayana, edited by Paul Arthur Schilpp, New York: Tudor Publishing, pg. 497). The ontological categories of “essence” and “matter” would become key components of Santayana’s mature philosophy. (See section 3c.)

Besides being a poet, philosopher, and novelist, Santayana was a hugely influential cultural critic. In a trenchant 1911 address before the Philosophical Union in California he coined the term “genteel tradition” and memorably provided the characterization of America as an “old wine in new bottles.” He wrote many similarly speculatively rich essays diagnosing the cultural character of the America of his time, some of which included penetrating philosophical criticisms of his contemporaries and former teachers, James and Royce. These diagnoses were early collected in the volume Character and Opinion in the United States (1920).

None of Santayana’s writings stray entirely from philosophical considerations, including his only fictional novel. Santayana authored a single best-selling work of fiction titled The Last Puritan, published in 1936. He spent several of his post-Harvard years composing the book, and many of the main characters reflect personalities close to the author. The main theme of the novel (co-titled: “Memoir in the Form of a Novel”) is of interest for its enhancing one’s understanding of Santayana’s view towards America. It chronicles the tragic, sacrificial life of Oliver Alden, the title-subject, a romantic and pious youth whose inner religious sensibilities conflict with the pulsating natural life around him. Alden is from one standpoint a sympathetic character, one with whom the author himself admitted affinities. But from another standpoint the protagonist represented the tragic contemporary American as Santayana understood him—partly in reaction to troubled young poets and artists Santayana knew from his Harvard days.

Santayana’s broader cultural criticism can be found in such works as Winds of Doctrine (1913) and the beautiful and unforced Soliloquies in England (1922), remarkably written amidst the uncertain, violent times of World War I. The latter is an exemplary instance—of which two others include Dialogues in Limbo (1926) and Platonism and the Spiritual Life (1927)—where one finds the post-Harvard Santayana following inspirations as they come, allowing both his literary imagination and penetrating philosophical eye to take equal share in the interpretive task.

These shorter works undoubtedly provided opportunities of creative release for Santayana as the ambitious project of conceiving a system of philosophy began to assert itself. In 1923 Scepticism and Animal Faith (hereafter SAF), the introductory text to his four-volume system of philosophy was published. SAF is one of the few Santayana works to have remained in print up to the present. The book introduces the terminology and critical background of his mature ontology, itself unfolded in four volumes over the period of thirteen years.

3. Philosophy

a. Ontology and Epiphenomenalism

Despite minor shifts in emphasis and Santayana’s own attitude towards his work, there is no radical break between the early humanistic Santayana, and the mature, ontological one. The same persistent distinction between ideals and natural grounds for those ideals—which he calls in his mature ontology “essence” and “matter”—holds throughout all of Santayana’s works; and the same abiding concern for reconciling moral with natural life remains intact.

As Royce had prophesied, an ontological distinction persisted throughout Santayana’s works: between “essence,” or the infinite realm of character embodiments that any existing thing must take on in order to be experienced by humans, and “existence,” or the groundless causal flux of nature that underlies any form whatsoever.

In the Life of Reason Santayana emphasizes the distinction between “perfections” or “ideals” and their “natural roots” which he sometimes calls a “natural ground” or “basis” for all action, thought and experience: “Every genuine ideal has a natural basis…Ideals are legitimate, and each initially envisages a genuine and innocent good; but they are not realizable together, nor even singly when they have no deep roots in the world.” Such ideals then are not Platonic forms, in that they have “roots” and bear the marks of their natural origins. Plato’s forms, on the contrary, are conceived as entirely foreign to natural origins.

But Santayana’s terminological shift from talking of ideals and natural grounds to talking of essence and matter perhaps did come at a certain cost. Throughout the evolution of his thinking Santayana holds to an increasing, and to many interpreters troubling, epiphenomenal view of consciousness. Briefly, epiphenomenalism is the view that mind is derivative, wholly caused, and has itself no causal power. Such strong epiphenomenalism comes out in the following passage from RB: “…the realm of matter cannot admit mind into its progressive structure and movement; each trope or rhythm must be complete before sensation can arise; so that this sensation is intrinsically a result and not a cause, a comment and not an agent…” If mind and sensation appear on the scene only as after-effects, one has to wonder how human experience can be considered fulfilling—how more specifically it can be anything but an ineffectual, spectator process.

There is however more than this to Santayana’s view of mind and accompanying story of human experience. To see this one needs a further understanding of the definitive concepts of his mature philosophy.

b. Realms and Terminology

The four realms of being Santayana identifies, in the order in which he published each RB volume, are essence, matter, truth, and spirit. The realms are said by Santayana to be “qualities of reality” (RB 183) (not themselves to be confused as parts of the cosmos), that are worth distinguishing to render human experience more fulfilling, intelligent, and edifying.

Santayana holds that the realms are irreducibly different and are for that reason worth distinguishing. The possibility that there are more realms is not something he dismisses; his only condition for an additional realm is that it be irreducibly distinct from the four he distinguishes.

As indicated, before introducing the realms individually Santayana set up their presentation through a penetrating and synthetic critical introduction, published in 1923 as Scepticism and Animal Faith. Understanding the project of SAF requires acquaintance with the meaning of key original concepts, amongst which are: “intuition,” “intent,” “psyche,” “animal faith,” and “skepticism.”

All belief, Santayana writes, is “a form of some faith in animal, material existence.” What Santayana calls “animal faith,” is the instinctive (if you will) and unavoidable tendency for human actions to betray a deep belief in the existence of matter. On Santayana’s account, one cannot act without believing in matter. According to Santayana, the denial in speech or dialectical skepticism of the existence of matter is a solipsistic, momentary pose. So philosophers like Descartes and Berkeley are transcendental posers, inflexibly denying in theory what they unhesitatingly affirm in practice. Worse yet, however: these Modern’s conflate functional orientations of the mind which Santayana respectively distinguishes as “intuition” and “intent.”

“Intuition” is for Santayana the contemplation or consciousness of an essence (more on these shortly) apart from belief in any particular existence. Santayana contrasts “intent” from intuition in order to capture the process of “taking” essences as existences. When we interact with, manipulate, engage, or otherwise encounter what we experience as physical objects, we are imbuing essences with intent—giving them a material existence they can never literally have. This process of intent is governed by the preferential makeup of what Santayana terms “psyche.”

The psyche is the material set of preferences that define individuality in organisms. The psyche is, very simply, the material manifestation of mind and as such it is imbued with, defined by, and stricken with belief. When one is believing, one is acting on behalf of one’s psyche. When one is intuiting essences without the addition of belief in their existence—be it a revery, daydream, or performative trance as in a locked moment of harmonious activity—one is communing spiritually with the realm of essence.

This raises the issue of skepticism: if we only ever have a symbolic grasp of material reality, and we can at any point imaginatively “escape” such symbolic play, what’s to keep us from relapsing into Cartesian (re)pose? The first ten chapters of SAF are an exercise in engaging Cartesianism, with the goal of pushing skepticism to its “ultimate” limits.

As a skeptic Descartes was half-hearted according to Santayana (as regards naturalism he also accused his contemporary John Dewey of this), in that he thought skepticism ceased with awareness of the self. For Santayana, nothing overcomes skepticism except pure intuition, the irony of which is the fact that pure intuition issues in the “discovery of essence,” which is itself a bankruptcy of knowledge (see “essence” below). So where Descartes had sought the most indubitable knowledge, and proceeded on the principle that such a thing could be achieved, Santayana tries to show in SAF that the principle of indubitable knowledge is itself a paradox; when knowledge is tested by way of a radical skepticism, and certainty is the ultimate goal, the paradox is that certainty is achieved only at the cost of knowledge itself. “Certainty,” for Santayana, is thus a transcendent vision of essence and as such has nothing to do with knowledge, much less with science.

So the goal of SAF is to bankrupt Cartesianism, and in doing so to suggest a new starting point for philosophy. That starting point is animal faith, the tacit acceptance of material reality as the source of understanding, knowledge, and common sense. Hence the title: “Skepticism AND Animal Faith”: we need skepticism to intellectually clear the way for, and at the same time to lead us back to, natural intelligence—to the realms themselves!

c. Realms Defined

Essence: The realm of essence should be understood to have a certain primacy since it is infinite and pertains to all of the forms or definite character embodiments that material objects and events may take on. Essence is what Santayana defines as the most radical sense in which anything is or has a character. Nothing—be it material objects, objects of thought, imaginings, flights of fancy, or objects of logical deduction—is experienced except through the mediation, or more accurately, “im-mediation” of essences. In his inimitable way, Santayana says of essences that they are “the only things people ever see and the last they notice.” Essences are said by Santayana to designate the realm of internal or intrinsic relations, and awareness of essences indicates a departure from what is called “knowledge,” which he defines as “faith mediated by symbols.” Awareness of essence is just that: awareness; it is direct and unmediated and as such entails no faith (belief in realities not given).

Matter: The catch however is that Santayana is a thoroughgoing materialist, in that he holds that no form can appear to human intuition without the previous establishment of material conditions for that form to arise. Matter is the primordial existential flux and is an unintelligible “surd.” This does not mean, however, that matter cannot be “known,” at least provisionally. Like Spinoza’s substance, existence or matter for Santayana has no purpose, but imposes external, natural limits to all activity. Those external limits define human life and mark off the boundaries between human understanding and the unfathomable depths of material existence. Santayana holds that humans know matter only at a remove, that is, (to repeat) symbolically. Matter is in fact referred to by Santayana as a “metaphor” only, producing one of the more provocative aspects of his philosophy: science is no less literary than poetry in representing matter in that it must express its truths at a remove, through the lens of human bias. In this sense Santayana’s materialism is, to use a contemporary term, “non-reductive.” Whatever scientists keep telling us of matter, while it is the hallmark of wisdom to defer everyday understanding to these experts (their findings do after all indicate a provisional advance upon previous understanding and serve contemporary sympathies very well), it is for Santayana only spiritual nearsightedness to deem such knowledge exhaustive of the cosmos.

Truth: As a fourth realm of being, truth wasn’t conceived by Santayana until after the first three (essence, matter and spirit) had been distinguished, and may therefore be justly supposed to have been introduced somewhat ad hoc. Whatever the reason, by 1913 (10 years before the publication of SAF) Santayana had conceived truth to round out his fourfold ontology. Truth is alleged by Santayana to be a subset of the infinite realm of essence. The realm of truth is the total inventory of essences instantiated by matter. The master metaphor for truth is given by Santayana in RB as: “Truth is the furrow which matter must plow upon the face of essence.” All events that take place entail concatenations of essences elected by matter for appearance in the course of human life, and their objective relations—factual arrangement, for example, that the terrorist attacks in America in 2001 took place on September 11th rather than the 12th—introduce the possibility of truth for human understanding.

Though there are similarities, Santayana’s view of truth differs in important respects from that of Classical pragmatists: truth for Santayana is fully objective and not necessarily presupposing of a cognizing agent; it is the necessary condition for the possibility of true opinions (Santayana appeals to the self-conscious act of lying as evidence of this fact); judgments are true if and only if they faithfully reproduce a portion of the descriptive properties of the process of the world coming, becoming, and going away into existence. These features of truth are guaranteed by the eternal status of the terms of its acknowledgement: essences.

Thus the pragmatist account of truth as what “works,” in the sense of what fits the current standard comprehensive description of the world is acceptable to Santayana so long as there is an understanding that the terms that make truth possible, namely, essences, are eternal, everlasting possibilities of experience that are not reducible to that experience. This is where Santayana especially departs from the pragmatist account of truth: it is not reducible to experience.

Spirit: Finally, Santayana distinguishes the realm of spirit, which is neither more nor less mysterious than one’s everyday understanding of consciousness. Santayana defines consciousness as the “total inner difference between being asleep and awake.” John Lachs has characterized Santayana’s spirit as that part of a life constituted by its series of intuitions. The native affinity of mind is, according to Santayana, to essence and not to fact. (This is an important outcome of his engagement with and overcoming of Cartesianism.) As such consciousness may play with appearances apart from the believing intent of the organic manifestation of mind (psyche); to the extent that it does so play, the spiritual life has been lived. Spirit is the ability of mind to turn natural events and experiences into appearances of themselves, and in so doing allow a healthy cosmic repose even as nature moves ceaselessly, beautifully, and sometimes destructively along.

In this way the core contribution of Santayana’s philosophy can be seen to culminate in a reconciliation of spirit and nature, two realities very much at odds in contemporary life. Santayana’s status as something of an “acquired taste” philosopher may plausibly be argued to be a function of his uncommon ability to uphold two sincere sympathies: on the one hand with Platonism and the spiritual life, and on the other with the life of reason which includes an openness to the advantages of three phases of moral life he called in that same-titled volume “pre-rational morality,” “rational ethics,” and “post-rational morality.”

4. Naturalism in World Perspective

As should not be surprising from what has been presented, Santayana consistently praises select philosophers and philosophies from history for what he considers their “naturalistic piety.” From the Ancient world, Santayana was deeply impressed with Lucretius, and also what he gleaned from Eastern Indian philosophy. Of the Modern philosophers, Santayana reserves his highest praise for Spinoza.

Backed by these historical allies, Santayana provides in a soliloquy a memorable (if partly irreverent) arrangement of world-philosophies:

…the progress of philosophy has not been of such a sort that the latest philosophers are the best: it is quite the other way…the later we come down in the history of philosophy the less important philosophy becomes, and the less true in fundamental matters.
Suppose I arrange the works of the essential philosophers—leaving out secondary and transitional systems—in a bookcase of four shelves; on the top shelf (out of reach since I can’t read the language) I will place the Indians; on the next the Greek naturalists; and to remedy the unfortunate paucity of their remains, I will add here those free inquirers of the renaissance, leading to Spinoza, who after two thousand years picked up the thread of scientific speculation; and besides, all modern science: so that this shelf will run over into a whole library of what is not ordinarily called philosophy. On the third shelf I will put Platonism, including Aristotle, the Fathers, the Scholastics, and all honestly Christian theology; and on the last, modern or subjective philosophy in its entirety. I will leave lying on the table, as of doubtful destination, the works of my contemporaries. There is much life in some of them. I like their water-colour sketches of self-consciousness, their rebellious egotisms, their fervid reforms of phraseology, their peep-holes through which some very small part of things may be seen very clearly: they have lively wits, but they seem to me like children playing blind-man’s-buff; they are keenly excited at not knowing where they are. (“The Progress of Philosophy,” in Soliloquies in England and Later Soliloquies, Charles Scribner’s Sons, 1922: 208-210)

Santayana recommends placing on the bottom, “inferior” shelves all the philosophy that is published, reprinted, and discussed in universities across the Western world today. This recommendation motivated one critic to characterize Santayana as a “defiant eclectic” (Charles Hartshorne, “Santayana’s Defiant Eclecticism” in The Journal of Philosophy, Vol. LXI. No. 1, 1964: 35-44), suggesting that his thinking amounts to a high-minded circumvention of the real problems of philosophy through the sublimation of a few eccentric doctrines. This point is still an issue among Santayana scholars. What is clear is that Santayana combined an indisputably rich reading of the history of philosophy with an unparalleled synoptic critical vision.

5. Legacy

Santayana’s philosophy has had a modest, unsettled legacy, one which nevertheless surprises in its continuing ability to attract sensibilities from across academic disciplines. While his thinking never has, and likely never will be, given to indoctrination or discipleship, it is clear that Santayana never conceived of these as important and justifiably suspected that such things were bad rather than good indications that a philosophy is worthy of the world it struggles to understand.

Still, a glowing campfire of devotion to Santayana’s work persists, first through the institutional support of the MIT Press and the staff of the Santayana Edition at Indiana University-Purdue University Indianapolis (IUPUI); and second from the scholarly contributions made to the only Santayana journal, Overheard in Seville: Bulletin of the Santayana Society. The Bulletin is published annually and is edited by Angus Kerr-Lawson. The Santayana Society meets annually in December at the Eastern gathering of the American Philosophical Association and has recently been added to the proceedings of the annual meetings of the Society for the Advancement of American Philosophy. MIT Press is in the process of publishing a critical edition of The Works of George Santayana, several of which are currently released.

The future of Santayana studies, whatever their course, will depend upon genuine interest in a non-reductive philosophical naturalism that expresses deep respect to religious sensibilities and leads the charge for the return to a conception of philosophy as a way of life rather than as a critical profession with little relevance to inner experience.

6. References and Further Reading

a. MIT Press Critical Editions

All works by George Santayana are undergoing republication as critical editions through MIT Press, under the editorship of William G. Holzberger and Herman J. Saatkamp, Jr., and the editorial work of those affiliated with the Santayana Edition at Indiana University-Purdue University Indianapolis.

  • Persons and Places (1987).
  • The Sense of Beauty (1988).
  • Interpretations of Poetry and Religion (1990).
  • The Last Puritan (1994).
  • The Letters of George Santayana: Books I-VIII (2001-2008).

b. Other Santayana Works

  • Animal Faith and Spiritual Life. Edited by John Lachs. New York: Appleton-Century- Crofts, 1967.
  • The Birth of Reason and Other Essays. Daniel Cory, editor. New York and London: Columbia University Press, 1968.
  • Character and Opinion in the United States. New York, Charles Scribner’s Sons: 1921.
  • Dialogues in Limbo. The University of Michigan Press, 1948.
  • Dominations and Powers: Reflections on Liberty, Society, and Government. New York, Charles Scribner’s Sons: 1951.
  • Egotism in German Philosophy. Charles Scribner’s Sons, 1940.
  • Essays in Literary Criticism. Edited by Irving Singer. New York, Charles Scribner’s Sons: 1956.
  • The Genteel Tradition: Nine Essays by George Santayana. Lincoln and London: The University of Nebraska Press, 1967.
  • The Idea of Christ in the Gospels. New York, Charles Scribner’s Sons: 1946.
  • Life of Reason or The Phases of Human Progress, One Volume Edition. New York: Charles Scribner’s Sons, 1955.
  • Obiter Scripta. New York, Charles Scribner’s Sons: 1936.
  • The Philosophy of Santayana. Edited by Irwin Edman. The Modern Library, 1936.
  • Poems. New York, Charles Scribner’s Sons: 1923.
  • The Realms of Being. New York, Charles Scribner’s Sons: 1942.
  • Santayana on America: Essays, Notes, and Letters on American Life, Literature, and Philosophy. Edited by Richard Colton Lyon. New York: Harcourt, Brace & World, Inc., 1968.
  • Scepticism and Animal Faith. New York: Dover Publications, 1923, 1955.
  • Soliloquies in England and Later Soliloquies. New York, Charles Scribner’s Sons: 1922.
  • Some Turns of Thought in Modern Philosophy. New York, Charles Scribner’s Sons: 1933.
  • Winds of Doctrine: Studies in Contemporary Opinion. New York, Charles Scribner’s Sons: 1913.

c. Books About Santayana

  • Ames, Van Meter. Proust and Santayana: The Aesthetic Way of Life. New York: Willett, Clark & Company, 1937.
  • Arnett, Willard E. Santayana and the Sense of Beauty. Bloomington: Indiana University Press, 1957.
  • Butler, Richard. The Life and World of George Santayana. Chicago: A Gateway Edition, 1960.
  • Coleman, Martin; Santayana Edition (IUPUI).  The Essential Santayana: Selected Writings.  Compiled with an introduction by Martin Coleman and the Santayana Edition at IUPUI.  Indiana University Press, 2009.
  • Cory, Daniel. The Letters of George Santayana. New York, Charles Scribner’s Sons: 1955.
  • Cory, Daniel. Santayana: The Later Years; A Portrait With Letters. New York: George Braziller, 1963.
  • Flamm, Matthew Caleb and Krzysztof Piotr Skowronski. Under Any Sky: Contemporary Readings of George Santayana. Newcastle: Cambridge Scholars Publishing, 2007.
  • Howgate, George W. George Santayana. New York: A.S. Barnes and Co., Inc., 1961.
  • Lachs, John. On Santayana. Wadsworth, 2000.
  • Lachs, John with Michael Hodges. Thinking in the Ruins: Wittgenstein and Santayana on Contingency. Vanderbilt University Press, 2000.
  • Levinson, Henry Samuel. Santayana, Pragmatism, and the Spiritual Life. Chapel Hill and London: The University of North Carolina Press: 1992.
  • Lamont, Corliss, editor. Dialogue on George Santayana. New York: Horizon Press, 1959.
  • Munson, Thomas N. The Essential Wisdom of George Santayana. New York: Columbia University Press, 1962.
  • Schilpp, Paul Arthur, editor. The Library of Living Philosophers: The Philosophy of George Santayana. New York: Tudor Publishing Company, 1951.
  • Singer, Irving. George Santayana, Literary Philosopher. Yale University Press, 2000.
  • Sprigge, Timothy. Santayana. London and New York: Routledge, 1995.
  • Woodward, Anthony. Living in the Eternal: A Study of George Santayana. Nashville: Vanderbilt University Press, 1988.

Author Information

Matthew Caleb Flamm
Email: mflamm@rockford.edu
Rockford College
U. S. A.

Huineng (Hui-neng) (638—713)

HuinengHuineng (Hui-neng) a seminal figure in Buddhist history. He is the famous “Sixth Patriarch” of the Chan or meditation tradition, which is better known in Japanese as “Zen”). The focus of an immense body of lore that grew over the centuries, Huineng’s life mirrors the fortunes of Chan itself – a provincial Chinese version of Buddhism that rose to become a major religious and cultural force throughout East Asia. Tradition holds that Huineng was an uncouth “barbarian” youth who, because of his innate intuitive insight, surpassed his more cultured fellow monks to earn the official “dharma seal” certifying the authoritative transmission of Buddhist enlightenment, and thereby earning a lasting place in history. He is intimately associated with the Platform Sutra of the Sixth Patriarch, one of the most influential texts in all of Chinese Buddhism. Alleged to be a sermon from the lips of Huineng himself, this text provides a gripping first person account of the Master’s life. Its cryptic, yet insightful, discussion of Chan practice lays out the central concerns of Chan cultivation. Huineng’s discussion of the themes of inherent enlightenment, sudden awakening, and the non-dual nature of wisdom (Sanskrit: prajna) and meditation (Sanskrit: dhyana) resounds through later generations of Chan teachers, and continues to pose difficult philosophical challenges to this day.

Table of Contents

  1. Chan Buddhism in Context
  2. Biography
  3. Historical Issues and Mythic Elements
  4. Central Teachings
    1. Major Themes
      1. Original/Inherent Enlightenment (ben jue)
      2. Non-duality
      3. No-thought (wu nian)
      4. Sudden Awakening (dun wu)
      5. The Centrality of Practice
    2. Teaching Style
  5. Influences
  6. Critical Issues
    1. The Role of Reason and Rationality
    2. Sudden vs. Gradual?
    3. The Role of Text (wen) in Life
    4. The Relation of Action (praxis) and Knowledge (theoria)
    5. The Centrality of Ritual (Li)
  7. Impact on Later Buddhist and Chinese Philosophical Traditions
  8. References and Further Reading

1. Chan Buddhism in Context

It is impossible to disentangle Huineng from the story of early Chan. Indeed, it is in sections 49-51 of the Platform Sutra that Huineng lays out the classic story of Chan’s origins. According to this account, Chan began with the historical Buddha, Sakyamuni, and his famous “Flower Sermon.” One day the Buddha took his seat before his assembled monks and, instead of speaking, remained silent while holding a single flower aloft in his hand. Of those assembled, only one disciple Mahakashyapa (Sanskrit: “Great Kashyapa”), understood the meaning of the Buddha’s actions. The Buddha publicly recognized Mahakashyapa’s realization and he, in turn, passed the wordless teaching along to his disciples. Eventually the transmission passed to a certain Bodhidharma (c. 470-553 CE), the infamous “First Patriarch,” who, it is said, brought Chan to southern China, crossing the Yangzi (Yangtze) River on a reed. Recent scholarship has established that a mysterious figure named Bodhidharma was indeed in southern China in the fifth century proclaiming teachings based on the Lankavatara Sutra as well as a simplified but powerful form of dhyana. After his death his disciples carried on his teachings, but most of them never founded lasting lineages. Eventually these teachings were transmitted to Hongren (600-674), the Fifth Patriarch, who taught at Dongshan. Hongren had a number of disciples who spread out through China, establishing their own schools where they taught their own versions of Chan. Some died out but a few flourished, going on to record their histories to establish their particular pedigrees.

Often dubbed “the meditation school,” Chan derives its name from the Chinese term channa, an attempted transliteration of the Sanskrit term dhyana (meditation, concentration). In Japan, it is known as Zen; in Korea, as Son; and in Vietnam, as Thien. In India, dhyana encompassed a wide variety of techniques for training the mind to attain the deep insight into reality necessary for awakening. When Buddhism began making inroads into China in the first and second centuries CE, missionaries brought these techniques with them. Dhyana study proved popular in some circles – in part because of its resemblance to Daoist meditation practices – but it was just one practice alongside of others, such as sutra study, devotional rituals and the performance of charitable works. Only later did Chan become a self-conscious movement with a firm institutional base.

By the sixth century, certain monasteries in the mountainous areas of central and southwestern China became known as places reserved for intense meditation training. The masters at these centers taught methods so powerful that it was rumored that those willing to persevere could awaken in this very life. As time went on several of these meditation masters gained loyal followings and tales of them spread as their disciples established their own monasteries. It was out of this context that Chan as a distinct school (zong, “lineage”) and the legend of its most famous master arose. Modern scholars now agree that many of the stories surrounding Huineng are “mythical” reconstructions and elaborations by later generations of Chan writers. Nonetheless, this mythology tells us a lot about how Chan came to conceive itself as a distinct tradition, at once radically innovative and deeply conservative. This Chan self-conception finds its best articulation in a poem attributed to Bodhidharma, according to which Chan is “a separate transmission outside the scriptures, not relying on words and phrases, directly transmitted from mind to mind.” Such transmission can only occur within the relationship between Master and student; hence, the Master, and the connection to him, is of paramount importance in all Chan schools.

2. Biography

As with many legendary figures, it is difficult to sort fact from fiction when it comes to Huineng. We have many sources of information on him but most were written long after his lifetime. Most scholars of Buddhism now consider the story of Huineng’s life and his role in establishing Chan as a direct line going back to Sakyamuni (the historical Buddha, ca. 6th to 5th centuries BCE) to be little more than pious fiction. While there may be a kernel of historical truth to them, all of the accounts of Huineng’s life (particularly as recorded in the Platform Sutra of the Sixth Patriarch) show evidence of later expansion and elaboration. In fact, scholars cannot even agree on the location of Dafan, the temple in which Huineng allegedly recited the Platform Sutra.

The earliest mention of Huineng comes from an inscription for a memorial pagoda in Faxing monastery dated 676. The pagoda was said to commemorate Huineng’s meeting with master Yinzong (627-713), a devotee of the Nirvana Sutra and a renowned master of monastic discipline (vinaya), and the ceremony in which Huineng underwent monastic tonsure, that is, shaving of part of the head. Unfortunately, the actual inscription has not been preserved and so many historians deem it unreliable. The only other record dating back to Huineng’s lifetime just lists him as a student of the Chan master Hongren (Hong-jen).

Later records, of which there are many, probably bear little resemblance to real historical events, and actually contradict each other on certain details. Later traditions concerning Huineng vary tremendously. He seems to go into hiding for several years only to reappear in Nanhai at a monastery presided over by Yinzong. One day after the Master had finished a lecture, Huineng overheard two monks arguing over whether the temple flag or the wind was moving. Huineng abruptly injected himself into this discussion, declaring that in fact it was mind that was moving. Hearing of this, Yinzong sent for Huineng and, bowing to him, asked to be taught the dharma of Hongren. It was Yinzong who oversaw the giving of the tonsure to Huineng, the incident memorialized in the inscription mentioned above. Eventually most accounts of Huineng’s life have him retiring to the Baolin temple. Some traditions speak of Huineng being summoned to the imperial capital by the emperor Zhongzong or possibly the empress Wu Zhao (ca. 625-706). In any case, Huineng declined, preferring to spend his days in the mountains and forests preaching the dharma. He did, however, give the imperial envoy a dharma talk that jolted the messenger into an intense sudden realization. Returning to the capital the envoy reported his experience to the emperor who issued an edict praising Huineng and bestowing special gifts upon him.

Our major source for information on Huineng is the autobiographical portion (sections 2-11) of the Platform Sutra of the Sixth Patriarch, an immensely complicated text that has undergone numerous revisions over the centuries. Purporting to be a series of sermons delivered by Huineng from a high seat in the lecture hall (the “platform” alluded to in the title) of Dafan Temple, this text remains the only Chinese Buddhist discourse to be accorded sutra (Sanskrit: “scriptural”) status. The earliest extant copy of this sutra, found in a cache of writings discovered in the Dunhuang (Tun-huang) caves in northwestern China, dates to around 850 but it is corrupt and full of errors – probably the result of being copied from an earlier version by a semiliterate scribe. The first section of the text names Fahai, a student of Huineng’s, as transcribing the sermon at the behest of the district governor. Elsewhere the text names Fahai as one of the Master’s ten disciples and “chief monk” of the community. However, Fahai does not appear anywhere else in Chan literature and his exact identity remains unknown. Some scholars suggest the sutra was actually written by a later Chan monk from a different school (possibly the Niutou or “Ox-head” school) around the year 780.

While most scholars do not put much stock in either the Platform Sutra or the other sources on Huineng’s life, we can still use them to piece together something of a biography for him. It seems his family name was Lu and his father had been a minor official who was banished to the provinces where he died when his son was only three. His mother took him to southern China and raised him in extreme poverty. Huineng worked throughout his childhood to support his family by cutting wood. One day when he was a young man, he overheard a man reciting a phrase from the Diamond Sutra and at once he experienced an initial awakening. With his mother’s permission he left home and devoted himself to religious life.

Huineng spent his next years wandering, ending up with a Buddhist nun who was devoted to the Nirvana Sutra. After reciting passages from it one day she asked him to take a turn reading it aloud only to find that he was illiterate. Incredulous, she asked how he intended to learn Buddha’s truth if he could not read the sutras. The youth replied that the nature of Buddha does not depend on words and letters so what need was there to read texts? Amazed at his insight, she suggested he take up monastic life. At this point he declined, but went on to train under a meditation master.

After three years of meditating in a mountain cave, Huineng went to Dongshan (East Mountain) monastery in Hubei, where he met Master Hongren, the “Fifth Patriarch.” Glaring at this supplicant, Hongren asked where he was from and why he was there. Huineng answered simply that he was from the south and had come to learn the dharma (Buddhist doctrine) from him. Hongren retorted that as a southerner, Huineng was a mere “barbarian,” adding, “How could you become Buddha?” Unfazed by the insult, Huineng replied, “Although my ‘barbarian’ body and yours differ, what difference is there in our buddha-nature?” Realizing at once the potential of this coarse youth, Hongren resolved to test him further. He took him in but assigned him to the threshing room, where he labored for nine months, treading the mill to separate the rice grains from their husks.

The most famous incident in Huineng’s story concerns a dharma contest. One day Hongren challenged his charges to each write a verse (gatha) distilling their understanding of their “original natures.” He promised to read them and award his robe (a symbol of dharma transmission; some versions of the story include Hongren’s begging bowl) and the title “Sixth Patriarch” to the student demonstrating true realization. The task quickly devolved onto the shoulders of the head monk, Shenxiu, who, it was assumed, would be the Master’s likely successor. Shenxiu, however, was full of doubt and spent a tortured night considering his options. Finally he stole out and wrote his verse anonymously on the wall of the new dharma hall:

The body is the bodhi tree.
The heart-mind is like a mirror.
Moment by moment wipe and polish it,
Not allowing dust to collect. (section 6)

A straightforward articulation of the necessity of diligent practice, Shenxiu hoped this verse would show the Master that his students had at least some understanding.

The next morning Hongren read the verse and praised it before the community. He burned incense before it and ordered them all to recite it before calling Shenxiu for an interview. In private he commended Shenxiu for his insight, stating that the verse showed he had reached the “gates of wisdom” but had yet to enter. He then suggested Shenxiu take a few more days to compose another verse worthy of being awarded the robe.

Meanwhile, Huineng was still working in the threshing room when a novice wandered by reciting Shenxiu’s verse. Immediately Huineng realized the author of the verse lacked full understanding. Venturing out to the dharma hall, he got someone to write his reply:

Bodhi originally has no tree.
The clear and bright mirror also has no support.
Buddha-nature is constantly purifying and clearing.
Where could there be dust? (section 8)

Very soon word of this new verse spread and eventually the news reached Hongren. The Master came to read it and immediately recognized it as the work of Huineng and that this unknown prodigy was truly enlightened. However, he knew that passing his robe to an uncouth peasant would upset the monastic hierarchy. Therefore he publicly dismissed it as “not complete understanding.” Later, under cover of darkness, Hongren summoned Huineng for a secret audience in which he gave him further teachings. Passing on his robe, the Master admonished him to flee for his life, predicting, however, that eventually he would transmit the teachings. With that, Huineng fled south. After some months, Huineng was traced to a mountain by a band of pursuers intent on killing him and stealing the robe. Most of the pursuers turned back after climbing only halfway but one, Huiming (a former general) reached him on the summit. There, rather than slay the young master, he received the teaching and became enlightened. Thus being recognized as a true Chan Master, Huineng dispatched his new disciple to the north to spread the dharma and convert the populace.

One of the most colorful episodes in Huineng lore concerns his confrontation with a dragon that lived in a pond in front of Baolin temple. The dragon was particularly large and fierce, emerging regularly from the watery depths to create havoc and instill fear in the populace. Fearlessly, the Master taunted the beast for its weakness at only being unable to appear in a large as opposed to smaller form. At once the dragon disappeared only to re-emerge in small form and so show the monk his powers. Unimpressed, the Master challenged the monster to show its courage by entering his bowl. When it did so, the Master quickly scooped the dragon up, took him into the Buddha Hall, and preached dharma to it until it shed its body and departed.

Much as with other great religious figures, so the stories of Huineng’s death are particularly dramatic. The Platform Sutra gives a confused account that may combine several different versions. In essence, however, it records that as he neared his death, the Master called his disciples for a final teaching in the form of a “dharma verse.” All the disciples broke into tears over the imminent departure of their beloved teacher except for one, Shenhui, whom the Master praised for having attained the status of awakening. Chiding the others for the foolishness of their tears, Huineng told them, “All of you sit down. I shall give you a verse, the verse of the true-false moving-quiet. All of you recite it, and if you understand the meaning, you will be the same as I. If you practice with it, you will not lose the essence of the teaching.” (section 48) After this final lesson (during which he outlined the Chan lineage back to the Buddha) Huineng died at the stroke of midnight on August 28, 713. Other traditions, however, have Huineng dying in deep meditation after finishing his last meal. His passing was marked by all manner of cosmic signs: a strange perfume pervading the temple for days, mysterious bright lights, a miraculous rainbow in the sky etc. The Platform Sutra says, “Mountains crumbled, the earth trembled, and the forest trees turned white. The sun and moon ceased to shine and the wind and clouds lost their colors.” (section 54) An inscription by the poet Wang Wei (d. 759) adds “the birds and monkeys cried in anguish.”

Several posthumous stories of Huineng attest to the powerful spell he cast on later generations. Some decades after his passing the emperor sent an envoy to ask for his robe and bowl so that the court might pay them homage. These were sent back with great ceremony a few years later by the succeeding emperor, who purportedly dreamt Huineng asked that they be returned. Later, in 816, Huineng was awarded the official title “Dhyana Master Dajian” (Great Mirror). To this day there is a mummy reputed to be Huineng in the Nanhua monastery located in Caoxi. For centuries it was the focus of intense devotion, and at times was brought to the nearby city of Shanzhou to promote prosperity or ward off plagues and droughts. The mummy was also threatened several times and at least one time was nearly decapitated by rival monks seeking to gain power through possession of the Sixth Patriarch’s head.

3. Historical Issues and Mythic Elements

Historical complexities aside, however, it is the mythic dimensions of Huineng’s story that most excite the imagination. Certainly the traditional account is replete with symbolism and allusion. As a boy Huineng is the quintessential simpleton (cf. the Daoist notion of pu, “simplicity” or “the uncarved block” spoken of in Daode jing 15, 19, 28, 32, 37, 57), an illiterate peasant who, pure and unspoiled by the sophistication of his more educated fellows, serves as the perfect vessel for receiving the sacred wisdom that, in turn, flows through him to posterity. Aside from the allusions to Daode jing just noted, Huineng epitomizes the ideal found in Daode jing 70, “The sage goes about with a coarse cloth on top yet carries jade in his bosom.” We find similar themes in stories of other Buddhist figures (for example, Dao’an, 312-385) as well as the Prophet Muhammad. The tradition of Huineng’s being orphaned and cared for by his mother echoes the biography of Mencius (ca. 385-312 BCE), one of the most revered and mystical of Confucian sages.

Huineng’s potential is recognized by the truly wise (for example, Hongren) but he must first be tested to prove his worth. His assignment to hard labor for nine months in seclusion suggests a type of spiritual gestation. Moreover, Huineng’s attaining official recognition under cover of darkness, symbolized in the passing on of Bodhidharma’s robe and bowl (sacred relics imbued with the Patriarch’s charisma), underscores the drama of this moment and the immense value of his precious wisdom. The tradition that these were buried with him indicates something else of importance: Huineng’s successors would no longer rely on India; Chan would henceforth be a homegrown Chinese tradition. Huineng’s turning down the imperial summons recalls the similar story involving Zhuangzi wherein the Daoist sage prefers to live as a turtle, “dragging his tail in the mud” (Zhuangzi, chapter 17). Finally, the accounts of Huineng’s death clearly echo the earthly passing (parinirvana) of Sakyamuni Buddha. Symbolically, Chan tradition, by drawing such a wide assortment of sacred figures into Huineng’s own story, has effectively absorbed these holy personages’ collective mana. As such, Chan is then empowered to project this “new” sacred aura down through its own lineage.

We can also understand the traditional story of Huineng’s life as an example of the apparently universal “Hero Myth.” He starts off as an unpromising youth living in obscurity who embarks on a great quest. Along the way he is aided by various helpers (the anonymous man who recited the Diamond Sutra, the nun devoted to the Nirvana Sutra, his first meditation teacher). After various adventures he meets a true mentor, the Wise Old Man (Hongren), who recognizes his worth and proceeds to train and test him until he is ready. Then the Wise Old Man passes on the secret knowledge he will need to face all obstacles. The climactic story of Huineng’s flight, pursuit, confrontation on mountain top, and his victory all fit in broad outline the structure of such tales the world over. His encounter with the dragon, of course, is the stereotypical battle with the monster (cf. St. George and the Dragon, Beowulf and Grendel) through which the Hero saves society from the threat of evil and chaos, while his refusal of imperial status demonstrates his humility and desire to avoid self-glorification. In this light, the master’s death marks his apotheosis and rise to divine status, for which he is revered by later generations.

When assessing the life of Huineng and his place in Chan lore, it is vital to bear in mind the centrality of lineage in Chinese culture. Lineage is a primary marker of group identity and solidarity, as well as social recognition. Chan, like other Chinese religious/philosophical traditions, is organized as a system of lineages in which teachings are passed down from Master (Patriarch) to disciple, much as family heritage passes down from father to son. The concern for lineage is most evident in sections 49-51 of the Platform Sutra, where Huineng traces the transmission of his teachings back through various masters to Bodhidharma. In Huineng’s Chan genealogy, Bodhidharma, in turn, received the teachings via a series of Indian masters going back to Sakyamuni. Such an impressive pedigree no doubt brought much prestige to those within the Chan line. The importance of lineage continued through the succeeding generations and was carried over when Chan went to Japan. To this day, Chan teachers trace their lineage back to Huineng. Essentially, Huineng has become the Primary Ancestor of the Chan line, receiving the reverence and devotion typical of ancestral cults throughout East Asia. Metaphorically speaking, Huineng is Chan, and remains so even today.

Such critical analysis of the Platform Sutra and the body of lore surrounding Huineng is not intended to dismiss Chan tradition (particularly in regards to the matter of lineage) as fraudulent. Rather, it helps us understand the concerns of early Chan and the vital role that a charismatic hero such as Huineng plays in rhetorically establishing a distinctive Chan identity. For an analogy we can look to the way in which the great Song scholar Zhu Xi (1130-1200) constructs a lineage for his school of Neo-Confucianism, with Confucius taking the place of Huineng and Master Zhu serving as the Confucian version of Shenhui.

4. Central Teachings

Although Huineng’s mythic biography is fascinating, the Platform Sutra mainly consists of an extended series of dharma talks offering what is at times some rather cryptic advice on Chan cultivation. Like most sermons, the Sutra is not a systematic presentation of defined doctrines and arguments but is an address to the faithful, exhorting them to see into their “original nature” and awaken here and now. Huineng explicitly says that his teachings do not originate with him but are, “handed down from the sages of the past” (section 12). Nonetheless, Huineng does introduce several important ideas and initiates the peculiar style of teaching that comes to be enshrined in later Chan tradition. These teachings tend to overlap and interlock with each other, thereby suggesting the unity-cum-diversity that is one of the hallmarks of Chan thought.

a. Major Themes

i. Original/Inherent Enlightenment (ben jue)

The teaching of “inherent” or “original” enlightenment is a major theme in Huineng’s sermon, and the theoretical basis for most of what he says regarding practice. Its roots go back to Indian teachings concerning the tathagata-garbha (“womb/embryo of Buddha”). Although a complex notion, essentially this teaching comes down to a positive articulation of basic Buddhist views on emptiness (shunyata) and the thoroughly interrelated nature of existence. According to tathagata-garbha teachings, although all beings are mired in ignorance and suffering, our true natures are always pure and luminous – defilements are merely adventitious. Awakening occurs when we pierce through the defilements and allow our original purity to shine forth. While at first glance, the assertion of a seemingly permanent “nature” would seem to contradict the fundamental Buddhist doctrine of anatman (“no [permanent] self”), in fact it does not. The tathagata-garbha is not a substantive essence but an indication of the innate positive tendency towards awakening that is always directly at hand.

Tathagata-garbha teachings had strong appeal for the Chinese, most likely due to their resonance with Confucian ideas of “propriety” (yi, the appropriate manner of acting in a given situation) and humanity’s innate “goodness,” as well as Daoist views of the Way (dao), in which each thing uniquely contributes to the all-encompassing system of the cosmos. These notions also dovetail with the traditional Chinese concern with one’s “nature” (xing, the inborn organic pattern guiding a thing’s development). Together such ideas sketch out a distinctive worldview of dynamic, interactive relationships that unfold in the natural course of things. In this perspective, one can obstruct one’s inherent tendencies or open conscientiously into a more free and responsive way of engagement. In general, the latter is the truer, more proper (or “natural”) way of being. Chinese Buddhists speak of this potential for realization as one’s “Buddha-nature” (fo xing). For Chinese Buddhists, awakening is the natural result of activating or “seeing into” this innate but hidden potential and manifesting it here and now.

Nearly everything Huineng says is predicated on the “Buddha-nature.” We see this clearly in his youthful exchanges with both the nameless Buddhist nun and Master Hongren. Huineng drives this point home in a number of places, often quite explicitly. As he proclaims, “Since Buddha is made by your own nature, do not look for him outside your body. If you are deluded in your own nature, Buddha is then a sentient being; if you are awakened in your own natures, sentient beings are then Buddhas.” (section 35) In this understanding of Buddhahood, one may have an initial awakening (Japanese satori) but this is only a hurried glimpse, yet it provides a vague understanding that spurs one on further – something we clearly see in Huineng’s own life with his first awakening at hearing a passage from the Diamond Sutra.

By rhetorically taking his stand on this inherent enlightenment, Huineng challenges his audience to understand this truth and realize their original natures where they are at this very moment. This is something they can and must do: “Despite heterodox views, passions, ignorance, and delusions, in your own physical bodies you have in yourselves the attributes of inherent enlightenment, so that with correct views you can be saved.” (section 21) It is on this basis that he speaks of such things as the unity of meditation (dhyana) and wisdom (prajna), and the “samadhi of oneness. By realizing one’s “Buddha-nature” one naturally moves beyond habitual “selfish” actions and joining with things in an appropriate and compassionate way.

ii. Non-duality

Another important theme that Huineng preaches concerns the fundamentally “non-dual” nature of existence. This, too, is prone to be misunderstood. Huineng never espouses a mushy notion that “All is One” so much as challenge the assumption that a person stands apart from her/his immediate situation. His target is the self-conscious sense of separation that tends to arise out of deliberative thinking and living. Thus, his focus is not so much theoretical as practical; one must not get caught up in speculative thought but realize (make real) Buddha, one’s true nature, and act accordingly. This fundamental unity comes through in his famous dharma verse through which he won Hongren’s robe. By countering Shenxiu’s verse and its assumptions of duality, Huineng graphically tells us that we must not think of our minds as something distinct that “we” must polish to reflect truth. Rather, we are truth, immediately and directly.

The vision Huineng seeks to impart is one of integrity within our larger context. It is an evocation of wholeness, interrelatedness and participation rather than separation and distinction. One of Huineng’s most provocative presentations of this idea comes in his discussion of meditation. For Huineng, meditation is not a separate “thing” from wisdom, nor do you attain the latter by way of the former. As he says, “Never under any circumstances say mistakenly that meditation and wisdom are different; they are a unity, not two things. Meditation itself is the substance of wisdom; wisdom itself is the function of meditation” (section 13). Later, the Patriarch explains their relationship through the analogy of a lamp and its light: just as the lamp and its illuminating are essentially one, so meditation and wisdom are one.

Huineng also challenges assumptions of separation by advocating the “samadhi of oneness,” or concentrated attention to the present situation: “The samadhi of oneness is straightforward mind at all times, walking, staying, sitting, and lying.” This constitutes an intriguing practice of mindful, meditative action performed with attentive detachment. There are obvious echoes between this practice and the Daoist notion of wei wuwei (“acting without acting”) as well as path of karma yoga outlined by Krishna in the Bhagavad-Gita, and Chan communities to this day seek to instill such an approach to life throughout their daily regimen.

This fundamental unity of existence that one manifests by realizing one’s “Buddha-nature” also informs Huineng’s view of the Pure Land (the “Western Paradise”) which, following the Vimalakirti Sutra (where the Buddha shows his disciples that this world is the Pure Land for those with Pure Mind), he refuses to allow us to conceive the Pure Land as something separate from our current existence. It is, rather, the straightforward mind of the “samadhi of oneness.” In attaining this state of true purity, one finds no obstructions. Or, as Huineng puts it, “If inside and outside are clear, this will be no different from the Western Land” (section 35).

iii. No-thought (wu nian)

Huineng speaks from the standpoint of Ultimate Truth (the inherent “Buddha-nature”) the non-dual reality lying beyond our everyday unenlightened experience of separation and division. To awaken to this Truth, Huineng emphasizes “non-clinging” to any verbal teachings, which only present obstacles to True Awakening. Instead, Huineng stresses the perspective of “no-thought” (wu nian), an open, non-conceptual state of mind that allows one to experience reality directly, as it truly is. As he states, “No thought is not to think even when involved in thought. . . To be unstained in all environments is called no-thought. If on the basis of your own thoughts you separate from environment, then, in regard to things, thoughts are not produced. If you stop thinking of the myriad things, and cast aside all thoughts, as soon as one instant of thought is cut off, you will be reborn in another realm.” (section 13)

Note that Huineng explicitly says “no-thought” is not a state of insentiency, nor is it a way of valorizing irrational, “thoughtless” behavior. Rather, “no-thought” is a highly attentive yet unentangled way of being — seemingly the only genuine freedom available. Those who act from the perspective of “no-thought” respond compassionately in all situations, untouched by suffering, much the same way the Mahayana scriptures speak of bodhisattvas (enlightened beings who selflessly seek to aid others) who “course in the Perfection of Wisdom.”

iv. Sudden Awakening (dun wu)

Few ideas are so closely associated with Huineng’s Chan than “sudden awakening” (dun wu). Rooted in earlier Buddhist and Daoist teachings, it primarily referred to statements of truth a sage made in relationship to specific audiences. Those that were direct and profound were given to those ready for such a “sudden” dose of reality whereas those that were more indirect and metaphorical were provided for those who needed to be led “gradually.” The difference, thus, lies in those who receive the teachings rather than the actual content of the teachings. Some are, as it were, closer to their “Buddha-nature.” According to later Chan tradition, Huineng advocated the (superior) way of “sudden awakening” in contrast to Shenxiu, whose dharma verse clearly points to the (inferior) way of “gradual awakening.”

This polemical distinction, however, does not capture Huineng’s full meaning. The term dun, typically translated as “sudden,” might better be rendered as “poised” or “ready” for some great undertaking Those who experience such “sudden awakening” are those who are “keen” and “fast,” ready to awaken in action, poised to break through to fuller, wise and compassionate living. By contrast, those who are “dull” are “slow,” not quite as prepared or attentive to respond in so wise a fashion. Equally as important, moreover, is Huineng’s insistence that from the standpoint of the “Buddha-nature,” there is no “sudden” or “gradual.” Thus he notes, “The dharma itself is the same, but in seeing it there is a slow way and a fast way. Seen slowly, it is the gradual; seen fast it is the sudden [teaching]. Dharma is without sudden or gradual, but some people are keen and others dull; hence the names ‘sudden’ and ‘gradual.’” (section 39)

v. The Centrality of Practice

In many respects the necessity of practice may be the single most important refrain in Huineng’s sermons. Huineng repeatedly emphasizes that Chan life, awakening, is not attained through study or careful deliberation but live action. One of the best instances comes immediately after he explains what seated meditation (zuochan; Japanese zazen) is: “Good friends, see for yourselves the purity of your own natures, practice and accomplish for yourselves. Your own nature is the Dharmakaya [“Body of the Teaching,” the Ultimate Truth] and self-practice is the practice of Buddha; by self-accomplishment you may achieve the Buddha Way for yourselves.” (section 19)

To achieve Buddhahood one must be Buddha, that which, paradoxically, one always already is. Such awakened living cannot be adequately explained through words so much as demonstrated and acted upon. In this sense, one learns it directly by conforming to an already established pattern, internalizing it, and then acting this out in any given situation. An analogy might be learning to play a musical instrument or another activity such as riding a bicycle. Chan practice is Chan doing, something that can only be learned through careful imitation of a living example – one’s Master. It is this type of first-hand learning to which Bodhidharma refers in his famous verse: “A special transmission outside the scriptures; not dependent on words and letters.”

Ironically, despite his constant injunctions to wise action, Huineng provides little detail on the specifics of practice. As a result, scholars are unsure what sorts of actual practices were taught in early Chan communities. This silence on specifics, however, turned out to be a point in Huineng’s favor, as his injunctions could readily be applied to a wide variety of Chan styles through the ages.

b. Teaching Style

Huineng’s presentation in the Platform Sutra pioneered Chan’s distinct teaching style that makes use of paradox and cryptic statements aimed at jolting students out of their habitual discursive reasoning. By no means, of course, is Huineng the inventor of such discourse (it is very common in Buddhist and Daoist texts) but in the Platform Sutra Huineng uses it with uncanny skill. As such, it warrants close examination.

One of the most significant features of Huineng’s discourse is its overwhelmingly dialogical character. Although it has its share of lectures, this “sermon” is more often a series of exchanges between Huineng and various interlocutors. Such a literary form calls for one to shift perspective back and forth. Like normal conversation, so a dialogue also tends to lead one beyond the immediate horizon, inviting listeners (and readers) to come along. Dialogue is a common form in Western philosophy (most notably in Plato’s dialogues) yet there is also ample precedent in both Buddhist and Chinese literature. The Perfection of Wisdom Sutras, the primary scriptures of Mahayana Buddhism, are all extended dialogues between the Buddha and his disciples, while most of the Analects and the Zhuangzi are dialogues as well. The dialogue is a powerful rhetorical form, dramatic and challenging, one that demands a response from its audience.

One of the more common rhetorical forms in Buddhism is paradox, and Huineng certainly makes use of this in his teaching. Thus, for instance, he admonishes his students, “Do not depart from deceptions and errors; for they of themselves are the nature of True Reality” (section 27). Later when on the point of death, he takes his closest disciples to task for their ignorance by saying, “All of you sit down. I shall give you a verse, the verse of the true-false moving-quiet.” (section 48) There is something very tricky in such sayings, as they are seemingly contradictory if not absurd. The point of a paradox, of course, is that such absurdity is only apparent for the paradox masks a higher truth that we must divine ourselves. As such, paradox is a highly suggestive form of rhetoric, one that presents us with a basic tension, leaving it for us to resolve.

Huineng also engages in a great deal of polemics in the Platform Sutra. For example, he continually contrasts the “wise” with the “deluded.” He also draws a sharp contrast between his teachings and those of the “Northern school” (secs. 37, 39, 48-49), criticizes a student whose “practice” consists of only reciting the Lotus Sutra (sec. 42), and even converts a “spy” who seems to have come to discredit him (secs. 40-41). While a polemical style may have negative connotations it also serves several rhetorical purposes. To begin, it sets the Master and his audience apart from others, thereby emphasizing that this teaching is different or special. It also underscores the challenging nature of the teaching, and no doubt directly counters various preconceived ideas in the audience. Indeed, it may even put his disciples and audience on the defensive, thus setting them up psychologically for a deeper breakthrough.

All in all, Huineng’s teaching style is quite challenging. At times it is highly provocative, even maddening. He does not lay his subjects out neatly so that his audience can absorb what he says with ease but jars his listeners to elicit a reaction from them. His words, thus, are inherently unstable and elusive, pouring forth quixotically. They resist final definition and closure, similar to Zhuangzi’s “goblet words” (zhi yan, cf. Zhuangzi chapter 27) or what the fifth century Buddhist thinker Sengzhao terms “wild words” (kuan yan, cf. his essay “Panruo Wuzhi”). Such stylistic considerations, in the end, suggest that the ultimate message of Huineng’s sermon may not be so much what he says as how he says it and how we take up his words in our response.

5. Influences

As noted above, Huineng himself claims that nothing in his teachings originates with him, much as Confucius does in Analects 15.28. Clearly, what he iterates in the Platform Sutra derives from earlier works and there are many times when he makes explicit references to other texts, notably the Diamond, Vimalakirti, and Lotus Sutras. In addition, we should also mention the Nirvana Sutra, a text promoting the universality of the “Buddha-nature” that had a profound influence on Chinese Buddhism as a whole. The influences, however, go far beyond this short list. Huineng demonstrates knowledge of the great body of Prajna-paramita (Perfection of Wisdom) literature (of which the Diamond Sutra is one rather late example), as well as the techniques of the Madhyamika school – notably in the negation of set positions, dialectic play between “conventional” and “Ultimate” truth, and the constant challenge to any attempts at a final articulation of Buddhist truth. In addition, at certain points he reveals a basic familiarity with Pure Land doctrine (sec. 35) and some rather technical aspects of Abhidharma and Yogacara philosophy (sec. 45)

Moreover, Huineng’s teachings and style of presentation also owe a great deal to indigenous Chinese sources. This is most obvious when it comes to Daoism, as Huineng’s character and actions so often epitomize teachings found in both the Daode jing and the Zhuangzi. As for Confucian tradition, Huineng makes an obvious bow to Confucius in presenting himself as a transmitter, while his adherence to the positive power of “Buddha nature” owes at least something to the Mencian idea of “inherent goodness” of human nature, a perennial theme in Chinese philosophy that finds one of its most popular articulations in the Zhongyong (“Doctrine of the Mean”). Other scholars have even suggested that portions of the Platform Sutra may have been compiled under the influence of the Yijing.

The fact that Huineng quotes passages from such a large body of works and that scholars can so-easily discern other literary influences and allusions constitutes further proof that the tradition of Huineng’s illiteracy should not be taken literally. In the Platform Sutra Huineng proves rather erudite, if not bookish. His familiarity with so much of his Buddhist and Chinese heritage challenges stereotypes of Chan as denigrating and even ignoring written texts. Indeed, scholars of Buddhism often point out the ironic fact that Chan, so often known for its dismissal of texts, has the largest body of written work of any East Asian Buddhist tradition. Furthermore, many great Chan masters (for example, Dogen, 1200-1253) were brilliant scholars and original thinkers. This paradoxical aspect of Chan, rather than being the product of centuries of institutionalization as some might claim, seems to have been there from the very beginning.

6. Critical Issues

Although the Platform Sutra is most unusual for a “philosophical” text, both in terms of style and content it raises a number of issues that are of particular philosophic import.

a. The Role of Reason and Rationality

Chan has a reputation for irrationality, allegedly insisting that practitioners cut off thinking entirely. There is some basis for such views, and in Chan history we do find examples where this seems to have been encouraged, as, for example, in the case of the Baotang school of Chan that developed in Sichuan during eighth century. Huineng and most Chan masters, however, do not advocate a disorderly or irrational lifestyle. Their concern, instead, seems to be on the predominance of ratio (deliberative, analytic thinking) and the discursive reasoning that severs aspects of reality into discrete bits, usually from an egocentric standpoint. From a Chan perspective, this mode of understanding is the result of a highly artificial process that cuts one off from full participation in one’s immediate context and inevitably leads to suffering. Such an approach cannot be countered with rational, objective arguments because such reasoning is itself a product of such a mode of understanding. By breaking the grip of such processes on humanity, Huineng and his later followers seek to free us for a fuller, more natural life, and hence a truer life.

Much of the difficulty surrounding this subject stems from Chan’s distinctive rhetorical style, of which Huineng is a true master. In particular the notion of “no-thought” seems to suggest a sort of mindless, purely instinctual response or perhaps even unconsciousness. Certainly, “no-thought” is not rational in the sense of bare objectivity. In fact, as we have seen, “no-thought” is not this at all but more like an attitude of carefully attentiveness to the situation at hand. If “no-thought” is lacking in anything it would be the self-consciousness that typically arises out of the dualism inherent in subject-object thinking. Most assuredly “no-thought” should not be equated with becoming insentient, that is, an “object” among others.

Is there a place for reason in all this? Not in the ordinary sense. However, Chan would seem to be less “irrational” than “a rational,” although such labels themselves are designations arising within discursive reasoning. In the end, it may be helpful to view Huineng as espousing a type of “philosophy as propaganda,” much like Nagarjuna or the later Wittgenstein. The aim is not to argue but to change one’s way of thinking in favor of a more immediate and direct way of being.

b. Sudden vs. Gradual?

Much has been made of this notion in Chan scholarship and, indeed, Chan tradition often presents the as a conflict of “Northern Chan Gradualism” and “Southern Chan Subitism” – an alleged conflict from which the latter emerged victorious. In reality it is not really so simple, but the contrast points to an instable dynamic that lies at the heart of Buddhism and perhaps all spiritual practice. If “sudden awakening” refers to an instantaneous experience of enlightenment at which point nothing more needs to be done, then why did someone like Huineng continue to sit in meditation through his later years and exhort his students to do the same even after his death (section 53)?

In fact, what Huineng says about the contrast between “sudden” and “gradual” is anything but clear: “Good friends, in the dharma there is no sudden or gradual, but among people some are keen and others dull. The deluded recommend the gradual method, the enlightened practice the sudden teaching. . . Once enlightened, there is from the outset no distinction between these two methods; those who are not enlightened with for long kalpas be caught in the cycle of transmigration” (section 16). In part it appears that the distinction between “sudden” and “gradual” is a provisional one made from the unawakened standpoint that applies to Chan practitioners rather than the actual event of awakening itself. Yet can one move from delusion to enlightenment, from gradual to sudden? It also seems that the difference between “sudden” and “gradual” cannot refer to a temporal distinction, for even a “sudden awakening” certainly cannot be attained easily or without much practice; Huineng had several “sudden awakenings” but devoted himself to a lifetime of Chan practice.

Later Chan thinkers such as Zongmi (a.k.a. Guifeng, 780-841) were deeply concerned about these notions and sought to clarify them by speaking of “sudden awakening followed by gradual cultivation.” While intriguing, such a solution essentially erases any ultimate meaning to the sudden/gradual distinction. It also implies that claims to “sudden awakening” by Huineng and his followers line were rhetorical rather than genuine.

c. The Role of Text (wen) in Life

The reputation of Chan as eschewing textual study has long been a source of controversy and great appeal. We can see this even in the “Chan motto” attributed to Bodhidharma in which the dharma is said to be a “separate transmission outside the scriptures, not relying on words and letters.” There can be no arguing that Chan presents a basic distrust of scholasticism that seems to have characterized the Chinese doctrinal schools such as Tiantai and Huayan. But does this mean that texts have no place? This would hardly seem to be warranted given what we find in the Platform Sutra. In the autobiographical portions of the Sutra Huineng has his initial awakening from hearing a text (the Diamond Sutra), demonstrates his worth through his own “dharma verse,” and received official dharma transmission through verbal teachings from Hongren. Moreover, Huineng’s sermon is full of instances in which he unfolds the various meanings in a number of Buddhist texts. In addition, there are several passages in which Huineng draws attention to the text of his sermon itself, stating “If others are able to encounter the Platform Sutra, it will be as if they received the teaching personally from me” (section 47). The text goes on to note that Huineng’s closest disciples received his teaching, made copies of the Platform Sutra and entrusted them to later generations, all of whom were led through it to see into their own true natures.

An important clue for our understanding can be found when Huineng is preparing to give his “death verse.” Before launching into his final teaching he tells his disciples, “if you understand its meaning, you will be the same as I” (section 48). Like Sakyamuni before his passing, Huineng promises that that the master will remain with his students in the form of his teachings. These teachings, compiled in textual form, will have the power to lead hearers and readers to realization of their True natures once they grasp the teachings’ true import. In this reading, the Master’s role is open up the teachings via his own words (or actions) and so manifest their meaning; the crucial point is that these are transmitted by the Master and taken up by the students – a process that can only happen “outside the scriptures” themselves. There is an interesting parallel here to the view of the Neo-Confucian master Zhu Xi, who, in outlining the regimen of study for his disciples, emphasizes the importance of texts as a coming into the very presence of the Sages themselves.

The conclusion seems to be that Huineng does not denigrate texts per se, for they were instrumental in his own awakening and play a central role in his sermons. Instead, he (and later Chan tradition) attacks the tendency to treat them objectively, as material to be mastered rather than dharma gates leading to awakening. Ego, cutting off from full involvement in the world. Taking texts truly as “scripture,” however, is another matter. The words of dharma are Buddha in that they allow us to perceive truth. In this view, then, those passages in the Platform Sutra calling attention to the text itself emphasize its way of connecting one with Huineng’s wisdom offered for our awakening. What we see then is that through Huineng, Chan celebrates the centrality of text, but as “live word” internalized and acted upon rather than mere marks on the page. Such an existential engagement, however, is not typically found in the modern study of philosophy and so raises questions about what “philosophy” may actually be.

d. The Relation of Action (praxis) and Knowledge (theoria)

The centrality of practice is a major refrain in Huineng’s discourse. Despite his often-cryptic comments, the Master shares the decidedly practical focus that runs through much of Chinese philosophic culture. Time and time again, Huineng exhorts us to a life of Chan action and practice, a life of Buddhahood, rather than quietistic withdrawal. Although clearly there is some sort of “theory” informing Huineng (a sinified version of tathagatha-garbha teachings), this never takes precedence over practical application. In fact, Huineng (and Chan in general) refuses to distinguish between these two concepts, arguing essentially that true knowing is practical action. Thus, from this perspective nothing can be “true in theory” if it is not borne out in practice.

The priority of praxis is underscored by the fact that Chan is often regarded first and foremost as a “practice school.” In contrast to the doctrinal concerns of the Tiantai and Huayan, Chan emphasizes practices such as “no-thought” while maintaining that getting tangled up in mistaken ideas inevitably leads one astray. Since we are already Buddha, we must realize this through Buddha living. Only then are we awakened to the truth of our original (Buddha) nature.

There are some interesting analogies to Huineng’s perspective that provide much food for thought. Socrates, for example, famously argues that “to know the good is to do the good,” implying that true understanding is always attested in actual life. In a different vein, there is also Martin Heidegger’s existential analysis of dasein in which he focuses on our unreflective “being-in-the-world” as demonstrating a prior unthematized Understanding, that is, our actual (as opposed to theoretical) knowledge of things. Perhaps the most obvious analogy, however, can be found in the work of Wang Yangming (Wang Shouren, 1472-1529). Among his teachings, Wang maintained that knowing and acting formed an essential original unity that people often separate through their own selfish desires. In fact, Wang explained to one of his greatest disciples, “There have never been people who know but do not act. Those who are supposed to know but do not act simply do not know.”

e. The Centrality of Ritual (Li)

This matter has received little attention until recently but is an outgrowth of the general Chinese focus on practice. We have already seen that in the Platform Sutra Huineng constantly preaches to his charges to act upon his teachings, putting them into practice. This preaching, of course, is itself a type of Chan practice and, in fact, occurs within a ritual context and in a temple setting. Giving and listening to a “dharma talk” are both highly ritualized activities that follow their own specified rules. Furthermore, Huineng repeatedly enjoins his followers to chant certain vows aloud and to take various types of precepts. Thus the entire discourse is pervaded by a strong sense of ritual, or li. There is a strong, albeit implicit message here that Huineng is calling for participation in specific activities from all those in his audience, that is, all who hear or read the Platform Sutra.

Adherence to li, of course, has been a primary focus of Chinese culture from the very earliest times, and philosophical discussion of li plays a central role in Chinese thought since at least the time of Confucius. Moreover, li by their very nature are a form of highly regulated activity that require repeated engagement to learn. One learns the li by doing the li. Huineng and the text of the Platform Sutra thus underscore the highly ritualized nature of Chan life, a fact that several scholars have noted and which provides yet another strong contrast to popular (mis)understandings of Chan. Rather than being an incitement to egocentric spontaneity (which would result in utter chaos, and hence more delusion and suffering), the “sudden awakening” espoused by Huineng can only occur within a ritual context in which all parties are actively engaged. Those involved are not “doing their own thing” but participating in a shared activity in which all energies are marshaled in concert. It is just for this reason that Huineng stresses the “samadhi of oneness” and Chan monastic training involves meditation training not just during periods of actual physical sitting but throughout all daily activities.

7. Impact on Later Buddhist and Chinese Philosophical Traditions

Huineng’s impact on Chan is without parallel. Not only did he articulate the major themes that came to dominate Chan discourse and practice, he provided the model of the ideal Master. By the late eighth century, two main branches of Chan existed: the “Northern” and “Southern” schools. Claiming to have studied under Huineng, Shenhui (684-758) launched an attack on the legitimacy of “Northern” Chan, which enjoyed imperial patronage during the Tang dynasty (618-907) under the leadership of Master Shenxiu (ca. 606-706) and his heir, Puji (651-739). Alleging that his teacher was the true recipient of dharma transmission and ridiculing the latter’s “gradualist” approach to awakening, Shenhui insisted that Huineng was the real Sixth Patriarch and claimed the title of Seventh Patriarch for himself. Shenhui’s claims carried the day and by the ninth century, the “Southern” school with its teaching of “sudden awakening” was accepted as the official line. Ironically, both the “Northern” and “Southern” schools eventually died out as direct lineages. It was only later that, having survived the imperial persecutions of 841-845, other Chan schools reasserted their connection(s) to Huineng and so enshrined the tale of unilinear dharma transmission.

The Platform Sutra became wildly popular in China, perhaps because of its paradoxical “Daoist” air, and numerous copies circulated. The traditional version, printed some five hundred years after the oldest version, is almost twice the size of the original due to later additions and expansions. Huineng’s idiosyncratic way of discussing the sutras, less of a strict exegesis and more a performance of their message, a practice known as tichang (Japanese teisho) set the standard for a Chan “dharma talk.” Stories of Huineng are scattered throughout the various gong’an (Japanese koan) collections. Perhaps the most famous of these allegedly comes from Huineng’s confrontation with Huiming, the fierce former general who came to kill him on the mountaintop. As the Huiming approached, the Master asked, “Not thinking of good, not thinking of evil, just at this moment, what is our original face before your mother and father were born?” Huiming at once became enlightened. This koan is still one of the first given to beginning students in Japanese Zen monasteries.

By inaugurating a powerful new approach to the dharma, however, Huineng had impact far beyond Buddhism and Chan. Philosophically, the strongest effect was on Neo-Confucianism, a major response of Confucian tradition to the challenges offered by Buddhism, particularly Chan. Each of the “Five Great Masters” (Zhou Dunyi, Zhang Zai, Cheng Yi, Cheng Hao, Zhu Xi) studied Chan at some point in their youth, and the records of their discussions with students as well as the anecdotes concerning their lives (collected in such works as Reflections on Things at Hand) strongly resemble later Chan collections such as the Wumen guan (The Gateless Gate). Chan influence on Wang Yangming is so great as to scarcely need comment.

As for Daoism, the most obvious impact Chan had was on the formation of the Quanzhen (“Complete Perfection”) school, a monastic sect that originated in the twelfth century. The Quanzhen sect shows blatant Chan influence, from its code of regulations, meditation techniques, and even the layout of its monastic compounds. The school’s founder, Wang Chongyang (1112-1170), with his cryptic teaching style and insistence on diligent practice at all times, could even be one of Huineng’s disciples.

The portrait of Huineng emerging from Chan tradition and the Platform Sutra in particular is quite compelling. The Master is portrayed as brilliant despite (or because of) his humble beginnings and takes on a truly heroic stature through his trials and eventual triumph. In his statements, Huineng comes across as immensely charismatic. He is by turns insightful, iconoclastic and humorous. Throughout his discourse he challenges his audience to leave behind intellectual preconceptions while undercutting all attempts to grasp his meaning by rational means. Ironically, during this lengthy verbal discourse he proclaims, “the practice of self-awakening does not lie in verbal arguments.” (section 38) This despite offering long harangues against Chan practitioners who have “false views.” Huineng, thus, is the archetypal Chan Master, a model for all later Chan practitioners. We can even see traces of Huineng in the character of Yoda, the great Jedi master from the Star Wars film series. At one point in Episode V: The Empire Strikes Back, Yoda famously tells his disciple Luke Skywalker, “Do, or do not — there is no ‘try’!” — a line that could be straight from the Platform Sutra. Truly, Huineng lives on.

8. References and Further Reading

  • Dumoulin, Heinrich. Zen Buddhism: A History. Vol. 1, India and China. New York: Macmillan, 1988.
    • The first in a nearly exhaustive two-volume treatment of the history of Chan/Zen Buddhism (the second volume deals exclusively with Japan). Accessible, detailed, interesting, this is a fine scholarly overview that both beginners and experts will find useful.
  • Faure, Bernard. The Rhetoric of Immediacy: A Cultural Critique of Chan/Zen Buddhism. Princeton: Princeton University Press, 1995.
  • Faure, Bernard. The Will to Orthodoxy: A Critical Genealogy of Northern Chan Buddhism. Stanford: Stanford University Press, 1997.
    • Along with Faure’s Ch’an Insights and Oversights (1993), these two works exemplify the detailed, technical studies of Chan/Zen that have emerged during the past two decades. Faure draws heavily on Postmodern figures (Foucault, Derrida) in his powerful, wide-ranging yet insightful critical “unmasking” of traditional understandings of Chan and Zen.
  • Hershock, Peter D. Chan Buddhism. Honolulu: University of Hawaii Press, 2005.
    • Part of the “Dimensions of Asian Spirituality” series, this may be the finest one volume overview of Chan/Zen available in English. Hershock skillfully steers a “middle way” between critical-historical scholarship and insight into the spiritual meaning of Chan/Zen teachings and practice. An admitted practicing Buddhist for over 20 years, Hershock fleshes out his “Zen Bones” with profiles of Huineng as well as other Chan masters (Bodhidharma, Mazu, and Linji). In the end he presents Chan/Zen as a vital practice that has the potential to help us shed our ego boundaries and open ourselves to our fellow human beings.
  • Hershock, Peter D. Liberating Intimacy: Enlightenment and Social Virtuosity in Ch’an Buddhism. Albany: State University of New York Press, 1996.
    • Hershock’s first book on Chan, presenting a unique and insightful philosophical take stressing Chan as a tradition of practice in the world. As the title suggests, Hershock maintains that Chan is a way towards achieving “liberating intimacy” with other sentient beings. A masterful refutation of charges that Chan/Zen is mere self-indulgent “navel gazing” or that it encourages antinomian or immoral behavior.
  • Jorgenson, John. Inventing Hui-neng, the Sixth Patriarch: Hagiography and Biography in Early Ch’an. Leiden: E. J. Brill Academic Publishing, 2005.
    • A recent critical analysis of the Huineng legend and the saga of Early Chan. The author uses the life of Confucius as the model on which Huineng’s biography is based. Very good at showing the influence of Confucianism, politics etc. on early Chan. The cover photo of Huineng’s alleged “mummy” alone is startling.
  • McRae, John R. The Northern School and the Formation of Early Ch’an Buddhism. Honolulu: University of Hawaii Press, 1986.
    • A major scholarly work drawing heavily on critical Japanese scholarship. McRae was one of the first to truly take on the traditional Chan/Zen story of the “Northern” versus “Southern” school.
  • Price, A.F., and Wong Mou-lam, trans. The Diamond Sutra and the Sutra of Hui-Neng. Boston: Shambhala Publications, Inc., 1990.
    • One of the special “Shambhala Dragon Editions” series, this work presents two of the most important texts in early Chan, and does so from a Chan perspective. While not scholarly by any means (there are very few notes), they definitely capture the iconoclastic spirit of Chan. As if to underscore this, a famous 13th century black ink painting of Huineng tearing up a sutra graces its cover. Wong’s translation of the Platform Sutra was the first ever done into English (in the 1930’s), and for that reason alone it is significant. It includes some episodes not in the Dunhuang version translated by Yampolsky (see below).
  • Suzuki, Daisetz Teitaro. The Zen Doctrine of No-mind: the Significance of the Sutra of Hui-Neng (Wei-Lang). York Beach, ME: Weiser Books, 1972.
    • Originally published in 1969, this is a posthumous work by one of the foremost (and controversial) popularizers of Zen in the West. While perhaps marked by a sort of “weisho quality,” this book demonstrates Suzuki’s awareness of critical scholarship on Chan/Zen tradition and a real understanding of many of the issues involved in Huineng’s “biography” and Zen teachings. Although not a roshi himself, Suzuki was never as much of an “outsider” to the Zen establishment as some of his critics have made him out to be. His personal experience with Zen training sharpened Suzuki’s insights and his comparisons with Christianity are thought provoking at the very least.
  • Yampolsky, Philip B., trans., The Platform Sutra of the Sixth Patriarch (New York: Columbia University Press, 1967.
    • Still the definitive English translation, based upon the Dunhuang manuscript. All quotations in the above are taken from Yampolsky’s translation. Heavily annotated, it includes a lengthy introduction (over 100 pages), glossary, and a critical edition of the Chinese text at the very end. A must read for anyone seeking to understand Chan tradition and its most famous Patriarch.

Author Information

John M. Thompson
Email: john.thompson@cnu.edu
Christopher Newport University
U. S. A.

Gorgias (483—375 B.C.E.)

GorgiasGorgias was a Sicilian philosopher, orator, and rhetorician. He is considered by many scholars to be one of the founders of sophism, a movement traditionally associated with philosophy, that emphasizes the practical application of rhetoric toward civic and political life. The sophists were itinerant teachers who accepted fees in return for instruction in oratory and rhetoric, and many claimed they could teach anything and its opposite (thesis and antithesis). Another aspect of their method was the ability to make the weaker argument the stronger. The term sophist in classical Greek was a general appellation denoting a “wise man.” They were important figures in Greece in the 4th and 5th centuries, and their social success was great. Plato was the first to use the term rhêtorikê, while the sophists termed their “art” logos . Nevertheless, Gorgias is commonly associated with the development of rhetoric in classical Greece. The democratic process in Athens supplied the need for instruction in both rhetoric and philosophy.

Despite efforts by G.W.F Hegel and George Grote toward rehabilitating the reputations of Gorgias and the other sophists in the 19th century, the sophists still had a foul reputation well into the 20th century (as evidenced by the pejorative term “sophistry”). In 1930, French philosopher Jacques Maritain remarked “[s]ophistry is not a system of ideas, but a vicious attitude of the mind;” the sophists “came to consider as the most desirable form of knowledge the art of refuting and disproving by skillful arguments” (32-33). In recent years, however, modernists and post-structuralists have found great value in the philosophy of Gorgias, especially his theories on truth and language.

Table of Contents

  1. Life and Works
  2. Philosophy
    1. Ontology & Epistemology
    2. Rhetorical Theory
  3. Critics
  4. References and Further Reading

1. Life and Works

Gorgias (483-375 B.C.E.) came to Greece from Leontini in Sicily. Little is known of his life before he arrived in Athens in 427 B.C.E. as a political ambassador seeking military assistance against Syracuse, a city-state in Sicily. He delivered a series of speeches that dazzled the Athenian audiences and won him fame and admiration. Upon completion of his mission, he traveled throughout Greece as a teacher of rhetoric and as an orator, and according to Aristotle, spoke at the Panhellenic festivals (Art of Rhetoric 1414b29). He was a student of Empedocles, and according to Quintilian and others, was the teacher of Isocrates. Plato identifies Meno (Meno 76Aff) among the students of Gorgias, and he may have been one of Aspasia’s instructors as well. Many of the sophists set up schools and charged fees in return for instruction in rhetoric, and Gorgias was no exception. Philostratus (Lives of the Sophists I 9, I) tells us that Gorgias began the practice of extemporaneous oratory, and that he had the boldness to say “‘suggest a subject’ …he was the first to proclaim himself willing to take the chance, showing apparently that he knew everything and would trust the moment to speak on any subject.” He died at the age of 108 at Larissa in Thessaly.

Four works are attributed to Gorgias: On the Nonexistent or On Nature, the Apology of Palamedes, the Encomium on Helen, and the Epitaphios or Athenian Funeral Oration. The original text of On Nature has been lost, and survives only in two different paraphrases, one in Sextus Empiricus’ Against the Professors and another in an anonymous work entitled Melissus, Xenophanes, Gorgias. There are two different manuscripts of Palamedes and Helen (the Cripps and Palatine versions), one slightly different than the other. Legal historians consider the Defense of Palamedes to be an important contribution to dicanic [explanatory] argumentation, and some cultural historians believe the Epitaphios was used as a stylistic and genre source for Plato’s Menexenus (Cosigny 2). Gorgias’ rhyming style is highly poetic, and he viewed the orator as an individual leading a kind of group incantation. He employs metaphor and figurative expressions to illustrate his assertions, and even uses humor as one instrument of refutation. The term macrologia (using more words than necessary in an effort to appear eloquent) is sometimes used to describe his oratorical technique (Kennedy 63).

2. Philosophy

Any student of Gorgias must immediately mark the distinction between his philosophy as expressed by Plato in the dialogue Gorgias (see below) and his philosophy found within the three works: On the Nonexistent, the Apology of Palamedes, and the Encomium on Helen.

a. Ontology & Epistemology

Nowhere is Gorgias’ sophistical love of paradox more evident than in the short treatise On the Nonexistent or On Nature. The subject of this work is ontological (concerning nature of being), but it also deals with language and epistemology (the study of the nature and limitations of knowledge). In addition to this, it can be understood as an exercise in sophistical rhetoric; Gorgias tackles an argument that is seemingly impossible to refute, namely that, after considering our world, we must come to the conclusion that “things exist.” His powerful argument to the contrary proves his abilities as a master of oratory, and some believe the text was used as an advertisement of his credentials.

Gorgias begins his argument by presenting a logical contradiction, “if the nonexistent exists, it will both exist and not exist at the same time” (B3.67) (a violation of the principle of non-contradiction). He then denies that existence (to on) itself exists, for if it exists, it is either eternal or generated. If it is eternal, it has no beginning, and is therefore without limit. If it is without limit, it is “nowhere” (B3.69), and hence does not exist. And if existence is generated, it must come from something, and that something is existence, which is another contradiction. Likewise, nonexistence (to mê on) cannot produce anything (B3.71). The sophist then explains that existence can neither be “one” (hen) or “many” (polla), since if it were one, it would be divisible, and therefore not one. If it were many, it would be a “composite of separate entities” (B3.74) and no longer the thing known as existence.

Gorgias then turns his attention to what is knowable and comprehensible. He remarks, “if things considered [imagined or thought] in the mind are not existent, the existent is not considered” (B3.77), that is to say, existence is incomprehensible. This supposition is backed up by the fact that one can imagine chariots racing in the sea, but that does not make such a thing happen. The operation of the mind (intellection) is fundamentally distinct from what happens in the real world; “the existent is not an object of consideration and is not apprehended” (B3.82). It is helpful to think of apprehension here in Aristotelian terms, as simple apprehension, the first operation of reasoning (logic) in which the intellect “grasps” or “apprehends” something. Simple apprehension happens when the mind first forms a concept of something in the world, and is anterior to judgment.

Finally, Gorgias proclaims that even if existence could be apprehended, “it would be incapable of being conveyed to another” (B3.83). This is because what we reveal to another is not an external substance, but is merely logos (from the Greek verb lego, “to say”–see below). Logos is not “substances and existing things” (B3.84). External reality becomes the revealer of logos (B3.85); while we can know logos, we cannot apprehend things directly. The color white, for instance, goes from a property of a thing, to a mental representation, and the representation is different than the thing itself. In its summation, this nihilistic argument becomes a “trilemma”:

i. Nothing exists
ii. Even if existence exists, it cannot be known
iii. Even if it could be known, it cannot be communicated.

This argument has led some to label Gorgias as either an ontological skeptic or a nihilist (one who believes nothing exists, or that the world is incomprehensible, and that the concept of truth is fictitious). But it can also be interpreted as an assertion that it is logos and logos alone which is the proper object of our inquiries, since it is the only thing we can really know. On Nature is sometimes seen as a refutation of pre-Socratic essentialist philosophy (McComiskey 37).

b. Rhetorical Theory

Most of what we know concerning Gorgias’ views on rhetoric comes from the Encomium. This work can be understood as a sophistical effort to rehabilitate the reputation of Helen of Troy. In it, Gorgias attempts to take the weaker argument and make it the stronger one, by arguing for a position contrary to well-established opinion: in this case, the opinion that Helen was to blame for the Trojan War. Gorgias argues that Helen succumbed either to (a) physical force (Paris’ abduction), (b) love (eros), or (c) verbal persuasion (logos), and in any instance, she cannot be blamed for her actions. According to Gorgias, logos is a powerful force that can be used nefariously to convince people to do things against their own interests. It can take the form of poetry (metrical language), divine incantations, or oratory. Logos is described as a “powerful lord” (B11.8) and “[t]he effect of speech upon the condition of the soul is comparable to the power of drugs over the nurture of bodies” (B11.14). This should be contrasted with the view of Isocrates that logos is a “chief” or “commander” (Nicoles 5-9). The difference here is subtle, but Gorgias’ dynastic concept of logos clearly turns it into a despotic overlord, while Isocrates’ “commander” is a leader with delegated authority, an individual who fights along side his troops.

Examples of persuasive speech, according to Gorgias, are the “conflicts among the philosophers’ arguments in which the swiftness of demonstration and judgment make the belief in any opinion changeable” (B11.13). This is similar to the assertion of Sextus Empiricus that equally convincing arguments can be formed against, or in favor of, any subject. Gorgias may have believed in a relative notion of truth that was contingent upon a particular kairos (an opportune moment or “opening”), that is to say, truth can only be found within a given moment. He seems to reject the idea of truth as a philosophically universal principle, and thus comes into conflict with Plato and Aristotle. Nevertheless, the rhetor (orator) is ethically obligated to avoid deception, and it is “the duty of the same man both to declare what he should rightly and to refute what has been spoken falsely” (B11.2). Ultimately, Gorgias’ opinion concerning truth is difficult to ascertain, but from his writings, we can conclude that he was more concerned with rhetorical argument than the truth of any given proposition or assertion.

In the epideictic speech Defense of Palamedes, Gorgias uses a mythical narrator (Palamedes) to further illustrate his rhetorical technique and philosophy. In the Odyssey, Palamedes was responsible for revealing Odysseus’ “madness” as a fiction, an act for which the latter never forgave him. Ultimately, Palamedes was executed for treason, after Odysseus accused him of conspiring with the Trojans. Gorgias focuses on the invention of arguments (topoi) necessary to exonerate Palamedes within the setting of a fictional trial, all of which depend upon probability. Palamedes could not have committed treason with a foreign power since he speaks no language other than Greek (B11a.6-7), and no Greek desires social power among barbarians (B11a.13). In the second example, we see that topoi “embody the values of the community, in the sense that they comprise what the community considers important” (Cosigny 84). A fundamental difference between the topoi found within Aristotle’s Art of Rhetoric and Gorgias’ topoi is that Aristotle’s are “acontextual, while Gorgias places his in the narrative context of the Palamedes myth” (McComiskey 49). Therefore, there is a direct relationship between kairos and invention.

Gorgias rejects the use of pathos (emotional appeal) in his Defense, with the assertion that “among you, who are the foremost of the Greeks …there is no need to persuade such ones as you with the aid of friends and sorrowful prayers and lamentations” (B11a.33). He prefers to use ethos (ethical appeal, or arguments from character) and logos, as his instruments of persuasion.

3. Critics

Gorgias’ most famous critic is Plato. In the dialogue Gorgias, Plato (through his mentor Socrates) expresses his contempt for sophistical rhetoric; all rhetoric is “a phantom of a branch of statesmanship (463d) …a kind of flattery …that is contemptible,” because its aim is simply pleasure rather than the welfare of the public. Nor can rhetoric be considered an art (technê), since it is irrational (465a). The end result of rhetoric is a cosmetic alteration of language that conceals truth and falsity (465b). Furthermore, rhetoric is “designed to produce conviction, but not educate people, about matters of right or wrong (455a). The character of Gorgias in the dialogue is forced to admit that his “art” deals with opinion (doxa) rather than knowledge (epistemê); that its intention is to persuade rather than to instruct, and that rhetoric deals with language without regard to content. Gorgias is portrayed as a man with an ambivalent attitude towards truth, a relativist, who boldly asserts that it does not matter if one truly has knowledge of any given subject, only that he is perceived by others to have knowledge, and that “[r]hetoric is the only area of expertise you need to learn. You can ignore all the rest and still get the better of the professionals!” (459c).

There are a number of explanations for Plato’s antipathy towards sophistic rhetoric. The first is simply philosophical; Plato was not a relativist, nor did he believe rhetoric had a pedagogical value. But there is also a political element to be considered. Bruce McComiskey points out that Plato believed in an “oligarchic government” for Athens, while many of the sophists “favored the Athenian Democracy the way it was” (20). It is important to point out that during Gorgias’ lifetime, both Leontini and Athens were democratic city states and a loose alliance existed between the two. On a more practical level, the Greek city states also served as a market for those who would sell instruction in rhetoric.

Aristotle dismisses Gorgias as a “frigid” stylist who indulges in excessive use of compound words such as “begging-poet-flatterers” and “foresworn and well-sworn” (Art of Rhetoric 1405b34). He also faults Gorgias for overly poetic language (1406b4), and we can see examples of this in Gorgias’ description of logos as a great dynast or lord (B11.8) and as a “drug” (B11.14). The sophist compares orators to “frogs croaking in water”(B3.30), and philosophers to the “suitors of Penelope” (B3.29).

Despite efforts by G.W.F Hegel and George Grote toward rehabilitating the reputations of Gorgias and the other sophists in the 19th century, the sophists still had a foul reputation well into the 20th century (as evidenced by the pejorative term “sophistry”). In 1930, French philosopher Jacques Maritain remarked “[s]ophistry is not a system of ideas, but a vicious attitude of the mind;” the sophists “came to consider as the most desirable form of knowledge the art of refuting and disproving by skillful arguments” (32-33). In recent years, however, modernists and post-structuralists have found great value in the philosophy of Gorgias, especially his theories on truth and language.

4. References and Further Reading

Note: the citations above regarding Gorgias’ statements follow the alpha-numeric system used by Sprague (see below) in the text The Older Sophists (B3=On Non-Being, B11=Encomium on Helen, B11a=Defense of Palamedes).

  • Aristotle. The Art of Rhetoric. Trans. John Henry Freese. London: WM Heinemann, 1967.
  • Barrett, Harold. The Sophists: Rhetoric, Democracy, and Plato’s Idea of Sophistry. Novata: Chandler & Sharp, 1987.
  • Consigny, Scott. Gorgias: Sophist and Artist. Columbia: University of South Carolina, 2001.
  • Freeman, Kathleen. Ancilla to the Pre-Socratic Philosophers. Cambridge: Harvard, 1948.
  • Gorgias. Encomium of Helen. Trans. Douglas MacDowell. Glasgow: Bristol Classics, 1982.
  • Isocrates. Isocrates. 3 vols. Trans. George Norlin and LaRue Van Hook. Cambridge: Harvard, 1968.
  • Jarratt, Susan. “The First Sophists and the Uses of History.” Rhetoric Review 6 (1987): 67-77.
  • Jarratt, Susan C. Rereading the Sophists: Classical Rhetoric Refigured . Carbondale and Edwardsville: Southern Illinois University Press, 1991.
  • Kennedy, George. The Art of Persuasion in Greece. Princeton N.J.: Princeton University, 1963.
  • Kerferd, G.B. “The First Greek Sophists.” Classical Review 64 (1950): 8-10.
  • Marias, Julian. History of Philosophy. New York: Dover, 1967.
  • Maritain, Jacques. Introduction to Philosophy. Westminster MD: Christian Classics, 1991.
  • McComiskey, Bruce. Gorgias and the New Sophistic Rhetoric. Carbondale: Southern Illinois, 2002.
  • Plato. Gorgias. Trans. Robin Waterford. Oxford: Oxford, 1994.
  • Poulakos, John. Sophistical Rhetoric in Classical Greece. Columbia: University Of South Carolina, 1995.
  • Schiappa, Edward. “Sophistic Rhetoric: Oasis or Mirage?” Rhetoric Review 10 (1991):5-18.
  • Sprague, Rosamund Kent, ed. The Older Sophists. Columbia: University of South Carolina, 1972.

Author Information

C. Francis Higgins
Email: colin@louisiana.edu
University of Louisiana Lafayette
U. S. A.

Fazang (Fa-tsang, 643—712 C.E.)

The Buddhist ideologue Fazang (Fa-tsang) stands as one of the foremost figures of medieval Chinese Buddhism. He lived at the very pinnacle of Chinese Buddhism among towering figures such as the legendary pilgrim and Yogacara (Faxiang) master Xuanzang (602-664), the Chan patriarch Shenxiu (d. 706) and the great chronicler Daoxuan (596-667). According to Song dynasty biographer Zanning, he was “mysterious and upright, by nature surpassingly clever and sagacious.” For the better part of his life, he worked in close proximity with the highest echelons of imperial power, deeply engaged in matters of court and country. For four decades, under a series of emperors, he served as a lecturer, a translator, a rhetorician, a propagandist, and a miracleworker. Tirelessly, he lectured on the Flower Garland Sutra, translated Buddhist sutras from Sanskrit and Khotanese (a Middle Iranian language once spoken in what is now China’s Xinjiang province) into Chinese, and wrote meticulously crafted commentaries interpreting Buddhist scripture in a manner that served to exalt his imperial patron’s status. Shortly after his death, the emperor Ruizong (r. 684-690, 710-712) praised him effusively: “The late monk Fazang inherited his virtuous karma from the Heavens and his open intelligence accorded with principle. With his eloquence and outstanding understanding, he had his mind interfused with penetrating enlightenment.” He would become known as the third patriarch and systematizer of the Flower Garland (Huayan or Hua-yen) school of Buddhism.

Table of Contents

  1. Biography
  2. Thought
    1. Shunyata
    2. Bodhicitta
    3. Indra’s Net
    4. The Golden Lion
  3. Works
  4. References and Further Reading
    1. Secondary Sources
    2. Primary Sources

1. Biography

Fazang was a native of Sogdiana (in Chinese, Sute). This is an Iranian civilization that encompassed territories now incorporated into the modern states of Uzbekistan and Tajikistan in Central Asia. As a youth, he embraced Buddhism with fervent devotion; at sixteen, he burned off one of his fingers as an offering to the Buddha before the Aśokan reliquary in the famous Famen Temple in the Tang dynasty capital of Chang’an. Thereafter, he became a recluse on nearby Mount Taibai, where he encountered masters of the Flower Garland (Avatamsaka) Sutra. Returning to Chang’an to attend to his ailing parents, he encountered Zhiyan (602-668) and became his student and disciple. Fazang was constantly called upon to explicate the profound wonders contained in the Flower Garland Sutra, lecturing to clergy and rulers more than thirty times.

Like many eminent Buddhists, a mystical aura has grown around Fazang in subsequent hagiography. One must investigate with a careful and critical eye the many miracles and legends that surround his person. Some of the purported miracles were closely associated with his oratory prowess. In 689, when he delivered his lecture on the Flower Garland Sutra in Luoyang, a piece of auspicious ice was discovered in which, it is said, an image of “twinned pagodas” appeared. When Śiksānanda and he were translating the Flower Garland Sutra in Luoyang, a hundred-petaled lotus flower blossomed in front of the translation hall. After lectures in 692 and 696, light allegedly issued from Fazang’s mouth, prompting the congregated faithful to marvel. On other occasions, following his lectures, it is said that flowers fell from the heavens and five-colored clouds accumulated in the skies.

Fazang appears to have been a practitioner of esoteric Buddhism, which many East Asian rulers believed commanded magical powers. In 697, the throne requested that he use Buddhist scriptural magic to help defeat the Khitan, a proto-Mongolian ethnic group that once dominated what is now Manchuria. Fazang performed a ritual cleansing, changed clothes, set an eleven-faced image of the bodhisattva (an enlightened being who selflessly seeks to aid others) Guanyin (Kuan-yin) on a ritual platform, and worked his magic. Heavenly drums echoed, the image of Guanyin appeared on high, surveying the countless divine troops who materialized to combat the raiders, inspiring the Zhou forces and plunging the Khitan into despair. This triumph prompted the empress Wu Zhao to exclaim, “This is the blessed aegis of Buddha force!” and change the reign era name to Shengong (“Divine Merit”).

He was also renowned as a conjurer, capable of summoning weather. On multiple occasions, his prayers and rites brought timely rain to alleviate drought. In 687, at the empress’ behest, he prayed for rain, fasting for seven days, until the skies fortuitously opened and drenched the parched ground. Again, in 696, his prayers proved effective in bringing salubrious rain to afflicted Yongzhou. In 702, Fazang invited another monk to pray at Wuzhen Temple in Lantian, which had no spring. After three dawns of reciting sutras, a freshet suddenly jetted forth at Maitreya Pavilion, bringing vernal bounty to the surrounding lands. Under the emperor Zhongzong, when drought struck Chang’an, Fazang prayed and performed Buddhist rites for seven days, finally bringing a downpour. The following year his prayers for rain were successful once again. Under the emperor Ruizong, he relieved drought and snowless winter, his sincere prayers brought down a blizzard.

In spite of his impressive monastic, scholastic, and thaumaturgical credentials, Fazang was no detached ascetic who speculated on matters recondite and metaphysical. Under Wu Zhao (a.k.a. Empress Wu or Wu Zetian, 624-705, r. 690-705), the only female emperor in Chinese history, the Buddhist clergy was politicized as never before. Contending against a Confucian tradition that stridently opposed her assumption of power, Wu Zhao naturally sought validation for her sovereignty in Buddhism. She styled herself in Buddhist terms as a cakravartin (a universal wheel-turning monarch) and a living bodhisattva. A brilliant orator, lecturer, ideologue, rhetorician and translator, Fazang was one of many Buddhist ideologues who helped sanction her sovereignty. He differed from the vast majority of her other Buddhist supporters in that he was an independent-minded and profound thinker who lectured to Wu Zhao, rather than mustering rhetoric for her. The remarkable duration and depth of their mutual commitment also stands out. For better than three decades, beginning when he preached the Flower Garland Sutra on behalf of her recently deceased mother, he applied his abundant talents toward enhancing Wu Zhao’s reputation as a Buddhist ruler.

At a pivotal juncture of Wu Zhao’s political ascent, as part of a grand ceremony early in 689 that anticipated the inauguration of her Zhou dynasty by a single year, she ordered Fazang to convene a dharma assembly and, from an elevated seat, expound upon the Flower Garland Sutra to thousands of Buddhist monks and nuns congregated for the event. When Fazang delivered a lecture at Buddha’s Prophecy Temple in Luoyang in 700 (shortly after the completion of his new translation of the Flower Garland Sutra), the ground of the lecture hall and temple purportedly shook. Rather than interpreting this earthquake in Confucian fashion, as an inauspicious disharmony of the elements, Wu Zhao understood it as a wondrous event, praising Fazang:

Because he has extended the knowledge of the subtle and profound; disseminated wisdom on the mysterious and abstruse, on the first day of translation, I dreamed that sweet dew descended as an auspicious sign. On the morning of the lecture I felt the earth tremor, a miraculous sign. This, then, was the footfall of the Future Buddha, Maitreya, using the mandala as a lucky icon.

This marriage of ideology and power did not end happily. In Wu Zhao’s turn toward Daoist expiatory rites and longevity potions during her final years, Fazang felt a shift in his patron’s imperial favor. In early 705, Fazang transported the sacred finger-bone of the Buddha from Famen Temple to Luoyang, where Wu Zhao placed him in charge of the relic veneration ceremony, which she believed might ameliorate her declining health. In this official capacity, which provided him access to her person and to the Forbidden City, Fazang worked in tandem with conspirators from the court and betrayed his longstanding patron Wu Zhao, supporting the coup that removed her in 705. A political opportunist, he continued to promote Flower Garland Buddhism serving under emperors Zhongzong (r. 684, 705-710), Ruizong, and Xuanzong (r. 712-756). Curiously, his treachery, to no small extent, saved Buddhism from being identified as a rogue ideology used to validate one whom the Confucian establishment styled an illegitimate female usurper.

Fazang’s successful promotion and propagation of Flower Garland Buddhism under successive rulers played an important role in the subsequent spread, development and Sinification of the school. Over a period of three decades, Fazang played a leading role in these cooperative efforts among the corps of Indian, Khotanese, Sogdian, Korean and Chinese writing translations and commentaries on Buddhist sutras. In Fazang’s epistolary correspondence with Korean Flower Garland monk Ŭisang, another disciple of his master Zhiyan, it is apparent that he attempted to propagate a worldwide state without barriers, an infinite realm linked by the Mahayana Buddhist faith. Fazang also taught another Korean monk, Shimsang, who helped transmit Chinese Flower Garland Buddhism to Japan. Ultimately, these contacts helped propagate Flower Garland Buddhism, linking it to a wider pan-Asian network

2. Thought

a. Shunyata

At the very heart of Flower Garland Buddhism is the idea of what is known in Sanskrit as shunyata (“emptiness”): universal interconnectedness, all-inclusiveness, intercausality and interpenetration. Fazang did a great deal to elevate Flower Garland Buddhism over rival schools, acknowledging other Buddhist schools and sutras, but championing the Flower Garland Sutra as the central teaching of the Buddha. As the Buddha’s first sermon upon attaining enlightenment, the nearly incomprehensible Flower Garland Sutra was invested with a profundity and wisdom unequalled in the Buddha’s subsequent works. In this effort, Fazang gathered and classified the rather unsystematic and wide-ranging Buddhist teachings into five categories in order of ascending profundity and power. In ascending order: Hinayana, Initial Mahayana, Final Mahayana, Sudden Teaching of the One Vehicle (proto-Zen), and, at the pinnacle, the Comprehensive Teaching of the One Vehicle—in essence, the Flower Garland Sutra. The sense of universality allowed the Flower Garland School to be compatible with other sects, effectively encompassing their doctrine, while maintaining the overarching primacy of the Flower Garland teachings.

b. Bodhicitta

This doctrine of interdependence is also reflected in Fazang’s thoughts on bodhicitta (mental dedication to helping all sentient beings and attaining enlightenment). Following the logic that each element pervades all that exists and itself contains all other elements in the phenomenal world, “In practicing the virtues, when one is perfected, all are perfected,” he writes, “and when one first arouses the thought of enlightenment one also becomes perfectly enlightened” (trans. Wright). Fazang’s emphasis on the omniversal generative power of the tathagatagarbha, the “womb of Buddhahood,” while not unique, subsequently developed into an important concept in the East Asian Mahayana Buddhist tradition.

So that others might better comprehend the profound doctrine of the Flower Garland Sutra, Fazang used the metaphor of the Ten Mysteries (Ten Mysterious Gates) to explicate the interconnectedness and inter-causality in the Flower Garland universe. These Ten Mysteries illustrate how seemingly contradictory pairs—the hidden and the manifest, truth and falsehood, the infinite and the infinitesimal, the general and the specific–mutually complement each other and coexist without obstruction. Indra’s net (see below) is one of the Ten Mysteries.

Fazang’s ideas of an interconnected omniverse extended easily and effectively from the metaphysical realm to the political arena. Indeed, it allowed Wu Zhao to serve as the alpha link in a cosmic concatenation. Stanley Weinstein has commented “Seeing herself as a universal monarch, she must have been attracted by the Flower Garland school with its well-ordered universe presided over by Vairocana Buddha, whose every act was reflected in countless worlds.” This integrated and totalistic vision of the cosmos was “analogous to the highly centralized imperial state that she ruled.” This ideology allowed Wu Zhao to portray herself as an absolute sovereign, all-pervasive and omnipresent. This central idea of the boundless reach of the Buddha’s power and compassion, nicely paralleled and supported the idea of the infinite compass of the ruler’s authority and benevolence. Fazang’s creative presentation and flair for theater (see below), both enhanced the great aesthetic, intellectual and philosophical appeal of his ideas and made them more comprehensible. In Wu Zhao, he found a potential cakravartin to propagate the Buddhist faith; in Fazang’s profound thought, she, in turn, discovered powerful ideological justification for her authority.

c. Indra’s Net

When Fazang first lectured on the Flower Garland Sutra, the principles he expounded upon were so abstruse that the listeners were utterly dumbstruck. Therefore, to render the sutra comprehensible to his imperial patrons and to the masses of Buddhist faithful, he used metaphors such as Indra’s Net of Jewels and the Golden Lion. In the former, “In each of the jewels, the images of all the other jewels are reflected…the images multiply infinitely, and all these multiple images are bright and clear within a single jewel.” This concatenation, this mutual linking and inter-penetration, illustrates harmonious interconnectedness of everything. Here, causal sky net objects can not be conceived of independently: the nature of each object is defined by its place with relation to all other objects. He also devised a Hall of Mirrors to illustrate the workings of Indra’s Net and the power of the Buddha by arranging ten mirrors (corresponding with the Ten Mysterious Gates), eight in an octagon, one above and one below, with a statue of the Buddha set in the middle, the focal point of origin and return. When he lit a torch to illumine the centerpiece, an endless web of reflected light crisscrossed, creating an infinite series of images within images, each containing the entire Buddha. This demonstration made manifest the meaning of the inexhaustible interconnectedness of the universe, hence the infinite power of the Buddha.

d. The Golden Lion

Fazang’s most famous device of performative metaphor was a lion made of gold. The lion represents the cosmos, parts of the lion the various phenomena of the universe, while the gold represented emptiness. The lion had a mane, teeth, claws and eyes: parts that seemed distinct and unrelated. And yet the essential substance of the entire lion was the same–gold. Within each hair, paradoxically, there are infinite lions. The differences are all superficial. Such is the nature of the integrated, interconnected Flower Garland universe. After demonstrating this principle to Wu Zhao using the sculpture of a lion at the imperial palace gate around 700 (sources differ), Fazang wrote a one-chapter Essay on the Golden Lion.

In his Treatise on the Five Teachings, a house is used as a metaphor for the universe. The complex interplay between joists, uprights, roof, tenons and mortises—the sum total of structural relationships between all parts–is contained in a single rafter. The nature of the infinite can be seen in the infinitesimal. The role of the rafter–or any other component–helps one understand the interdependence of all sentient beings. Certainly, Fazang’s flair for the theatrical and his ability to convey the message to his patrons through such brilliant demonstrations, helped successfully propagate Flower Garland Buddhism.

3. Works

Much of Fazang’s energy was devoted to exegetical work on and demonstrations of the Flower Garland Sutra. He produced more than sixty original works, commentaries on a wide variety of Buddhist texts, and meditation manuals, and participated in many Buddhist translation projects. Collectively, Fazang’s works and translations must be looked at not only in terms of their metaphysical and ideological merit, but as political rhetoric consciously geared toward promoting the Flower Garland school and exalting the sovereignty of his imperial sponsors. Fazang’s Treatise on the Five Teachings detailed a hierarchy of Buddhist sects, placing, of course, Flower Garland at the apex and clarifying common ideological ground.

Fazang was a propagandist. His Huayanjing zhuanji, a commentary he wrote between 690 and 693, helped provide legitimacy for Wu Zhao’s claim to be a cakravartin. Making reference to her titles as “Sage Mother” and “Divine Sovereign,” Fazang remarked, “Both sage and divine, she makes the Six Supernatural Penetrations act without stopping; infinitely good and infinitely beautiful, she displays the Ten Goodnesses beyond all limits.”

For Wu Zhao, retranslating and reinterpreting the Flower Garland Sutra was an ongoing, high-priority political activity. Fazang played a pivotal role in this effort. The Flower Garland Sutra was at the heart of a deep-rooted and longstanding Khotanese tradition of Buddhist kingship, with a Chinese lineage going from ruler Shi Hu of the Eastern Jin in the 4th century to Liang Wudi to Sui Wendi and finally to Wu Zhao. She sent emissaries to Khotan to seek the Sanskrit version of the Flower Garland Sutra. In 679, the Indian monk Divākara presented newly recovered Sanskrit sutras at Gaozong’s court. In 684, with Divākara, Fazang worked on a translation of the Flower Garland Sutra at Western Taiyuan Temple. As preparatory work for the compilation of the new Flower Garland Sutra, Fazang compared these new texts to extant translations, noting disparities and incorporating omissions. Between 695 and 699, she recruited Khotanese monks such as Śiksānanda and Devaprajña to work in tandem with Fazang, completing a new, improved Flower Garland Sutra that was eighty chapters instead of sixty. This new Flower Garland Sutra superseded the version completed in the 680s and helped confirm Wu Zhao’s identification as a cakravartin and a bodhisattva.

4. References and Further Reading

a. Secondary Sources

  • Chan, Wing-tsit, ed. A Source Book in Chinese Philosophy. Princeton University Press, 1963.
  • Pages 406–424 include a brief survey of Flower Garland school thought and a full translation of the “Golden Lion Essay.”
  • Chen, Jinhua. Monks and Monarchs, Kinship and Kingship: Tanqian in Sui Buddhism and Politics. Italian School of East Asian Studies Essays Series, vol. 3. Kyoto: Scuola Italiana di Studi sull’Asia Orientale, 2002.
  • Chen, Jinhua. “More Than a Philosopher: Fazang (643-712) as a Politician and Miracle-worker.” History of Religions 42.4 (May 2003): 320-358.
  • Cook, Francis. Hua-yen Buddhism: The Jewel Net of Indra. Penn State University Press, 1977.
  • DeBary, Wm. Th., et al, eds. Sources of Chinese Tradition, Vol I., 2nd ed. Columbia University Press, 1999.
  • Pp. 471-476 includes sections from the Flower Garland Sutra such as “The Tower of Vairocana” and “Indra’s Net.”
  • Fang, Litian. Huayan jin shizi zhang jiaoshi, Zhongguo Fojiao dianji xuankan. Zhonghua, 1996.
  • Forte, Antonino. A Jewel in Indra’s Net: The Letter Sent by Fazang in China to Ŭisang in Korea. Italian School of East Asian Studies Occasional Papers 8. Kyoto, 2000.
  • Forte, Antonino. Mingtang and Buddhist Utopias in the History of the Astronomical Clock: The Tower, the Statue and the Armillary Sphere Constructed by Empress Wu. Rome, 1988. See pp. 121-122.
  • Forte, Antonino. Political Propaganda and Ideology in China at the End of the Seventh Century. Naples, 1977.
  • Fox, Alan. “Fazang.” Great Thinkers of the Eastern World, ed. Ian P. McGreal (HarperCollins, 1995), 99-103.
  • Gu, Zhengmei. “Wu Zetian de Huayan jing: Fowang chuantong yu fowang xingxiang.” Guoxue yanjiu 7 (2000): 279-321.
  • Liu, Ming-Wood. “The Harmonious Universe of Fa-tsang and Leibniz.” Philosophy East and West 32 (1982): 61-76.
  • Rothschild, Norman H. Sub-chapter “Fazang” in “Rhetoric, Ritual and Support Constituencies in the Political Authority of Wu Zhao, Woman Emperor of China.” Ph.D. dissertation, Brown University, 2003.
  • Weinstein, Stanley. “Imperial Patronage in T’ang Buddhism.” Perspectives on the T’ang, eds. Arthur F. Wright and Denis C. Pritchett (Yale University Press, 1973), 265-306.
  • Weinstein, Stanley. Buddhism in T’ang China. Cambridge University Press, 1987.
  • Wright, Dale. “The ‘Thought of Enlightenment’ In Fa-tsang’s Hua-yen Buddhism.” The Eastern Buddhist (Fall 2001): 97-106.

b. Primary Sources

  • Ch’oe Ch’iwŏn (Cui Zhiyuan), Da Tang Jianfusi gu shu fanjing dade Fazang heshang zhuan, (Taisho Tripitika, vol. 50, no. 2054).
    • Biography.
  • Daoxuan, Xu Gaoseng zhuan (Biographies of Eminent Monks), Taisho Triptika, vol. 50, no. 2060.
    • Biography.
  • Fazang, Dasheng qixinlun yiji, Taisho Tripitika vol. 44, no. 1846.
  • Fazang, Fanwang jing pusa jieben shu, Taisho Tripitika vol. 40, no. 1813.
    • Commentary on Brahmajala sutra.
  • Fazang, Huayanjing tanxuan ji (Taisho Tripitika, vol. 35, no. 1733).
    • Commentary on the profundities of the Flower Garland Sutra.
  • Fazang, Huayan jing wenyi gangmu, Taisho Tripitika, vol 35, no. 1734.
    • Explicates the ten mysterious gates (Ten Mysteries) from the Flower Garland Sutra.
  • Fazang, Huayanjing zhigui (Taisho Tripitika, vol. 45, no. 1871).
    • Commentary on the Flower Garland Sutra.
  • Fazang, Huayanjing zhuanji (Taisho Tripitika, vol. 51, no. 2073).
    • Propaganda supporting Wu Zhao’s sovereignty written between 690 and 693.
  • Fazang, Huayan Wujiao zhang (Treatise of the Five Teachings), Taisho Tripitika, vol. 45, no, 1866.
    • Central work that classifies Buddhist teachings and situates the Flower Garland Sutra at the apex.
  • Fazang, Jin shizi zhang, (Essay on the Golden Lion), Taisho Tripitika vol. 45, no. 1881.
  • Yan Chaoyin, “Da Tang Jianfusi gu dade Kangzang fashi zhi bei,” Taisho Tripitika, vol. 50, no. 2054.
    • Funerary epitaph.
  • Zanning, Song Gaoseng zhuan, Taisho Tripitika, vol. 50, no. 2061.
  • Zhipan, Fozu tongji, Taisho Tripitika vol. 49, no. 2035.
    • Biography is fascicle 29 of this Southern Song dynasty (1127-1279) work.

Author Information

Norman Harry Rothschild
Email: hrothsch@unf.edu
University of North Florida
U. S. A.

Epistemic Closure Principles

Epistemic closure principles state that the members of an epistemic set (such as propositions known by me) bear a given relation (such as known deductive entailment) only to other members of that epistemic set.  The principle of the closure of knowledge under known logical entailment is that one knows everything that one knows to be logically entailed by something else one knows.  For instance, if I know grass is green, and I know that grass is green deductively entails that grass is green or the sky is blue, then I know that grass is green or the sky is blue.  Epistemic closure principles are employed in philosophy in myriad ways, but some theorists reject such principles, and they remain controversial.

Some people see closure principles as capturing the idea that we can add to our store of knowledge by accepting propositions entailed by what we know; others claim that this is a misunderstanding, and that closure principles are silent as to how a piece of knowledge is, or can be, acquired.  For instance, the proposition I have a driver’s license issued by the state of North Carolina entails that North Carolina is not a mere figment of my imagination.  According to the principle that knowledge is closed under known entailment, if I know the former claim, and I know the entailment, I know the latter claim.  Some insist, however, that this must be distinguished from the (possibly) false claim that I could come to know the latter on the basis of my knowing the former, since my basis for knowing the former involves presupposing the latter (by taking my sense experience and memory at more or less face value, for instance).

Closure principles are employed in both skeptical and anti-skeptical arguments.  The skeptic points out that if one knows an ordinary common sense proposition (such as that one has hands) to be true, and knows that this proposition entails the falsity of a skeptical hypothesis (such as that one is a handless brain in a vat, all of whose experiences are hallucinatory), one could know the falsity of the skeptical hypothesis, in virtue of knowledge being closed under known entailment.  Since one cannot know the falsity of the skeptical hypothesis (or so the skeptic maintains), one also must not know the truth of the common sense claim that one has hands.  Alternatively, the anti-skeptic might insist that we do know the truth of the common sense proposition, and hence, in virtue of the closure principle, we can know that the skeptical hypothesis is false.  Although the closure principle is sometimes used by anti-skeptics, some view the rejection of closure as the key to refuting the skeptic.

Table of Contents

  1. The Closure of Knowledge under Known Entailment
    1. The Closure of Knowledge Under Entailment
    2. The Closure of Knowledge Under Known Entailment
    3. Justification, Single-Premise and Multiple-Premise Closure
  2. Philosophical Uses of the Closure Principle
  3. Externalist Accounts of Knowledge and the Rejection of Closure
    1. Epistemic Externalism and Internalism
    2. Nozick’s Tracking Account of Knowledge and the Failure of Closure
    3. Dretske’s Externalist Account of Knowledge and Closure Failure
    4. “Abominable Conjunctions”
    5. Alternative Anti-Skeptical Strategies Need Not Reject Closure
    6. Some Skeptical Arguments do not Employ Closure
  4. Dogmatism and the Rejection of Closure
  5. The McKinsey Paradox, Closure, and Transmission Failure
    1. The McKinsey Paradox
    2. Davies, Wright, and the Closure/Transmission Distinction
  6. Ordinary Propositions, Lottery Propositions, and Closure
  7. References and Further Reading
    1. References
    2. Further Reading

1. The Closure of Knowledge under Known Entailment

a. The Closure of Knowledge Under Entailment

A set is closed under a particular relation if all the members of the set bear the relation only to other members of the set. The set of true propositions is closed under entailment because true propositions entail only other truths. Since false propositions sometimes entail truths, false propositions are not closed under entailment. Epistemic closure principles state that members of an epistemic set (such as my justified beliefs) are closed under a given relation (which may be a non-epistemic relation, like entailment, or an epistemic one, such as known entailment).

A simple closure principle is the principle that knowledge is closed under entailment:

If a subject S knows that p, and p entails q, then S knows that q.

Less schematically, this says that if one knows one thing to be true and the known claim logically entails a second thing, then one knows the second thing to be true. This principle has obvious counter-examples. A complicated theorem of logic is entailed by anything (and hence by any proposition one knows), but one may not realize this and may thus fail to believe (or even grasp) the theorem. Since one must at least believe a proposition in order to know that it is true, we see that one may fail to know something entailed by something else that one knows. Additionally, even if a proposition is entailed by something one knows, if one comes to believe the proposition through some epistemically unjustified process, one will fail to know the proposition (since one’s belief of it will be unjustified). For instance, if one knows that one will start a new job today and then comes to believe that one will either start a new job today or meet a handsome stranger based on the testimony of her astrologist, then perhaps she will fail to know the truth of the entailed disjunction.

b. The Closure of Knowledge Under Known Entailment

It is more plausible that knowledge is closed under known entailment:

If S knows that p, and knows that p entails q, then S knows that q.

As stated, however, the principle seems vulnerable to counter-examples similar to the ones just discussed. The subject might fail to put his knowledge that p together with knowledge that p entails q and thus fail to infer q at all. One might know that she has ten fingers and that if she has ten fingers then the number of her fingers is not prime, but simply not bother to go on to deduce and form the belief that her number of fingers is not prime. Alternatively, although the subject could have come to believe q by inferring it correctly from something else that she knows (since she is aware of the entailment), she instead might have come to believe q through some other, epistemically unjustified, process.

How can we capture the idea that one can add to one’s store of knowledge by recognizing and assenting to what is entailed by what one already knows? This formulation seems suitably qualified:

If S knows that p, and comes to believe that q by correctly deducing it from her belief that p, then S knows that q.

Less formally, if I know one thing, correctly deduce another thing from it, and come to believe this second thing by so deducing it, then I know the second thing to be true. This principle eliminates counterexamples in which the subject fails to believe the entailed claim (and thus fails to know it) or comes to believe the entailed claim for bad reasons (and thus fails to know the claim). (Henceforth, uses in this article of the phrase “the principle of closure of knowledge under known entailment” should be regarded as referring to this preferred formulation of the principle).

So much is built into the antecedent of this principle that it might now seem trivial but, as we shall see, it has been disputed on various grounds.

c. Justification, Single-Premise and Multiple-Premise Closure

We would seem to have similar grounds for supposing that justified belief is closed under known entailment. One is epistemically justified in believing whatever one correctly deduces from one’s justified beliefs. This captures the idea that one way to add to one’s store of justified beliefs is to believe things entailed by your justified beliefs. When one reasons validly, the justification that one has for the premises carries over to the conclusion.

The mere fact that justification is (ordinarily taken to be) one of the necessary conditions for knowledge does not strictly entail that justification is closed under the same operations (such as known entailment) that knowledge is closed under. As Steven Hales (1995) has pointed out, to argue in this manner is to commit the fallacy of division: to infer from the fact that a whole thing has a particular quality, that each of its components must have this quality as well. For instance, it does not follow from the fact that the glee club is loud that each, or even any, of the individual singers in the glee club is loud. Knowledge might be closed under known entailment even if justified belief is not, if all the counterexamples to the closure of justification were examples in which the justified belief was missing at least one of the necessary conditions for knowledge. There seems to be no particular reason to believe that this is the case, however. (See Brueckner 2004 for more on this point).

The closure principles discussed thus far are instances of single premise closure. For instance, one’s knowledge that a given particular premise is true, when combined with a correct deduction from that premise of a conclusion, seems to guarantee that one knows the conclusion. There are also multiple premise closure principles. Here is an example:

If S knows that p and knows that q, and S comes to believe r by correctly deducing it from p and q, then S knows that r.

That is, if I know two things to be true and can deduce a third thing from the first two, then I know the third thing to be true. There is good reason to be dubious of multiple premise closure principles of justification, such as

If S is justified in believing that p and justified in believing that q, and S correctly deduces r from p and q, then S is justified in believing that r.

Lottery examples reveal the difficulty. Given that there are a million lottery tickets and that exactly one of them must win, it is plausible (though not obvious) that for any particular lottery ticket, I am justified in believing that it will lose. So I am justified in believing that ticket one will lose, that ticket two will lose, and so forth, for every ticket. But if I know that there are a million tickets, and I am justified in believing each of a million claims to the effect that ticket n will lose and I can correctly deduce from these claims that no ticket will win, then by closure I would be justified in concluding that no ticket will win, which by hypothesis is false. Justified belief is fallible, in that one can be justified in believing something even if there is a chance that one is mistaken; conjoin enough of the right sort of justified but fallible beliefs and the resulting conjunction will be unlikely to be true, and thus unjustified.

If knowledge, like justified belief, is fallible (say, only 99.9% certainty is required), then multiple premise closure principles for knowledge will fail as well. One could be sufficiently certain for knowledge about each of a thousand claims (“I will not die today”; “I will not die tomorrow”; …; “I will not die exactly 569 days from today”; etc.), but not sufficiently certain of the conjunction of these claims (“I will not die on any of the next thousand days”) in order to know it, even though it is jointly entailed by those thousand known claims (and thus true). The fallibility of knowledge is far more controversial than the fallibility of justified belief, however.

Similarly, closure might be thought to hold for different types of knowledge, such as a priori knowledge (i.e. knowledge not gotten through sense experience, to oversimplify a bit). If one knows a priori that p, and knows a priori that p entails q, then one knows a priori that q. Intuitively, it seems that if one knows the premises of an argument a priori and is able to validly deduce a conclusion from those premises, one would know the conclusion a priori as well. This last point is on weaker ground, however, as discussed in Section 5b.

2. Philosophical Uses of the Closure Principle

The closure principle, now qualified to handle the straightforward counterexamples, has been employed in skeptical and anti-skeptical arguments, in support of a dogmatic refusal pay attention to evidence that counts against what one knows, to generate a paradox about self-knowledge, and for many other philosophical ends.  These uses are described in brief in this section, and in greater detail in later sections.

The skeptic may argue as follows:

  1. I do not know that I am not a handless, artificially stimulated brain in a vat.
  2. I do know that I have hands entails I am not a handless, artificially stimulated, brain in a vat.
  3. If I know one thing, and I know that it entails a second thing, then I also know the second thing. (Closure)
  4. Thus, I do not know that I have hands. (From 2 and 3, if I knew I had hands I would know that I am not a brain in a vat, in contradiction with 1).

If one really knew the ordinary common sense claim to be true, one could deduce the falsity of the skeptical claim from it and come to know that the skeptical claim is false (by closure). The fact that one cannot know that the skeptical claim is false (as per the first premise) demonstrates that one does not in fact know that the common sense proposition is true either. (See also Contemporary Skepticism).

But one person’s modus tollens (the inference from if p then q and not-q to the conclusion not-p) is another person’s modus ponens (the inference from if p then q and p to the conclusion q), as we can see from an anti-skeptical argument of the sort associated with G.E. Moore. (See Moore 1959).

  1. I know that I have hands.
  2. I know that I have hands entails I am not a handless, artificially stimulated, brain in a vat.
  3. If I know one thing, and I know that it entails a second thing, then I also know the second thing. (Closure)
  4. Thus, I know that I am not a handless, artificially stimulated brain in a vat.

From the fact that one knows that she has hands and this is incompatible with a skeptical hypothesis under which her hands are illusory, one can infer, and thus come to know (if closure is correct), the falsity of the skeptical hypothesis.

The closure principle can be used even in defense of a dogmatic rejection of any recalcitrant evidence that counts against something that one takes oneself to know. The argument runs as follows (adapted from Harman 1973):

  1. I know my car is parked in Lot A. (Assume)
  2. I know that if my car is parked in Lot A, and there is evidence that my car is not parked in Lot A (say, testimony that the car has been towed), then the evidence is misleading. (Analytic, since evidence against a truth must be misleading)
  3. Thus, I know that any evidence that my car is not parked in Lot A is misleading. (Closure)
  4. I know that there is evidence that my car is not parked in Lot A. (Assume)
  5. Thus, I know that this evidence (testimony that my car was towed) is misleading. (Closure)
  6. If a piece of evidence is known by me to be misleading, then I ought to disregard it. (Analytic)
  7. Thus, I ought to disregard any evidence that my car is not parked in Lot A. (From 5 and 6)

This result seems paradoxical, however, as most would claim that it is epistemically irresponsible to ignore all the evidence against what one takes oneself to know, simply because it is evidence against what one takes oneself to know. It is plausible (though hardly obvious) that one takes oneself to know each thing that one believes (considered individually). If this is conjoined with the argument above, it entails that one ought to ignore any evidence against what one believes. This seems to be an even more ill-considered policy.

The closure principle also figures prominently in a paradox about self-knowledge and knowledge of the external world. It is now widely accepted that some thought contents are individuated externally. That is, there are some thought contents that one could not have unless one was in an environment or linguistic community that is a certain way. On this view, one could not think the thought that water is wet were one not in an environment with water, or at least with some causal connection to water. Given content externalism, it seems we may argue as follows (the argument is due to McKinsey 1991):

  1. I know that I have mental property M (say, the thought that water is wet). (Assume privileged access to one’s own thoughts)
  2. I know that if I have mental property M (the thought that water is wet), then I meet external conditions E (say, living in an environment containing water). (Externalism with respect to content)
  3. If I know one thing, and I know that it entails a second thing, then I know the second thing. (The principle of the closure of knowledge under known entailment).
  4. Thus, I know that I meet external conditions E (namely, that I live in environs containing water). (From 1, 2 and 3)

The conclusion follows from an application of the closure principle, but what makes this paradoxical is that it appears that the knowledge that is attributed in the premises depends on reflection alone (introspection plus a priori reasoning), whereas the knowledge attributed in the conclusion is empirical. If the premises are correct, and closure holds, I can know an empirical fact by reflection alone (since I know it on the basis of premises than can be known by reflection alone). Something seems to have gone wrong and it is unclear which premise, if any, is the culprit.

Closure principles figure in another philosophical puzzle about knowledge of “ordinary propositions”, those we ordinarily take ourselves to know, and “lottery propositions,” those that, although extremely likely, we do not ordinarily take ourselves to know. Suppose that one is struggling to get by on a pensioner’s income. It seems plausible to say that one knows one will not be able to afford a mansion on the French Riviera this year. However, that one will not be able to afford the mansion this year entails that one will not win the lottery. By the closure principle, since one knows that one will not be able to afford the mansion, and knows that this entails that one will not win the lottery, one must know that one will not win the lottery. However, very few are inclined at accept that one knows one will not win the lottery. After all, there’s a chance one could win.

3. Externalist Accounts of Knowledge and the Rejection of Closure

a. Epistemic Externalism and Internalism

To determine whether someone is epistemically justified in believing something, one must do so from a particular point of view. One may consider the point of the view of the agent who holds the belief or of someone who possesses all the relevant information (which may be unavailable to the agent). To oversimplify, those who consider only the subject’s perspective when evaluating the subject’s epistemic justification are epistemic internalists, and those who adopt the point of view of one with all the relevant information are epistemic externalists. An account of epistemic justification is internalist if it requires that all the elements necessary for an agent’s belief to be epistemically justified are cognitively accessible to the agent; that is, these elements (say, evidence or reasons) must be internal to the agent’s perspective. Externalist theories of justification, on the other hand, allow that some of the elements necessary for epistemic justification (such as a belief’s being produced by a process that makes it objectively likely to be true) may be cognitively inaccessible to the agent and external to the agent’s perspective.

There are so many varieties of internalism and externalism that further generalization is perilous. Considering the theories’ respective treatments of the problem of induction illustrates the basic difference between them. Hume famously argued that although we rely on inductive inferences, we have access to no non-question begging justification for doing so, as our only grounds for thinking that induction will continue to be reliable is that it always has been reliable. This is an inductive justification of the belief that induction is epistemically justified. If Hume is right, then a typical internalist will concede that beliefs based on inductive reasoning are not epistemically justified. An externalist, however, might insist that such beliefs are justified, provided that inductive reasoning as a matter of fact is a process that reliably produces mostly true beliefs, whether the agent who reasons inductively has access to that fact or not. On the other hand, an epistemic internalist might rate the beliefs of a brain in a vat or a victim of Cartesian evil demon deception as epistemically justified, provided that they were formed in a way that seems reasonable from the point of the view of the agent (the brain in a vat), such as through the careful consideration of evidence (evidence, albeit, that is misleading). The epistemic externalist, however, likely would rate such an agent’s beliefs as unjustified, on the basis of evidence not accessible to the agent, such as that the belief-forming processes she relies on make her beliefs extremely likely to be false.

For the most part, internalist accounts of knowledge are those that appeal to an internalist conception of epistemic justification and externalist accounts of knowledge employ an externalist conception of justification. (Alternatively, one may be an internalist about justification and an externalist about knowledge, by rejecting the view that epistemic justification is one of the requirements for knowledge.) Perhaps the greatest challenge to closure principles for knowledge comes from externalist theories of knowledge, notably those of Robert Nozick and Fred Dretske.

b. Nozick’s Tracking Account of Knowledge and the Failure of Closure

It strikes many that some version of the closure principle must be true. The idea that no version of the principle is true is, according to one noted epistemologist, “one of the least plausible ideas to come down the philosophical pike in recent years.” (Feldman 1995) Nevertheless, philosophers have argued against the epistemic closure principle on many different grounds. One serious challenge to closure arose from those who proposed the “tracking” analysis of knowledge (notably Nozick 1981). According to the tracking theory, to know that p is to track the truth of p. That is, one’s true belief that p is knowledge if and only if the following two conditions hold: if p were not the case, one would not believe that p, and if p were the case, one would believe that p. For one’s belief that p to be knowledge, one’s belief must be sensitive to the truth or falsity of p; that sensitivity is captured by the two subjunctive conditions above. One knows that Albany is the capital of New York only if one would not believe it if it were false, and would believe it if it were true. (See also Robert Nozick’s epistemology).

This is an externalist theory of knowledge because whether or not an agent satisfies the subjunctive conditions for knowledge may not be cognitively accessible to the agent. To evaluate an agent’s belief, with respect to whether it meets those conditions, it may be necessary to adopt the point of view of someone with information not accessible to the agent.

Let’s illustrate this with an example similar to Nozick’s own (1981, 207). Let p be the belief that one is sitting in a chair in Jerusalem. Let q be the belief that one’s brain is not floating in a tank on Alpha Centauri, being artificially stimulated so as to make one believe one is sitting in a chair in Jerusalem. Suppose one has a true belief that p. In the “closest” counterfactual situations (to employ the terminology of one account of truth-conditions for subjunctives) in which p is false (say, one is standing in Jerusalem, or one is sitting in Tel Aviv), one will not believe p. In close counterfactual situations in which one is sitting in Jerusalem, one does believe that p. One’s belief of p tracks the truth of p and thus counts as knowledge.

Suppose, on the other hand, that one has a true belief that q. If one’s belief that q were false, however (and one really was in this predicament on Alpha Centauri), one would still believe (falsely) that one was not in Alpha Centauri (q). One’s belief that q, while actually true, does not track the truth of q (being held when q is true but not when q is false). Hence, the belief that q does not count as knowledge.

How does this relate to the closure of knowledge? The proposition that one is sitting in Jerusalem (p) entails that one’s brain is not floating in a tank in Alpha Centauri, being stimulated so as to make one think that one is sitting in Jerusalem (q). We may suppose that one can correctly deduce q from p. Even so, since one’s belief that p tracks the truth of p and counts as knowledge and one’s belief that q does not do so, knowledge fails to be closed under known entailment. One may know that p, and know that p entails q (and come to believe the latter by correctly deducing it from the former), and yet fail to know that q.

Nozick’s account has at least two virtues. One is that the tracking analysis of knowledge is plausible. The other is that the rejection of closure allows us to reconcile the following two claims, both of which seem plausible but had seemed incompatible: (1) we do know many common sense propositions, such as that I have hands, and (2) we do not know that skeptical hypotheses, such as that I am a handless, artificially stimulated brain in a vat, are false. One desideratum of a theory of knowledge is that it refutes skepticism while accounting for the plausibility and persuasiveness of the skeptic’s case against common sense knowledge claims. Both the skeptic and the Moorean anti-skeptic come up short here. The skeptic must deny our common sense knowledge claims and the Moorean must maintain that we can know the falsity of skeptical hypotheses. As long as we accept the closure principle, whether we are skeptics or anti-skeptics, we cannot maintain both that we know common sense propositions and that we do not know that the skeptical hypotheses are false, since we know that the common sense propositions entail the falsity of the skeptical propositions. Knowledge of the truth of the common sense claims would, if knowledge is closed under known entailment, guarantee our knowledge that skeptical hypotheses are false. Citing our failure to know that skeptical hypotheses are false, the skeptic applies modus tollens and infers that we must not know the common sense propositions. The rejection of closure blocks this move by the skeptic.

This is not to say that there are not plausible counterexamples to the tracking account of knowledge. I may know my mother is not the assassin since she was with me when the assassination took place. But counterfactually, if she were the assassin, I would still believe she was not, since after all I couldn’t believe such a thing of my mother. My belief that my mother is not the assassin fails to track the truth, since I would have believed it even if it were false, but it seems quite plausible that I do know she’s not the assassin, as my evidence for her innocence is quite overwhelming – my mother cannot be in two places at once. Tracking accounts like Nozick’s, which do not make reference to the reasons the agent has for the belief in question, seem vulnerable to such counterexamples.

c. Dretske’s Externalist Account of Knowledge and Closure Failure

Dretske’s account of knowledge is as follows: one’s true belief that p on the basis of reason R is knowledge that p if only if (i) one’s belief that p is based on R and (ii) R would not hold if p were false. Less formally, we may put this as follows: one knows a given claim to be true only if one has a reason to believe that it is true, and one would not have this reason to believe it if it were not true. (See Dretske 1971). This is an externalist account because whether an agent meets conditions (i) and (ii) above may be inaccessible to the agent. One could believe a claim on the basis of a particular reason without being able to explain one’s reliance on that reason, and without knowing whether one would still have the reason if the claim were false. For instance, one might believe that one’s toes are curled on the basis of proprioceptive evidence (evidence that one would not have if one’s toes were not curled), without one having any idea what proprioception is, what sort of evidence one has for the claim that one’s toes are curled, or whether one would have such evidence even if one’s toes were uncurled.

Let’s illustrate Dretske’s account with his famous zebra example (Dretske 1970). Suppose one is in front of the zebra display at the zoo. One believes that one is seeing zebras on the basis of perceptual evidence. Furthermore, in the closest possible worlds in which one is not seeing zebras (where the display is of camels or tigers), one would not have that perceptual evidence. Consequently, one knows that one is now seeing zebras, on the basis of the perceptual evidence one is having. Consider, however, the belief that one is not now seeing mules cleverly disguised by zoo staff to resemble zebras. Whatever one’s reason for believing this claim (say, that it is just very unlikely that the zoo would deceive people in that fashion), one would still have this reason even if the belief were false (and one was seeing mules cleverly disguised to look like zebras). Hence, one would not know that one is not now seeing mules cleverly disguised to resemble zebras.

As with Nozick’s account, this provides a counterexample to the closure of knowledge. One can know that one is now seeing zebras, one can correctly deduce from this that one is not now seeing mules cleverly disguised to resemble zebras, and yet fail to know that one is not now seeing mules cleverly disguised to resemble zebras. Furthermore, Dretske’s account better handles the counterexample to Nozick’s theory. One believes (truly) that one’s mother is not the assassin, on the grounds that one was with one’s mother at the time the assassination happened (and that mother cannot be in two places at once) and one would not have this reason to think mother innocent if she were indeed the assassin. Thus, one knows that one’s mother is not the assassin, since the evidence is absolutely conclusive, despite the fact that if one’s mother were the assassin, one would still believe that she wasn’t, on the basis of a different, bad reason.

Even Dretske’s account is plausibly vulnerable to counterexample. Suppose that one believes correctly at noon on Tuesday that Jones is chair of one’s department, on the basis of the typical sort of evidence (say, recollection of Jones being installed in the position, the department’s website listing Jones as chair, and so forth). Suppose that at five minutes past noon on Tuesday, Jones is suddenly struck dead by a bolt of lightning (and is consequently no longer chair). Did one know at noon, five minutes prior to the death, that Jones was the chair? Since one would have had that same set of reasons to believe at noon that Jones was chair even in the closest possible worlds in which he was not chair at noon (that is, worlds in which he’d been struck dead by lightning five minutes before noon), one does not actually know at noon that Jones is the chair. Those who find this verdict implausible (that is, those who think one does know on the basis of the typical evidence that Jones is the chair, right up until the moment that Jones suddenly is struck dead and stops being the chair), may find Dretske’s account of knowledge wanting. (The example is adapted from Brueckner and Fiocco 2002).

Further justification of Dretske’s for denying closure is that there are other sentential operators that are not closed under known entailment and behave in many respects like the knowledge operator. (See Dretske 1970). Dretske defines a sentential operator O to be fully penetrating when O(p) is closed under known entailment. That is, O is penetrating if and only if: O(p) entails O(q) if p is known to entail q. “It is true that” is a penetrating operator, since, if p is known to entail q, “it is true that p” must entail “it is true that q”. “It is surprising that” is non-penetrating; although it is surprising that tomatoes are growing on the apple tree, it is not surprising that something is growing on the apple tree. Some operators are semi-penetrating. An operator is semi-penetrating when it penetrates only to a certain subset of a given proposition’s entailments.

For instance, “R is an explanatory reason for” seems to be a semi-penetrating operator. Within a range of cases, if p is known to entail q, then R is an explanatory reason for p entails R is an explanatory reason for q. A reason that explains why Bill and Harold are invited to every party necessarily is a reason why Harold is invited to every party. Similarly, “knows that” seems to penetrate through similar entailments; if one knows that Bill and Harold are invited to every party, then one knows that Harold is invited to every party.

However, “R is an explanatory reason for my painting the walls green” need not entail “R is an explanatory reason for my painting the walls.” Depending on the context, a reason that explains why I painted my walls green may be a reason why I did something entailed by my painting the walls green, such as my not painting the walls red, but may not be a reason why I did something else entailed by my painting the walls green, such as my not wallpapering the walls green. The emphasis is crucial. A reason to paint the walls green is a reason not to paint them red, but may not be a reason to paint rather than wallpaper. A reason to paint the walls green may be a reason not to paint the floor green, but it might be neutral as to the color. Consideration of ordinary demands for reasons shows that emphasis, or other contextual factors, determines a certain range of reasons to be relevant and a certain range irrelevant. The same reason will not suffice to explain each of the following: “I bought tomatoes,” “I bought tomatoes” and “I bought tomatoes”, even though these three sentences entail and are entailed by exactly the same claims, since they are logically equivalent. Dretske says that no fact is an island and that various contextual factors will determine, for each operator, its relevant alternatives (i.e. the negations of the consequents to which the operator penetrates). (See also Contextualism in Epistemology, Chapter 3, on Dretske and the denial of closure).

d. “Abominable Conjunctions”

On the other hand, some philosophers view the closure principle as so obviously true that, rather than reject it to accommodate a given theory of knowledge, they would reject the account of knowledge in order to keep closure. Dretske’s account of knowledge has been much discussed in the philosophical literature. One consequence of this rejection of closure in favor of his account that hardly seems felicitous is that one could truly say, “I know that that animal is a zebra and I know that zebras are not mules, but I don’t know that that animal is not a cleverly disguised mule.” Or, “I know I have hands, and I know that if I have hands I am not handless, but I don’t know that I am not a handless brain in a vat.” Worse yet, “I know it is not a mule, but I don’t know it’s not a cleverly disguised mule.” These claims (“abominable conjunctions,” according to DeRose 1995) sound at best paradoxical and at worst absurd. This seems to point to the extreme plausibility of some form or another of the closure principle.

Dretske (2005a, 17-18) agrees that such statements sound absurd, but maintains that they are true. They may violate conventional conversational expectations and they may be met with incomprehension, but they are not self-contradictory. “Empty” and “flat” are often taken to be absolute concepts (since to be empty is to not contain anything at all and to be flat is to have no bumps), but also context-relative, in that whether a particular item counts as a thing or a bump depends on the context. It sounds a bit strange to say that the warehouse is empty, but has lots of dust, gas molecules, and empty crates in it. The utterance may violate conversational rules, but the utterance might, despite all that, be true, if the concepts of emptiness and flatness are as described. So too with the abominable conjunctions if the attendant conception of knowledge is correct. Philosophers may always appeal to Gricean conversational implicatures to blunt the objection that their view entails absurd claims. Truth and conversational propriety are not one and the same. (Paul Grice is the philosopher most closely associated with the view that communication is guided by various conversational maxims and that some utterances are conversationally inappropriate, even if true, because they invite misunderstanding. For instance, the utterance “Mary insulted her boss and she was fired,” is true even if Mary did not insult her boss until after she was fired, but it would be an inappropriate remark in most contexts, since the listener naturally would conclude that the insult preceded the dismissal. For more on this, see Grice 1989).

John Hawthorne (2005: 30-31) makes two points in reply. First, he says, it is unclear what sort of Gricean mechanism could make it true but conversationally inappropriate to utter “S knew that p and correctly deduced q from p, but did not know that q.” Second, an appeal of this sort can at best explain why we do not utter certain true propositions, but not why we actually believe their negations. Even if it is true that one’s wife is his best friend, it would be inappropriate for him to introduce her to someone as his best friend. But the conversational mechanism at play here could hardly be an explanation for why he believed that his wife was not his best friend (even though she was). Why, if the denial of closure is true but conversationally infelicitous, do so many not only not deny closure in conversations but in fact believe that the closure principle is true?

One might reply that many people, even philosophers, are apt in some situations to mistake what is conversationally appropriate for what is true (as with conditional claims that have false antecedents), so an explanation of why a true claim violates conversational norms might well explain why people believe the negation of the claim.

e. Alternative Anti-Skeptical Strategies Need Not Reject Closure

There are alternative strategies for refuting skepticism that seem to have many of the virtues of the tracking account of knowledge, but do not entail the falsity of closure principles. Contextualism, for example, says that knowledge attributions are sensitive to context, in that a subject S might know a proposition p relative to one context, but simultaneously fail to know that p relative to another context. The contextual factors to which knowledge attributions are taken to be sensitive include things like whether a particular doubt has been raised or acknowledged and the importance of the belief being correct.

In an ordinary context, where skeptical scenarios have not been raised, the standards for knowledge are quite low, but, in contexts in which skeptical doubts have been raised, such as an epistemology class, standards for knowledge have been raised to levels that typically cannot be met. One might know relative to the everyday context that she has hands, but fail to know this relative to the skeptic’s context, because a skeptical scenario has been raised and she cannot rule it out.

Or a true belief with a certain level of justification might count as knowledge as long as it is not terribly important that the belief be correct, but would no longer be knowledge if the stakes were raised. One might know that the bank will be open on Saturday after confirming that the bank has Saturday hours, even if one has not checked whether the bank has changed its hours in the past two weeks, as long as no great harm will befall one if it turns out one is wrong. But if financial ruin will befall one were a check not deposited before Monday, then one’s justification might need to be stronger before it would be correct to say that one knows the bank is open Saturday.

The contextualist then can reconcile the intuitions that it is sometimes correct to attribute to someone knowledge of everyday common sense propositions, despite her inability to rule out skeptical propositions, and that we are sometimes correct in refusing to attribute knowledge of the falsity of a skeptical scenario when the subject is unable to rule out such scenarios. But the contextualist can do this while accepting at least some version of closure. The contextualist says that epistemic closure holds within an epistemic context, but fails inter-contextually. For instance, in the everyday, low epistemic standards context, one knows that one has hands and anything that one can correctly deduce from this claim, such as that one is not a handless being deceived into thinking that one has hands. In the context with much higher epistemic standards, one knows neither that one is not a handless, artificially stimulated brain in a vat, nor (by an application of the closure of knowledge under known entailment) that one has hands. Closure will fail only when it extends across contexts. For instance, if one were to cite one’s knowledge that one has hands (in the ordinary context) as grounds for saying in the heightened context that one knows that the brain in a vat hypothesis is false (as the Moorean might), one would illegitimately apply the closure principle. The skeptic’s citing one’s failure to know the falsity of the skeptical hypothesis (in the heightened context) as entailing that one does not know the common sense proposition (in the ordinary context) would be a similar misuse of the closure principle.

If a theory of knowledge is independently plausible and can answer the skeptic without denying closure, then, everything else being equal, we ought to be reluctant to reject closure just so that we can accept the tracking account of knowledge. Contextualism, of course, is plagued with problems of its own. One such problem is as follows: since whether one knows a claim or not depends on how stringent the epistemic standards are in the context and the standards can be raised by a particular doubt occurring to someone in the context, contextualism seems to imply that it is easier to know things if one spends time with the stupid or incurious or if one is stupid or incurious.

The plausibility of the denial of closure may well depend not only on whether it is a way to avoid skepticism, but on whether it is the only way to do so. (Dretske does insist that the only plausible way to refute skepticism is by denying closure. See his 2005a and 2005b for a defense of this claim, trenchant criticisms of the contextualist theory, and responses to criticisms of the tracking theory.)

f. Some Skeptical Arguments do not Employ Closure

One of the strengths claimed for the tracking account of knowledge is that it blocks the standard skeptical argument, since it involves the rejection of closure. Not all skeptical arguments employ closure principles, however, so it is unclear how much anti-skeptical value would accrue from denying closure. Underdetermination arguments might be the best skeptical arguments and they do not depend (at least explicitly) on closure.

Underdetermination is a relation that holds between two or more theories, when the theories are incompatible, but empirically equivalent. Underdetermination skeptical arguments rely crucially on the premise that if two theories are incompatible but compatible with all the available (and perhaps possible) data, we cannot know that one is true and the other false. Compare, for example, the thesis that I have hands, which I perceive through sense perception, and the thesis that I am a handless brain in a vat, artificially stimulated so as to have misleading sense perceptions. These theses are incompatible, but they are empirically equivalent. Whichever thesis were true, I would have the same sort of experiences. Suppose we adopt the following principle: if two incompatible theses both entail (or predict) the same observational data, then that observational data does not support (or justify belief of) one of the theses over the other. With this principle and the premise that the two theses are incompatible but observationally equivalent, we can deduce that our apparent perception of our hands does not justify us in believing that we have hands.

The argument is greatly oversimplified, but the outline of the skeptical argument from underdetermination now ought to be clear. The argument does not explicitly employ any closure premise, so the rejection of closure would seem not to undermine the argument in any straightforward way. One could always argue that the appeal of the argument from underdetermination implicitly relies on the closure principle or that the argument from underdetermination is objectionable on other grounds. Skeptical arguments from underdetermination, however, seem as plausible as other skeptical arguments and their plausibility seems not to depend on the plausibility of any of the closure principles.

Infinite regress arguments for skepticism also do not straightforwardly appeal to closure. A regress argument that no belief is epistemically justified (and hence than no belief counts as knowledge) runs as follows. We assume that all justification is inferential. That is, every justified belief is justified by appeal to some other justified belief. The basis for this claim might be the nature of argumentation. One is justified in believing a conclusion if one is justified in believing the premises that support the conclusion. If the conclusion is one of the premises, then the argument is question-begging, or circular, and not rationally persuasive. But if every justified belief can be justified only be inferring it from some further justified belief and there cannot be an infinite regress of justified beliefs, then it must be that no beliefs are justified. (A foundationalist about justification, on the other hand, while agreeing that an infinite regress of justified beliefs is impossible, insists that there are justified beliefs, and hence that some beliefs are justified non-inferentially, or in other words, that some justified beliefs are basic or foundational). The claim that no justified belief is self-justifying does not entail any closure principle of justification or knowledge, so the argument seems to be independent of closure and thus not vulnerable to arguments against closure principles. (See also Ancient Skepticism).

The proponent of the tracking account of knowledge need not answer all forms of the skeptical argument with the same tools, so even if some skeptical arguments do not depend on the closure principle, the tracking analysis might provide the resources for countering the skeptical arguments from underdetermination or regress.

4. Dogmatism and the Rejection of Closure

At least one philosopher (Audi 1988, 76-8; 1991, 77-84) has claimed that the justification of dogmatism, adapted from Harman (see section 2 of this article), is a reductio ad absurdum of the epistemic closure principle. If closure allows one to infer, and thus know, that any evidence against something one knows must be misleading and may be ignored, then closure must be rejected.

Audi’s example is of a man who adds up a series of numbers and thereby knows the sum of the numbers. But the man’s wife (whom he considers to be a better mathematician) says that he has added the numbers incorrectly and gotten the wrong sum. If the man knows that the sum is n, and knows that his wife says the sum is not n, then by closure he knows that his wife is wrong. (This is so, as “the sum is n and my wife says the sum in not n” entails that “my wife is wrong;” one knows the former claim and knows it entails the latter, so one knows the latter). Since he knows his wife is wrong, there is no need to recalculate the sum. (Similar examples appear in Dretske 1970 and Thalberg 1974). If one believes something only when one takes oneself to know it, as is plausible, then by this reasoning one has reason to dismiss any evidence against something that one believes.

Denying the closure principle to avoid the odd dogmatic conclusion has some initial appeal, but there are alternative solutions that do not require us to reject such a compelling principle. And, as Feldman says (1995, 493), there is a general reason not to resolve the paradox by denying closure. To say, “Yes, I know that p is true, and that p entails q, but I draw the line at q,” seems irrational. To refuse to accept what you know to be the consequences of your beliefs, he says, is to be “patently unreasonable.” Not only is it infelicitous to deny closure, but the dogmatist argument may be blocked without doing so.

For instance, one could take the dogmatism argument to be a reductio ad absurdum of the anti-skeptical position. This is the tack taken by Peter Unger (1975). If we deny that one could know that p (say, that the sum of the numbers is n), then even if we accept closure, we have no reason to suppose that one could know that all evidence against p was misleading.

Alternatively, Roy Sorensen (Sorensen 1988) argues that given that one knows that p, the conditional “If E is evidence against p, then E is misleading” is a junk conditional, in that although it may be known to be true, this knowledge cannot be expanded under modus ponens. That is to say, if “if p then q” is a junk conditional, the conditional can be known to be true, but it could not be the case that simultaneously the conditional is known and that knowledge of the antecedent p would justify one in believing the consequent q. Some conditionals are known to be true on the basis of the extreme unlikelihood of the antecedent, but are such that if one acquired evidence that supports the antecedent, one would not be justified in inferring the consequent because the probability of the antecedent is inversely proportional to the probability of the conditional. That is, if one were to learn that the antecedent of the conditional was true, one would no longer have reason to accept (and would no longer know) the conditional. “If this is a Cuban cigar, then I’m a monkey’s uncle!” is an example of such a conditional. This conditional can be known to be true, in virtue of the antecedent being known to be false, but if one were to find evidence that this is indeed a Cuban cigar, one should not infer that he is a monkey’s uncle. Rather, one should conclude that perhaps one did not know the conditional to be true after all, since one has evidence that its antecedent was true and its consequent false. In short, if a conditional is a junk conditional one cannot come to know the consequent q in virtue of one’s knowing the antecedent p and the conditional if p then q, because one’s knowledge of the conditional depends on the falsity of the antecedent.

Given that one knows that r (say, that one’s car is in parking lot A), one knows that the conditional “if there is any evidence against r, however strong, then it must be misleading” is true. Part of one’s basis for knowing that r might be that one has reason to believe that there is no strong evidence against r. But if one were to learn of strong evidence against r, such as testimony that one’s car had been towed, one ought, at least in some cases, to consider the possibility that one does not in fact know that r, rather than simply inferring that the testimony is misleading. Learning the truth of the antecedent – that there is strong evidence against r – may undermine the justification for believing the conditional itself, thus making the conditional resistant to modus ponens. Knowledge of the conditional depends on one’s knowing that the antecedent is false. Finding evidence in favor of the antecedent – even if in fact it is misleading – may weaken one’s justification for the conditional, such that one no longer knows the conditional to be true.

This blocking of the dogmatist argument does not involve denying closure, though. The reason the modus ponens inference fails to go through is because the conditional is a “junk” conditional; one can know the conditional to be true only if one does not know the antecedent to be true, and the closure principle applies only if one simultaneously knows both the conditional and its antecedent to be true.

Another explanation that does not require the denial of closure is due to Michael Veber (Veber 2004). He says that even if the dogmatist argument is sound, the principle “If a piece of evidence E is known by S to be misleading, S ought to disregard it,” ought not to be endorsed on grounds of human fallibility. We are frequently enough wrong in taking ourselves to know what we in fact do not know that following such a principle would lead one to disregard evidence that is not misleading. There is nothing wrong with the principle, provided it is correctly applied; but due to the difficulty or impossibility of correctly applying it, adopting such a policy is contraindicated.

5. The McKinsey Paradox, Closure, and Transmission Failure

a. The McKinsey Paradox

Michael McKinsey (1991) discovered a paradox about content externalism that has prompted some reconsideration of how knowledge is transmitted through deductive reasoning.

Content externalism (or anti-individualism) is, to greatly oversimplify, the thesis that we are only able to have thoughts with certain contents because we inhabit environments of certain sorts. (Putnam 1975 and Burge 1979 are the most notable defenses of this view). Molecule-for-molecule duplicates could differ in their contents due to differences in their environments. According to the externalist, my twin on Twin Earth might be an exact duplicate of me, but if Twin Earth contains a different but similar light metal used to make baseball bats, cans, and so forth instead of aluminum, then even if the denizens of Twin Earth call this metal “aluminum,” their thoughts are not thoughts about aluminum. This view is a repudiation of the Cartesian view of the mental, according to which the contents of our thoughts are what they are independent of the surrounding world.

Externalism has been defended and criticized on many different grounds, but the debate about externalism has pivoted largely on its implications for the thesis that we have privileged access to the contents of our own thoughts. How does one know that she is now thinking that some cans are made from aluminum, rather than the thought that some cans are made from twaluminum (as we may call it), which is what she would be thinking if she lived on Twin Earth? Incompatibilists about externalism and privileged access point out that the two thoughts are introspectively indiscriminable if externalism is true and argue that one could only know which of these thoughts one is now thinking through empirical investigation of one’s environment.

Compatibilists about externalism and self-knowledge often argue that if a subject has a mental state with a particular content (say, a belief that some cans are made of aluminum) in virtue of that subject bearing a certain relation to an external state of affairs (say, aluminum, rather than twaluminum, being present in one’s environs), then any mental state the subject has about that particular mental state of his, like his belief that he believes some cans are made of aluminum, will also stand in a similar relation to the same external state of affairs (aluminum, rather than twaluminum, being present). Hence, this second-order mental state (i.e. a mental state about a mental state) will involve the same content as the first-order belief (say, that some cans are made of aluminum). In short, one will believe that he believes cans are made of aluminum only if one in fact does believe that cans are made of aluminum, since both of these states bear a causal relation to aluminum, rather than twaluminum. (See Burge 1988 and Heil 1988). Whatever makes it the case that S thinks that p (instead of q) will also make it the case that S thinks I am thinking that p (instead of I am thinking that q). Coupled with a reliabilist theory of knowledge, these second-order beliefs count as knowledge since they cannot go wrong and the thesis of privileged access is reconciled with externalism.

Enter McKinsey’s Paradox. We assume that we know content externalism to be true and that it is compatible with a suitably robust thesis of privileged access to thought contents. We may now reason as follows:

  1. I know that I am in mental state M (say, the state of believing that water is wet). (Privileged Access)
  2. I know that if I am in mental state M, then I meet external conditions E (say, living in an environment that contains water). (Content Externalism, known through philosophical reflection)
  3. If I know one thing and I know that it entails a second thing, then I know the second thing. (Closure of knowledge under known entailment)
  4. Thus, I know that I meet external conditions E. (From 1-3)

The knowledge attributed in the premises is a priori in the broad sense that includes knowledge gotten through introspection and/or philosophical reflection. That knowledge is not gained via empirical investigation of the external world. The conclusion follows by an application of the closure principle. What is paradoxical is that, given closure, it seems that one can know the truth of an empirical claim about the external world (say, that one’s environment contains water or that it contains aluminum rather than twaluminum) simply by inferring it from truths known by reflection or introspection. This argument bolsters the incompatibilist’s case: since it is only by investigation of the world that one can know that one meets a particular set of external conditions and since the premises (including closure) entail that this fact can be known on the basis of knowledge not dependent on investigation of the world, either the privileged access premise or the externalist thesis must be false (provided that the closure principle is correct).

b. Davies, Wright, and the Closure/Transmission Distinction

There are many responses to this argument. Some reject externalism, some (like McKinsey) deny privileged access, and some compatibilists (Brueckner 1992) argue that even if externalism is known to be true, nothing as specific as the second premise of the argument could be known a priori. But perhaps the most influential attempt to solve the paradox is due to Martin Davies (1998) and Crispin Wright (2000). They argue that even though arguments like McKinsey’s are valid and their premises are known to be true, this knowledge is not transmitted across the entailment to the conclusion. At first blush, it seems like Davies and Wright are rejecting closure, which is certainly one way to deal with the paradox. Davies and Wright accept closure, though, and only reject a related but stronger epistemological principle that says that knowledge is transmitted over known entailment.

Davies and Wright are distinguishing between the closure of knowledge under known entailment and what they take to be a common misreading of it. The closure principle says that if one knows that p and knows that p entails q, then one knows that q, but the principle is silent on what one’s basis or justification for q is and does not claim that the basis for q is the knowledge that p and that p entails q. The principle of the transmission of knowledge under known entailment, however, states that if one knows that p, and knows that p entails q, then one knows q on that basis – what enables one to know that p and that p entails q also enables one to know that q. Davies and Wright accept the closure principle but deny the transmission principle, arguing that it fails when the inference from p to q is, although valid, not cogent. Here cogency is understood as an argument’s aptness for producing rational conviction.

One way an argument could be valid but fail to be cogent is that the justification for the premises presupposes the truth of the conclusion. If I reason from the premise that I have a drivers license issued by the state of North Carolina (based on visual inspection of my license and memory of having obtained it at the North Carolina Department of Motor Vehicles) to the conclusion that there exists an external world, including North Carolina, outside my mind, it is plausible that my justification for the premise (taking sense experience and memory at face value) presupposes the truth of the conclusion. If this is so, then it seems that the premise could not be my basis for knowing the conclusion. Anyone in doubt about the conclusion would not accept the premise, so although the premise entails the conclusion, the premise could not provide the basis for rational conviction that the conclusion is true. Such an argument is valid, but not cogent. It would not be a counterexample to closure, for anyone who knows the premise and the entailment also must know the conclusion, but it is a counterexample to the transmission principle, since the conclusion would not be known on the basis of the knowledge of the premise.

According to Davies and Wright, the McKinsey argument is valid but not cogent because knowledge of the conclusion is presupposed in one’s supposed introspective knowledge of the premises. Thus, it is a counterexample to transmission, but poses no threat to closure. The non-empirical access to the externally individuated thought contents is conditional on the assumption that certain external conditions obtain (such as that one’s environs include aluminum rather than twaluminum), which can only be confirmed empirically. Thus one may not reason from the non-empirical knowledge claimed in the premises to non-empirical knowledge of an empirical truth that enjoys presuppositional status with regard to the premises. That one has a thought about water may entail that one bears a causal relation to water in one’s environment (if externalism is correct) and one may know the former and the entailment only if one knows the latter, but one may not cogently reason from the premise to the conclusion, since the inference begs the question. Anyone who doubts the conclusion of the McKinsey argument in the first place would not (or at least should not — the presuppositions of our premises are not always recognized as such) be moved to accept the premises that entail it.

Consider then the following principle about a priori knowledge:

(APK) If a subject knows something a priori and correctly deduces (a priori) from it a second thing, then the subject knows a priori the second claim.

We can describe this principle in two equivalent ways. It is the principle of closure of a priori knowledge under correct a priori deduction and, alternatively, it is a specific instance of the principle of transmission of knowledge under known entailment, since it claims that the a priori basis for knowledge of the premise transmits to the conclusion, allowing it to be known a priori as well. If Davies and Wright are correct, the principle is false because counterexamples may be found in deductions that are valid but not cogent.

Davies and Wright apply this distinction between transmission and closure to Moore’s anti-skeptical argument as well. Although it is true that the negation of the brain-in-a-vat hypothesis is entailed by an ordinary proposition, such as that I have hands, the existence of the external world is presupposed in the justification for that premise and, therefore, may not be justifiably inferred from that premise. Moore’s argument is not cogent, so it is a counterexample to transmission, which we have reason to reject anyhow, and not a counterexample to closure (or so Davies and Wright argue).

This is plausibly another sort of conditional that is not expandable by modus ponens. Unlike the junk conditionals, which cannot be expanded because the conditional can be known to be true only when the antecedent of the conditional is not known to be true, conditionals in which the justification for the antecedent presupposes justification for the consequent – we may call them conditionals of presupposition – cannot be expanded because the relevant modus ponens inference would not be cogent. The inference would be question-begging.

The distinction that Davies and Wright argue for also applies to closure principles for justified belief. If they are correct, then justified belief could be closed under known entailment even if justification is not necessarily transmitted across known entailment. The counterexamples to the transmission principle for knowledge would also function as counterexamples for the transmissibility of justified belief.

Some have argued that the Davies-Wright line of argument fails to solve the McKinsey paradox. Whether they are right is beyond the scope of this entry. But the distinction Davies and Wright have drawn between transmission and closure is an important one. That if one knows that p and has validly deduced q from p, one must know that q, tells us nothing about one’s basis for q. Although quite often it can and will, in some instances knowledge of p cannot provide the basis for knowledge of q, even though p entails q, because the justification for p presupposes q. One knows that q (on some independent basis), so there is no counterexample to closure, but q will not be known on the basis of p, so the transmission principle is false.

Clarifying the closure principle as a principle about the distribution of knowledge across known entailment, rather than as a principle about the transmission or acquisition of knowledge, divorces the closure principle, to some extent, from the initial intuitive support for it, which is the idea that we can add to our store of knowledge (or justified belief) by accepting what we know to be entailed by propositions we know (or justifiably believe). On this understanding of closure, knowledge and justified belief are distributed across known entailment even when drawing the inference in question could not add to one’s store of knowledge or justified belief.

6. Ordinary Propositions, Lottery Propositions, and Closure

The closure principle also figures in a paradox about our knowledge of “ordinary propositions” and “lottery propositions.” Ordinary propositions are those that we ordinarily suppose ourselves to know. Lottery propositions are those with a high likelihood of being true, but which we are ordinarily disinclined to say that we know. Suppose that one lives on a fixed income and struggles to make ends meet. It seems that one knows one will not be able to afford a mansion on the French Riviera this year. One’s not being able to afford the mansion this year entails that one will not win the big lottery this year. By the closure principle, since one knows that one will not be able to afford the mansion and one knows that one’s not being able to afford the mansion entails that one will not win the lottery, one must know that one will not win the lottery. Most, however, are disinclined to say that one could know that one will not win the lottery. There’s always a chance, after all (provided that one buys a ticket).

This phenomenon is widespread. Ordinarily, one who keeps up with politics could be said to know that Dick Cheney is the U.S. Vice-President. That Cheney is the Vice-President entails that Cheney did not die of a heart attack thirty seconds ago. But it seems that one does not know that Cheney did not die of a heart attack in the last thirty seconds. How could one know such a thing? (The coining of the term “lottery proposition” and the discovery that this phenomenon is widespread, is due to Jonathan Vogel).

The apparently inconsistent triad is (i) one knows the ordinary proposition, (ii) one fails to know the lottery proposition, and (iii) closure. One may eliminate the inconsistency by denying closure on the sort of grounds that Dretske and Nozick cite. Plausibly, one’s belief of so-called ordinary propositions tracks the truth, while one’s belief of lottery propositions does not. If Cheney were not Vice-President, one would not believe he was, but had Cheney died in the past thirty seconds, one still would believe he was Vice-President.

One might bite the skeptical bullet and insist that one really does not know that Cheney is Vice-President. One of a more anti-skeptical bent might maintain that one can really know the lottery propositions, such as that Cheney did not die in the last thirty seconds. Such a resolution has considerable costs, but denying closure is not among them.

Alternatively, one might argue for a contextualist handling of the problem that does not require the denial of closure or biting the skeptical or anti-skeptical bullet.

7. References and Further Reading

a. References

  • Audi, Robert (1988), Belief, Justification and Knowledge, Belmont: Wadsworth.
    • Argues against closure to avoid dogmatic conclusion.
  • Audi, Robert (1991), “Justification, Deductive Closure and Reasons to Believe,” Dialogue, 30: 77-84.
    • Argues against closure to avoid dogmatic conclusion.
  • Brueckner, Anthony (1992), “What an Anti-Individualist Knows A Priori,” Analysis 52: 111-118.
    • Solution to the McKinsey paradox that does not deny closure.
  • Brueckner, Anthony (2004), “Strategies for Refuting Closure,” Analysis 64: 333-35.
    • Reply to Warfield 2004 and Hales 1995.
  • Brueckner, Anthony; Fiocco, M. Oreste (2002), “Williamson’s Anti-Luminosity Argument,” Philosophical Studies, 110: 285-293.
    • Contains putative counterexample to Dretskean account of knowledge.
  • Burge, Tyler (1979), “Individualism and the Mental,” Midwest Studies in Philosophy, 4: 73-122.
    • Seminal defense of content externalism (or anti-individualism).
  • Burge, Tyler (1988), “Individualism and Self-Knowledge,” The Journal of Philosophy, 85: 649-663.
    • Influential reconciliation of content externalism and the privileged access theses.
  • Davies, Martin (1998), “Externalism, Architecturalism, and Epistemic Warrant,” in C. MacDonald, B. Smith and C. J. G. Wright (eds.), 321-361.
    • Argues that McKinsey paradox is a counterexample to transmission, not closure.
  • Dretske, Fred (1970), “Epistemic Operators,” The Journal of Philosophy, 67: 1007-1023.
    • Seminal paper arguing against the closure of knowledge.
  • Dretske, Fred (1971), “Conclusive Reasons,” Australasian Journal of Philosophy, 49: 1-22.
    • Contains Dretske’s account of knowledge.
  • Dretske, Fred (2005a), “The Case against Closure,” in Steup and Sosa (eds.), 13-26.
    • Argues that denying closure is only way to avoid skepticism.
  • Dretske, Fred (2005b), “Reply to Hawthorne,” in Steup and Sosa (eds.), 43-46.
    • Reply to Hawthorne 2005.
  • Feldman, Richard (1995), “In Defence of Closure,” The Philosophical Quarterly, 45: 487-494.
    • Defends closure against Audi’s arguments (Audi 1988, 1991).
  • Grice, Paul (1989), Studies in the Ways of Words, Cambridge, MA: Harvard University Press.
    • Classic treatment of pragmatic/semantic distinction, and conversational maxims and implicatures. Relevant to discussion of the tracking theory of knowledge’s “abominable conjunctions.”
  • Gunderson, Keith (ed.) (1975), Language, Mind and Knowledge, Minnesota Studies in the Philosophy of Science, volume VII, Minneapolis: University of Minnesota Press.
    • Contains seminal Putnam 1975 article.
  • Hales, Steven (1995), “Epistemic Closure Principles,” The Southern Journal of Philosophy 33: 185-201.
    • Produces counterexamples to many different formulations of the closure principle, but points out that one cannot refute closure for knowledge by showing that some necessary condition for knowledge fails to be closed.
  • Harman, Gilbert (1973), Thought, Princeton: Princeton University Press.
    • Employs closure principle in formulating dogmatic argument.
  • Hawthorne, John (2004), Knowledge and Lotteries, Oxford: Clarendon Press.
    • Argues for quasi-contextualist solution to problem of lottery propositions, and defends closure.
  • Hawthorne, John (2005), “The Case for Closure,” in Steup and Sosa (eds.), 26-43.
    • Defends closure against Dretske’s 2005a arguments.
  • Heil, John (1988), “Privileged Access,” Mind 97: 238-251.
    • Influential reconciliation of content externalism and privileged access theses.
  • MacDonald, Cynthia; Smith, Barry; Wright, Crispin (1998), Knowing Our Own Minds: Essays on Self-Knowledge, Oxford: Oxford University Press.
    • Contains the Davies 1998 article.
  • McKinsey, Michael (1991), “Anti-Individualism and Privileged Access,” Analysis 51: 9-16.
    • Formulation of the McKinsey paradox.
  • Moore, G.E. (1959), Philosophical Papers, London: George Allen and Unwin, Ltd.
    • Contains seminal anti-skeptical essays, such as “Proof of an External World,” and “A Defence of Common Sense.”
  • Nozick, Robert (1981), Philosophical Explanations, Cambridge: Harvard University Press.
    • Influential tracking account of knowledge and consequent denial of closure.
  • Putnam, Hilary (1975), “The Meaning of ‘Meaning’,” in K. Gunderson (ed.), 131-193.
    • Seminal work defending content externalism.
  • Roth, Michael (ed.) (1990), Doubting: Contemporary Perspectives on Skepticism, Dordrecht: Kluwer.
    • Contains Vogel 1990.
  • Sorensen, Roy (1988), “Dogmatism, Junk Knowledge and Conditionals,” The Philosophical Quarterly, 38: 433-454.
    • Solves dogmatism puzzle without denying closure.
  • Steup, Matthias, and Sosa, Ernest, (eds.) (2005), Contemporary Debates in Epistemology, Malden MA: Blackwell Publishing.
    • Contains Dretske-Hawthorne exchange on closure.
  • Thalberg, Irving (1974), “Is Justification Transmissible Through Deduction?” Philosophical Studies 25: 347-356.
    • Argues for counterexample to closure in dogmatism examples.
  • Unger, Peter (1975), Ignorance: A Case for Scepticism, Oxford: Oxford University Press.
    • Retains closure but offers skeptical resolution of the dogmatism puzzle.
  • Veber, Michael (2004), “What do you do with Misleading Evidence?” The Philosophical Quarterly 54: 557-569.
    • Reply to Sorensen (1988) and alternative solution to dogmatism puzzle.
  • Vogel, Jonathan (1990), “Are There Counterexamples to the Closure Principle?” in M. Roth (ed.).
    • Influential discussion of closure and lottery propositions.
  • Wright, Crispin (2000), “Cogency and Question-Begging: Some reflections of McKinsey’s Paradox and Putnam’s Proof,” Philosophical Issues 10: 140-163.
    • On the distinction between closure and transmission, and McKinsey’s paradox.

b. Further Reading

  • Brueckner, Anthony (1985), “Transmission for Knowledge not Established,” The Philosophical Quarterly 35: 193-95.
    • Reply to Forbes 1984.
  • Brueckner, Anthony (2000), “Klein on Closure and Skepticism,” Philosophical Studies 98: 139-151.
    • Reply to Klein 1995.
  • DeRose, Keith (1995), “Solving the Skeptical Problem,” Philosophical Review 104: 1-52.
    • Influential defense of contextualist epistemology.
  • Forbes, Graeme (1984), “Nozick on Scepticism,” The Philosophical Quarterly 34: 43-52.
    • Argues that Nozick’s denial of closure cannot adequately handle cases of inferential knowledge.
  • Goldman, Alvin (1976), “Discrimination and Perceptual Knowledge,” Journal of Philosophy 73: 771-791.
    • Defends reliabilist account of knowledge that denies closure, and contains a helpful discussion of the notion of a relevant alternative.
  • Klein, Peter (1981), Certainty: A Refutation of Skepticism, Minneapolis: University of Minnesota Press.
    • Argues that defense of knowledge closure assumes internalism about justification, so the skeptic who uses the principle begs the question against the externalist anti-skeptic.
  • Klein, Peter (1995), “Skepticism and Closure: Why the Evil Genius Argument Fails,” Philosophical Topics 23: 213-236.
    • Offers a defense of closure for justification, which, whether the defense succeeds or fails, he says refutes the skeptic.
  • Luper (-Foy), Steven, (1987), “The Causal Indicator Analysis of Knowledge,” Philosophy and Phenomenological Research 47: 563-587.
    • Argues for a tracking account of knowledge that retains closure.
  • Pritchard, Duncan (2002), “McKinsey Paradoxes, Radical Scepticism, and the Transmission of Knowledge Across Known Entailments,” Synthese 130: 279-302.
    • Reply to Martin and Davies on Transmission and McKinsey paradox.
  • Salmon, Nathan (1989), “Illogical Belief,” Philosophical Perspectives 3: 243-285.
    • Argues that his Millian account of names and belief produces counterexamples to closure principles of justification and knowledge.
  • Silins, Nicholas (2005), “Transmission Failure Failure,” Philosophical Studies 126: 71-102.
    • Argues against the Davies-Wright line on transmission failure.
  • Sosa, Ernest (1999), “How to Defeat Opposition to Moore,” Philosophical Perspectives 13: 141-152.
    • Adjustment of the tracking account of knowledge that allows it to sustain closure.
  • Stine, Gail (1971), “Dretske on Knowing the Logical Consequences,” Journal of Philosophy 68: 296-299.
    • Reply to Dretske 1970.
  • Warfield, Ted (2004), “When Epistemic Closure Does and Does not Fail: a Lesson from the History of Epistemology,” Analysis 64: 35-41.
    • Points out that one cannot refute closure for knowledge by showing that some necessary condition for knowledge fails to be closed.

Author Information

John M. Collins
Email: collinsjo@ecu.edu
East Carolina University
U. S. A.

Open Theism

Open Theism is the thesis that, because God loves us and desires that we freely choose to reciprocate His love, He has made His knowledge of, and plans for, the future conditional upon our actions. Though omniscient, God does not know what we will freely do in the future. Though omnipotent, He has chosen to invite us to freely collaborate with Him in governing and developing His creation, thereby also allowing us the freedom to thwart His hopes for us. God desires that each of us freely enter into a loving and dynamic personal relationship with Him, and He has therefore left it open to us to choose for or against His will.

While Open Theists affirm that God knows all the truths that can be known, they claim that there simply are not yet truths about what will occur in the “open,” undetermined future. Alternatively, there are such contingent truths, but these truths cannot be known by anyone, including God.

Even though God is all-powerful, allowing Him to do everything that can be done, He cannot create round squares or make 2 +2 = 5 or do anything that is logically impossible. Omniscience is understood in a similar manner. God is all-knowing and can know all that can be known, but He cannot know the contingent future, since that too, is impossible. God knows all the possible ways the world might go at any point in time, but He does not know the one way the world will go, so long as some part of what will happen in the future is contingent. So, Open Theists oppose the claim of the sixteenth century Jesuit theologian, Luis de Molina, that God has “middle knowledge.”

Open Theists believe that Scripture teaches that God wanted to give us the freedom to choose to love or reject Him. In order for each of us to genuinely have a choice for which we are morally responsible, we must have the ability to do otherwise than we do. This is the distinctive necessary condition of what has come to be called libertarian freedom. God may intervene in the created world at any time, and He may determine that we act in ways of His choosing. But He cannot both respect our libertarian freedom and guarantee that we will do specific things freely. Thus, Open Theists believe that God has created a world in which He takes the risk that many of us will reject Him and act in ways opposed to Him, in order to give us the opportunity to freely choose to love and obey Him.

Table of Contents

  1. History of Open Theism
  2. The Biblical Witness
  3. Philosophical Considerations
  4. Theological Implications
  5. References and Further Reading
    1. For Open Theism
    2. Against Open Theism
    3. Multiple Views

1. History of Open Theism

Open Theism has been a significant topic in philosophy of religion and in evangelical Christian circles since the 1994 publication of The Openness of God: A Biblical Challenge to the Traditional Understanding of God by Clark Pinnock, Richard Rice, John Sanders, William Hasker, and David Basinger. Philosophers of religion such as A. N. Prior, J. R. Lucas, Peter Geach, Richard Swinburne, and Richard Purtill had advocated Open Theism in their writings prior to this date, though not under that name, and Rice had published a work initially entitled The Openness of God in 1980. (It was later republished as God’s Foreknowledge and Man’s Free Will.) But the 1994 book’s attempt to systematically explicate the relational view of God that its authors labeled the open view clearly marks the beginning of increased discussion and debate over Open Theism’s tenets.

Since the publication of The Openness of God, there has been significant debate about not only the philosophical and theological merits of Open Theism, but also its orthodoxy. In 2003, The Evangelical Theological Society considered whether to remove Clark Pinnock and John Sanders from its membership for implicitly disavowing the inerrancy of Scripture in their writings by suggesting that some Biblical passages traditionally understood to be prophecies have remained and may continue to remain unfulfilled. While Pinnock agreed to revise the most objectionable passage in his book Most Moved Mover, Sanders continued to maintain that God does not infallibly predict or prophesy what will contingently occur in the future, and he maintained that Biblical passages may initially appear to predicate divine foreknowledge and/or unconditional prophecies by God of what will contingently occur but these passages must be interpreted differently (more below). The charges against Pinnock and Sanders were not sustained, but this was just barely the case for Sanders.

Proponents of Open Theism allow that their view is at odds with the great majority of the Christian tradition in rejecting both meticulous providence and divine foreknowledge of what will contingently occur. However, they argue that the tradition, guided by neo-Platonic philosophy in its formation, had difficulty reconciling beliefs about the implications of God’s perfection with the Biblical witness to a God that cares deeply about His people and how they respond to Him. Many of the early Church Fathers affirmed elements of the Open Theists’ relational view of God, in tension with their beliefs in divine impossibility. Then Saint Augustine, whose Confessions tell us that his faith partially resulted from a careful study of neo-Platonism, forcefully argued for an emphasis on God’s perfection and otherness from His creation that precluded genuine responsiveness on God’s part to our actions. The (Western) Christian tradition subsequently became largely identified with an Augustinian understanding of providence. The early Church Fathers’ idea that God’s foreknowledge is conditioned by human actions did not receive significant consideration again until Jacob Arminius in the sixteenth century and John Wesley in the eighteenth. And it is only recently, in light of philosophical considerations of the nature of freedom, that the full reciprocal relationality of Open Theism has been affirmed, with its concordant denial that God knows what will contingently occur.

Open Theists suggest that when the testimony of Scripture is considered together with philosophical reflection on the conditions necessary for free and morally responsible action, the view that results is theirs. An emphasis on God’s conditioned relationship to His creation is clearly present in the early Church, in the Eastern Church, and in developments during and in response to the Protestant Reformation. This emphasis is largely absent from the theology of the Middle Ages, but the giants of theology from Augustine to Aquinas were clearly attempting to understand God and His relationship to the world in light of the best secular philosophy available to them. While Open Theists acknowledge that their view is in important respects at odds with the Christian tradition, they also maintain that their view is not as dissonant from that tradition as might be thought; it is just that the emphasis on God as a perfect being who does not change in any respect, which is neither clearly taught by Scripture nor obviously compatible with God’s loving relationality, must be rethought.

2. The Biblical Witness

Open Theists suggest that there is a strong Biblical case to be made for affirming a God who respects our moral responsibility while inviting us into a loving relationship with Him. They argue that the most plausible reading of the Bible reveals a personal God who genuinely interacts with human persons and accepts that His desires and projects are dependent on that interaction. As discussed below, Open Theists read the Bible as showing that God desires to be in relationship with the people He has created, that He sometimes changes His mind as a result of dialogue with His people, and that He seeks to accomplish His goals for the world in concert with human agents. They also point to passages that attribute to God the learning of information as evidence that God’s knowledge is not settled, and does not include foreknowledge of the occurrence of contingent events.

Critics of Open Theism offer alternative interpretations of the passages frequently cited by Open Theists, and bring forward their own proof texts that the Biblical God is one whose sovereignty over creation includes exhaustive foreknowledge and ultimate control over each and every aspect of His creation. In any consideration of how well Open Theism accords with the teachings of Scripture, it is important to note that one’s philosophical understandings of freedom and moral responsibility necessarily inform one’s hermeneutic. One cannot fully appreciate the Biblical cases made for or against Open Theism without also appreciating the philosophical considerations to be considered in the subsequent section. Open Theism is most plausible if the dignity and responsibility of an agent require the freedom to do otherwise; if this is so, then texts that attribute responsibility to persons seem to clearly require that God does not also determine the humans’ actions. If foreknowledge is also incompatible with the ability to do otherwise, then neither can God know what we will do. But if our responsibility is consistent with either or both of divine foreknowledge and God’s sovereign determination, then the force of these passages is not nearly as great, and there is no need to seek a more nuanced reading of passages that on their face seem to attribute to God unconditioned knowledge of contingent events in the future.

Open Theists argue that the God revealed in the Bible clearly desires to be in relationship with the people He has created. From the beginning, we have been created in God’s image and given responsibility to care for His creation (Gen. 1:26). God’s relationship to His creation is clear throughout the narrative of the Old Testament. Both Abraham and Moses, among others, speak, and indeed argue, directly with God. Abraham questions God about how His promises will be fulfilled (Gen. 15), and prevails upon Him to spare Sodom if only ten righteous people can be found living there (Gen. 18). Immediately after Abraham shows himself faithful to God by his willingness to obey God even to the point of sacrificing his son Isaac, God states that it is because of Abraham’s obedience that He will maintain His promise to bless Abraham and his descendants (Gen. 22:15-18). Abraham questions God, dialogues with God, affects God’s decisions, and his actions of obedience are credited by God as at least partly responsible for Him fulfilling the promise of blessings that He has revealed to Abraham. Moses speaks with God, and because He lacks confidence to speak to his fellow Israelites, God appoints Aaron to speak for Him (Ex. 4: 1-18). God reveals His law to Moses, and when the Israelites turn their backs on their Deliverer, Moses reminds God of His promises and asks Him to relent from His anger and spare His people (Ex. 32: 9-14). It is clear throughout the Pentateuch that God speaks to chosen leaders of His chosen people, and that He not only commands them, but also listens to their concerns, often adjusting His original plans in light of His dialogue with them.

In both the Old and New Testaments, God presents Himself as working with human agents, and as being disappointed in His hopes for them, rather than as compelling them to act in prescribed ways. This is clear throughout the narrative of Israel, and in passages such as Is. 65:1-2, in which the Lord bemoans the stubbornness of those who will not call on Him, despite His many revelations to them. The Bible teaches us that we can thwart God’s desire that we freely return His love. This is suggested by passages such as Mark 6:5-6, in which we are told that Jesus could not perform many miracles in his hometown because of the lack of faith of its people, and it is explicit in Luke 7:30, in which we are told that the Pharisees rejected God’s purpose. God asks us to follow and obey Him; He does not compel obedience. Nor should every calamitous event be assumed to be divine punishment for disobedience (Job, Lk. 13:1-5, Jn. 9:1-3).

The above passages suggest that God desires to be in relationship with His created people in a manner that respects their freedom to respond to Him in various ways, and that He is genuinely responsive to our concerns. There are also passages in Scripture that more directly suggest that the future is open, and that not even God has foreknowledge of what will contingently happen. Genesis 22:12 records God as stating, “Now I know that you fear God, because you have not withheld from me your son, your only son.” The emphasis on “now” knowing “because” of Abraham’s action clearly points to this being a genuine test of Abraham’s faith, where even God could not be sure of Abraham’s response to the test. Jeremiah 3:7 and 19-20 quote God as saying that He thought Israel would return to faith in Him, but that she had not. Mark 6:6 emphasizes Jesus’ amazement at the lack of faith of those in His hometown, a reaction that only makes sense if He had had an expectation of greater faith. These passages suggest that God can genuinely learn new information.

Of course, the above is meant only to be suggestive of the kinds of considerations that Open Theists emphasize in reading the Bible. These several texts are among those that suggest that God desires to be in a relationship that respects our freedom to respond to God in a variety of ways, and that He has thus left the future open to determination through our actions, at least in part. But critics of Open Theism interpret the same data differently. For instance, Classical Theists may suggest that an incarnational theology’s emphasis on the revelation of God in Christ is misguided if it does not give sufficient weight to the idea that God veiled His glory in becoming human (see Jn. 17:5). And they cite other texts that are arguably more suggestive of the traditional view of God as providentially in control of all that happens, such as Isaiah 40-48, Romans 9, and Ephesians 1:11.

Any reading of the Bible must seek a consistent hermeneutic, and must acknowledge that certain texts must be given readings that are not initially obvious. “Prophetic” texts are read by Open Theists as either decrees of what God has decided to do, conditional predictions about what will happen if certain conditions (such as repentance) are not met, or forecasts based upon God’s exhaustive knowledge of the past and present. None of these interpretations require God to have exhaustive foreknowledge of future events, but responsible readers of the Bible may well disagree about the plausibility of these interpretations as applied to specific passages. Open Theists also argue that plausible readings that accord with Open Theism can be given of “pancausality” texts such as those alluded to in the previous paragraph, and that this is preferable to dismissing as merely anthropomorphic the overwhelming sense of the Bible that God is in dynamic relationship with His creation.

3. Philosophical Considerations

Many theologians in the Christian tradition have maintained both that we are free to choose how we act, and that God foresees our choices. Many lay Christians likewise think that this is the obvious way to reconcile our freedom with God’s omniscience. So long as God does not pre-determine that we act in the ways that we do, but only “sees” what we do, what is the problem? Why does Open Theism insist that the future is open in such a way that God’s foreknowledge of contingent events must be denied?

There are two primary ways of understanding the nature of human freedom. The “compatibilist” view of freedom is that so long as one is acting in a manner that accords with one’s desires or can be otherwise identified with one’s character, one acts freely. Our freedom is compatible with our actions being determined, so long as we are acting in the way we want. We are free so long as were we to desire otherwise, we could act otherwise, and this is so even if we could not desire otherwise. If this is the right view of our freedom, then God might predetermine all of our actions while they are yet free, so long as they are consistent with our character.

The alternative account of the nature of freedom is “libertarian.” This account maintains that unless one is genuinely able to do otherwise than one does, one is not free. So, if one’s character is formed in such a way that one will certainly act in a particular way, and if one has no control over one’s character, then one is not really free, since one cannot act in a manner otherwise than one does. Importantly, one may remain morally responsible for one’s action if one’s character has become thus through one’s earlier free decisions. (Alternatively, one might be said to be free in a derivative sense if one’s character was freely chosen in the past.) If as a result of our sinful nature we cannot choose to do good, then we are not genuinely free to do otherwise than sin. We must really be able to either accept God’s invitation to love Him or to reject it, if we are free with respect to this choice. And if we are not and have never been libertarianly free with respect to this choice , then we are not morally responsible for our choice of whether or not to love God.

Open Theists affirm a libertarian view of freedom. From almost the beginning of Western philosophy, philosophers have been concerned with whether such freedom is compatible with prior truths about what one will do. Aristotle famously argued in his De Interpretatione (book 9) that prior truth is incompatible with future contingency. His argument there may be represented as follows:

  1. It is true that it will be white.
  2. If it is true that it will be white, then it has always been true that it will be white.
  3. If it has always been true that it will be white, then it is impossible that it will not be white.
  4. If it is impossible that it will not be white, then it is necessary that it will be white.
  5. It is necessary that it will be white.

An obvious implication of this argument is that if it is now true that one will act in a particular way, then it is necessary that one will act thusly. But it is not immediately clear why one should accept premise 3. Why should one think that something’s always having been the case entails the impossibility of its ever being otherwise?

One plausible reason for thinking this is based on the idea that one cannot change the past. If a proposition was once true, can one now act in such a way that it is no longer true? If not, then the prior truth of a proposition about what one will do seems enough to rule out one’s doing otherwise, and thus rule out one’s being libertarianly free with respect to that action. The same type of consideration applies to God’s prior knowledge of what one will do. Consider the following argument given by William Hasker in The Openness of God:

  1. It is now true that Clarence will have a cheese omelet for breakfast tomorrow. (Premise)
  2. It is impossible that God should at any time believe what is false, or fail to believe anything that is true. (Premise: divine omniscience)
  3. God has always believed that Clarence will have a cheese omelet tomorrow. (From 1, 2)
  4. If God has always believed a certain thing, it is not in anyone’s power to bring it about that God has not always believed that thing. (Premise: the unalterability of the past)
  5. Therefore, it is not in Clarence’s power to bring it about that God has not always believed that he would have a cheese omelet for breakfast. (From 3, 4)
  6. It is not possible for it to be true both that God has always believed that Clarence would have a cheese omelet for breakfast, and that he does not in fact have one. (from 2)
  7. Therefore, it is not in Clarence’s power to refrain from having a cheese omelet for breakfast tomorrow. (From 5, 6) So Clarence’s eating the omelet tomorrow is not an act of free choice. (From the definition of free will.)

If premise 4 is true and if we have libertarian freedom, then it is not possible for God to know what we will freely do before we do it.

Whether one finds Open Theism plausible largely depends on whether one finds the intuition underlying premise 4 plausible. Philosophers have debated whether all of the past is comprised of “hard” facts fixed in this way, or whether there are “soft” facts that might be conditional upon our future actions. Proponents of the compatibility of human libertarian freedom with divine foreknowledge have argued that facts about God’s prior knowledge of our future actions are conditional on our subsequent choices. To use Clarence as an example, were he to choose to have a bagel tomorrow, it always would have been true that God knew that he would so choose, rather than that he would choose to eat an omelet. Since there is no reason to think that Clarence’s choice is determined by prior causes, divine or otherwise, one may affirm that he is free to have an omelet or not even while maintaining that God knows he will have an omelet. Clarence has what has been termed “counterfactual power” over the past: the power to act in such a way that were he to so act, the past always would have been different than it in fact is. Proponents of counterfactual power over the past can thus agree that Clarence does not have the power to change, or alter, the past, since were he to eat a bagel, it never would have been true that he would eat an omelet tomorrow.

Philosophers have not come to an agreement over whether one might have counterfactual power over the past, or whether the past is instead fixed in a manner that rules out this power. On this topic, basic intuitions about freedom and the fixity of the past differ from person to person, and largely determine how they view the compatibility of divine foreknowledge with human freedom, and thus how they view the plausibility of Open Theism.

It is important to note that even if foreknowledge and freedom are compatible, it is not clear that simple foreknowledge — foreknowledge that is not based on middle knowledge (see below) — could be of any aid to God in providentially ordering His creation. If God knows what will actually happen, He cannot also use this information to arrange for something else to happen, for then the contents of what He “knows” would not comprise knowledge. Foreknowledge is of the actual occurrence of future events; once the occurrence of these events is known, it is “too late” to prevent them (or to bring them about). Doing so is incompatible with their occurrence being infallibly known by God. Simple foreknowledge, if God has it, allows Him to know what will occur without having to wait for the future occurrence of events, as He must for contingent events according to Open Theism. But His knowledge is no less conditioned by the occurrence of the events; He has no greater control over their occurrence based on foreknowledge than He does if Open Theism is true.

Once it is realized that simple foreknowledge does not offer any providential advantage to God, one may wonder what reason there is to affirm it, aside from an assumption that it is more perfect for God to have such knowledge than not. One might think that foreknowledge would provide an explanation for the accuracy of prophecy. But it does not. If God has “at once” complete foreknowledge of all that happens, He “sees” what will happen including whether or not He instructs persons to prophesy that events will happen. Given knowledge of what will occur, God is not free to do otherwise than He foresees He will do. Perhaps God could “look” at a little bit of the future at a time, make decisions about how He will react to the events He foresees, and then “look” a little further to see how His creation reacts to these actions. But this would offer no greater help for predicting future events. Suppose that God foresees the course of the world until the end of 1935. Could He then decide to warn persons on January 1st of 1936 that the holocaust is about to occur? Not in any infallible way. For assuming that the holocaust was still avoidable in 1935, and assuming that God has not yet “looked” beyond 1935, He does not yet know what will occur in the next ten years. He can decide to make probably accurate but possibly mistaken predictions on January 1, 1936, based on the tendencies present at that point, but this is no more than He can do given Open Theism.

Simple foreknowledge has no utility for God’s providential governance of the world, nor can it ground infallible predictions of future events. (It should also be reiterated that Open Theists believe that there are less instances of such predictions in the Bible than is thought by those who affirm a traditional meticulous view of providence.) If one wants to affirm that we have libertarian freedom and still maintain a traditional view of providence according to which God directs the course of the world rather than merely witnessing how it unfolds, then affirming foreknowledge is not enough.

The most plausible view of how human libertarian freedom might be compatible with a traditional view of providence, and thus the greatest competitor to Open Theism, is a view called “Molinism,” named after a sixteenth century Jesuit theologian, Luis de Molina. Molina predicated “middle knowledge” to God and explained God’s providential determination of what will occur in terms of this knowledge. Middle knowledge is knowledge that lies between (in an explanatory sense, not a temporal sense) God’s “natural” knowledge of all the possible ways the world might go and His “free” knowledge of the one way the world will go based upon His creative decree. Natural knowledge is pre-volitional knowledge of necessary truths, including all the possibilities for creation. Free knowledge is post-volitional knowledge of contingent truths, including all future contingent truths. And middle knowledge is pre-volitional knowledge of contingent subjunctive conditional truths of the form: if such and such were the case, then so and so would be the case. God’s middle knowledge includes all the facts about how the world would go given various antecedent conditions. These facts, because they are known before God wills anything, are outside of His control.

Through middle knowledge, God might have known that were he to place Adam and Eve in the Garden of Eden in just the way He did, then they would sin by eating of the tree of the knowledge of good and evil. And He might have known that if they did this and He subsequently kicked them out of the garden, events would unfold in a certain way. God’s middle knowledge would include all the true subjunctive conditionals about how the persons He might create would act in the various circumstances He might place them. These subjunctive conditionals have come to be called “counterfactuals of creaturely freedom.” Based on this exhaustive middle knowledge, God would have known how events would unfold given any creative action He might decide to perform. And on the assumption that libertarian freedom is consistent with knowledge of how one would act in various circumstances, our freedom would remain intact. Molinism promises to uphold both our libertarian freedom and God’s ability to providentially decide exactly what occurs in His creation.

There are two primary objections to Molinism that Open Theists have advanced. If the argument that foreknowledge is incompatible with libertarian freedom is valid, then a similar argument can be made against the compatibility of middle knowledge with libertarian freedom. If it has always been true and known by God that I would act in such and such a way if I were in such and such circumstances, then do I have the power to bring it about that this fact has never been true, or never been known by God? Do I have counterfactual power over this past truth and God’s past knowledge of it? I must, in order to be libertarianly free. The same intuitions about the fixity of the past are brought into play. The other objection to Molinism given by Open Theists, termed the “grounding objection,” is based on the status of the counterfactuals of creaturely freedom. These are truths that, though contingent, are not under God’s control. God “finds Himself” faced with these truths, similarly to the manner in which He “finds Himself” faced with the fact that 2+2=4. But why are certain subjunctives true and certain ones not? The grounding objection is that there seems to be no reason that some particular counterfactuals of creaturely freedom are true rather than others. There is no ground for their truth or falsity. If one believes that all truths, or all contingent truths, must have some underlying ground or “truth-maker,” then one will reject the idea that there are counterfactuals of creaturely freedom available to God prior to creation.

The most important philosophical argument for Open Theism is based on the idea that God’s foreknowledge of one’s actions is incompatible with those actions being free because one does not have the power to bring it about that God has never known something that He does in fact know. But it is important to note that foreknowledge alone is of no help to God in providentially directing the course of His creation. The real competitor to Open Theism as an account of God’s providence is Molinism. Open Theists object to Molinism because they view as implausible the counterfactual power over the past that Molinism requires, and because they believe that there are insufficient grounds for the contingent truth of the counterfactuals of creaturely freedom that Molinists believe God knows via His middle knowledge.

4. Theological Implications

In considering any theology, it is important not only to evaluate the Scriptural and philosophical arguments for and against the view, but also to consider how it might be incorporated into one’s lived faith. So, this article ends with a consideration of the practical implications of Open Theism – for how one views evil, for prayer, and for how one understands the responsibility for salvation.

The traditional view of divine providence holds that each and every event occurs according to God’s will. The implication that the most horrendous evils are thus intended by God has troubled many persons. One of the advantages of Open Theism (and any other view that denies meticulous providence) is that the responsibility for evil is much more clearly removed from God and placed upon our free choices. Because God desires that we freely choose to love Him, he has given us the freedom to reject Him as well, and our acts of rejection take all kinds of horrible forms. The responsibility for the evil that we freely perform is fundamentally ours. While God gave us the ability to do evil things, He does not in any sense intend that we do them. Rather, He grieves with and comforts the victims of our sins.

If God’s will for the world is inviolable, then we must have faith that each instance of evil serves some greater good that God has purposed. On the other hand, if much of the evil in the world is due to our free choices, then there is significant gratuitous evil that serves no further purpose. To those who believe that much of the evil in the world is indeed gratuitous, Open Theism provides an understanding of God’s general project that explains why He allows us to exercise our freedom in ways that sadden Him. He does this because He must do so in order to also allow us the freedom to reciprocate His unfailing love for us.

Not everyone finds this kind of free will defense against the problem of evil comforting. If Open Theism is true, then there is no guarantee that everything will work out as God wants in the end. Open Theists may trust and hope in God’s wisdom and power, but they recognize that there are limitations on what God can effect if we stubbornly refuse to aid Him. Some persons find it easier to have faith in an inscrutable secret will of God that is furthered by the evil we witness. This response to evil also has the advantage of applying to natural evil as well as evil events that result from our actions. While Open Theists may point out that much of the “natural” evil in the world is exacerbated by our poor stewardship of the earth, they must also seek additional explanations for God’s allowance of the devastation and suffering brought about by natural disasters.

Just as one’s views of freedom and of whether the past is fixed in such a way to rule out counterfactual power over it are good predictors of whether one finds Open Theism plausible, one’s reaction to evil is also a reliable indicator of how one thinks of Open Theism. If one cannot imagine that a good and loving God would intend that genocide, torture, rape, and other horrendous evils occur for some inscrutable good, then one is likely to find a free will theodicy, and Open Theism, comforting. If instead one cannot imagine that God would allow us to perform such horrible acts, or allow the massive suffering caused by natural disasters, without there being some very great good that they serve, then one is likely to put one’s faith in the mysterious but certain goodness of God’s meticulous governance of creation.

One of the advantages of Open Theism against any theology that affirms divine foreknowledge or foreordination is that prayer can genuinely influence God’s decisions. Because the future is open and not yet determined, we may pray that God will exercise His influence in ways we desire. We may ask that He will aid ourselves or others. We may easily make sense of James’ assertion that “You do not have, because you do not ask God.” (Ja. 4:2b) In contrast, if God determines the occurrence of each and every event, then He also determines whether and how we pray. On a traditional view of God that affirms His meticulous sovereignty, our prayer is ultimately brought about by God; it cannot persuade God. And even if God merely foreknows our prayers as part of His exhaustive foreknowledge, rather than bringing those prayers about, He also foreknows His response to those prayers, so that there is no greater room for our prayers to influence God’s decisions. Only if the future is open does prayer that God will act in certain ways make sense. Since we often pray in this way, this is an important consideration in favor of Open Theism.

However, proponents of more traditional views of sovereignty can attempt to minimize the purported advantage that Open Theism has for understanding prayer by asking what essential role prayer plays in God’s decision-making, even if Open Theism is true. Since God knows everything about the past and present, and the probabilities of what might occur in the future, can prayer really inform God of anything? He already knows our every thought and desire, and whether our wants are likely to be good for us. Given this, should we think of God as waiting for us to pray to take whatever action seems best for those for whom we pray? Perhaps. It may be that the action of making a request is important – perhaps we do not really understand what it is we would ask, until we bring ourselves to ask it. It also may be that God sometimes grants requests that we make, even though He believes that they are ill-advised, because He believes that we will learn important lessons from pursuing the course of action we desire. Open Theists may respond to the above line of criticism in various ways, but it should be clear that the advantage that Open Theists have for understanding prayer as a means of influencing God is not as great as it initially appears.

The critical questions about how our prayers might influence the actions God chooses to take in the world do not apply in the same way to prayers for divine guidance. Here too, Open Theists have the advantage of a view that allows God to genuinely guide and advise His followers, because the future is not determinate. We may pray that God would guide us in important choices that we must make, trusting in His greater knowledge of the possible and probable effects of these choices. This too is an important kind of prayer that we often exercise, and so the advantage of being able to understand how God might genuinely guide us in response to prayers that He do so is an important benefit of affirming Open Theism. Molinists may say that God chooses to create a world in which He always knows that and how we will pray, in which He knows how He will respond to these prayers, and in which He knows how we will respond to His “guidance.” But assuming that Open Theists are right to deny counterfactual power over the past, God’s responses to prayer given Molinism cannot constitute advice that one may take or not, as it does given Open Theism, precisely because Molinists view the future as determinate and known by God once God has willed His initial creation.

Of course, God’s guidance is limited to His knowledge of how things will probably go if one thing is done rather than another. He cannot know what will happen as a result of our decision so long as the effects of that decision will be influenced by other free decisions. And the further in the future we consider, the less certain that even God can be of what will occur. So while God’s advice about what to do is certainly much better than any other person’s, it is no guarantee that everything will in fact go well. Furthermore, the idea of praying for guidance is most easily understood on a dialogical model, in which we speak with and hear from God. If one does not feel that God usually communicates with us so directly, then it is harder to understand how He might guide us in any precise way. It is important to note that seeking “signs” of God’s will for us is not likely to be particularly reliable if those signs could also be brought about or blocked by other free agents.

In light of the above discussion, we may conclude that Open Theists can understand the efficacy of prayers that God will act in certain ways and prayers for divine guidance in decision-making. In contrast, those who affirm meticulous providence or exhaustive and settled foreknowledge of what will contingently occur plausibly cannot understand this efficacy, since there seems to be no room for our prayers to affect God or for His response to them to affect our decisions, if the decisions of both God and ourselves have always been foreknown, and perhaps foreordained. But we have also seen that what initially seems to be a clear advantage for Open Theism is tempered by questions about how exactly we might influence God, and about how exactly He might communicate His advice to us in response to prayers for guidance.

The final theological implication of Open Theism that requires discussion is the degree to which we have a greater responsibility for our salvation if Open Theism is true. Traditionally, Christians have emphasized that we are constrained by our sinful nature in such a way that we cannot respond favorably to God without additional grace given by Him. If this grace is both necessary and sufficient for a “salvific” faith, then the ultimate cause of whether one is saved or not is God’s giving or withholding of that grace, rather than any “choice” one makes. Open Theism claims that it is essential that the choice for or against God that determines our salvation be genuinely up to us. We must be free to choose to love or reject God, in order for our choice to love Him to be genuine, and giving us that genuine choice is the reason that God has given us libertarian freedom.

To what extent is God’s glory diminished by His giving us a greater role in our salvation, that of genuinely choosing whether or not to follow Him? While some opponents of Open Theism have argued that any attribution to human persons of an ability to determine a necessary condition of salvation impugns God’s sovereignty, it is not at all clear that this is so. If Open Theism is true, we are still dependent on God’s gracious and freely-given invitation to us to love Him and thereby be saved. Open Theists may even affirm a doctrine of sin that predicates to us an inability to respond favorably to God without further enabling grace. But they claim that God has extended this enabling grace to all persons through Jesus Christ and the Holy Spirit. The only thing that we do is decide whether or not to accept the greatest gift imaginable. There is no cause for pride on our part in making the right choice. If we truly appreciate God’s glorious sovereignty, rather than requiring that His sovereignty be understood in particular ways, then the only appropriate response to God’s invitation involves humility.

The debate over whether Open Theism correctly portrays God’s relationship to His creation involves a complicated web of Biblical data, philosophical arguments, and reflection on the practical theological implications of the view. Certain points of contention clearly divide those who might consider Open Theism from those who will not: a belief that libertarian freedom is essential to moral responsibility, a belief that the past is fixed in such a way that we do not have the ability to bring it about that it was always different, and a belief that evil should be attributed to our imperfect human decisions rather than to a secret inscrutable will of God. Of these three beliefs, it is the second that divides Open Theists from Molinists, who also affirm libertarian freedom but attempt to do so in concert with meticulous providence. Even if one affirms all three of these beliefs, however, there remains the hard work of slowly working through a detailed examination of Scripture and reflection on the Christian life. This is the case for any theology, and it is perhaps especially so for a relatively young theology such as Open Theism.

5. References and Further Reading

a. For Open Theism

  • David Basinger, The Case for Freewill Theism: A Philosophical Assessment (Downer’s Grove, IL: InterVarsity Press, 1996).
    • A brief consideration of freewill theism generally, and open theism specifically, especially as applied to the topics of omniscience, evil, and prayer.
  • Gregory A. Boyd, God of the Possible: A Biblical Introduction to the Open View of God (Grand Rapids, MI: Baker Books, 2000).
    • A brief and easy to read consideration of the Biblical case for Open Theism.
  • Terence Fretheim, The Suffering of God: An Old Testament Perspective, Overtures to Biblical Theology (Philadelphia: Fortress Press, 1984).
    • A study of the use of metaphors in describing God in the Old Testament, and a case for predicating suffering, and thus genuine responsiveness, to God.
  • William Hasker, “Foreknowledge and Necessity,” Faith and Philosophy 2, no. 2 (April 1985), 121-157.
    • An extended argument that foreknowledge is incompatible with libertarian freedom.
  • William Hasker, God, Time and Knowledge, Cornell Studies in the Philosophy of Religion (Ithaca, N.Y.: Cornell University Press, 1998).
    • A book length exposition of the philosophical case for Open Theism. Also a good place to start to get a sense of the philosophical debate over the relationship of freedom and divine foreknowledge.
  • William Hasker, Providence, Evil, and the Openness of God, Routledge Studies in the Philosophy of Religion (New York: Routledge, 2004).
    • A consideration of the strengths of Open Theism in comparison with Calvinism, process theism, and Molinism, especially with regard to the problem of evil and the question of divine action within the world.
  • Clark H. Pinnock, Most Moved Mover: A Theology of God’s Openness (Grand Rapids, MI: Baker Books, 2001).
    • An exposition of Open Theism in terms of the controlling metaphor of God as love that treats in turn: the Scriptural foundations for Open Theism, the development of traditional Christianity influenced by Hellenic philosophy, the philosophical case for Open Theism, and Open Theism’s adequacy to the practical demands of living one’s faith.
  • Clark H. Pinnock, Richard Rice, John Sanders, William Hasker, and David Basinger. The Openness of God: A Biblical Challenge to the Traditional Understanding of God (Downers Grove, Ill.: InterVarsity, 1994).
    • The book that began the extensive debate over Open Theism. A series of five essays that consider Biblical and historical considerations in favor of Open Theism, what a systematic openness theology amounts to, the philosophical case for this view, and its practical implications. An appropriate starting point for anyone interested in learning about Open Theism.
  • Richard Rice, God’s Foreknowledge and Man’s Free Will (Eugene, OR: Wipf and Stock Publishers, 2004). Previously published as The Openness of God: The Relationship of Divine Foreknowledge and Human Free Will (Minneapolis: Bethany House, 1980).
    • An early argument for the present-knowledge or open view of God.
  • John Sanders, The God Who Risks: A Theology of Providence (Downers Grove, Ill.: InterVarsity Press, 1998).
    • The best exposition of Open Theism to date, especially with respect to the Biblical case for the view, and in systematically setting out openness theology. Also an excellent source of additional references to texts related to Open Theism.
  • Richard Swinburne, The Coherence of Theism, rev. ed. (New York: Oxford University Press, 1993).
    • A penetrating philosophical case for understanding theism in a manner that accords with Open Theism’s view, made prior to the widespread use of that term.

b. Against Open Theism

  • William Lane Craig, The Only Wise God: The Compatibility of Divine Foreknowledge and Human Freedom (Eugene, OR: Wipf and Stock Publishers, 2000).
    • An argument for the compatibility of divine foreknowledge and human libertarian freedom based on Molinism’s attribution to God of middle knowledge of subjunctive conditionals about what free agents will do in particular circumstances (counterfactuals of creaturely freedom).
  • Millard Erickson, What does God Know and When does He know it?: The Current Controversy over Divine Foreknowledge (Grand Rapids, MI: Zondervan, 2003).
    • An extended argument against Open Theism that also calls for greater moderation and civility in the debate over the topic.
  • Thomas P. Flint, Divine Providence: The Molinist Account (Ithaca, NY: Cornell University Press, 1998).
    • The most thorough explication of Molinism, with critiques of both orthodox Thomistic and Open Theistic views of divine providence.
  • John Frame, No Other God: A Response to Open Theism (Phillipsburg, NJ: Presbyterian & Reformed, 2001).
    • A critique of Open Theism based on a Reformed reading of Scripture.
  • Norman L. Geisler and H. Wayne House, The Battle for God: Responding to the Challenge of Neotheism, (Grand Rapids, MI: Kregal Publications, 2001).
    • Calling Open Theism “neotheism,” this work argues that Open Theism is dangerously far from traditional Christianity, and seeks to explicate the orthodox view of God’s attributes.
  • Paul Helm, The Providence of God. Contours of Christian Theology, (Downers Grove: IL: InterVarsity Press, 1994).
    • A systematic explication of God’s providence as risk-free meticulous sovereignty.
  • Beyond the Bounds: Open Theism and the Undermining of Biblical Christianity, edited by John Piper, Justin Taylor, and Paul Helseth (Wheaton, IL: Crossway Books, 2003).
    • A series of essays arguing that Open Theism is unorthodox and not an acceptable form of Christianity.
  • Still Sovereign: Contemporary Perspectives on Election, Foreknowledge, and Grace, edited by Thomas R. Schreiner and Bruce A. Ware (Grand Rapids, MI: Baker Books, 2000).
    • A series of essays explicating and defending the classical view of divine sovereignty.
  • Bruce A. Ware, God’s Lesser Glory: The Diminished God of Open Theism (Wheaton, Ill: Crossway Books, 2001).
    • An argument, primarily based on his reading of Scripture, that Open Theism is false and its consequences are dire.
  • R. K. McGregor Wright, No Place for Sovereignty: What’s Wrong with Freewill Theism (Downer’s Grove, IL: InterVarsity Press, 1996).
    • An attempt to show what’s wrong biblically, theologically, and philosophically with freewill theism, both in its contemporary (Open Theism) and historical forms (Arminianism).

c. Multiple Views

  • Predestination and Free Will: Four Views of Divine Sovereignty and Human Freedom, edited by David Basinger and Randall Basinger (Downer’s Grove, IL: InterVarsity Press, 1986).
    • Essays in favor of foreordination (John Feinberg), foreknowledge (Norman Geisler), God’s self-limited power (Bruce Reichenbach), and God’s self-limited knowledge (Clark Pinnock), with responses by each author to the other essays.
  • Divine Foreknowledge: Four Views, edited by James Beilby and Paul Eddy (Downer’s Grove, IL: InterVarsity Press, 2001).
    • Essays in favor of Open Theism (Gregory Boyd), simple foreknowledge (David Hunt), middle knowledge or Molinism (William Lane Craig), and the Augustinian-Calvinist view (Paul Helm), with responses by each author to the other essays.
  • God and Time: Four Views, edited by Gregory Ganssle (Downer’s Grove, IL: InterVarsity Press, 2001).
    • Essays on divine timeless eternity (Paul Helm), eternity as relative timelessness (Alan Padgett), timelessness and omnitemporality (William Lane Craig), and unqualified divine temporality (Nicholas Wolterstorff), with responses by each author to the other essays.
  • Christopher Hall and John Sanders, Does God Have a Future?: A Debate on Divine Providence, (Grand Rapids, MI: Baker Books, 2003).
    • The product of a year’s dialogue via email between Hall, who affirms a classical theism, and Sanders, an Open Theist, about divine providence and foreknowledge.

Author Information

James Rissler
Email: amf@atlantamennonite.org
Oglethorpe University
U. S. A.

Color

bluePhilosophy has long struggled to understand the nature of color. The central role color plays in our lives, in visual experience, in art, as a metaphor for emotions, has made it an obvious candidate for philosophical reflection. Understanding the nature of color, however, has proved a daunting task, despite the numerous fields that contribute to the project. Even knowing how to start can be difficult. Is color to be understood as an objective part of reality, a property of objects with a status similar to shape and size? Or is color more like pain, to be found only in experience and so somehow subjective? Or is color more like what some have said about time–that it seems real until we reflect enough, where we come ultimately to dismiss it as mere illusion? If color is more like shape and size, can we give a scientific account of it? Various strategies exist for this option–taking the color of an object to be just a complicated texture of that object, one that reflects certain wavelengths. Or perhaps color is merely a disposition to cause experiences in us, as salt has a disposition to dissolve. On the other hand, if color is more like pain, and found only in subjective experience, what is the nature of color experience? How, for instance, does an experience of red differ from an experience of blue, or from an experience of pain for that matter? Finally, if color is mere illusion, how do we continue to be so taken in by that illusion and how can something unreal seem so real and important to us?

There are just some of the questions that have been raised about color, ones we will address in this article. Of course, this is only a beginning, for it is not only the scientist or scientifically-inclined philosopher that wonders about color. Accounts of color have been given by anthropologists, artists, philosophers interested in metaphysics, and many others. How their accounts go, and how they all fit together makes for fascinating philosophy. This article will offer an introduction to philosophical issues of color, with an eye to exploring some of the answers that have been offered to some of the puzzles. As always in philosophy, the discussion has to begin somewhere, though it need not ever end.

Table of Contents

  1. Color, Philosophy, and Science
    1. Realism
      1. Non-Reductive Realism
      2. Reductive Realism
        1. Physicalism
        2. Dispositionalism
    1. Subjectivism
      1. Mentalism
      2. Eliminativism
  2. Color and Metaphysics
    1. Color Skepticism
    2. Color and Internal Relations
  3. Is Color Experience Universal?
    1. Linguistic Determinism
    2. Berlin and Kay
  4. References and Further Reading
    1. Overviews and General Discussions
    2. Specific Positions

1. Color, Philosophy, and Science

Many contemporary debates about color have their origin in the rise of modern science. The emerging scientific picture of the 16th and 17th centuries demoted color, sound, taste and other aesthetically interesting properties to second-class status, according them the pejorative title of “secondary qualities.” Primary qualities, such as shape, size, motion, and number, in contrast, seemed necessary and sufficient to explain the behavior of physical objects and were thereby countenanced by the new physics as the truly real. From the perspective of physics, secondary qualities such as color were deemed explanatorily idle, and thus at best were said to be present in bodies only as complex structures of primary qualities, and so do not resemble our ideas of them. At worst, color and the like were dismissed as mere illusory appearances. Color would no more be in objects than pain is. Either way, the world was seen as not colored–or at least, if there is color in reality, it bears little resemblance to the color we are so intimately aware of.

With this background, contemporary philosophers face a choice of sorts. Should color be assimilated, on the one hand, to shape and size, and thus accountable in a scientific manner, not requiring appeal to sensory experience? Or, on the other hand, are colors more like sensations of pain, and thus personal, subjective features of experience? These questions trigger different responses, and so determine numerous accounts of the nature of color. Early portions of this article will examine the interplay between common sense and science on the nature of color, with an eye to answering those questions.

But philosophical issues of color are not limited to these debates. Color plays such an important role in our lives, in so many different ways, that it is not surprising that other issues should arise. We will explore some of these as well. Like children then, philosophers are fascinated by color. Unlike children, we have sophisticated concepts and tools at our disposal to help us understand the mysteries of color.

To begin let us ask, “Are physical objects, independently of perceivers’ experiences, colored? Again, were we to discard what is found in experience, would it still be correct to say that objects are colored?”

Realism about color, as understood here, maintains that yes, objects are colored. In particular, Realism holds that objects are colored, regardless of whether anyone is looking at an object, regardless if the color is perceived. In so maintaining that objects are colored, we are saying that the essence of color is to be found in the nature of the objects that are colored, as opposed to being within the minds of perceivers. Subjectivism, on the other hand, holds that it is false to say that objects are colored. But even if objects are not colored, surely there are experiences of color. And in this way we can find a place for color, by including the perceivers and perception of color. Subjectivism gets its name because of the role of the subjects of experience, where color is now to be found. In saying that color exists within subjective experiences of color, however, we need not mean there is something arbitrary or illusory about color. Color could be something that really does exist within perceivers, which can be studied, measured, and explained.

As we articulate these positions more precisely, we will discover that there are various ways to claim that objects are colored, just as there are various ways to understand the claim that there are only experiences of color. Due to limitations of space, we can only hope to introduce the reader to some of the positions and complexities of the debate, and hope that is enough to both satisfy one’s initial curiosity and to also spur one to learn more.

a. Realism

Realism holds that objects are colored. So does common sense. Science, particularly physics, apparently threatens that view. For science tells us, in the first place, that ordinary objects–trees, houses, cars, are themselves just complexes of more basic items (atoms, protons, electrons, quarks, and so forth). And in the second place, these scientific objects are not colored. We thus seem on the verge of paradox as we consider the following two claims.

CS: (Ordinary) objects are colored.
CP: Ordinary objects are bundles of basic scientific objects.
PS: Basic scientific objects are not colored.

(Though CP is clearly relevant to this discussion, it will not be explored further.) What then should we say about CS, the claim that common sense objects are colored, given the hard-to-deny threat posed by PS, the claim that the physicist’s entities are not colored? Several strategies emerge.

i. Non-Reductive Realism

Non-Reductive Realism about color holds there to be no distinction between what are called the primary and secondary qualities of objects. Both types exist in the object just as they present themselves. A red ball looks to have primary qualities (the shape, size, mass, and so forth) and secondary qualities (the color, the smell, the warmth, and so forth) and on this view, the object truly does have both kinds of qualities. The color exists “cheek by jowl” with the shape. Using some technical terms, we might say that on this view, shape and color are both irreducible qualities; they are basic and appear as they really are. In contrast, as we will see, other versions of Realism will deny color exists as such a basic quality. Instead, such views will reduce color to something more basic.

The motivation for Non-Reductive Realism, otherwise known as Primitivism, is clear enough, namely to allow us to take seriously our common sense view of the world, in which color plays an obvious and significant role. But as we have said, the scientific view of reality threatens common sense. On many fronts, science tells us to be suspicious of our everyday, common beliefs. When it comes to color, science typically seeks to explain our experiences of color by invoking scientifically respectable properties, the ones that lend themselves to mathematization, namely the primary qualities. In schematic form, we are said to perceive red, for instance, because of the shape and texture of a given object, which in turn reflects certain wavelengths of light to our eyes, which then send electrical impulses to our brain, resulting in the experience of color. More generally, the thought is that we should attribute to physical objects only those properties necessary and sufficient to explain their physical behavior, and that this can be accomplished by reference solely to the so-called primary qualities (hence their status as “primary”.) Since the property of red, for instance, seems to play no causal role in our experience of red, it should not be included in the list of properties that characterize physical objects. What does the explaining instead is the texture of the object, the wavelengths of light that are reflected, and so forth. Worse still, even if objects were colored in the irreducible, or what we could call the occurrent sense, it is not clear how that would help our perception of red objects. For again, the mechanism used to explain the perception of red makes use only of light, surface texture and the like. Color is left as explanatorily idle and should not be said to be part of the physical world. So goes the threat from science, as we have said.

How might the Non-Reductive Realist reply? One strategy denies that CS and PS are truly incompatible. Each might be argued to be true in their own way, and that therefore no problem arises. Why? Because 1) common sense and physics, and thus CS and PS respectively, operate at different levels of analysis and 2) there is no ultimately right level of analysis, and so, 3) we are not forced to choose between them. Consider another area where we do not feel the need to choose one level of analysis over another. For instance, we accept explanations of people’s behavior by describing their beliefs and desires. Even though we suspect that those beliefs and desires could (eventually) be given a description at the level of brain processes, we do not think we must appeal to that level in order to genuinely describe and explain. So too a level of discourse that speaks of objects’ irreducible properties seems autonomous and respectable, even if there is another level according to which there are not such colors. The autonomy of this level then could withstand the encroaching scientific perspective, allowing us to maintain both, if we like.

Of course, someone who takes science’s dictates to be the ultimate word on what does really exist–that science is the measure of all that is, will not be swayed by these considerations. And for those philosophers, they now must face that conflict between common sense and science. But again there is possibility for reconciliation. This, however, requires a reinterpretation of the claim that objects are colored, one that makes use of the notion of reduction.

ii. Reductive Realism

Since the Modern era, scientifically-inclined philosophers have sought a way to reconcile common sense claims with the philosophic-scientific view that color plays no role in physical explanations, should not be countenanced as basic, and thus is not in the objects in a basic sense. Faced with the inadequacies of Non-Reductive Realism, and with the general sentiment that our ontology should be given by science (or at least not be inconsistent with our best scientific theory), we might seek a scientifically respectable account of red and the like.

The hope has been to give a scientific account of these qualities by showing them to be just complicated physical properties, that is, primary qualities. If we can show how color is really just a combination of say, complex, microphysical properties that characterize the surface of objects, ones that cause certain wavelengths to be reflected, we will have given an account of their nature comparable to what has been done with observable shape, size, weight, texture, motion and the like. Objects can be now said to be colored, where that color now is understood as really just a complex of physical, primary, properties. We will have reduced color to properties and relations that do not include occurrent or basic color.

Our original conflict, then between:

CS: Objects are colored
PS: Basic scientific objects are not colored

disappears as CS is reinterpreted to mean that objects are colored in a reduced, non-occurrent sense. Just as scientists have shown sound to be nothing more than wavelengths in a medium, and shown heat to be kinetic energy, a similar reduction has been proposed for color.

1) Physicalism

How exactly does this reduction go? One broad strategy, known as Physicalism, seeks to reduce color to those physical properties (primary qualities) sufficient to explain why we see objects as colored in the basic, self-presenting, occurrent sense. But saying we can give a reductionist account of color that appeals only to the physical properties of objects and light is far from actually doing it. And there are many obstacles to the actual reduction. Here is why, in part: There are many, many different physical causes which, when they impinge upon our highly sensitive visual system, yield the same experienced color. Consider the color blue, and the many places blue appears. It turns out there are drastically different physical causes for the blue of sapphire; the blue of lapis; that of turquoise; from blue dye to blue in the rainbow; the blue of water compared with the sky; the blue on tv, compared with the blue of a bluish star. In short, identity or even similarity in color of objects does not imply similarity in physical structure of object. (Making matters worse, similarity in physical structure does not even imply similar color appearances. The same reflected range of light, but at different angles of reflection, will make for different colors–this is part of the explanation of the phenomenon of iridescence).

For simplicity, let us ignore the differing physical mechanisms that explain the blue of the sky (dispersal of light), the blue of water (reflection), and the blue of a rainbow (refraction). Instead, just focus on the blue of ordinary objects. Can we give a reductive, physicalist account of this blue, one that allows us to say the object is blue, but in a non-basic way? Here is how one version of Physicalism goes. (We have referred to this as “Reductive Physicalism, but as we are noting now, this is but one of various forms of that approach. We might think of the version about to be discussed as Disjunctive Reductive Physicalism.) A given color is defined by reference to the (micro)physical features that characterize the surfaces of objects; features which are then responsible for reflecting particular wavelengths to perceivers’ eyes. What is a color then? It is that complicated set of primary qualities which characterize the surface of an object. Some surfaces are structured to cause experience of red, some to cause blue, and so forth. The color itself, of an object, is that surface structure, which can be accounted for in physical terms–that is, describable by physics, chemistry and the like.

An immediate problem arises, even for this simplified phenomenon. This is the phenomenon known as metamerism, according to which different combinations of wavelengths (in the same conditions) give rise to identical color experiences. The reason metamers make things difficult is that two objects can have very different surface textures–at the microphysical level–and thus can reflect very different wavelengths to perceivers. But these very different wavelengths can be experienced as the exactly same color. For instance, light that is 100% 577 nm (a nanometer is a billionth of a meter) will appear as pure yellow. But light that is composed of 50% 540 nm and 50% 670 nm will appear qualitatively indistinguishable. Since different physical structures can produce different wavelengths, all of which yield the same color experience, it appears we are left defining color as the structure of an object by saying:

Yellow= microstructure1 OR microstructure2 OR microstructure3 OR…

This is, in other words, a disjunction and yellow looks to be definable as a disjunction only. There is apparently no single physical property of objects, of wavelengths, of reflections of light, and so forth. that all yellow objects have in common–let alone yellow of non-ordinary objects like the sun, after-images, and so forth.

With these scientific facts in hand we approach the matter now as philosophers. What should we say about the reduction of a property, in this case, a color, to a disjunction? Consider various problems raised. First, if the list of conditions that characterize yellow (or any color) is infinite, as it might be, then it hardly seems that we have reduced color. Even were it just a long finite list, as seems equally possible, we also might object to the claim that such disjunctive properties are real properties at all. Most troubling, however, is that there does not seem to be a unifying physical condition which explains why these all are instances of yellow. The only thing that explains why these various physical conditions are yellow is that they cause experiences of yellow. Thus our seemingly perceiver-independent account of color actually seems to require reference to perceivers. For without perceivers of color in the picture, we no way to explain why some physical conditions are yellow and some are not. And that leaves us with the disturbing sense that our list of physical conditions is just a hodgepodge, a gerrymandered set of properties, not a genuine explanatorily useful reduction. And while there are other ways to develop such Physicalism, the problems we have outlined have sufficed to send philosophers looking elsewhere for an account.

2) Dispositionalism

Failing to find a single (micro)property that explains an experience of a certain color, while still hoping to reconcile the claim that objects are colored with the scientific claim that color is not basic, philosophers have hit upon another reductive strategy. John Locke is usually credited here as the originator of this Dispositionalism, as he writes,

“Such qualities, which in truth are nothing in the Objects themselves, but Powers to produce various Sensations in us by their primary qualities, that is, by the Bulk, Figure, Texture, and Motion of their insensible parts, as Colours, Sounds, Tastes, and so forth. These I call secondary qualities.” (Locke, An Essay Concerning Human Understanding. Bk.II, Chpt. VIII, §10.)

To appreciate this claim, recall that we are still looking for a reductive account of color, but as well, have rejected Physicalist attempts at reduction. With that in mind, we might step back and notice that the Physicalist account of color was given by focusing largely, if not completely, on the object itself, leaving aside our experience of color–what it is like and how it might play a role in understanding color. Perhaps the absence of even a reference to experience is the source of the trouble. For certainly our motivation to understand color itself comes from reflection on our experience of color–especially as we put that alongside an account of reality that tells us to be suspicious of our common sense experiences of the world. Maybe we will do better by approaching the nature of color with a role for the fact that color is an experienced quality. With this in mind, we might develop an account of color that brings out the extent to which the particular nature of color is linked with experiences of color, though the color itself is still said to be a property of objects.

To develop this account, philosophers draw attention to the following true biconditional:

(C): x is red if and only if x appears red under standard conditions.

Red objects, that is, appear red in standard conditions (to normal perceivers), and if an object appears red to a normal perceiver, in normal conditions, then that object is red. What explains this? Here it is claimed that C is true because of a deeper truth about color, namely, that the color of an object just is the disposition of that object to appear red. Let us call this DC, and let it be the Dispositionalist’s definition of color.

(DC): x is red = x is disposed to appear red (to normal perceivers in standard conditions).

Of course, there are also corresponding biconditionals for shapes of objects. Examination of their different status will make clearer the goal and nature of Dispositionalism. Consider then,

(S): x is square iff x appears square under standard conditions (to normal perceivers)

This too is true, but does not entail a parallel treatment of square’s essence. For we will not accept,

(DS): x is square = x is disposed to appear square (to normal perceivers in standard conditions).

The reason we will not move from S to DS is instructive. For when it comes to such properties as being square, we believe that an account of its nature can be given by simple appeal to an objects’ physical properties, without appeal to how it appears to perceivers. We have no temptation to give a dispositionalist account of square for the essence of square. In contrast, color can be thought of as a property of physical objects, but only in a thin sense, namely, the disposition to cause in us certain experiences. Which experience? The appearance of the very color in question.

The merits of this account are numerous. First, we have found a way to keep our common sense claim, CS from above, though with a reinterpretation of CS. Objects are colored, though not in a basic sense. Second, we now also have room to take seriously the dictates of science according to which the basic entities of reality are not colored. What we can say is that if those basic entities are put together in suitable ways, ordinary objects come to have certain powers or dispositions, namely in this case, to cause experience of colors such as red. This makes for another merit. Objects can said to be red, or blue, and so forth, and we can distinguish veridical from non-veridical perceptions of color. One might experience a truly blue object as green, because either the viewing conditions are not standard (for instance, in certain kinds of light), or because something is amiss with the perceiver. In the second case, the perception was not veridical, for there is a way the object really is colored. This allows, in other words, for intersubjective agreement about the colors of objects, and thus keeps color from being purely subjective or relative. Finally, we can say that objects do have their colors even when not being observed, or even when they are in the dark. For even in the dark, objects do have the disposition to appear certain ways, and of course, that is what we are saying color really is. In this way color is said to be real, as we want when considering the matter from common sense. Yet in another sense, color is relative to a perceiver–for an object only has a disposition to appear red–and the experience of red, for instance, does require a perceiver, and an element of subjectivity. The total package then is a nice blend of objective and subjective elements, and for many is just what we should expect from a good explanation of color.

In sum, these features have made Dispositionalism a tempting and popular position. We now explore some objections to this view, leaving it to the reader to decide for themselves whether or not these objections are compelling.

It is often complained against Dispositionalism, for instance, that colors do not look like dispositions. They look like basic, occurrent properties, just like the shapes of objects. How then, it is questioned, could color really be a disposition, if it does not look like one at all? Here we might expect the Dispositionalist to ask us to specify exactly how we would expect a disposition to look in the first place. The Dispositionalist will then argue that once we actually figure out how we would expect color as disposition to appear, we discover that that is just how colors do appear. For example, if color were a disposition to appear red in standard conditions, then in standard conditions, a red object would look red. And is not that just what it does look like?

Perhaps more troubling, however, is that Dispositionalism seems circular. What is red? A disposition to appear a certain way. Which way? To appear as red, of course. Red, then, is a disposition to appear red. If “red” is being used the same way here, then we have explained “red” by reference to “appears red”. That seems straightforwardly circular, and thus problematic. Interestingly, some philosophers have taken this to be a serious problem, while others have suggested it is a harmless and even expected result. After all, they say, we have wanted an account of color that appeals to our experience of it. Thus the only way to explain what red is is to describe our experiences of red. In this case the circularity is not threatening, but simply an indication that our desired account of color required appeal to the experience of color to make sense of it in the first place. That, again, was what made explanation of red different from explanation of shape. On the other hand, circular accounts do not provide much information, and as such we might still wonder what we have really learned about the nature of red, if that is just a disposition to appear red.

Finally, some have worried that if color is a disposition, we are now incapable of explaining why we have experiences of color at all. Consider this parallel. We can taste the saltiness of a pretzel. Why? Because the pretzel was salty. And the salt has a disposition to dissolve and cause experiences of tasting salty. But it is not the disposition to dissolve that is responsible for the taste of salt. It is the non-dispositional properties of salt that both cause it to dissolve and which cause the taste of salt. Again, it is not salt’s dispositions that cause our experiences of salty taste. It is the non-dispositional properties that ground that disposition. In fact, we say that what is essential to salt is whatever properties explain those dispositions, and it is those more basic properties that do the causing. So too it might be said for color. Dispositions do not cause anything, but rather the ground of those dispositions does. Color as a disposition cannot cause a perception of color. Instead, it must be the non-dispositional ground that causes experiences of color. But that means we have located color in the wrong place. Instead of speaking of color as a disposition, it now seems we should be considering the ground of that disposition to be the heart of color. And that might take us away from Dispositionalism and back to Physicalism, with all of its problems. Or maybe not, as some philosophers have sought here a third way.

As noted, these discussions of different kinds of Realism have only skimmed the surface. The broad strategies we have outlined, of course, can and have been developed in quite a number of different ways. Enough has been said, however, to both give a sense of these positions and to show the need some have felt for a completely different approach. We turn to that now, the broad strategy we have designated as Subjectivism.

b. Subjectivism

Recall that conflict between science and common sense over the status of color.

CS: Ordinary objects are colored.
PS: Basic scientific objects are not colored.

Our discussion of Realism has been an extended exploration of this conflict, with focus on preserving the truth of CS and common sense. Let us now cease attempting to reconcile these claims, and simply reject CS as false. Common sense is just wrong, we might claim. Objects are not colored in any sense, reduced or not; and thus we are free to embrace a scientific ontology which does not include color among the basic properties of its basic entities.

Common sense is wrong then, but it certainly does not seem wrong. The world presents itself as colored, afterall, and if it really is not colored, we are owed at least an explanation of how we could have been so wrong. Here is where Subjectivism gets its name and appeal. For while the world itself has no color, there are undeniably experiences of color. And while we will need to give a philosophical account of those experiences, we can say for now that color is subjective in the sense of being perceiver dependent, just as pain is. Objects can be round or square, but they are not colored. Since it does not make sense to say objects have the properties of pain and pleasure, we say that pain and pleasure, instead, are merely types of subjective experience. Those experiences may be caused by physical objects, but the qualities of pain and pleasure are in us, not in the objects. So too we may say for color.

In thus locating color within perceptual experience, we make it perceiver dependent, and thus, in some sense, cease to view color as part of the objective world. How we choose to account for experience itself, however, will give us different versions of Subjectivism.

i. Mentalism

Let us call any position that posits color as a genuine property of subjective, personal experience, a version of Mentalism. The inspiration for this view is René Descartes, who thought that color and other secondary qualities were merely sensations, and as such, mere occurrences within a mental substance. The parallel again with pain is instructive here. Pain and color, then, occur in a substance that is also the locus of thinking. As occurrences in a mental realm, they fall outside the scope of the physical sciences that study material substance.

Contemporary philosophy, however, has had little sympathy for this kind of substance dualism, whereby two distinct types of substances exist side by side. Not only does this mental substance fall outside the scope of the physical sciences, difficult questions about the connection and interaction of these independent substances arise. As we will see next, some have left the letter of mental substance behind, while retaining the spirit in a related, but slightly less problematic metaphysics, one that comes in handy when accounting for the nature of color.

In the earlier parts of the twentieth century, philosophers made much use of a special class of entities dubbed, sense-data. These are a class of particulars, or individuals, which have existence only in minds. They are often held to be private, special objects, of which each person has direct, infallible access to and knowledge of. Knowledge of sense-data in turn allegedly provided foundational knowledge on which all other knowledge rests. As for sense-data themselves, they were introduced to explain the appearance of perceptual qualities when there were in fact no such qualities in the physical objects one is perceiving. In a famous example, one could explain a perception of an elliptical coin, when presented with a coin that is really round, by claiming that the actual object of experience is an elliptically-shaped item (an elliptical sensum), which one experienced directly. Sense-data would be the bearers of properties we take physical objects to have, and so could explain the possibility of perceptual error.

With this metaphysics in hand, color can now be categorized as a property of such sense-data. Though the physical world may lack such properties as color, the world causes each of us to have experiences and present in such experiences would be special, private, mental entities that have the qualities in question. Presented then with an apple that really is not red or sweet, we have experiences of red sensa; sweet sensa, and so forth. We thereby account for the existence of such qualities–having them qualify these subjective, perceiver dependent entities, and we also explain our belief that the world is colored. We think there is color, because in fact there is, though we mistakenly believe the color of sense-data is really to be found in physical objects.

Sense-data themselves, however, have fallen on hard times, especially since the middle of the twentieth century as various philosophers objected both to their nature and the epistemological role they were to play. Though many are now reluctant to speak of sense-data as a class of particulars, some contemporary philosophers have preserved some of the functions of sense-data, and now speak of qualities that characterize our visual field, or perhaps that qualify our mental states or mental events. Color on this understanding is categorized as a “phenomenal property”, maintaining the Cartesian legacy that such properties are mind-dependent and subjective, but in a way that frees them of excessive ontological baggage.

ii. Eliminativism

In opposition to Mentalism, but still within what we have called Subjectivism, lies another popular position, Eliminativism. This view agrees that objects are not colored, but it does not wish to trade the color of objects for color as now an irreducible property of something inner or mental. Instead, it wishes to rob color of any ontological significance at all. We can still speak of our experiencing color, of course, but we are not to understand this as claiming that color does really exist, only now as a property of mental substance or of sense-data or of our visual field. Color experiences themselves, we could say, are to be reduced to non-color properties, just as Reductive Physicalism sought to reduce the color of objects to non-colored properties and relations. For Eliminativism the reduction of color experiences is to be to properties and facts about our visual processing systems, facts about the behavior of rods and cones, about transmission of information along neural pathways and the like. (We will explore some of the details below in our discussion of the universality of color experience.) In the end, nothing, anywhere, answers to our common sense description or account of color. That type of property just does not exist.

Put positively, Eliminativism can be understood as follows. Our experience of a seemingly colored world is the result of a systematic error. Simply put, we take features found in our visual experience and project them upon the world, mistakenly believing that color is “out there”–when in fact color is but subjective response to an achromatic reality. This Projectivism about color does not deny that this is an important projection, or that it might help us navigate the world more easily, or that we can continue to speak of the world as colored, but it does point out the fundamental error nevertheless. An analogy might help, and in fact much recent philosophy has involved discussion of the aptness of the following analogy.

In ordinary moral discourse, we are inclined to speak of an action as moral or immoral, right or wrong. We seem in these cases to be claiming that a particular act has (or lacks) a special, moral property or nature. Taken literally, though, such predication would commit us to the existence of rather strange properties, that is, rightness and/or wrongness, ones that are not easily described or explained. Wanting to avoid commitment to those properties, some have suggested a similar projectivist account. In this case, certain actions create in us feelings of pleasure or pain, approval or disapproval. We project these attitudes upon the world, taking the world to really have such properties, when in fact they are nothing but subjective responses. (Talk of “projection” in the psychoanalytic sense is another helpful parallel, where again, something “inner” is mistakenly claimed to be found “outside” us.)

Such Projectivism, as one way of developing Eliminativism, clears the road for a fully scientific account not only of objects, but now of perceivers as well. In particular, only properties that can do genuine explanatory work will be included, and color will be sorted into the group of properties that contribute nothing to our understanding of causal relations between objects and perceivers. There is a downside, however. Besides indicting common sense as systematically wrong, we are bound to be left with a nagging feeling that a most treasured property has completely disappeared. This has provoked some to reply along the following lines: “We started with a belief that objects are colored. Having reduced physical objects to items with only primary qualities, we were left to relocate color and similar qualities within perceivers. Now, however, we have made perceivers and their experiences also bereft of secondary qualities. Without color in the picture at all, we fail to explain how we thought there was color in the first place. How can we explain the appearance of color, our experience of color, now that color is nowhere to be found?”

This question might lead one to rethink the steps that led to this puzzling conclusion, and to raise the possibility that a mistake was made along the way. If so, where exactly did we go wrong, and what would be a better route? If not, how exactly then do we come to believe there is color, if it appears nowhere in our account of reality and perceivers? These difficult questions explain why philosophers continue to debate this interplay between what common sense says about color and what science would have us believe.

2. Color and Metaphysics

One should not conclude that the only philosophical questions about color involve science. The remaining portions of this article offer introduction to other important and exciting issues. In particular, we turn to some questions of metaphysics, and then turn to ones about the universality of color experience, questions that get at the heart of the nature of color from other perspectives.

To begin, consider how much energy we have devoted to explaining the color of objects. Is the color of an object a basic property, a disposition, a combination of micro-primary qualities? Let us pause, however, and ask about color itself. What exactly is color in the first place? What is the essence of this quality that is capable of being a property of objects, or a property of sensations, and so forth? (We can also ask, of course, “What is a quality? And what is the difference between qualities that are colors and those that are sounds?) Focusing our attention on a specific color seems to make things even harder. Consider the questions, “What is the essence of red?” “What is the difference between red and blue”? How do we even go about answering them? Let us explore some attempts.

a. Color Skepticism

Faced with such as question as, “What is the essence of red?” one might respond by pointing to something red, or by looking for a metaphor, claiming that red is like a trumpet sound. The first does not tell us much though–in fact, pointing at a red ball does not suffice to even indicate the redness as opposed to the round shape. Similarly, though metaphors might help convey something about the experience of red, they tell us little about the nature of redness. Can we do better? Can we actually articulate the nature of individual colors? Can we even say what colors in general are, in a rich, philosophically satisfying manner?

One possible source of the apparent difficulty is that we tend to think that the red we experience is something essentially private and subjective. We are drawn to a picture whereby the essence of red, or blue, or yellow for that matter, is given in sense-experience, where the experience itself is something ineffable. Just as it is hard, if not impossible, to articulate what a pain feels like, we may think that the qualitative difference between blue and red is similarly inexpressible. Let “color skepticism” be the view that the essence of color is ineffable, and let us explore the merits of such skepticism.

One source of the supposed ineffability of color, as we have seen, lies in the belief that color’s nature is revealed only in private experience. The language of color, and language as a whole, however, is public in the sense of both being suitable for reporting public events and learnable by appeal to public objects. How then could the allegedly private, subjective nature of color be reconciled with the public, intersubjective nature of language? Color skepticism gains a foothold here, for it seems it cannot. As a result, we are tempted to conclude that our experiences of color are akin to pain in being private, personal and ineffable. No surprise that many have been led to wonder whether the qualitative experience they associate with, say, red, is the same for each person, or instead, whether it is possible that what I experience as red, you experience as green, though we both use the same public word, “red”. Such color skepticism leads to this familiar problem, the Inverted Spectrum. At its worst, we imagine that all of our color experiences might be systematically different from another’s, though we all use “red” to refer to the color of firetrucks, “yellow” for the color of bananas, and so forth. In this case, each of us is trapped within our minds, forever cut-off from truly sharing our experiences of things that matter dearly to us.

How we might extricate ourselves from this depressing, solipsistic trap? One route is to rethink our starting point, namely that there is nothing more to say about red than pointing to red objects or reverting to metaphor. As an alternative, some have sought to articulate the metaphysical nature of color in a surprising direction–by understanding the intrinsic features of individual colors as a product of their relations to other colors. These relations are known as “internal relations” and to them we turn.

b. Color and Internal Relations

First we need to distinguish such internal relations from so-called external relations. External relations are ones in which the relation plays no role in making the relata the relata that they are. For instance, my glass of water is externally related to the table. The relation, “being on top of” is external in that it is not part of the nature or essence of the glass or table to be in that relation. Were the glass and table to cease to being so related neither will undergo a change in their nature. They will not cease to be the things they are. The relata here are external to each other in the sense of not depending on each, or the relation, for their identity.

In contrast we have internal relations. For internal relations, the relations are essential to the being and nature of the related items. Without that particular relation, an entity would not be the thing that it is. To say that colors are internally related to colors would mean that the natures of individual colors depend on the relations those colors have to other colors, to other members the color-array. Orange is related to red and yellow in a particular, unique way, for instance. That relation therefore helps make orange the color it is–that relation as well as the other ones that orange bears to other colors. No other color has those particular relations, and thus no other color is orange. Put differently, orange would cease to be orange were it to not have that relational structure to other colors. (Another example is numbers. Seven would not be the number it is, for instance, were it not between 6 and 8.)

To speak then of a particular color requires reference to its relational place within a color array. What is the nature of the relation between colors? Most abstractly, it is that relation which includes only colors. More specifically, we might say that it is the betweenness relation colors bear to one another. Orange, for instance, is between yellow and red, while green is between blue and yellow, and so forth. Such betweenness relations capture the essence of color. Taken as a whole, these complex betweenness relations can be modeled, allowing us to understand the logical structure of the entire color array. And though many models have been proposed, one particularly illuminating one captures these betweenness relations by modeling color’s structure on that of a double cone. (We can even now speak of the difference between different types of qualities by talking about their different spatial models–color is nicely modeled on a double cone, sound perhaps by a spiral staircase, with each octave recognized as another turn on the staircase.)

The following diagram helps illustrate the structure of color, making use of the HSL (hue, saturation, lightness) model. We can even use it to spell out in some detail a claim about a particular color’s nature and its betweenness relations.

Relying as we have on internal relations might seem paradoxical. On the one hand, each color has its proper place within the color array because of the particular color it is. On the other a color is the particular color it is because of that place within the color array. This suggests colors have their intrinsic properties because of their relations–as opposed to saying they have the relations they do because of their intrinsic property. But what could be plainer than saying red is what it is because of its intrinsic properties? The intrinsic nature of color, we might object, is prior to any of its relations and it is that essence we should try to articulate. Have not we forgotten this important point? Have not we ignored the intrinsic nature of color, and thus what is most important about color in the first place? In reply, it is acknowledged that this account of internal relations does appeal to the relations a color has to other colors in order to individuate it. But, crucially, that does not make the relations conceptually or ontologically prior to colors’ intrinsic properties. For to make sense of the particular relations a color has we have to return to the relata, the color itself. A color has the particular relations it does because of the color it is, just as we want to say. The difference is that on this story, the relata and the relation are intimately and necessarily involved. The relationship and dependence goes in both directions. We are talking about internal relations here, after all. As such, the relata and the relation figure as essential elements. Both balance each other, making both important, but neither prior. That is what is so special about internal relations. In conclusion, we can now say that we have still paid proper respect to the intrinsic nature color.

With this account in place, perhaps we finally have an answer to the color skeptic. We now have something, in fact a lot, to say about each color. True, we need to speak of other colors to explain what a single color is, but we have gone well beyond mere pointing or metaphor. We say what a color is by talking about how it relates to other colors, about its color relationships, its intrinsic properties that make for those relations, and those relations that make for those properties. If that is not good enough to satisfy our skeptic, we might begin to wonder whether the skeptic is willing to be convinced.

3. Is Color Experience Universal?

A final issue we will discuss in this article concerns the universality of color experience. We have already seen one threat to the notion that we all experience color the same way, namely the possibility of an inverted spectrum. A deeper threat comes from another direction, this time borne from wondering about the connection between language and perception. An important theme in the background of this threat lies in the rise and development of a view according to which our perceptual experience is mediated by our language. This has been an important strand in post-WWII philosophy, and as such draws on various themes that fall far outside the scope of this article. We can gain enough of an appreciation of the issue by considering for starters a relatively uncontroversial sense in which our familiarity with a concept influences what we see. To use a well-worn example, a physicist looking at a technical apparatus in a lab sees, in some sense, something different that what the layperson sees and experiences. In this way, different concepts can play some role in what is seen. We move from this innocuous example to tougher ones when we wonder whether different cultures that have completely different languages experience the same world. Or, instead, do the different linguistic resources they bring to experience give them experiences of quite different worlds? It is not hard to be swayed to a perspective from which we see such different languages as yielding very different worlds of experience. Now take these general questions and apply them to experiences of color. Would speakers of languages that have different color terms see the world differently, see different colors?

a. Linguistic Determinism

A particularly strong version of the view that language influences perception was advanced by the anthropologists, Whorf and Sapir. On their view, language plays such an essential role in perception that cultures that use different language can be said to inhabit quite different worlds. What we all see, what we take ourselves to touch, to conceive as real, is a function of language. Vary the language and you change the world experienced. Dubbed the thesis of Linguistic Determinism, this view clearly has interesting implication for color experience once it is realized that there is great diversity in color language across cultures. There are well-documented languages that have only 2 color terms, or three, or only four, and so forth.

What then would Linguistic Determinism have us expect for people who speak a language with only three color terms, for instance? Presumably, if that thesis of determinism is correct, those people would experience only three colors. We would expect these people to simply not be aware of the colors we have terms for; they would fail to make the color discriminations we make, and they would organize their color field in very different ways than we do. This hypothesis was put to the test in the 1960’s by the researchers, Berlin and Kay. Compiling data from a great number of languages, their results seem to contradict the Whorf-Sapir thesis and open a whole range of questions and interesting debates

b. Berlin and Kay

To help appreciate the significance of their findings, we need to distinguish a color’s “foci” from its “boundary.” When presented with an array of color samples (such as ones found at a paint store) we can ask how many of those samples are properly called by a certain color. We could ask, that is, how many of these samples are appropriately called “red”, and where do we draw the line between samples that are red and those that are not? To answer these questions is to speak of the red’s boundary. We might also ask about what is the best sample or paradigmatic sample of red. This is to ask about red’s foci, or more generally, to look for focal samples.

Berlin and Kay found, quite interestingly, that though there are differences across cultures of color boundaries for shared color terms, there was significant consensus on what counted as a focal color–even across languages with very different numbers of color terms. So, in a culture that had only three basic color words, say ones for “white”, “black”, and “red”, people in that culture would point to the same samples as the foci for each of these colors as people with 11 basic color terms, such as English speakers. What they consider as truly red, or white, or black, would be nearly the same samples that we do, though we carve up the world with many different color terms. On the face of it, this suggests something quite other than Whorf-Sapir would have us expect. Something besides color vocabulary seems to be at work in our experience of color. Why else would we all gravitate to the same samples, when for some what is red would presumably include many more colors than us? After all, with only three terms to cover the whole range of color, many more things would have to be called “white” or “black” or “red” in this example. Why would certain samples stand out, even when so many other things are conceived and experienced as red?

In addition, Berlin and Kay found that languages exhibit great similarities on which color terms they have; and great similarities in the relationships between differing numbers of color terms. The following graph summarizes their results, where movement from left to right indicates what color terms would be added as a language increases its number of terms.

Here we see that if a language has two color terms, the terms are “white” and “black”. If a language has a third term, it is “red”; if more than three, then either “green” or “yellow”; and next the other “green” or “yellow” term; and so on. This suggests, as they interpreted it, a development suitably conceived as evolutionary. Thus if a language evolves from two colors to three, the one it will add will be “red”, then “green or “yellow”, and so forth.

What is the philosophical significance of these findings, if true? Simply put, they again suggest that there is something other than language that determines what colors are seen. Berlin and Kay conclude that there are universal, non-trivial constraints on color terms. Color experience is not simply a function of a language’s terms and arbitrary conventions. Instead, there seems something about how the world is that causes different speakers to experience certain colors as best samples, to develop terms for “red” before terms for “brown”, for privileging “white” and “black” over “pink”, and so forth.

If not language, what would explain these findings? One answer comes from facts about the biology of color perception, facts about how our visual system processes certain kinds of electromagnetic radiation, that is, light. (These are the very facts our previously discussed “color-eliminativist” might offer to show that there really is no such thing as color, that is, might make use of to reduce color experience to facts, properties, and relations that make no mention of color at all. Thus what follows can be called upon to serve two functions–explain similarity of color experience across language, and also be used by a color eliminativist to reduce away color. Importantly, these issues are logically independent, and a solution to one problem need have no bearing on the truth of the other.)

Here is a quick summary of the proposed biological account. Our visual system includes rods and cones. Cones are responsible for color vision and do so with three different types of cones. Two of these cones operate according to what is known as opponent-processing. For these two types of cones, they each have two cells, one which has its rate of firing increase when hit by a certain range of light and decrease when another range of light hits it, and a second cell that operates just the opposite way. For example, there is a cell that maximizes its output when hit by light around 610 nm and is at its lowest output around 500 nm. It sits alongside another cell that works just the opposite–its maximum is around 500 nm and its lowest is at 610 nm. (Call these our Y+B- and Y-B+ cells, respectively.) Thus when the cone with this cell package is hit by that 610 nm light, there will be a pure, highly stimulated response as the Y+B- cell will be at its highest, and the Y-B+ cell will be at its quietist. 610 nm happens to be the range of light we call yellow; and thus when this cone is hit by that light, it will give its purest, most intense output of energy. Yellow will be experienced, in other words, in a pure, intense manner. But when the received light is at around 440 nm, this Y+B- cell is at its lowest output, but its partner, the Y-B+ cell, is at its highest output. Blue then also can appear as a particularly strong, pure color. Other places where we get these pure peaks of cell stimulation occur at 520 nm and 660 nm–the very ranges that correspond to green and red respectively. Here we can speak of our R-G+ and R+G- cells. (White and black have their own cells, but these do not work in opposition to each other, so both the black cell and the white cell can be activated at the same time, yielding experiences of different shades of gray.)

This all suggest that any person with a normal operating visual system is going to experience certain ranges of light with intense neural stimulation which happen to correspond to the four basic colors: yellow, red, blue, and green (yes, green is a primary color when it considering our visual system.) And it also explains why no one seems to experience reddish greens–for when the “red” cell is active, the “green” cell is not. We can only have one or the other, and not both. Further, these facts might be able to explain why different speakers in different languages hone in on the same color samples–because for everyone these samples trigger the same intense cell stimulation. Our shared judgments about focal colors, as well as why all people gravitate towards certain colors in a similar order, now seem explainable. And the explanation goes beyond what language creates, contrary to Whorf-Sapir.

To be sure, there are many questions left–such as why it is that red always is the first color to appear in languages after “white” and “black” even though other colors trigger similarly intense responses. So too have Berlin/Kay’s results been subjected to many criticisms and objections, from the philosophical to the methodological. What emerges then is a fascinating debate the ranges across numerous disciplines. In a way, that seems most proper and fitting. For color appeals to all who can see it, and it makes sense to suppose that we are still drawn to color, whatever our intellectual interests, just as we have been since we were kids.

4. References and Further Reading

a. Overviews and General Discussions

  • Berlin, B., & Kay, P. (1999). Basic color terms : their universality and evolution. Stanford, Calif: Center for the Study of Language and Information.
    • The landmark book that summarizes their cross-cultural findings of color terms, boundaries, and foci.
  • Byrne, A., & Hilbert, D. R. (1997). Readings on color. Cambridge, Mass: MIT Press.
    • This two volume set contains a wide range of important article on various issues on color. Volume 1 is on the philosophy of color, and the second volume on the science. Besides containing numerous landmark articles, there is a detailed bibliography and glossary of terms. A must have set for those wishing to explore the various debates in more detail.
  • Kay, P., McDaniel, C. “The Linguistic Significance of the Meaning of Basic Color Terms”. Language, vol. 54, 1978, pp.610-46.
    • Provides a biological based explanation for the anthropological findings in Berlin/Kay.
  • Harrison, Bernard. (1973). Form and Content. Oxford: Basil Blackwell.
    • An extended discussion of what we have called “color skepticism”, with a detailed account of color as a system of internal relations. Covers many issues in a careful, interesting manner.
  • Wittgenstein, L., & Anscombe, G. E. M. (1978). Remarks on colour. Oxford [Eng.]: B. Blackwell.
    • An interesting, but difficult, examination of a number of puzzles about color. Hard going but shows a brilliant mind struggling to make sense of difficult problems about color.

b. Specific Positions

  • Armstrong, D. M. (1987) “Smart and the secondary qualities.” In Metaphysics and Morality: Essays in Honour of J. J. C. Smart, ed. P. Pettit, R. Sylvan, and J. Norman. Oxford: Blackwell. Reprinted as chapter 3 of Readings on Color, vol. 1.)
    • Classic statement of Physicalism.
  • Cornman, J. “Can Eddington’s `two tables’ be identical?”. Australasian Journal of Philosophy vol 52, 1974. pp. 22-38.
    • A defender of Non-Reductive Realism.
  • Hardin, C. L. (1988). Color for Philosophers: Unweaving the Rainbow. Indianapolis: Hackett Pub. Co.
    • Written by a philosopher who knows lots of the science of color perception, this book provides an excellent introduction to debates over the scientific status of color, and provides an extended argument for what we have called Color Eliminativism.
  • Jackson, F., and R. Pargetter. “An objectivist’s guide to subjectivism about colour.” Revue Internationale de Philosophie. vol. 41. 1987. pp.127-41. (Reprinted as chapter 6 of Readings on Color, vol. 1.)
    • An alternative to Physicalism about color.
  • Johnston, M. “How to speak of the colors”. Philosophical Studies, vol. 68, 2 1992. pp. 21-63.
    • Extended defense of Dispositionalism.
  • McDowell, J. “Values and Secondary Qualities”, in Ted Honderich, ed., (1985) Morality and Objectivity. Routledge & Kegan Paul.
    • Discusses the pros and cons of a Projectivist strategy that compares secondary qualities and moral properties.
  • Peacocke, C. “Colour concepts and colour experience”. Synthese vol. 58, 1984. pp. 365-82. (Reprinted as chapter 5 of Readings on Color, vol. 1.)
    • Another version of Dispositionalism.
  • Sellars, W. “Philosophy and the Scientific Image of Man” in Science, Perception and Reality. (1991) Ridgview Publishing Company.
    • A difficult but interesting argument against Eliminativism, in favor of a different version of Subjectivism.
  • Shoemaker, S. “Phenomenal character.” Noûs. vol. 28, 1994. pp. 21-38. (Reprinted as chapter 12 of Readings on Color, vol. 1.)
    • From a defender of what we have called Phenomenal Subjectivism.

Author Information

Eric M. Rubenstein
Email: erubenst@iup.edu
Indiana University of Pennsylvania
U. S. A.

Personal Identity

What does being the person that you are, from one day to the next, necessarily consist in? This is the question of personal identity, and it is literally a question of life and death, as the correct answer to it determines which types of changes a person can undergo without ceasing to exist. Personal identity theory is the philosophical confrontation with the most ultimate questions of our own existence: who are we, and is there a life after death? In distinguishing those changes in a person that constitute survival from those changes in a person that constitute death, a criterion of personal identity through time is given. Such a criterion specifies, insofar as that is possible, the necessary and sufficient conditions for the survival of persons.

One popular criterion, associated with Plato, Descartes and a number of world religions, is that persons are immaterial souls or pure egos. On this view, persons have bodies only contingently, not necessarily; so they can live after bodily death. Even though this so-called Simple View satisfies certain religious or spiritual predilections, it faces metaphysical and epistemological obstacles, as we shall see.

Another intuitively appealing view, championed by John Locke, holds that personal identity is a matter of psychological continuity. According to this view, in order for a person X to survive a particular adventure, it is necessary and sufficient that there exists, at a time after the adventure, a person Y who psychologically evolved out of X. This idea is typically cashed out in terms of overlapping chains of direct psychological connections, as those causal and cognitive connections between beliefs, desires, intentions, experiential memories, character traits, and so forth. This Lockean view is well suited for thought experiments conducted from first-person points of view, such as body swaps or tele-transportation, but it, too, faces obstacles. For example, on this view, it appears to be possible for two future persons to be psychologically continuous with a presently existing person. Can one really become two? In response to this problem, some commentators have suggested that, although our beliefs, memories, and intentions are of utmost importance to us, they are not necessary for our identity, our persistence through time.

A third criterion of personal identity is that we are our bodies, that is to say, that personal identity is constituted by some brute physical relation between, for example, different bodies or different life-sustaining systems at different times. Although this view is still somewhat unpopular, developments about personal identity theory in the 1990s promise an ideological change, as versions of the so-called somatic criterion, associated with Eric Olson and Paul Snowdon, attract a continuously growing number of adherents.

The aim of this article is to (1) add precision to the problem of personal identity, (2) state a number of theories of personal identity and give arguments for and against them, (3) formulate “the paradox of identity,” which proposes to show that posing the persistence question, in conjunction with a number of plausible assumptions, leads to a contradiction, and (4) explain how Derek Parfit’s theory of persons attempts to answer this paradox.

Table of Contents

  1. Understanding the Problem of Personal Identity
    1. Criteria and the Identity Relation
    2. Personhood
  2. Theories of Personal Identity
    1. The Simple View
    2. Reductionism (1): General Features
    3. Reductionism (2): Psychological Approaches
    4. Quasi-Psychology
    5. Reductionism (3): Physiological Approaches
  3. The Paradox of Personal Identity
    1. Fission
    2. The Paradox
  4. Parfit and the Unimportance of Personal Identity
  5. References and Further Reading

1. Understanding the Problem of Personal Identity

The persistence question, the question of what personal identity over time consists in, is literally a question of life and death: answers to it determine, insofar as that is possible, the conditions under which we survive, or cease to exist in the course of, certain adventures. These adventures do not have to be theoretically as fancy as the cases, to be discussed later, of human fission or brain swaps: a theory of personal identity tells us whether we can live through the acquisition of complex cognitive capacities in our development from fetus to person, or whether we have survived car accidents if we find ourselves in a persistent vegetative state. Furthermore, theories of personal identity have ethical and metaphysical implications of considerable magnitude: in conjunction with certain normative premises they may support the justification or condemnation of infanticide or euthanasia, or they could prove or falsify certain aspects of our religious outlook, in deciding the questions of how and whether we can be resurrected and whether we are possessors of souls whose existence conditions are identical with ours. It is not surprising, therefore, that most great philosophers have attempted to solve the problem of personal identity, or have committed themselves to metaphysical systems that have substantial implications with regards to the problem, and that most religious belief systems give explicit answers to the persistence question. Neither is it surprising that virtually everybody holds a pre-theoretical theory of personal identity, if only in the sense of having beliefs about afterlives and the meaning of death. The task of solving the metaphysical problem of personal identity essentially involves answering the question of how the phenomenon or principle in virtue of which “entities like us” persist through time is to be specified, under the widely but not universally accepted premises that there is such a phenomenon or principle and that it can be specified. We are concerned, in other words, with the truth-makers of personal identity statements: what makes it true that our statement that an entity X at time t1 and an entity Y at time t2 are identical, if X and Y are entities like us?

a. Criteria and the Identity Relation

Answers to the persistence question often provide a criterion of personal identity. A criterion is a set of non-trivial necessary and sufficient conditions that determines, insofar as that is possible, whether distinct temporally indexed person-stages are stages of one and the same continuant person. (A temporally indexed person-stage is a slice of a continuant person that extends in three spatial dimensions but has no temporal extension.) To say that C is a necessary condition for E is to say that if E is the case, then C is the case as well, and to say that C is a sufficient condition for E is to say that if C is the case, then E is the case as well. Consequently, to specify such a criterion is to give an account of what personal identity necessarily consists in.

Let us distinguish between numerical identity and qualitative identity (exact similarity): X and Y are numerically identical iff X and Y are one thing rather than two, while X and Y are qualitatively identical iff, for the set of non-relational properties F1…Fn of X, Y only possesses F1…Fn. (A property may be called “non-relational” if its being borne by a substance is independent of the relations in which property or substance stand to other properties or substances.) Personal identity is an instance of the relation of numerical identity; investigations into the nature of the former, therefore, must respect the formal properties that govern the latter. The concept of identity is uniquely defined by (a) the logical laws of congruence: if X is identical with Y, then all non-relational properties borne by X are borne by Y, or formally “∀(x, y)[(x = y) → (Fx = Fy)]; and (b) reflexivity: every X is identical with itself, or formally “∀x(x = x). (Note that congruence and reflexivity entail that identity is symmetric, “∀(x, y)[(x = y) → (y = x)], and transitive, “∀(x, y, z)[((x = y) & (y = z)) → (x = z)]). [Note: ∀(xy) is an abbreviation of (∀x)(y).]

Grasp of the notion of numerical identity, to be sure, is essential to our ability to distinguish between the events of picking out one thing more often than once and picking out more than one thing. Although exact similarity is, by congruence, a necessary condition for synchronic personal identity, it is neither necessary nor sufficient for diachronic personal identity, that is to say, the persistence of a person over time: two person-slices at different times could be qualitatively identical slices of different people or qualitatively distinct slices of the same person. This is not to say, however, that it is ruled out that lack of similarity over time may obliterate numerical personal identity: depending on what personal identity consists in, certain qualitative changes in a person’s psychology or physiology may kill the person. The question a criterion of personal identity answers is: what kind of changes does a person survive?

This gives a distinctive sense to the claim that a criterion of personal identity is to be constitutive, not merely evidential: in order for a relation R to be constitutive for personal identity, it must be the case that, necessarily, if some past or future Y stands in an R-relation to X, then X is identical with Y. Hence, many elements of our successful everyday reidentification practices, such as physical appearance, fingerprints, or signatures, are inadequate if considered as constituting ingredients of personal identity relations: for example, if the man in the crowd is wearing a Yankees jacket, this might be sufficient evidence for you to conclude that he is your friend Larry. However, wearing a Yankees jacket is not what it is for Larry to persist through time: neither did Larry come into existence when he wore the jacket for the first time nor does he die when he takes it off.

Does the logic of the concept of identity impose further restraints on the concept of personal identity? Some commentators believe that identity is an intrinsic relation, that is, that if two person-stages at different times are stages of one and the same person, that will be true only in virtue of the intrinsic relation between these two stages (cf. Noonan 1989; Wiggins 2001). Others hold identity to be necessarily determinate, that is, that it is necessarily false that sometimes there is no answer to the question of whether X is identical with Y. These commentators typically reason as follows: suppose that it is indeterminate that X is identical with Y. Since it is determinate that X is identical with X, under the assumption that congruence and predicate logic apply, X must be determinately identical with Y. Therefore, by modus tollens, if X is not determinately identical with Y, X is not identical with Y (cf. Evans 1985; Wiggins 2001). Consequently, the question does in fact have an answer, and the claim that identity is indeterminate is self-contradictory. This conclusion is strengthened, in the case of personal identity, by the widely shared intuition that even if the identity of some objects might be indeterminate, this could not be true of the identity of persons: one cannot, it seems, be a bit dead and a bit alive in the same way in which one cannot be a bit pregnant. As it turns out, however, there may be good reasons to deny both the intrinsicness and the determinacy of personal identity (cf. 3.a.; 3.b.).

b. Personhood

While the formal properties of the concept of identity are necessary constraints on our discussion, the truth of our identity judgments is subject to material conditions of correctness, which these formal properties cannot provide. These material conditions must be supplied by the nature of the relata judged to stand in an identity relation. The obvious suggestion is that, given that we are dealing with personal identity, these relata are person-stages located at different times. This proposal, however, violates the requirement that the persistence question ought to specify its relata without presupposing an answer: should we choose to accept a definition in the vicinity of Locke’s characterization of a person as a “thinking, intelligent being, that has reason and reflection, and can consider itself as itself, the same thinking thing in different times and places” (1689, II.xxvii.9), then those criteria of personal identity that sanction the identity of a person at one time with a non-person at another time are categorically ruled out. Fetuses, infants, or human beings in a persistent vegetative state, for example, plainly do not fulfill the criteria envisaged by Locke. As a result, since these beings do not possess cognitive capacities, if they do at all, that qualitatively attain those of thinking beings, couching the persistence question in terms of persons entails that none of us has ever been a fetus or infant or ever will be a human vegetable (Olson 1997a; Mackie 1999). To be sure, these initially baffling claims could be true. However, since these are clearly substantial questions about our persistence, we should not consider ourselves justified to settle the matter by definition. Consequently, we should prefer vagueness over chauvinism and pose the persistence question in terms of the wider notion of human being, postponing the question of whether and in what sense the notions of person and human being ought to be distinguished: for any person X and any human being Y at different times t1 and t2, if X at t1 is numerically identical with Y at t2, what makes this claim necessarily true?

2. Theories of Personal Identity

In order to discover what your pre-philosophical attitude towards this question is, ask yourself the following: what does a supernatural being have to do in order to resurrect you after you die? Collect a few possible answers and ask yourself whether the resulting being, the freshly created being that is now a candidate for being identical with you before you died, is in fact you. For example, do you believe that

  1. …the supernatural being could have given you a body which bears no physical continuity or causal relation to the one you possessed before your death, or that it could have resurrected you, in some sense or other, as a bodiless being?
  2. …it could have given a new form or content to your psychology, that is, that it is not necessary or sufficient for the “resurrected you” to remember your actions or experiences and that there do not have to be any causal connections between the actions and experiences of you before you died and the”resurrected you”?
  3. …the question of whether or not the resulting person is you depends on the existence, in the resurrected person, of something that one might call “a soul”?

If you believe any of these options, then you must also believe, respectively, that

  1. …a physiological criterion of personal identity is false.
  2. …a psychological criterion of personal identity is false.
  3. …the Simple View of personal identity is true.

Let us discuss these theories of personal identity in more detail.

a. The Simple View

Some commentators believe that there are no informative, non-trivial persistence conditions for people, that is, that personal persistence is an ultimate and unanalyzable fact (cf. Chisholm 1976; Lowe 1996; Merricks 1998; Shoemaker & Swinburne 1984). While psychological and physiological continuities are evidential criteria, these do not constitute necessary and/or sufficient conditions for personal identity. We must distinguish between two versions of this view. One version is that personal identity is non-reductive and wholly non-informative, denying that personal identity follows from anything other than itself. This makes the label Identity Mysticism (“IM“) most appropriate (cf. Zimmerman 1998):

IM: X at t1 is identical to Y at t2 iff X at t1 is identical to Y at t2,

Identity Mysticism plays only an indirect role in contemporary personal identity theory. Although it may be poorly understood, due to limitations of space this article will disregard the view. IM is to be distinguished from a more popular version of the simple view, according to which personal identity relations are weakly reductive (WR) and in independence non-informative (INI):

WR-INI: X at t1 is identical to Y at t2 iff there is some fact F1 about X at t1, and some fact F2 about Y at t2, and F1 and F2 are irreducible to facts about the subjects’ psychology or physiology, and X at t1 is identical with Y at t2 in virtue of the fact that the propositions stating F1 and F2 differ only insofar as that “X” and “t1” occur in the former where “Y” and “t2” occur in the latter.

WR-INI is weakly reductive in the sense that, while the identity relation in question can be reduced to a further domain, the further domain itself typically exhibits elements of non-reducibility and/or resistance to full physical explanation. In their most prominent variants, these elements are due to references to souls, Cartesian Egos or other spiritual or immaterial substances and/or properties. Initially the idea underlying this claim may appear prejudicial; ultimately it is based on a number of widespread but not universally accepted beliefs about the naturalness of the world and the nature, validity and theoretical implications of physicalism. According to this general stance, either both psychological and physiological continuity relations are fully reducible to a domain in which physical explanations are couched, perhaps in terms of the basic elements of a final and unified theory of physics, or they belong themselves to such a domain.

WR-INI may entail IM but does not so necessarily: it is conceivable that personal identity relations consist in something which is itself neither identical with nor reducible to a spiritual substance nor identical with nor reducible to aggregates or parts of psychologies and physiologies. In fact, Descartes’ own view that personal identity is determined by “vital union” relations between pure Egos and bodies, with the persistence of the Ego being regarded as sufficient for the persistence of the person but the person not being wholly identifiable with the Ego, could be a weakly reductive view of persons. It is merely weakly reductive, however, because the identity of the phenomenon that specifies the necessary and sufficient conditions for personal identity does not itself follow from anything other than itself. While a weakly reductive criterion of personal identity relations is explicable in terms of the identities of phenomena other than persons, the identities of these phenomena themselves are not explicable in other terms: their identity may be, as we would suppose “soul identity” to be, “strict and philosophical”, and not merely “loose and popular” (Butler 1736).

Nowadays, the Simple View is disparaged as a theory only maintained by thinkers whose religious or spiritual commitments outweigh the reasons that speak against their views on personal identity. This is due to the fact that it is assumed that a theory of personal identity cannot be weakly reductive without involving appeal to discredited spiritual substances or committing itself either to the acknowledgment of yet unrecognized physical entities or to an Identity Mysticism on the level of persons. As a consequence, many philosophers think that the problems that infiltrate dualism and Cartesian theories of the soul, such as the alleged impossibilities to circumscribe the ontological status of souls and to explain how a soul can interact with a body, render the Simple View equally problematic. Although the options mentioned are exceedingly difficult to defend, why should they have to be regarded as the only options available to the Simple Theorist? Arguably, many respectable philosophical ideologies, such as conceptualism or Neo-Kantianism, may issue in theories of personal identity along Simple lines without appeal to Cartesian Egos. (Note, however, that these ideologies, with regards to the problem of the persistence of people, may also be, and in fact have been, construed along physiological or psychological lines). This suggests that we do not only need a better understanding, and above all more promising articulations, of the Simple View, but also a new taxonomy of theories of personal identity: the traditional division of theories into Simple, Psychological and Physical, even if maintained here by the author of this entry, may not be the best way of viewing the matter.

b. Reductionism (1): General Features

Modern day personal identity theory takes place mainly within reductionist assumptions, concentrating on the relative merits of different criteria of identity and related methodological questions. Reductionist theories of personal identity share the contention that…

Reduction: Facts about personal identity stand in an adequate reduction-relation to sets of sub-personal facts SF1 SFn about psychological and/or physiological continuities in such a way as to issue in biconditionals of the form “X at t1 is identical to Y at t2 iff X at t1 and Y at t2 stand in a continuity-relation fully describable by SFx.”

Thus, any given set of sub-personal facts will impose demands, in forms of necessary and sufficient conditions, upon the kinds of adventures a subject can survive in persisting from t1 to t2. The sets of necessary and sufficient conditions determined by these sets of sub-personal facts constitute the various criteria of personal identity. It must be noted that the biconditionals in question need not to be understood in such a way as that circularity is an objection to them: provided that concepts other than “person” feature in the analysans, these biconditionals, by exhibiting connections with collateral and independently intelligible concepts, may be genuinely elucidatory even if the concept to be analyzed features on both sides of the equation (cf. McDowell 1997; Wittgenstein 1922, 3.263).

Only when the concepts “person” and “personal identity” become the target of what may be referred to as an authentic reduction circularities become vicious. The need for the distinction between authentic and inauthentic reductions arises due to an equivocation that ought not to confuse the present discussion: reductionisms in personal identity theory often take forms, if regarded for example as sets of supervenience claims, that are deemed, in other areas of analytic philosophy, as distinctively non-reductionist. Let us speak of authentic reductions if the ontological status of members of the reduced category is, in a way to be made precise, diminished in favor of the allegedly “more fundamental” existence-status of members of the reducing category. The question of whether an authentic reductionism about persons must claim that it is not only able to give a criterion of personal identity without presupposing personal identity but also that facts about persons are describable without using the concept “person” is a matter of current controversy (cf. Behrendt 2003; Cassam 1989; 1992; Johnston 1997; McDowell 1997; Parfit 1984; 1999; forthcoming; cf. also 2.d.).

In a search for the necessary and sufficient conditions for the sustenance of personal identity relations between subjects, which type of continuity-relations could SF describe? There are two main contenders, physiological continuity-relations and psychological continuity-relations, which will be discussed in turn.

c. Reductionism (2): Psychological Approaches

Psychological Criteria of personal identity hold that psychological continuity relations, that is, overlapping chains of direct psychological connections, as those causal and cognitive connections between beliefs, desires, intentions, experiential memories, character traits and so forth, constitute personal identity (cf. Locke 1689, II.xxvii.9-29; Parfit 1971a; 1984; Perry 1972; Shoemaker 1970; Shoemaker & Swinburne 1984).

Two apparently physiological theories of personal identity are at bottom psychological, namely (i) the Brain Criterion, which holds that the spatiotemporal continuity of a single functioning brain constitutes personal identity; and (ii) the Physical Criterion, which holds that, necessarily, the spatiotemporal continuity of that which sustains the continuous psychological life of a human being over time, which is, contingently, a sufficient part of the brain that must remain in order to be the brain of a living person, constitutes personal identity (cf. Nagel 1971). These approaches are at bottom psychological because they single out, as the constituting factors of personal identity, the psychological continuity of the subject. Consider a test case. Imagine there to be a tribe of beings who are in all respects like human beings, except for the fact that their brains and livers have swapped bodily functions: their brains regulate, synthesize, store, secrete, transform, and break down many different substances in the body, while their livers are responsible for their cognitive capacities, basic integrated postural and locomotor movement sequences, perception, instincts, emotions, thinking, and other integrative activities. Imagine the brain criterion to be true for human beings. Would we have sufficient reason to believe the brain criterion to be true for members of the tribe in question as well, if we were aware of all facts about their physiologies? No, precisely because the brain criterion is true for human beings, a liver criterion would have to be true for members of this tribe. There is nothing special about the 1.3 kilograms of grey mass that we carry around in our skulls, except for the fact that this mass is the seat of our cognitive capacities.

We can further distinguish between three versions of the psychological criterion: the Narrow version demands psychological continuity to be caused “normally,” the Wide version permits any reliable cause, and the Widest version allows any cause to be sufficient to secure psychological continuity (cf. Parfit 1984). The Narrow version, we may note, is logically equivalent to the Physical Criterion.

One might think that brain criterion and physical criterion, to varying degrees, combine the best of both worlds: both acknowledge the vital function psychological continuity plays in our identity judgments while at the same time admitting of the importance of physiological instantiation. In fact, however, the opposite is the case: the appeal to physiology introduces an unacceptable element of contingency into the answers to the persistence question envisaged by defenders of these criteria. A criterion of personal identity tells us what our persistence necessarily consists in, which means that it must be able to deliver a verdict in possible scenarios that is consistent with its verdicts in ordinary cases. One scenario that has been widely debated is the following:

Teletransportation

At t1, X enters a teletransporter, which, before destroying X, creates an exact blueprint of X’s physical and psychological states. The information is sent to a replicator device on Mars, which at t2 creates a qualitatively identical duplicate, Y (cf. Parfit 1984). Our alleged intuition: since Y at t2 shares with X at t1 all memories, character traits, and other psychological characteristics, X and Y are identical. Alleged conclusion: should teletransportation be reliable, all proposed criteria but the Wide and Widest versions of the Psychological Criterion are false.

Should teletransportation be unreliable, all criteria of personal identity but the Widest version of the Psychological Criterion are false. Consequently, should appeal to such scenarios as Teletransportation be acceptable and should the intuition above be widely shared, the brain criterion and physical criterion are false.

d. Quasi-Psychology

Many people regard the idea that our persistence is intrinsically related to our psychology as obvious. The problem of cashing out this conviction in theoretical terms, however, is notoriously difficult. Psychological continuity relations are to be understood in terms of overlapping chains of direct psychological connections, that is, those causal and cognitive connections between beliefs, desires, intentions, experiential memories, character traits and so forth. This statement avoids two obvious problems.

First, some attempts to cash out personal identity relations in psychological terms appeal exclusively to direct psychological connections. These accounts face the problem that identity is a transitive relation (see 1.a.) while many psychological connections are not. Take memory as an example: suppose that Paul broke the neighbor’s window as a kid, an incident he remembers vividly when he starts working as a primary school teacher in his late 20s. As an old man, Paul remembers his early years as a teacher, but has forgotten ever having broken the neighbor’s window. Assume, for reductio, that personal identity consists in direct memory connections. In that case the kid is identical with the primary school teacher and the primary school teacher is identical with the old man; the old man, however, is not identical with the kid. Since this conclusion violates the transitivity of identity (which states that if an X is identical with a Y, and the Y is identical with a Z, then the X must be identical with the Z), personal identity relations cannot consist in direct memory connections. Appeal to overlapping layers or chains of psychological connections avoids the problem by permitting indirect relations: according to this view, the old man is identical with the kid precisely because they are related to each other by those causal and cognitive relations that connect kid and teacher and teacher and old man.

Second, memory alone is not necessary for personal identity, as lack of memory through periods of sleep or coma do not obliterate one’s survival of these states. Appeal to causal and cognitive connections which relate not only memory but other psychological aspects is sufficient to eradicate the problem. Let us say that we are dealing with psychological connectedness if the relations in question are direct causal or cognitive relations, and that we are dealing with psychological continuity if overlapping layers of psychological connections are appealed to (cf. Parfit 1984).

One of the main problems a psychological approach faces is overcoming an alleged circularity associated with explicating personal identity relations in terms of psychological notions. Consider memory as an example. It seems that if John remembers having repaired the bike, then it is necessarily the case that John repaired the bike: saying that a person remembers having carried out an action which the person did not in fact carry out may be regarded as a misapplication of the verb “to remember.” To be sure, one can remember that an action was carried out by somebody else; it seems to be a matter of necessity, however, that one can only have first-person memories of experiences one had or actions one carried out. Consequently, the objection goes, if memory and other psychological predicates are not impartial with regards to identity judgments, a theory that involves these predicates and that at the same time proposes to explicate such identity judgments is straightforwardly circular: it plainly assumes what it intends to prove.

To make things clearer, consider the case of Teletransportation above: if at t2 Y on Mars remembers having had at t1 X’s experience on earth that the coffee is too hot, then, necessarily, X at t1 is identical with Y at t2. The dialectic of such thought experiments, however, requires that a description of the scenario is possible that does not presuppose the identity of the participants in question. We would wish to say that since X and Y share all psychological features, it is reasonable or intuitive to judge that X and Y are identical, and precisely not that since we describe the case as one in which there is a continuity between X’s and Y’s psychologies, X and Y are necessarily identical. If some psychological predicates presuppose personal identity in this way, an account of personal identity which constitutively appeals to such predicates is viciously circular.

In response, defenders of the psychological approach have created psychological concepts that share with our ordinary psychological predicates all features except presumptions of personal identity: for example, the concept of “quasi-memory” is exactly like ordinary memory apart from the fact that “memory” is judgmental with regards to personal identity whereas “quasi-memory” is not (cf. Shoemaker 1970). While many commentators regard the appeal to quasi-memory, and ultimately “quasi-psychology,” as sufficient to solve the circularity problem, some commentators think that personal concepts infiltrate extensionally articulated psychological concept-systems so deeply that any reductionist programme in personal identity is doomed from the start (cf. Evans 1982; McDowell 1997).

e. Reductionism (3): Physiological Approaches

Opponents of the psychological criterion typically favour a physiological approach. There are at least two of them: (i) the Bodily Criterion holds that the spatiotemporal continuity of a functioning human body constitutes personal identity (cf. Williams 1956-7; 1970; Thompson 1997); and (ii) the Somatic Criterion holds that the spatiotemporal continuity of the metabolic and other life-sustaining organs of a functioning human animal constitutes personal identity (cf. Mackie 1999; Olson 1997a; 1997b; Snowdon 1991; 1995; 1996). It is not obvious that there is a straightforward relation between them, for everything depends on how the notions of “functioning human body” and “life-sustaining organs” are understood. If these notions are understood similarly, the views are (close to) equivalent; the other extreme, even if unlikely to be held, is that the notions are understood differently, to the effect that they are incompatible (if, for example, a functioning human body and its life-sustaining organs could come apart). Physiological approaches have consequences many of us feel uncomfortable with. Consider the following thought experiment:

Body Swap

X’s brain is transplanted into Y’s body. X’s body and Y’s brain are destroyed, the resulting person is Z. Our alleged intuition: since Z shares with X all memories, character traits, and other psychological characteristics, X is identical with Z. Alleged conclusion: the Bodily and the Somatic Criteria are false (cf. Locke 1689, II.xxvii.15; Shoemaker 1963).

Defenders of bodily criterion and somatic criterion typically bite the bullet and argue that it is not the case that X and Y have swapped bodies, but that Y falsely believes to be X, and therefore that Z is identical with Y.

Since the psychological and physiological approaches are mutually exclusive and, we may suppose in the current context, as candidates for an adequate theory of personal identity jointly exhaustive, any objection against the psychological approach is equally an argument for the physiological approach. The initial implausibility of the physiological approach is due to thought experiments that traditionally permeate the personal identity debate and often favour psychological considerations. Defenders of the somatic approach, most notably Olson and Snowdon, have tried to shift the focus to real-life cases in which descriptions along physiological lines look much more promising. Consider:

Human Vegetable

X has at t1 a motor bicycle accident. The being Y that is transported to the hospital is at t2 in a persistent vegetative state. Our alleged intuition: X at t1 is identical with Y at t2. Alleged conclusion: all views which postulate psychological continuity as a necessary condition are false.

Fetus

Since a fetus does not possess the cognitive capacities necessary to satisfy the demands of the Psychological Criterion, if the latter is true, no person can be identical with a past fetus. Our alleged intuition: Each of us is identical with a past fetus. Alleged conclusion: all views which postulate psychological continuity as a necessary condition are false.

A third problem for the psychological approach is that it implies, supposedly, that we are not human animals (Ayers 1990; Snowdon 1990; Olson 1997a; 2002a). The argument is simple:

Premise 1: Psychological continuity is neither necessary nor sufficient for the persistence of a human animal.

Premise 2: The psychological approach claims that psychological continuity is necessary and/or sufficient for our persistence.

A: for reductio:The psychological approach is true.

B: from 2, A: Psychological continuity is necessary and/or sufficient for our persistence.

Premise 3: Psychological continuity cannot at the same time be (i) necessary and/or sufficient for a thing’s persistence and (ii) neither necessary nor sufficient for the same thing’s persistence.

C: from 1, B, 3: None of us is identical with a human animal.

Premise 2 is implied by the psychological approach. The thought experiments that support premise 1 have already been given: since the human animal each of us is has been a fetus and could end up as a human vegetable, the thought experiments Fetus and Human Vegetable above demonstrate that psychological continuity is not necessary for human animal identity. A variant of Body Swap shows that psychological continuity is not sufficient for human animal identity. Suppose X’s brain to be transplanted into Y’s skull and X’s body and Y’s brain are destroyed. Suppose further that the resulting being Z is psychologically continuous with X. In this case, it does not seem to be the case that the surgeons transplant the human animal X from one head to another. Rather, it seems, the human animal Y receives a new organ, namely a brain. Consequently, psychological continuity is not sufficient for human animal identity and premise 1 holds. Premise 3 seems to be obvious, because its being false would entail that one and the same being can outlive itself, which is absurd. The defender of the physiological approach now argues that

Premise 4: We are human animals.

C: from B, 4: The psychological approach is false.

Premise 5: Physiological and psychological answers to the persistence question are mutually exclusive and jointly exhaustive.

Conclusion: The physiological approach is true.

It may be argued that premise 4 is not a matter of metaphysics but of biological classification. The underlying problem, however, is that it seems undeniable that there is a human animal located where each of us is. If this human animal has persistence conditions different from those that determine our persistence, then there must be two things wherever each of us is located. This conclusion raises important questions and problems a psychological approach must address.

3. The Paradox of Personal Identity

One of the most influential thought experiments in recent personal identity theory is the case of fission.

a. Fission

Fission

X’s brain is removed from X’s body and X’s body is destroyed. X’s brain’s corpus callosum, the bundle of fibres responsible for retaining the capacity of information-transfer between the two brain hemispheres, is severed, leaving two (potentially) equipollent brain hemispheres. The single lower brain is divided and each hemisphere is transplanted into one of two qualitatively identical bodies of the fission outcomes Y1 and Y2. Our alleged intuition: since both Y1 and Y2 share with X all psychological characteristics, both are candidates for being identical with X: either, in the absence of the other, would have been identical with X. Alleged conclusion: either, on pain of violating the transitivity of identity, the Psychological Criterion is false or the question of whether two person-stages X at t1 and Y1 at t2 are temporal parts of the same person depends on facts concerning not only X and Y1 but also, in this case, Y2. In the latter case, a “closest continuer” clause and/or a “no-branching” proviso must complement a psychological continuity analysis (For a development of this case, see Nozick 1981; Parfit 1984; and Wiggins 1967).

Fission scenarios emphasise the difficulty of deciding whether a thought experiment is acceptable or not. They assume the possibility of commissurotomy or brain bisection, that is, the perforation of the corpus callosum, and hemispherectomy, that is, the surgical removal of the cerebral cortex of one brain hemisphere. Commissurotomy was used in epilepsy treatment in the 50’s (cf. Nagel 1971) and hemispherectomies too have been performed in the past. However, fission cases additionally assume the possibility, in some sense or other, of dividing the subcortical regions, and in particular the single lower brain. This is not physically possible without damaging the upper brain functions (cf. Parfit 1984). Many commentators regard fission to be an acceptable challenge to theories of personal identity. Wilkes disagrees: she thinks that our ignorance about what actually happens in these cases jeopardises the theoretical relevance of fission scenarios (cf. 1988). The question of whether or not physically impossible but logically possible scenarios are acceptable remains to be answered.

Should fission be an acceptable scenario, it presents problems for the the psychological approach in particular. The fission outcomes Y1 and Y2 are both psychologically continuous with X. According to the psychological approach, therefore, they are both identical with X. By congruence, however, they are not identical with each other: Y1 and Y2 share many properties, but even at the very time the fission operation is completed differ with regards to others, such as spatio-temporal location. Consequently, fission cases seem to show that the psychological approach entails that a thing could be identical with two non-identical things, which of course violates the transitivity of identity. Some commentators have attempted to save the psychological approach by appeal to the so-called “multiple occupancy view,” that is, the claim that, despite appearances, X was two people, namely Y1 and Y2, all along (cf. Lewis 1976; Noonan 1989; Perry 1972). Combined with a four-dimensionalist or temporal part ontology, this view is not as absurd as it initially seems, but it is certainly controversial.

Others have acknowledged, as a consequence of fission scenarios, that psychological continuity is not sufficient for personal identity. These commentators typically complement their psychological theory with a non-branching proviso and/or a closest continuer clause. The former states that even though X would survive as Y1 or Y2 if the other did not exist, given that the other does exist, X ceases to exist. This proviso avoids the problem of violating the transitivity of identity. It is hard to believe, however, because it entails that I can kill you without you ever noticing: if I knock you unconscious, transplant one of your brain hemispheres into a different body, and drop you off at home before you wake up, then, if the transplant is successful and the psychological approach with non-branching proviso is true, you are dead. We could avoid this problem by adding a closest-continuer or best candidate clause, stating roughly that the best candidate for survival in a fission scenario, that is, the fission outcome which bears the most or the most important resemblances to the original person X, is identical with X. One of the problems with this suggestion is that it assumes that personal identity is an extrinsic relation. It thereby violates another important principle, namely the so-called “only X and Y rule,” which states, roughly, that if two person-stages at different times are stages of one and the same person, that will be true only in virtue of the intrinsic relation between these two stages (cf. Noonan 1989; Wiggins 2001). While this principle is not necessarily sacrosanct, it is desirable to avoid violating it.

b. The Paradox

The upshot of the preceding discussion is that we find ourselves in a perplexing situation. Let the underlying assumption be that there is a criterion of personal identity. The starting point of the debate has been that

Premise 1: A criterion of personal identity captures all those aspects of our existence that are necessary and sufficient for our persistence.

Premise 2: Our persistence is determinate.

A: from 1, 2: A criterion of personal identity determines for every possible past event e0 and future event e2, within the boundaries of an adequate delineation of the modality in question, whether a person X at t1 is identical with the being that has participated in e0 and the being that will participated in e2.

Premise 3: Personal identity relations are factual: criteria of personal identity are determined neither by conventions, norms, or other social or personal preferences, however basic, nor by analytic matters about the meaning of concepts. Their truth is, literally, a matter of life and death.

B: from A, 3: There is a factual relation R between a person X at t1 and a being Y at t0/t2 which, for every possible scenario, determines whether X at t1 is identical with Y at t0/t2.

Now, if we agree with the tentative conclusion that there is, at present, no satisfactory simple view of personal identity, then we assent to the claims that

Premise 4: IM and WRINI are, with respect to a specification of the necessary and sufficient conditions for personal identity, inadequate.

Premise 5: The distinction between IM and WRINI on the one hand and the reductionist views sketched in I.A.4 on the other is exclusive.

C: from 4, 5: The only feasible candidates for R are relations of physiological and/or psychological continuity.

Since B demands that R holds for every possible scenario, within the limits of an adequate delineation of the modality in question, a criterion of personal identity must deliver compatible judgments on the thought experiments sketched above. However, since these thought experiments deliver conflicting intuitions about which criterion is true, it cannot be the case that more than one such criterion is true. From this it follows that

Premise 6: Physiological and psychological criteria of personal identity are incompatible, that is, R cannot be a conjunction of physiological and psychological relations as well as issuing in determinate and compatible solutions to each thought experiment.

Now, if we are also prepared to accept the

Big Assumption: A criterion of identity must accept all alleged conclusions of the thought experiments sketched in I.A.5

then we must conclude that

D: from B, 6A: Neither physiological nor psychological continuity is both necessary and sufficient for personal identity.

The problem with D is that, in conjunction with premises 2, 4, and 5, it reduces the underlying assumption that there can be an informative criterion of personal identity ad absurdum. This argument may be referred to as the Paradox of Personal Identity.

One should refrain from drawing precipitate conclusions from its defining characteristic as a paradox, that is, the fact that denying any of its premises leads to a conclusion that either violates our intuitions or, in the case of 4, 5, and C, commits one to a philosophically disreputable stance. Rather, the Paradox should be regarded as the starting point of any discussion of personal identity, in the sense that taking a stand on its individual premises bestows the various criteria of personal identity with their distinctive features. However, given that the paradox obliges us, in one way or other, to revise our pre-philosophical beliefs, a theory of personal identity should aim at meeting what will be referred to as the Adequacy Constraint AC on theories of personal identity, which demands that

AC: We ought to sanction a substantial revision of our pre-philosophical views of our metaphysical nature only on the conditions that (i) we receive an explanation of the unreliability of our intuiting faculties that in this domain outweighs our grounds for, and in other domains is compatible with, believing in their reliability; (ii) we receive an approximate demarcation of the extents to which we have to abandon our pre-philosophical beliefs and to which we can and we cannot have knowledge about ourselves.

How is the Paradox to be resolved? A, B, C, and D are deductions, and premise 1 is plausible on independent grounds. If identity is determinate, then premise 1 is true. Consequently, those arguments that deny the possibility of vague objects and indeterminate identity, in addition to our intuition that our own identity must be determinate, work in favor of 1. Note that, should personal identity be indeterminate, we might still be able to give a criterion of personal identity, even though such a criterion would then fall short of giving full necessary and sufficient conditions, since in some imaginary case it does not apply.

The denial of premise 3 seems to entail that we have, in a deep sense, an influence on whether we survive a given adventure, namely by possessing a particular normative, experiential, or attitudinal background. This contention may contradict our intuitions more than any thought experiment could. Since we assumed premises 4 and 5, only premises 2 and 6 and the Big Assumption remain. Could one deny premise 6? Given that the determinacy and factuality premises are accepted, It is hard to believe that we could: if a hybrid view were determinately true, a human being could die twice, once when her psychological and once when her physiological capacities cease to function. As a result, most commentators accept 6 but choose to accept a particular criterion in the vicinity of either side of the psychology-physiology divide. This implies that the Big Assumption must either not entail D or be rejected, which can be argued, always assuming that AC is being met, in three ways:

(a) One could define “adequacy of modality” in such a way as to exclude precisely those thought experiments which are problematic for a given criterion. There are two problems with this proposal: first, it is difficult to see how such a definition of adequacy of modality could not be ad hoc. And secondly, the suggestion is insufficient, for some thought experiments circumscribing physically possible scenarios, such as Human Vegetable, trigger incompatible intuitions as well. While some commentators think that Y is identical with X despite X’s loss of cognitive capacities, others regard Y as a living grave stone, nurtured merely for sentimental reasons, in commemoration of the deceased X.

(b) One could deny premise 2 instead, arguing that if personal identity is indeterminate, then our preferred criterion of personal identity does not have to deliver verdicts in all thought-experimental scenarios. This move has the further benefit that we do not have to quarrel with the alleged conclusion of another thought experiment, the combined spectrum:

Combined Spectrum

A spectrum of possible cases is imagined: at the near end, the normal case, X at t1 is fully psychologically and physiologically continuous with Y at t2, while at the far end X at t1 is neither psychologically nor physiologically continuous with Y at t2. In the intermediate cases, X at t1 is approximately halfway psychologically and physiologically continuous with Y at t2. Our alleged intuition: towards the near end of the spectrum X at t1 is identical with Y at t2 and towards the far end of the spectrum X at t1 is not identical with Y at t2. There could not even in principle be evidence for the existence of a sharp borderline between the cases in which X at t1is and the cases in which X at t1is not identical with Y at t2. Hence, it is implausible to believe that such a borderline exists. Alleged conclusion: personal identity is indeterminate.

Epistemicists like Timothy Williamson (cf. 1994) deny that we should render it implausible that there is such a sharp borderline merely because we are necessarily ignorant of its existence. Vagueness, according to epistemicism, consists precisely in our necessary ignorance of such sharp boundaries. The other problem is that even if personal identity is indeterminate, the claim cannot by itself establish one criterion over others: in order to do so, it would have to exclude those thought experiments that challenge opposing criteria while leaving untouched those that supposedly establish the preferred criterion. It is doubtful, however, that the indeterminacy of personal identity can be exploited selectively, for physiological and psychological continuity relations are equally indeterminate in a particular range of cases (cf. Parfit 1984). Furthermore, in those cases in which they are not, for example Body Swap, Human Vegetable, and Fetus, appeal to indeterminacy does little to remove the contradictory intuitions that these cases trigger. Consequently, unless one holds that personal identity is categorically indeterminate whenever the physiological and psychological features of a human being come apart, appeal to indeterminacy cannot establish the rejection of the Big Assumption in such a way as to avoid the Paradox’s conclusion.

(c) The most common strategy is to bite the bullet and some or other allegedly absurd conclusion of the thought experiments. The defender of the Psychological Criterion must hold that we are not identical with a past fetus or infant, and that we will not have survived if fallen into a persistent vegetative state. Defenders of a Physiological Criterion, on the other hand, must commit to the consequence that if X’s head is grafted onto Y’s body, then the resulting person is Y and not X, even though this person shares all psychological features with X before the operation.

The problem with this strategy is that, if accepted, we seem to be unable to decide on a criterion of personal identity on the basis of intuitions at all, on pain of unjustifiably favoring one’s own over other people’s intuitions. On the assumption that we are unable to hierarchically structure these conflicting intuitions, we have a classical stand-off: there are two sides to the coin of personal identity and appeal to intuition plainly underdetermines preferring one side over the other. The problem is that human beings are organic material objects, the persistence of which is determined by these objects’ following a continuous trajectory between space-time points. The further question of whether or not human beings are essentially organic material objects depends on the question of whether psychological properties render human beings to be sufficiently dissimilar from such objects so as to “deserve” their own identity criterion. The fear underlying the Paradox of Personal Identity, then, is that there may be no metaphysical fact to the matter as to whether the antecedently specifiable differences between human beings and other organic or inorganic material objects count as sufficient in order for us to have persistence conditions different from these objects. It does not seem as if any possible thought experiment, irrespectively of how unequivocal our intuitions about it, could redeem this fear. Personal identity theorists, therefore, ought to offer a more comprehensive account of the ontological status of persons and their relation to the constituents that make them up.

4. Parfit and the Unimportance of Personal Identity

Derek Parfit proposes a theory of the ontological status of persons, which promises to answer the problem of fission and the paradox of personal identity. While this article cannot do justice to the complexities of Parfit’s theory, which has been the focal point of debate since 1970, it is worth mentioning its main features.

Although Parfit affirms the existence of persons, their special ontological status as non-separately-existing substances can be expressed by the claim that persons do not have to be listed separately on an inventory of what exists. In particular, persons themselves are distinct from their bodies and psychologies, but the existence of a person consists in nothing over and above the existence of a brain and body and the occurrence of an interrelated series of mental and physical events. These are the foundational claims of Parfit’s constitutive reductionism. Consider an analogy: Cellini’s Venus is made of bronze. Although the lump of bronze and the statue itself surely exist, these objects have different persistence conditions: if melted down, Venus ceases to exist while the lump of bronze does not. Therefore, they are not identical; rather, so the suggestion, the lump of bronze constitutes the statue. The same is true of persons, who are constituted by, but not identical with, a physiology, a psychology, and the occurrence of an interrelated series of causal and cognitive relations.

Now, how does this relate to the fission case? We must first note that Parfit believes (i) that our persistence consists in physical and/or psychological continuity; (ii) that personal identity is indeterminate in some cases, that is, that sometimes there is no right-or-wrong answer to the question of whether somebody has ceased to exist in the course of a certain adventure (see 3.b.); (iii) that what prudentially matters in survival is psychological continuity; (iv) that personal identity relations must respect the remaining formal properties of identity. This means that in the fission case Y1 and Y2 cannot be identical with X because the transitivity of identity is violated: therefore, X dies in the fission case. It further means, however, that X has two Parfitian survivors, Y1 and Y2, which is, according to Parfit, as good (or even better) than being identical with Y1 and/or Y2. This is the upshot of Parfit’s claim that what prudentially matters is psychological continuity: for all we should care, from a purely rational point of view, it is good enough for us to be psychologically continuous with one or more future persons and consequently it would be irrational for us to prefer our own continued existence to death by fission. Generally, according to Parfit, psychological continuity with any reliable cause matters in survival, and since personal identity does not consist merely in psychological continuity with any reliable cause, personal identity is not what matters in survival.

5. References and Further Reading

ANTHOLOGIES

  • Bermúdez, Jos‚ Luis; Marcel, Anthony & Eilan, Naomi eds. (1995), The Body and the Self (Cambridge, MA & London: The MIT Press)
  • Blakemore, Colin & Greenfield, Susan eds. (1987), Mindwaves (Oxford: Blackwell)
  • Charles, David & Lennon, Kathleen eds. (1992), Reduction, Explanation, and Realism (Oxford: Clarendon)
  • Cockburn, David ed. (1991), Human Beings, Royal Institute of Philosophy Supplement, Vol. 29 (Cambridge University Press)
  • Dancy, Jonathan ed. (1997), Reading Parfit (Oxford: Blackwell)
  • Davies, Martin & Stone, Tony eds. (1995), Folk Psychology: The Theory of Mind Debate (Oxford: Blackwell)
  • Harris, Henry ed. (1995), Identity (Oxford: Clarendon)
  • Lovibond, Sabina & Williams, Stephen G. eds. (1996), Essays for David Wiggins: Identity, Truth, and Value (Oxford: Blackwell)
  • Macdonald, Graham F. ed. (1979), Perception and Identity: Essays Presented to A. J. Ayer, with His Replies (Ithaca, New York: Cornell University Press)
  • Martin, Raymond & Barresi, John eds. (2003), Personal Identity (Oxford: Blackwell)
  • Perry, John ed. (1975), Personal Identity (Berkeley & Los Angeles, CA: University of California Press)
  • Rorty, Amelie O. ed. (1976), The Identities of Persons (Berkeley & Los Angeles, CA: University of California Press)

BOOKS AND ARTICLES

  • Ayers, Michael (1991), Locke: Epistemology and Ontology, 2 vols. (London & New York: Routledge)
  • Baker, Lynne Rudder (1997), “Why Constitution Is Not Identity,” The Journal of Philosophy, Vol. 94, No. 12, 599-621
  • Baillie, James (1993), “Recent Work on Personal Identity,” Philosophical Books, Vol. 34, No. 4, 193-206
  • Behrendt, Kathy (2003), “The New Neo-Kantian and Reductionist Debate,” Pacific Philosophical Quarterly Vol. 84, No. 4, 331-50
  • Blackburn, Simon W. (1984), “Has Kant Refuted Parfit?,” in Dancy ed. (1997), pp. 180-201
  • Butler, Joseph (1736), “Of Personal Identity,” First Dissertation to The Analogy of Religion (reprinted in Perry ed. (1975), pp. 99-105)
  • Campbell, John (1992), “The First Person: The Reductionist View of the Self,” in Charles & Lennon eds. (1992), pp. 381-419
  • Cassam, Quassim (1989), “Kant and Reductionism,” Review of Metaphysics, Vol. 43, No. 1, 72-106
  • Cassam, Quassim (1992), “Reductionism and First-Person Thinking,” in Charles & Lennon eds. (1992), pp. 361-80
  • Cassam, Quassim (1993), “Parfit on Persons,” Proceedings of the Aristotelian Society, Vol. 93, 17-37
  • Cassam, Quassim (1997), Self and World (Oxford University Press)
  • Chisholm, Roderick M. (1976), Person and Object (Chicago & La Salle, IL: Open Court)
  • Crane, Tim (2001), Elements of Mind (Oxford University Press)
  • Doepke, Frederick C. (1996), The Kinds of Things: A Theory of Personal Identity Based on Transcendental Argument (Chicago & La Salle, IL: Open Court)
  • Evans, Gareth M. (1982), The Varieties of Reference, ed. John McDowell (New York: Oxford University Press)
  • Evans, Gareth M. (1985), Collected Papers, ed. Antonia Phillips (Oxford: Clarendon)
  • Garrett, Brian (1991), “Personal Identity and Reductionism,” Philosophy and Phenomenological Research, Vol. 51, No. 2, 361-73
  • Garrett, Brian (1995), “Wittgenstein and the First Person,” Australasian Journal of Philosophy, Vol. 73, No. 3, 347-55
  • Garrett, Brian (1998), Personal Identity and Self-Consciousness (London: Routledge)
  • Geach, Peter (1967), “Identity,” Review of Metaphysics, Vol. 21, No.1 (reprinted in his (1972), Logic Matters (Oxford: Blackwell), pp. 238-47)
  • Gordon, Robert M. (1995), “Folk Psychology as Simulation,” in Davies & Stone eds. (1995), pp. 59-73
  • Heal, Jane (1995), “Replication and Functionalism,” in Davies & Stone eds. (1995), pp. 45-59
  • Hirsch, Eli (1991), “Divided Minds,” The Philosophical Review, Vol. 100, No. 1, 3-30
  • Hume, David (1739), A Treatise on Human Nature, ed. Norton, David F. & Norton, Mary J. (Oxford University Press)
  • Johnston, Mark (1992), “Constitution Is Not Identity,” Mind, Vol. 101, No. 401, 89-105
  • Johnston, Mark (1997), “Human Concerns Without Superlative Selves,” in Dancy ed. (1997), pp. 149-79
  • Locke, John (1689), An Essay Concerning Human Understanding, ed. Woolhouse, Roger (London: Penguin, 1997)
  • Lowe, E. Jonathan (1991), “Real Selves: Persons as Substantial Kinds,” in Cockburn ed. (1991), pp. 87-108
  • Lowe, E. Jonathan (1996), Subjects of Experience (Cambridge University Press)
  • Martin, Raymond (1998), Self-Concern: An Experiential Approach to What Matters in Survival (Cambridge University Press)
  • McDowell, John (1997), “Reductionism and the First Person,” in Dancy ed. (1997), pp. 230-50
  • Merricks, Trenton (1998), “There Are No Criteria of Identity Over Time,” No–s, Vol. 32, No.1, 106-124
  • Moore, Adrian W. (1997), Points of View (Oxford: Clarendon)
  • Nagel, Thomas (1971), “Brain Bisection and the Unity of Consciousness,” Synthese, Vol. 22, 396-413
  • Nagel, Thomas (1986), The View From Nowhere (Oxford: Clarendon)
  • Noonan, Harold W. (1989), Personal Identity (London: Routledge)
  • Noonan, Harold (1993), “Constitution Is Identity,” Mind, Vol. 102, No. 405, 133-46
  • Nozick, Robert (1981), Philosophical Explanations (Oxford: Clarendon)
  • Olson, Eric T. (1997a), The Human Animal: Personal Identity Without Psychology (Oxford University Press)
  • Olson, Eric T. (1997b), “Relativism and Persistence,” Philosophical Studies, Vol. 88, No. 2, 141-62
  • Parfit, Derek A. (1971a), “Personal Identity,” The Philosophical Review, Vol. 80, No. 1, 3-27
  • Parfit, Derek A. (1971b), On “The Importance of Self-Identity”,” The Journal of Philosophy, Vol. 68, No. 20, 683-90
  • Parfit, Derek A. (1976), “Lewis, Perry, and What Matters,” in Rorty ed. (1976), pp. 91-107
  • Parfit, Derek A. (1982), “Personal Identity and Rationality,” Synthese, Vol. 53, 227-41
  • Parfit, Derek A. (1984), Reasons and Persons (Oxford University Press; revised reprint, Oxford: Clarendon, 1987)
  • Parfit, Derek A. (1986), “Comments,” Ethics, Vol. 96, No. 4, 832-872
  • Parfit, Derek A. (1987), “Divided Minds and the Nature of Persons,” in Blakemore & Greenfield eds. (1987), pp. 19-26
  • Parfit, Derek A. (1995), “The Unimportance of Identity,” in Harris ed. (1995), pp. 13-45 (reprinted in Martin & Barresi eds. (2003), pp. 292-318)
  • Parfit, Derek A. (1999), “Experiences, Subjects, and Conceptual Schemes,” Philosophical Topics, Vol. 26, Nos. 1-2, 217-70
  • Peacocke, Christopher (1983), Sense and Content: Experience, Thought, and Their Relations (Oxford: Clarendon)
  • Perry, John (1972), “Can the Self Divide?,” The Journal of Philosophy, Vol. 69, No. 16, 463-88
  • Shoemaker, Sydney (1963), Self-Knowledge and Self-Identity (Ithaca, New York: Cornell University Press)
  • Shoemaker, Sydney (1970), “Persons and Their Past,” American Philosophical Quarterly, Vol. 7, No. 4, 269-85 (reprinted in Shoemaker (1984), pp. 19-48)
  • Shoemaker, Sydney (1984), Identity, Cause, and Mind (Cambridge University Press; expanded edition, Oxford University Press, 2003)
  • Shoemaker, Sydney (1985), “Critical Notice of Reasons and Persons,” Mind, Vol. 94, No. 375, 443-53
  • Shoemaker, Sydney (1997), “Parfit on Identity,” in Dancy ed. (1997), pp. 135-48 (revised version of his 1985)
  • Shoemaker, Sydney (1999), “Self, Body, and Coincidence,” Aristotelian Society Supplementary Volume 73, 287-306
  • Shoemaker, Sidney & Swinburne, Richard (1984), Personal Identity (Oxford: Blackwell)
  • Snowdon, Paul F. (1991), “Personal Identity and Brain Transplants,” in Cockburn ed. (1991), pp. 109-26
  • Snowdon, Paul F (1995), “Persons, Animals, and Bodies,” in Bermúdez, Marcel & Eilan eds. (1995), pp. 71-86
  • Snowdon, Paul F (1996), “Persons and Personal Identity,” in Lovibond & Williams (1996), pp. 33-48
  • Strawson, Peter F. (1959), Individuals: An Essay in Descriptive Metaphysic (London & New York: Methuen)
  • Strawson, Galen (1999), “Self, Body, and Experience,” Aristotelian Society Supplementary Volume 73, 307-32
  • Swinburne, Richard G. (1973-4), “Personal Identity,” Proceedings of the Aristotelian Society, Vol. 74, 231-47
  • Thompson, Judith J. (1997), “People and Their Bodies,” in Dancy ed. (1997), pp. 202-29
  • Unger, Peter (1979), “I Do Not Exist,” in Macdonald ed. (1979), pp. 235-51
  • Van Inwagen, Peter (1990), Material Beings (Ithaca, New York: Cornell University Press)
  • Wiggins, David R. P. (2001), Sameness and Substance Renewed (Oxford University Press)
  • Wilkes, Kathleen V. (1988), Real People: Personal Identity Without Thought Experiments (Oxford: Clarendon)
  • Williams, Bernard A. O. (1956-7), “Personal Identity and Individuation,” Proceedings to the Aristotelian Society, Vol. 57, 229-52 (my references are to reprint in Williams (1973), pp. 1-18)
  • Williams, Bernard A. O. (1970), “The Self and the Future,” Philosophical Review, Vol. 79, No. 2, 161-80 (reprinted in Williams (1973), pp. 46-63)
  • Williams, Bernard A. O. (1973), Problems of the Self: Philosophical Papers 1956-1972 (Cambridge University Press)
  • Williams, Bernard A. O. (1978), Descartes: The Project of Pure Enquiry (Hardmondsworth: Penguin Books)
  • Williamson, Timothy (1994), Vagueness (London & New York: Routledge)
  • Wittgenstein, Ludwig (1922), Tractatus Logico-Philosophicus, transl. D.F. Pears & B.F. McGuiness (London: Routledge and Kegan Paul, 1961)
  • Wittgenstein, Ludwig (1953), Philosophical Investigations, transl. G.E.M. Anscombe (Oxford: Blackwell)
  • Wright, Crispin (1983), Frege’s Conception of Numbers as Objects (Aberdeen University Press)
  • Zimmerman, Dean W. (1998), “Criteria of Identity and the “Identity Mystics”,” Erkenntnis, Vol. 48, Nos. 2-3, 281-301

Author Information

Carsten Korfmacher
Email: C.Korfmacher.99 (at) cantab.net
Linacre College, Oxford University
United Kingdom

Supervenience and Determination

The term “supervenience” gained prominence in the twentieth century when it was suggested that moral properties supervene on natural properties and that our mental characteristics supervene on our physical characteristics such as the properties of our nervous system. The term can be defined as follows. For two sets of properties, A (the supervenient set) and B (the subvenient set or supervenience base), A supervenes on B just in case there can be no difference in A without a difference in B. Turning this principle on its head gives us the converse concept of determination: B determines A just in case sameness with respect to B implies sameness with respect to A. Supervenience and determination are simply two sides of the same coin.

From the basic definition initially presented, supervenience might seem a fairly innocuous principle, yet it has led a somewhat murky and controversial existence: some love it; some hate it. It was, for example, described by John Post as an “accordion word: indefinitely stretchable” (1984, p. 163). It has certainly been pulled about throughout its history, but it does have its limits. Indeed, others view it as too limited to be of any philosophical worth whatsoever. This article charts the history of the concept of supervenience, discusses the current panoply of definitions, and reviews some of the more tractable portions of the contemporary debate. The primary aim is to gain a feel for the basic concept without getting bogged down with the more formal and abstruse aspects of supervenience. The aim of this first section is to get to grips with the core idea of supervenience, and see some of the contexts in which it has been and might be used.

Table of Contents

  1. Getting to Grips with Supervenience
  2. The Recent History of Supervenience
  3. The Unlovely Proliferation of Formulations
  4. Supervenience and Causation
  5. Reduction, Emergence, and Multiple Realization
  6. Adding Mystery to Mystery?
  7. References and Further Reading

1. Getting to Grips with Supervenience

As David Lewis puts it, “We have supervenience when there could be no difference of one sort without differences of another sort” (1986, p. 14). For example: no difference in an individual’s mental characteristics without some difference in physical characteristics; no difference in a computer’s program without a difference in the computer’s circuitry; no difference in the economy without some difference in the behavior of its underlying economic agents; no difference in the temperature of a gas without some difference in the behavior of the molecules forming it, and so on. But notice that there can be differences in the neurons, circuitry, agents, and molecules without a difference in mental, computational, economic, and thermal properties.

The idea in each of the above cases is that some property A (or family of properties) is “determined” by some other properties B that do not themselves possess the property A, and that do not reduce to B (though this is a controversial point, as we shall see): individual neurons don’t possess mental characteristics; circuits don’t possess computational properties; individual agents don’t possess economic properties; and individual molecules don’t have temperatures. The intent is to avoid the stronger relations (such as identity or definability) between the types of property, generally because it often isn’t clear how there could be such strong relations holding them together. Part of the reason for this, and one prime motivation for supervenience, is that mental, computational, economic, and thermodynamic characteristics are “multiply realizable:: the same properties might be realized by very different underlying physical configurations or stuff. However, it needs to be strong enough to support a kind of non-symmetric dependence between two levels of property, such that a “lower” level determines a “higher” level. This feature may give rise to the notion of “levels of dependence” and, in certain cases, “hierarchical organization”: the mental is at a higher “level,” is higher up the hierarchy, from the physical; the economy is at a higher level than the economic agents, and so on.

This hierarchy of levels charts out a progression of ontological dependence too: without the physical stuff of neurons, circuits, people, and molecules (or something like them), the higher level states would not exist at all. This feature thus makes supervenience a useful tool in analyzing relations between the subject matter of distinct theoretical disciplines, such as the relation between physics and biology. It is, more generally useful in analyzing relations between things that are connected (correlated) in a way that doesn’t suggest reduction or identity. However, note that levels are not a generic feature of supervenience. Consider the case of the relationship between the length of the sides of a square and the area of the square. There is, in both directions, no difference in one without a difference in the other, and once the sides (respectively, area) are fixed the area (respectively, length) is fixed. So we have a clear case of supervenience. But this is a symmetric case, and so the notion of a level of dependence or hierarchy makes no sense; it only makes sense when the relation is asymmetric, and these make for the most philosophically interesting cases.

But, before we get bogged down with such details, what is the basic idea of supervenience? It is perhaps best understood by means of a colorful example. To this end, let us begin by adapting a simple story presented by Paul Teller (1983). Teller asks us to imagine a bunch of watches churned out of an assembly line in the same state, so that they are functionally and qualitatively (at least, in terms of their intrinsic properties) identical—clearly the watches will register the same time. The properties having to do with the physical makeup of the watches—their structure and composition, and so on—give us our B set of properties (the subvenient set). The supervenient A set has to do with the time-keeping properties of the watches—for example, whether they enable their owners to get into work on time, and so on. In this case, as Teller points out, the A properties of some particular watch will be the same as any other watch from the assembly line since they have the same physical makeup (B properties), and that is all that counts towards the A properties in this story. Being a good timekeeper supervenes on the physical makeup of the timekeeping device: one could not alter the time-keeping properties of the watches without altering their underlying structural and compositional properties. Moreover, any two devices that share their physical makeup will either both be good or both be bad timekeepers. That is to say, the physical make-up of a watch determines its time-keeping properties.

Though this captures much of the basic idea as encapsulated in our opening definition (which we can abbreviate to “no A-difference without a B-difference”), it misses one very crucial detail: modal impact. Supervenience is not intended to be a contingent “matter of actual fact” claim concerning two sets of properties that happen to be correlated at some particular time or place. Rather, it is intended to cover any situation involving A and B, covering any time, place, and world—though there will be natural restrictions concerning which worlds are to be included here (for example, logically possible [so that all logically coherent, non-contradictory worlds are considered], nomologically possible [so that all worlds permitted by the laws of physics are considered], and metaphysically possible [considering a class of worlds somewhere between the logically possible and the nomologically possible ones]). Different restrictions give different strengths. In our example, we should have to extend our story to include all possible watches that are indistinguishable in terms of their B-properties, including those inhabiting distinct worlds (from alien worlds and Twin-Earths, perhaps to worlds with different laws of physics). This additional modal aspect results in a profusion of distinct formulations that aim to adequately capture the fundamental notion of supervenience. Further proliferation results from the question of what are to be the objects that have the properties that enter into the supervenience/determination relation. Supervenience is, then, clearly far from innocuous!

2. The Recent History of Supervenience

Jaegwon Kim (1993, p. 131) notes that the term “supervenience” was in currency as far back as 1594. In its vernacular sense it means to “[come upon] a given event as something additional and extraneous (perhaps as something unexpected)” (ibid, p.132). However, the concept of Supervenience, as a philosophical term of art, is generally acknowledged to be traceable to G.E. Moore’s work on value theory, and from thence to R.M. Hare’s work on meta-ethics in which the term ‘supervenience’ was introduced into the philosophical literature. There it stifled for some time, before being unearthed by Davidson who applied it to the ‘mental-physical’ relationship. Let us review some central points from this historical development.

In “The Conception of Intrinsic Value” Moore writes that:

…if a given thing possesses any kind of intrinsic value in a certain degree, then not only must that same thing possess it, under all circumstances, in the same degree, but also anything exactly like it, must, under all circumstances, possess it in exactly the same degree. … it is not possible that of two exactly similar things one should possess it and the other not, or that one should possess it in one degree, and the other in a different one.

(Moore 1922, p. 261)

This sentiment is virtually parroted by Hare, this time specifically utilizing the term “supervenience” to describe the relation between certain natural (non-moral, physical) and moral properties, giving us ‘moral supervenience’:

…let us take that characteristic of “good” which has been called its supervenience. Suppose that we say ‘St. Francis was a good man.’ It is logically impossible to say this and to maintain at the same time that there might have been another man placed exactly in the same circumstances as St. Francis, and who behaved in exactly the same way, but who differed from St. Francis in this respect only, that he was not a good man.

(Hare 1952, p. 145)

Before we continue with the historical matters, let us briefly pause to consider what this means. Again, let’s give a simple example. Imagine we draw up a pair of catalogues of the properties of two people Saint Francis and Faint Srancis. The properties of Saint Francis are, say, kindness, bravery, niceness, neighborliness, and goodness. Faint Srancis’ properties differ from Saint Francis only in that the last property, goodness, is missing from his catalogue. Suppose, instead, that he has the property “badness” in its place. Now, according to the moral supervenience thesis espoused by Hare, this is simply not a genuinely possible state of affairs. All of the other properties, minus goodness, serve to fix or determine the property of goodness. It is just not possible that there be two such individuals differing in this way (whether they occupy the same world or not). Therefore, in possessing all of Saint Francis’ properties up to, but not including goodness, Faint Srancis must also thereby possess the property of goodness too. This is what is meant in saying that the property of goodness supervenes on a family of natural properties not including goodness. (Note that this matches Stalnaker’s, 1996, p. 87, preferred definition of supervenience: “To say that the A-properties or facts are supervenient on the B-properties or facts is to say that the A-facts are, in a sense, redundant, since they are already implicitly specified when one has specified all the B-facts.”) Let us now return to the historical path of the concept.

As Kim and others have pointed out, it seems that both some version of the concept and the term ‘supervenience’ were in operation before Moore’s and Hare’s usage in the context of the British Emergentist School. The emergentist’s understanding of supervenience, being more in line with the vernacular sense, does not match the current understanding as well as Moore’s and Hare’s. See McLaughlin 1992 for an excellent analysis. Indeed, supervenience, as a concept, most likely has much earlier roots than this, and one can readily find examples (or approximations, at least) littered throughout the history of philosophy. Leibniz’s theory of space and time might be one such example, with spatial and temporal properties supervenient on non-spatial and non-temporal events. Hume’s theory of causation might be another example, with cause and effect supervening on sequences of events that do not have causal properties. However, for the purposes of a cleaner exposition we will stick with the orthodox historical trajectory of supervenience. Not many philosophers initially picked up on Hare’s use of supervenience, but new life was breathed into it when Donald Davidson (1970) utilized it to provide some of the support for his anomalous monism. For example, in an oft-quoted passage he writes:

Although the position I describe denies there are psychophysical laws, it is consistent with the view that mental characteristics are in some sense dependent, or supervenient, on physical characteristics. Such supervenience might be taken to mean that there cannot be two events alike in all physical respects but differing in some mental respect, or that an object cannot alter in some mental respect without altering in some physical respect.

(Davidson 1970, p.214)

Davidson uses this supervenience relation to defend a non-reductive, but nonetheless non-dualist, position with regard to the way in which the mental stands to the physical (that is, psychophysical supervenience). Though the mental is certainly dependent upon the physical, in the sense that the physical determines the mental, it cannot be reduced to it since there are no psychophysical laws while there are, of course, physical laws:

[P]sychological characteristics cannot be reduced to the others, nevertheless they may be (and I think are) strongly dependent on them. Indeed, there is a sense in which the physical characteristics of an event (or object or state) determine the psychological characteristics…

(Davidson 1973, p. 716)

Once it entered the mainstream literature via Davidson, other philosophers (Jaegwon Kim in particular) began to focus on supervenience as an object of study in its own right—the 1984 Spindel conference saw the beginnings of much of this new direction (see Horgan (ed.), 1984—required reading for those wishing to gain a deeper appreciation of the foundations of supervenience). This trend shows no signs of letting up, though there is certainly some increased negativity about the concept’s usefulness and significance. A large part of the perceived problem with supervenience is that there is no unique, agreed-upon formulation of it. Instead there are many distinct formulations. However, this might not be such a bad thing; different jobs may require different tools. It is entirely possible that the fortunes of supervenience will reverse with the coming of age of the so-called “science of complexity,” for this involves direct consideration of the relationship between levels in hierarchies whereby a higher level is generated by the level below—it also involves many of the “special sciences.” Supervenience might thus provide the required conceptual framework to make sense of this feature of complex systems. It has, for example, been endorsed by Elliot Sober (1993) as the best way of understanding the biological concept of “fitness,” the idea being that fitness is something exhibited by very different species and individuals in relation to very different environments.

3. The Unlovely Proliferation of Formulations

We come now to the “embarrassment of riches” issue concerning the formulation of supervenience—the problem of there appearing to be too many possible formulations. David Lewis refers to this as an “unlovely proliferation” (1986, p.14). The proliferation arises simply in trying to pin down what is meant by supervenience in a precise way. The core idea that a formulation needs to capture is that fixing some one set of properties fixes some other property (or properties). The first distinction we meet is that between weak and strong supervenience. These can be stated simply enough in plain English as follows:

[Weak-SV]: For any possible world w, B-duplicates in w are A-duplicates in w.

[Strong-SV]: For any possible worlds w and w*, B-duplicates (x and y) in w and w* respectively are A-duplicates in w and w* respectively.

So, for example, according to Weak-SV, if we (perhaps here on our ‘plain vanilla’ Earth) managed to create a Star-Trek style replication machine and proceeded to replicate the physical makeup of a person P, generating a copy Prep, then P and Prep would share their mental characteristics too: “same worldly” physical duplicates are also mental duplicates. To understand Strong-SV we simply imagine that some Twin-Earthlings (in another possible world) got hold of an exact blueprint of P and are sufficiently advanced to be able to create a physical replica. Once again P and Prep are mental duplicates since they are physical duplicates. (By simply setting w = w*, and assuming the same types of worlds, we see that Strong-SV implies Weak-SV, but not vice versa.)

The difference between Weak and Strong supervenience, then, simply boils down to their respective modal strengths. One world is quantified over in the former, with objects compared within a world, while all worlds (subject to some restriction) are quantified over in the latter, with objects compared across worlds. For this reason Jackson (1998, p. 9) refers to these types as “intra-world” and “inter-world” supervenience respectively. Clearly the weak formulation cannot support basic counterfactuals of the form “if there were some B-duplicate of some object, then it would be an A-duplicate too.” Without this ability, Weak-SV is pretty much useless, for some dependency might be purely accidental. For example, it is perfectly consistent with Weak-SV that there be a world physically identical to ours yet with no conscious beings. (Though, of course, if one wants to describe such possibly accidental relations then Weak-SV might indeed be the right tool for the job.) Note also that Weak-SV does not tell us that a certain group of B-properties makes one morally good, or a piece beautiful, or a piece of matter alive. All Weak-SV tells us is that B-twins are A-twins; it does not tell us whether B-twins are one way or the other morally speaking, for example, just that whatever goes for on goes for the other. Hence, it fails to accomplish the task we set it: namely, to encode a notion of dependence and determination. Strong-SV gets around this problem of course, but it has its own problems. Suppose that there are two individuals, Fred and Ted, inhabiting worlds w and w* respectively. Let Fred and Ted be “almost” B-duplicates, differing only in one single trivial B-property, suppose one is wearing aftershave and the other is not. Then it follows from Strong-SV that Fred could be conscious but Ted not, all because he didn’t remember to put aftershave on!

There are alternative “modal operator” [MO] versions of the weak and strong formulations of supervenience. Again in “plain” English, these are:

[MO-Weak-SV]: Necessarily, if anything has property F in A, then there is some property G in B such that the thing has G, and whatever has G has F.

[MO-Strong-SV]: Necessarily, if anything has property F in A, then there is some property G in B such that the thing has G, and necessarily whatever has G has F.

The only difference between strong and weak here is that the strong formulation features an additional necessity operator. What these definitions amount to is this: Weak supervenience holds at any world (given restrictions on the class of worlds), and once that world is selected one compares B-duplicates, in that world, and sees whether they are A-duplicates, if weak supervenience is true then they will be. Strong supervenience holds at any world (again, given restrictions on the allowable worlds), and once a world is selected it follows that at any world accessible from that world, objects in the initially selected and the accessed world that are B-duplicates, will be A-duplicates—hence, one can compare cross-world cases. The modal operator versions capture something that the possible worlds formulations miss, namely that possession of a supervenient property demands that a subvenient one be had as well. So, in the possible worlds formulation, two things can be B-duplicates by not possessing any B-properties (that is, if they exactly zero B-properties)! Not so in the modal operator versions.

Another distinction concerns that between Weak-SV and Strong-SV, taken as a pair, and Global supervenience, which we can write as:

[Global-SV]: Possible worlds w and w* that are B-duplicates are also A-duplicates.

Thus, whereas Weak-SV and Strong-SV concern the properties of individual objects (within a world and potentially across worlds respectively), Global-SV concerns whole possible worlds and the pattern of properties distributed over them. One might wish for such a formulation to capture certain philosophical theses, such as physicalism (roughly: fixing the physical facts fixes everything), Humean supervenience (roughly: everything is fixed by the spatiotemporal distribution of local intrinsic properties), or determinism (roughly: everything to the future is fixed by the present, and perhaps past, facts), which involve worlds (or ‘world segments’) taken as individual objects. In each formulation, though, we can distinguish between cases with differing modal force by quantifying over different types of possible world (that is, by imposing different accessibility relations on the set of worlds). An accessibility relation is just a binary relation RMod (w, w*) holding between pairs of worlds, w and w*, so that RMod (w, w*) is true whenever w* satisfies the same M-laws (of physics, logic, and so forth) as w. If you’re only bothered about relations satisfying our laws of physics, then you will only want to consider the nomologically possible worlds, in which case RNom (w, w*) whenever w* follows the same physical laws as w. If you want to go beyond our laws, then quantification over the metaphysically possible worlds is more appropriate (one needs to ‘expand’ the accessibility relation).

There is some confusion in spelling out what is meant in saying that worlds are B-duplicates. Does it mean that the worlds may differ in other ways, so long as they do not differ with respect to B-properties? For example, might we consider two worlds B-duplicates where one world, but not the other, has ghosts (with C-properties)? If they are B-duplicates, and B-properties account for all there is, and the worlds contain the same individuals, then what distinguishes such worlds? These issues can cause problems when one tries to put supervenience to work. Moreover, Global-SV faces a similar problem to that mentioned with regard to Strong-SV. So long as two worlds are not B-duplicates they can differ in any way you like with respect to their A-properties. For example, if one single atom is out of place, then this could mean that one world has conscious beings and the other world only has zombies!

A further distinction is to be made between “single domain supervenience” and “multiple domain supervenience.” The difference here concerns whether we wish to consider the A- and B-properties associated to the same or to different things respectively. In the latter, multiple domain case, one would look at those cases where there cannot be A-differences in one thing without a B-difference in some other distinct thing. Thus, weak and strong are clearly single domain formulations. The multiple domain account has several applications: for example, in the case of the problem of material composition (for example, the way a statue stands to the lump of clay that out of which it is composed), those who believe that the statue and the clay literally coincide (share their spatial boundaries at a time, if not for all time, and indeed these divergent histories is what makes them different—they can also differ in their modal properties, so that they satisfy different counterfactuals) will want to say that the statue supervenes on the clay. But since these are two different things, according to the coincidence advocate, w will need a multiple domain account. For the same reasons, those who view societies, or other similar structures, as separate objects, autonomous from the individuals from which they are composed, will need a multiple domain account if they wish to say that social properties supervene on the properties of the underlying individuals. (One can also formulate “local” or “regional” supervenience, which restrict the supervenience relation to a spacetime region within a world, rather than some concrete object within a world. Again, this splits into weak and strong versions.)

There is something of a cottage industry devoted to spelling out the various entailment relations between the various formulations. We saw that Strong-SV implies Weak-SV, and it looks like Strong-SV implies Global-SV too. However, the converse is trickier: given a certain understanding of the properties involved, they become equivalent. However, equivalence is ruled out by a simple counterexample (due to Petrie): Suppose we have two worlds w and w*, each with two properties A = {S}| and B= {P}, and two individuals x and y (and no more) in world w, and x* and y* (and no more) in world w*. The world w is characterized by the following distribution of properties over its individuals: Px, Sx, Py, ~Sy. While world w* is characterized by the distribution: Px*, ~Sx*, ~Py*, and ~Sy*. Clearly, strong supervenience is ruled out by this model since x and x* are B-duplicates but not A-duplicates. But this isn’t incompatible with global supervenience because the worlds are not B-duplicates, so A-duplication is irrelevant. The fact that this model is consistent with global supervenience yet inconsistent with strong supervenience is enough, says Petrie, to show that they are not equivalent. There are objections to this argument, but we shan’t go in to these matters here. Let us instead turn to some controversial issues that arise in contemporary debates.

4. Supervenience and Causation

Supervenient properties are often those to which we wish to attach causal powers. For example, mental effects from mental causes and even physical effects from mental causes. If one thinks of an old love it may cause one to feel sad, or have some other emotion. It may cause one to cry. But the mental supervenes on the physical, which means that the physical fixes the mental. So both mental causes and mental effects are supervenient on some physical conditions. But then the mental cause is irrelevant here since the physical conditions are sufficient to bring about the effect. At best, the mental effect is over-determined by the mental and physical causes. At worst, it leads to epiphenomenalism about mental properties. Presumably the ground of the supervenience relation will be relevant here.

If the supervenient properties are understood as emergent, then it is possible that some “global” properties, to do with a whole system, can causally effect other things, and its parts (the supervenience base). For example, a group of agents can interact to generate an economy, but the economy has properties of its own (prices, interest rates, and such like); these will be able to influence how the agents behave. In other words, there is the possibility of a ‘feedback loop’ from global to local. Such a possibility would appear not to be available in the case of a “mereological” grounding of a supervenience relation, according to which the whole is just identified with the sum of its parts. In the former case, the whole is supposed to be some how more than the sum of its parts (due to the non-linear nature of the interactions between the parts). But, nonetheless, in both cases, once we fix the subvenient properties, we fix the supervenient ones too. However, there are very problematic causal issues involved in the case with a feedback loop where we would appear to have “downward causation” so that the supervenient properties constrain and even modify the subvenient ones. The existence of a “preferred direction” to the relation seems to have been lost in such cases. This is an interesting topic in need of much further work, but we cannot pursue it further here.

5. Reduction, Emergence, and Multiple Realization

Reductionism is as old as philosophy itself. The ancient Greek cosmologists each defended what appear to be reductive theories according to which everything that exists is made up of some single fundamental element or a group of such elements. Most apt here is the version of atomism given to us by Leuccipus and Democritus according to which all things, including secondary qualities, souls, and thoughts, were reduced to atoms moving in the void. But there are some things that, it seems, are not easily reducible. Take Beethoven’s Fifth Symphony. How does one reduce this? To a sound structure (that is, a sequence of sounds)? If so, then many different sound structures can realize it, on CDs, Vinyl, a badly tuned piano, and so on. This piece of music is, then, multiply realizable (there is a many-to-one relationship between the subvenient realizations and the supervenient property). We might also consider some “higher order” properties of musical works, say “being a grand piece of music.” This property too is multiply realizable: there are many ways to be a grand piece of music. This seems to rule out reduction, at least to a unique sound structure. But, and here we return to Hare’s example, if there are two indistinguishable realizations, then if one is a grand work of music, the other cannot fail to be. The grandeur is determined by the sound structure—we are, of course, assuming that grandeur is a property intrinsic to a work, otherwise one and the same sound structure could be both grand and not grand.

This multiple realizability lies at the core of supervenience’s job, namely, to describe a dependency weaker than identity and reduction. The idea is, that fixing the physical properties of the work of music (the tones, durations, intensities, and so on) suffices to fix any and all aesthetic properties the piece might have. But then the idea of emergence amounts to the claim that these aesthetic properties (and similar higher-level properties) are not reducible to the physical ones, they are something “novel” arising from the physical organization. (The distinction between physical and non-physical properties here amounts to both the fact that the latter type can be had by many objects with different natures and constitutions, and the fact that the former type obey the laws of, possibly complete, physics. However, nothing said here hinges on this distinction, one might as well say that aesthetic properties are physical too, since they occupy the world. Thus, this is just a way of speaking to label a curious fact, namely that some properties seem not to be reducible to what are standardly taken to be unproblematic ‘physical’ properties, such as mass, charge, spin, and so on.) Dualism and epiphenomenalism are avoided (1) because the physical facts are needed to fix the emergent facts and (2) because the emergent properties are supposed to be causally efficacious: the beauty of the Adagio from Mahler’s Fifth Symphony can cause a person to cry; it isn’t the durations, intensities, and pitch of sounds that is causally responsible—though one might conceivably take a hard line here and argue that it is precisely the physical (subvenient) properties that cause the tears. (Though it must be understood that causation is far from simple in these contexts, as we saw in the previous section.)

In an early and pioneering work on supervenience and determination, in the context of a defense and formulation of physicalism, Hellman and Thompson were concerned with separating out supervenience from reduction. Physicalism can be understood simply as follows: When God made the World, did he just have to fix the facts regarding the elementary particles and the forces (the B-properties) and all the rest (the A-properties: colors, qualia, aesthetic properties, moral properties, and so forth) followed from that, or did he have to then attach all the rest? A physicalist will answer Yes to the former question. Supervenience, or rather determination, is supposed to support the affirmative answer, for it says precisely that the B-properties determine the A-properties. Hellman and Thompson wanted to show that supervenience is neutral in respect of reduction between supervenient and subvenient levels of properties.

Why might we wish to defend the view that supervenience is non-reductive? One reason, as we have seen, is to capture a notion of ontological dependence—say of the mind on physical brain states or processes—without eliminating the mind, or identifying the mind with the brain states. The problem with such a view is that prima facie it appears to let in ‘unphysical’ properties, that either amount to dualism or epiphenomenalism. There is certainly a problem in making ontological sense of supervenient properties, but one needn’t espouse either dualism or epiphenomenalism if one is committed to a supervenience thesis. For all that is being said is that fixing some one set of facts fixes some others. However, there is an argument that attempts to demonstrate that supervenience is reductive. Let us consider this argument, and then present one against reduction.

The argument is given in Kim’s “Supervenience and Nomological Incommensurables”. In capsule form, it goes as follows: Suppose we have two sets of properties, P (for physical) and S (for special, as in special science). Let s be a property in S and let pn be the list of properties contained in P. Define qn to be the set of maximally conjunctive properties that can be built from pn (where the maximally conjunctive condition means that for each pi, either pi or its negation is a conjunct of qn). If S is supervenient on P then any pair of objects that share some qi must both possess s or both lack s. Now, let D be the disjunction of all of those qi such that if an object has qi then it has s too. However, this implies that possession of an S property is equivalent to possession of a P property. In other words, for all x’s, s has x if and only if D has x (in shorthand: x , s(x) iff D(x)). This, of course, is tantamount to a reduction of S to P, for the claim is that every higher level, supervenient, property is coextensive with some Boolean complex of lower level, subvenient, properties, say a long (possibly infinite) disjunction of properties. Thus, any two objects with the supervenient property A must possess the very same subvenient property B, but B is a very complex property that will involve an exhaustive list of the ways that A could be had by any object.

Hellmann and Thompson’s strategy is to disallow infinite conjunctions and disjunctions of properties, thereby blocking the route to the infinitely complex properties that Kim’s argument let in, and therefore blocking the route to reduction. However, while an outright ban on such properties may be otherwise well motivated, it is too ad hoc in this case. A more promising approach to stop Kim’s argument is to simply not allow that the kind of Boolean operations that Kim utilizes to generate new properties result in genuine properties. One might apply this strategy either to negations of properties, disjunctive properties, conjunctive properties, or some combination of these (see McLaughlin’s article “Varieties of Supervenience”).

In his “Reduction of Mind” Lewis speaks of supervenience as a reductive principle, going somewhat against the philosophical grain. As a build up he writes:

I hold, as an a priori principle, that every contingent truth must be made true, somehow, by the pattern of coinstantiation of fundamental properties and relations [that is, occurring all together]. The whole truth about the world, including the mental part of the world, supervenes on this pattern. If two possible worlds were exactly isomorphic in their patterns of coinstantiation of fundamental properties and relations, they would thereby be exactly alike simpliciter.

(Lewis 1994, p.292)

Lewis adds to this that all the fundamental properties and relations are physical, so that a materialist thesis is generated from the supervenience—the position amounts, more or less, to a statement of his “Humean Supervenience;” the claim that “All there is to the world is a vast mosaic of local matters of fact…And that is all” (1986, p.ix-x) so that “truth supervenes on being” (1994b, p.225). But how can supervenience be reductive? Lewis gives the following example:

Imagine a grid of a million tiny spots – pixels – each of which can be made light or dark. When some are light and some are dark, they form a picture, replete with interesting gestalt properties. The case evokes reductionist comments. Yes, the picture really does exist. Yes, it really does have those gestalt properties. However, the picture and the properties reduce to the arrangement of light and dark pixels. They are nothing over and above the pixels. They make nothing true that is not made true already by the pixels. They could go unmentioned in an inventory of what there is without thereby rendering that inventory incomplete. And so on.

(Lewis 1994, p. 294)

Such comments Lewis happily endorses: “The picture reduces to the pixels. And that is because the picture supervenes on the pixels” (loc. cit.). Lewis’ position here stems from the fact that the supervenience relation is (in this case, at least) non-symmetric and relates large to small—though it isn’t at all obvious that this is sufficient for reduction.

However, there is a way for the anti-reductionist to respond here, and this response ties in to much of the contemporary debate regarding supervenience (and emergence). The response is known as the “multiple realizability” objection, and was first used by Jerry Fodor (1974) in the context of the debate concerning the non-reducibility of special science to lower-level science (ultimately, physics). The argument, in a nutshell, is that properties associated to a ‘special science’ (for example, psychology) can be realized by a multitude of heterogeneous lower-level properties or states. Let us see how this works by focusing on a simplified example given by Putnam (1975).

We are asked to consider a board that has a round hole in it of 5 inches in diameter, and a square peg that is 5 inches on each of its sides. Clearly the peg will not go into the hole. The question we are faced with is why the peg does not go through. Obviously, says Putnam, the respective size and shape of the peg and hole give us the answer. These properties, size and shape, Putnam refers to as “macroproperties”, as contrasted with the “microproperties,” of the peg and board, namely the positions, momenta, charge, and so forth, of the atoms composing them. Clearly the shape and size of the peg and the board supervene on the microproperties. Do these microproperties provide an answer to the above question? Putnam says not, because the details at that level are irrelevant to why the peg did not penetrate the board: the microproperties could have been very different, in fact, and the result would have been the same. What are we to conclude from this? That the “peg/board/hole”-level features (the macroproperties) are autonomous, so that they cannot be reduced to lower-level features (the microproperties). This is, more or less, just multiple realizability again, but here it keys in to an interesting aspect of that concept. It tells us that what is explainable using supervenient features is not always explainable using the associated subvenient features. Here one can make connections traditional issues with philosophy of science.

There are dissenting voices to Putnam’s thesis, but we shall not go any further into the ins and outs of the debate here since it quickly becomes dense and complex. Suffice it to say that supervenience is still “live” in many philosophical debates and will no doubt continue to remain so for some time to come.

6. Adding Mystery to Mystery?

Supervenience is something of a halfway house. It is called upon by some to ground a view according to which certain properties that we think of as “unphysical” are not definable in terms of, or reducible to physical properties and yet are nonetheless connected in some way. It is supposed to somehow avoid the mystery of how physical matters can have a determinative role to play in unphysical properties, without those unphysical properties causing a problem in being materialistically un-kosher. For others, supervenience is a reductive principle, a matter of how the world is and must be.

Many philosophers have complained about the (in)significance of supervenience. Stephen Schiffer suggests that the invocation of supervenience simply moves the explanatory task back a step. How, he asks,

could being told that non-natural moral properties stood in the supervenience relation to physical properties make them any more palatable? On the contrary, invoking a special primitive metaphysical relation of supervenience to explain how non-natural moral properties were related to physical properties was just to add mystery to mystery, to cover one obscurantist move with another.

(Schiffer 1987, p.153-4)

Much recent work has been devoted to decrying the philosophical utility of specific formulations of supervenience, the general idea, or proving equivalences between them. All of the formulations we have seen do no more than to chart certain correlations between properties. They do not tell us anything about dependency or determination between the properties, in the sense of, say, a causal relation. Supervenience directs us to search for the underlying reasons for the correlation—it might not always be there. In the case of the special sciences it isn’t clear that an “underlying reason” is to be found. Kim (1987, p. 167), for example, believes that supervenience is not a “deep” metaphysical relation, but instead is a superficial relation that points to some other ‘deeper’ relation that might explain the superficial pattern of dependency—though more recently Kim has shifted to a reductive view of the relation (see Kim, 2005, for a clear account). In this sense, supervenience is a useful concept, for it can function as a filter on types of relations, letting through those of a certain type. Once we have identified a dependence relation, we can then delve deeper to see what might account for it: causation, mereology, definition, emergence, and so forth. In this sense there is no question of supervenience being an explanatory device, so there is no mystery here; but it can nonetheless be used in the search for explanations.

Supervenience has many useful applications too, in making other areas of philosophy clearer and more navigable. For example, the internalism/externalism distinction concerning mental content [very roughly, externalism is the view that mental content depends on things outside of the mind as well as inside; internalism denies this—saying that only what’s inside matters] can be cast into the endorsement and denial respectively of the following supervenience thesis: the content of a mental state (that is, what it is about) supervenes on certain neurobiological properties (narrow content). On the other hand, the externalist, as can be discerned from the rough characterization above, believes that there is more to content than this: the world plays a role too. One can clarify the distinction between internal and external relations too: an internal relation is one that supervenes on the intrinsic properties of its relata (for example, being heavier than), while this is not true in the case of external relations (for example, being 2 miles away from); it does not matter what something is like for it satisfy this latter relation, but it does for the former. We have seen too that it allows for a definition of physicalism and helps with the puzzle of material coincidence. Surely, if by a concept’s work shall you know it, supervenience deserves the central place that it has found in the philosophers’ toolbox.

7. References and Further Reading

For a more technical and detailed presentation of the concept of supervenience, see McLaughlin and Bennett’s article in the Stanford Encyclopedia of Philosophy.

  • Beckermann, A., Flohr, H., & Kim, J., (eds.). Emergence or Reduction? Essays on the Prospects of Nonreductive Physicalism. Berlin: Walter de Gruyter, 1992.
  • Davidson, D. 1970. “Mental Events.” In D. Davidson (ed.), Essays on Actions and Events, 1980: 207-225.
  • Davidson, D. “The Material Mind.” In P. Suppes (ed.), Logic, Methodology and the Philosophy of Science. North-Holland. Reprinted in Essays on Action and Events (Oxford University Press, 1980).
  • Fodor, J. “Special Sciences, or the Disunity of Science as a Working Hypothesis.” Synthese, 1974, 28: 97-115.
  • Hare, R.M. The Language of Morals. Oxford: Oxford University Press, 1952.
  • Hellman, G. & Thompson, F. “Physicalism, Ontology, Determination, and Reduction,” The Journal of Philosophy, 1975, 72: 551-64.
  • Horgan, T. “From Supervenience to Superdupervenience: Meeting the Demands of a Material World.” Mind, 1993, 102: 555-86.
  • Horgan, T. (ed.) Southern Journal of Philosophy 22: The Spindel Conference 1983 Supplement. Supervenience, 1984.
  • Jackson, F. From Metaphysics to Ethics. Oxford: Oxford University Press, 1998.
  • Kim, J. Supervenience, or Something Near Enough. Princeton University Press, 2005.
  • Kim, J. Supervenience and Mind. Cambridge University Press, 1993.
  • Kim, J. “Concepts of Supervenience.” Philosophy and Phenomenological Research 1984, 45, 2: 153-176.
  • Kim, J. “Supervenience as a Philosophical Concept.” Reprinted in J. Kim, Supervenience and Mind, 1993 (1990): 131-160.
  • Kim, J. “’Strong’ and ‘Global’ Supervenience Revisited.” Philosophy and Phenomenological Research, 1987, 48, 2: 315-326.
  • Lewis, D.K. The Plurality of Worlds. Oxford: Oxford University Press, 1986.
  • Lewis, D. K. “Reduction of Mind.” In D. Lewis (ed.), Papers in Metaphysics and Epistemology. Cambridge University Press, 1999 (1994): 291-324.
  • McLaughlin, B. & Bennett, K. “Supervenience.” The Stanford Encyclopedia of Philosophy (Fall 2005 Edition), edited by Edward N. Zalta.
  • McLaughlin, B.P. “The Rise and Fall of British Emergentism.” In A. Beckermann et al. (eds.), Emergence or Reduction? Essays on the Prospects of Nonreductive Physicalism. Walter de Gruyter, 1992: 49-93.
  • McLaughlin, B.P. “Varieties of Supervenience.” In E. Savellos & U. Yalcin (eds.), Supervenience: New Essays. Cambridge University Press, 1995: 16-59.
  • Moore, G.E. Philosophical Studies. London: Routledge, 1922.
  • Paull, C.P. & Sider, T.R. 1992. “In Defense of Global Supervenience,” Philosophy and Phenomenological Research, 32, 1992: 830-45.
  • Post, J. F. “Comment on Teller.” In Horgan (ed.), The Spindel Conference 1983 Supplement. Supervenience, 1984: 163-167.
  • Putnam, H. “Philosophy and our Mental Life.” In Mind, Language, and Reality. Cambridge University Press, 1975.
  • E. Savellos & U. Yalcin (eds.), Supervenience: New Essays. Cambridge University Press, 1995.
  • Schiffer, S. Remnants of Meaning. Cambridge, Mass.: MIT Press, 1987.
  • Sober, E. The Nature of Selection: Evolutionary Theory in Philosophical Focus. University of Chicago Press, 1993.
  • Stalnaker, R. “Varieties of Supervenience.” Philosophical Perspectives 10, 1996: 221-241.
  • Teller, P. “A Poor Man’s Guide to Supervenience and Determination.” In Horgan (ed.), The Spindel Conference 1983 Supplement. Supervenience, 1984: 137-50.

Author Information

Dean Rickles
Email: drickles@ucalgary.ca
University of Calgary

Cognitive Relativism

Cognitive relativism asserts the relativity of truth. Because of the close connections between the concept of truth and concepts such as knowledge, rationality, and justification, cognitive relativism is often taken to encompass, or imply, the relativity of these other notions also. Thus, epistemological relativism, which asserts the relativity of knowledge, may be understood as a version of cognitive relativism, or at least as entailed by it.

This kind of relativism can take different forms depending on the nature of the standpoint or framework to which truth is relativized. If truth is relativized to the individual subject, for instance, the result is a form of subjectivism. If the standpoint is an entire culture, the result is some form of cultural relativism. Other possible frameworks include languages, historical periods, and conceptual schemes. These frameworks do not exclude one another, of course, and in the positions developed by thinkers such as Thomas Kuhn and Michel Foucault (both generally regarded as holding relativistic views of truth) they are presented as interwoven.

Cognitive relativism is not so widely held as moral relativism. Moral relativism is the view that moral judgments (those employing concepts like good, bad, right or wrong) should only be assessed relative to a particular, limited standpoint (usually that of a specific culture). This doctrine became a commonplace for many growing up in modernized societies in the second half of the twentieth century and is virtually the default position encountered among undergraduates by countless philosophy instructors today. One major reason for its popularity is the importance attached by so many thinkers to the distinction between facts and values. Factual judgments are generally thought to be objective and provable; value judgments, by contrast, are commonly held to express subjective attitudes and to be unprovable, rather like judgments of taste.

Gradually, however, cognitive relativism has gained in credibility as the sharp logical dichotomy between facts ands values has been increasingly questioned. Instead of a dichotomy, many now argue for a spectrum of judgments with a greater or lesser evaluative component to them. Moreover, these components themselves may not be seen as radically different; they may, for instance, simply reflect the degree to which a judgment is controversial within a particular community, with what we call factual judgments being the least disputed. From this point of view, cognitive relativism is broader and more fundamental than moral relativism, for it asserts that the truth value of all judgments, not just moral ones, is relative.

Table of Contents

  1. Ancient relativism
  2. The emergence of relativism in modern times
  3. The definition of relativism
  4. Arguments for relativism
  5. Objections to relativism
    1. Relativism is Self-Refuting
    2. Relativism has Pernicious Consequences
  6. Conclusion
  7. References and Further Reading

1. Ancient relativism

In Western philosophy, relativism first appears as a philosophical outlook associated with the Sophists in fifth century Greece. Cosmopolitan and skeptically inclined, these traveling intellectuals were struck by the variations in law, mores, practices and beliefs found in different communities. They drew the conclusion that much of what is commonly regarded as natural is in fact a matter of convention. There is thus no objectively right way to worship the gods or organize society, any more than there is an objectively correct way to dress or to prepare food. The main critical thrust of this way of thinking was directed against traditional moral and political values, but the relativity of truth itself seems to be implicated in Protagoras’ famous assertion that “man is the measure of all things–of things that are, that they are, and of things that are not, that they are not.” The fact that the sophists taught rhetoric, and in stressing the value of persuasion appeared indifferent to questions of truth, reinforced this attitude.

The first great critic of relativism was Plato. In the Theatetus, he links Protagorean relativism to the view that knowledge should be identified with sense perception, and also to the Heracleitean doctrine that reality is in a continual state of flux. Plato’s criticisms of Protagoras’ position prefigures arguments advanced against relativism by its critics ever since. One objection he raises is that relativism collapses the distinction between truth and falsity; for if each individual is really the “measure” of what is, then everyone would be infallible, which is absurd. The implausibility of the Protagorean thesis is especially obvious, Plato argues, when we consider two people making incompatible predictions about the future. Events will prove that one of them, at least, was not a good measure of what is true. His other main objection is that relativism is self-refuting. If Protagoras is right, then whatever a person thinks is true, is true. But in that case, Protagoras must concede that those who think relativism is false are correct. So if Protagorean relativism is true, it must also be false.

Although skepticism about the possibility of knowledge became part of the mainstream of ancient philosophy, relativism did not. Socrates and Plato may be willing to concede that human understanding, in this life at least, is very limited, but they do not doubt the existence of an ideal vantage point from which the objective truth about the world could be known. Also, Aristotle appears fairly confident that such a vantage point is accessible to human reason properly employed.

2. The emergence of relativism in modern times

Between Aristotle and Kant there are no major Western philosophers who one could plausibly describe as cognitive relativists. Montaigne and Hume certainly stressed the importance of custom in shaping peoples’ beliefs, especially on moral matters; but this led them towards skepticism rather than relativism. The door to modern relativism was unlocked by Kant’s claim in the Critique of Pure Reason that the only world we can know or talk about meaningfully is one that has been shaped by the human mind. On Kant’s view, the concept of “objective reality” is employed speculatively and hence illegitimately if it is taken to refer to reality as it is independent of our experience of it. This obviously has implications for the traditional notion of objective truth. The judgments we call true are true for us and of our world; but to claim they are true in the sense of describing an independently existing reality is to go beyond what we can meaningfully or justifiably assert.

Kant is not generally considered a relativist since he held that the forms our mind imposes on the world are common to all human beings. Truths like the truths of geometry or the statement that every event is caused are thus universally accepted and constitute a priori knowledge. The forms we impose on experience also give the world a certain necessary character that is independent of our beliefs and wishes. For instance, causes must precede their effect, and time can only flow in one direction. In this sense, the forms confer objectivity on the world we experience, and our well-founded judgments about that world can be called objectively true. Later thinkers, however, took Kant’s ideas further down the road toward fully-fledged relativism. Hegel, while upholding a concept of “absolute knowledge”, allows every stage that human consciousness has passed through in the historical development of civilization to express an outlook that is true in a partial way. Marx highlights the influence of the mode of production along with class and economic interests in shaping the way people understand their world; and although he appears to recognize the epistemic authority of science in some areas, he rejects the idea of a neutral standpoint from which to adjudicate between different views of social reality. Nietzsche is explicitly relativistic about both moral values and truth, preferring to evaluate claims according to what sort of will to power the claims express rather than according to their objective truth-value.

In the twentieth century, a relativistic view of truth can be found in or inferred from the work of many major philosophers, including James, Dewey, Wittgenstein, Quine, Kuhn, Gadamer, Foucault, Rorty, and most of those commonly labeled “postmodernists”. Numerous others, including some who regard themselves as staunch opponents of relativism, have been accused of harbouring relativistic tendencies. There is thus a general consensus that modern philosophy has shifted in a relativistic direction. Even fierce critics of relativism like Allan Bloom (author of The Closing of the American Mind) concede this. Indeed, it is this trend, along with its trickle down effect on the outlook of rising generations, that occasions lamentations such as his.

3. The definition of relativism

There is no general agreed upon definition of cognitive relativism. Here is how it has been described by a few major theorists:

  • “Reason is whatever the norms of the local culture believe it to be”. (Hilary Putnam, Realism and Reason: Philosophical Papers, Volume 3 (Cambridge, 1983), p. 235.)
  • “The choice between competing theories is arbitrary, since there is no such thing as objective truth.” (Karl Popper, The Open Society and its Enemies, Vol. II (London, 1963), p. 369f.)
  • “There is no unique truth, no unique objective reality” (Ernest Gellner, Relativism and the Social Sciences (Cambridge, 1985), p. 84.)
  • “There is no substantive overarching framework in which radically different and alternative schemes are commensurable” (Richard Bernstein, Beyond Objectivism and Relativism (Philadelphia, 1985), pp. 11-12.)
  • “There is nothing to be said about either truth or rationality apart from descriptions of the familiar procedures of justification which a given society—ours—uses in one area of enquiry” (Richard Rorty, Objectivity, Relativism and Truth: Philosophical Papers, Volume 1 (Cambridge, 1991), p. 23.)

Without doubt, this lack of consensus about exactly what relativism asserts is one reason for the unsatisfactory character of much of the debate about its coherence and plausibility. Another reason is that very few philosophers are willing to apply the label “relativist” to themselves. Even Richard Rorty, who is widely regarded as one of the most articulate defenders of relativism, prefers to describe himself as a “pragmatist”, an “ironist” and an “ethnocentrist”.

Nevertheless, a reasonable definition of relativism may be constructed: one that describes the fundamental outlook of thinkers like Rorty, Kuhn, or Foucault while raising the hackles of their critics in the right way.

Cognitive relativism consists of two claims:

(1) The truth-value of any statement is always relative to some particular standpoint;

(2) No standpoint is metaphysically privileged over all others.

The first of these claims asserts the relativity of truth, obviously an essential element in this form of relativism. Oddly, though, this is not the most controversial part of the doctrine. After all, even committed realists might be willing to conceive of objective truth as equivalent to “true from a God’s eye point of view” or “true from the standpoint of the cosmos”. It is this second claim, the denial of any metaphysically privileged standpoint, that most provokes relativism’s critics. A brief look at the role of this thesis in the thought of three leading relativists–Kuhn, Rorty, and Foucault—will help reveal why it should be so controversial.

In The Structure of Scientific Revolutions, Kuhn argues that science progresses by means of what he calls paradigm shifts. A paradigm theory is an overarching theory like Dalton’s atomic theory or the theory of evolution. These provide the background conceptual scheme within which what Kuhn calls “normal science” occurs. On Kuhn’s account, a paradigm shift such as that by which Copernican astronomy displaced the Ptoemeic view of the universe should not be thought of as a shift between two different ways of looking at an independent reality. Rather, theory and observation are so intertwined that the shift amounts to a change in the reality the scientists inhabit. Consequently, there is no independent standpoint from which a paradigm shift can be judged to take us closer to a true picture of the way things really are. Kuhn likens debates over paradigms to political controversies, saying that “as in political revolutions, so in paradigm choice—there is no standard higher than the assent of the relevant community.” (p. 110)

Richard Rorty extends what Kuhn says about science to every other sphere of culture, particularly politics. The traditional view–call it Platonist, absolutist, objectivist or realist–is that when we do something like abolish slavery we move closer to an independent ideal and we bring our way of thinking closer to the One Right Way, the way dictated by reason or by our essential human nature. Rorty thinks this sort of thinking has been valuable in the past; but in more recent times it has become constraining rather than liberating. He therefore urges us to see intellectual and cultural progress as simply consisting in our exchanging one vocabulary for another. Descriptions of human beings that view them as entitled to equal rights before the law, and descriptions of the solar system that views it as heliocentric are both preferable to the descriptions they replaced; but not because they are closer to the truth. In both cases, we should prefer the newer descriptions on pragmatic grounds; they better enable us to achieve our purposes.

Michel Foucault’s relativism is similar to Kuhn’s in being based on and justified by historical researches. The domain of his studies is different, however. In works like Madness and Civilization, The Order of Things, and Discipline and Punish, Foucault tries to show how what we call “reason”, “science”, “knowledge” and “truth” are socially constituted and shaped by political forces. He argues that in order to pass muster as “scientific” or as “rational”, a discourse must satisfy certain conditions, and these conditions are socially and historically relative, reflecting the needs and interests of existing power structures. This relativity is more obvious in the case of classifications based on distinctions such as normal-perverted, natural-unnatural, rational-insane, or healthy-sick. But Foucault suggests that it applies also to other, more epistemologically central distinctions such as scientific-unscientific, knowledge-error, and true-false. The ideal of a neutral standpoint transcending epochs and interests is thus a chimera.

4. Arguments for relativism

Relativism is the radical offspring of non-realism, which is itself descended from the idealism of Berkeley and Kant. Non-realism holds that we cannot meaningfully talk about they way things are independent of our experience of them: to use Michael Dummett’s formulation, what makes a statement true is not independent of our procedures for deciding it is true. The main argument in favour of non-realism is essentially negative: it avoids the difficulties endemic to metaphysical realism (a.k.a. “objectivism” or “absolutism”).

Realists hold that our judgments are true when they accurately describe or correspond to a reality that exists independently of our perceptions, conceptions, theories or desires. On this view, a true statement such as “water contains oxygen” describes a fact about this independent reality. It rests on a scientific model that may be said to “carve nature at the joints”. But an obvious question arises: how can we determine that our judgments are true in this sense? The obvious answer is that we test them by making experiments and observations. I say it will snow today, and I test this by watching the sky. I say water contains oxygen and I confirm this by showing that one of the elements separated out by electrolysis supports combustion. When our assertions are decisively confuted by experience, we conclude that they are false—i.e. they describe a state of affairs that does not obtain.

Relativists accept that this is how we normally conceive of truth and falsity—in ordinary usage, the word “true” means something like “corresponds to the facts”–and as an account of our everyday epistemic procedures it is unobjectionable. But they argue that it loses coherence if it is elevated to the metaphysical level. For what is really happening, even when we are confirming the most mundane belief about the empirical world, is that we are satisfying ourselves that this belief coheres with our other beliefs. We confirm that the sea is salty by tasting it or by conducting a chemical analysis of seawater. But these procedures only confirm our belief about sea water in the sense of showing it to be compatible with or even entailed by a host of other beliefs: for instance, that the sample we are examining is typical; that nothing else tastes quite like salt; that our sensory faculties are trustworthy on this occasion; that salt tastes roughly the same at different times. What we can never do, argue relativists and other non-realists, is check the degree of correspondence between our judgments and reality as it is independent of our experience of it. To do this we would have to take a “sideways on” view of the cognitive relation between subject and object. But this is impossible since any vantage point we adopt will necessarily be that of the subject. For the same reason, we cannot compare our overall conceptual scheme or theoretical model of reality with reality as it is “in itself.”

The driving idea behind empiricism and the upshot of Kant’s critique of speculative metaphysics is thus that concepts must be tied to experience if they are to have legitimate employment in science or philosophy. Relativists argue that the metaphysical realist’s concept of truth fails this test, for it takes the notion of “correspondence with reality” out of its everyday employment, where it is genuinely useful, and tries to press it into metaphysical service, where it is neither useful nor legitimate. So even if, in its normal usage, “truth” means something like correspondence with reality, the ultimate criterion of truth turns out to be coherence with other beliefs. To put it another way: our philosophical conception of truth cannot simply be an expanded version of our commonsense notion of truth as correspondence. And this implies that truth must always be relative to some belief system, to some particular epistemic standpoint. This is the first of the two theses identified above as constituting the doctrinal kernel of relativism. Numerous philosophers have affirmed it. Yet many of these have sought to avoid relativism by rejecting the second thesis—that no standpoint is metaphysically privileged over all others.

This second thesis is what gives relativism its bad name. Critics commonly reduce it to the claim that any point of view is as good as any other and then attack it with some variation of Plato’s arguments against Protagoras. But virtually no well-known philosophers actually hold that all standpoints are of equal worth. Richard Rorty, for instance, who is widely regarded as a relativist, dismisses that position as “silly.” (Richard Rorty, Objectivism, Relativism, and Truth, p. 89). Rorty, Kuhn and most other relativists accept that one can have cogent reasons for preferring one standpoint to another; the preferred point of view may, for instance, exhibit greater logical consistency or greater predictive power than other available perspectives. But they argue that such reasons cannot confer any special metaphysical status on the standpoint in question. They cannot, for instance, show it to be the one favoured by God, or dictated by Reason, or most in accord with human nature.

Relativists typically justify this conclusion along the following lines. Any proof of a standpoint’s superiority must rest on premises that express fundamental assumptions and basic values. For instance, arguments for the superiority of the standpoint of modern science over that of religion will presuppose the value of consistency, of solving theoretical puzzles, and of being able to manipulate one’s environment. A person who defends the literal truth of the bible but shares these values is likely to be persuaded fairly quickly by these arguments. But a person who holds that truth appears to humans as paradoxical, and who values tradition and religious faith over experimental evidence and predictive power will not be persuaded. An argument can only be convincing to one who accepts its premises. Some premises, though, like those just mentioned, are so fundamental that they are not usually argued for at all. Rather, they are constitutive of a particular outlook.

The relativists’ thesis is not that one cannot support standpoints with arguments; it is that in the end all such arguments must be circular since they inevitably rest on premises that are themselves part of the standpoint. Critics will here point out that there is a difference between denying that the superiority of one standpoint over all others can be proved and denying that such a standpoint exists. In reply, relativists are likely to claim that this distinction is an abstract one that no consistent empiricist or pragmatist would make. To insist that one standpoint is objectively superior to all others, they argue, even though there is no way of proving this, is dogmatic and pointless; to claim that one’s own standpoint enjoys this unique but undemonstrable superiority is dogmatic and implausible.

A critic might also object that what relativists call “cogent” reasons for preferring one standpoint to another are not epistemically relevant: that is, they do not provide grounds for thinking that the standpoint generates or ensures beliefs that are objectively true. But this is clearly a point most relativists would be willing to concede. The notion of objective truth referred to here is not a concept for which they have a use, preferring instead something like William James’ conception of truth as “what is good in the way of belief.”

5. Objections to relativism

Critics of relativism are legion, but the objections leveled against it are usually of two kinds, both pioneered by Plato in his critique of Protagoras. One line of attack tries to show that relativism is incoherent because it is self-refuting. The other common objection is that relativism, if taken seriously, would have bad practical consequences. Let us consider both of these in turn.

a. Relativism is Self-Refuting

A doctrine is self-refuting if its truth implies its falsehood. Relativism asserts that the truth-value of a statement is always relative to some particular standpoint. This implies that the same statement can be both true and false. The qualification that the statement is true relative to standpoint A but false relative to standpoint B may save relativism from the charge of embracing gross contradictions. But it still clearly implies that relativism itself is false, at least relative to some standpoints. One might say that it is just as much false as it is true, in which case there seems to be no good reason to prefer relativism to alternative positions such as realism.

One possible response to this objection would be to modify the theory and hold that all truths are relative except for the truth that all truths are relative. On this view, the relativist thesis enjoys a unique status, being true in some non-relativistic sense. This position may be coherent, but it is rather implausible. It is hard to see what could justify granting the thesis of relativism this exceptional status. A more plausible option is for relativists to concede that their view is false relative to at least some non-relativistic theoretical frameworks but to deny that this admission is damaging. Relativism, they can claim, is simply in the same situation as any other theory. The theory of evolution is true from the perspective of modern science and false from the perspective of Christian fundamentalism. Relativists deny that one of these perspectives is demonstrably better than the other. But this does not mean that they cannot affirm the scientific perspective, and do so for cogent reasons. In the same way, they can acknowledge that relativism is false from the standpoint of metaphysical realism; but they can do this without inconsistency or incoherence since they are not metaphysical realists, and they have reasons for preferring relativism to realism.

A variation on the charge that relativism is self-refuting is the argument that it is somehow self-refuting for relativists to assert or to argue for their position. This line of attack has been pressed forcefully by Hilary Putnam and others. Putnam’s argument is that ordinary rational discourse presupposes a non-relativistic notion of truth. Jûrgen Habermas offers a similar sort of argument in his critique of postmodernists like Foucault and Derrida, claiming that a commitment to truth, like a commitment to sincerity, is a necessary condition of successful communication.

Relativists, however, are likely to remain skeptical about these alleged presuppositions and implicit commitments. It may be true that when we engage in rational discourse we implicitly commit ourselves to the truth of what we are saying. But it is not at all obvious that we implicitly commit ourselves to a non-relativistic conception of truth. And even if this were the case, it is not clear why this supposed presupposition of everyday communication should be accorded so much respect and made the basis for a philosophical account of truth. Our everyday notions of space and time may also be non-relativistic, but we do not demand that physicists’ theories of space and time conform to our pre-scientific ideas.

b. Relativism has Pernicious Consequences

This criticism also was first ventured by Plato and continues to be endorsed by many. Cognitive relativism is thought to undermine our commitment to improving our ways of thinking rather as moral relativism is thought to undermine our belief in the possibility of moral progress. Several reasons have been given to support this anxiety. To some, the fact that relativism countenances the possibility of multiple true but incompatible points of view entails a kind of epistemic nihilism. If creationism and the theory of evolution, Ptolemaic and Copernican astronomy, astrology and modern psychology are all equally true, then what purpose is served by developing new scientific theories? All views are of equal value, so why not just rest content with whatever happens to be “true for us”?

Against this, relativists can offer two responses. First, truth is not the only epistemic value. We can also prefer theories on the basis of such values as coherence with our other beliefs, predictive power, and practical fruitfulness. Second, by endorsing relativism one does not lose the right to judge beliefs according to their truth or falsity. Modern relativists will believe that the earth orbits the sun and that Copernicus’ discovery represented scientific progress over earlier astronomy. But their philosophical account of the status of these beliefs will be relativistic. The Copernican theory is true and its acceptance represents progress according to the values and concerns that constitute the modern scientific standpoint—a standpoint shared by both relativists and non-relativists. The difference between them is that the relativists do not believe this standpoint can be proved superior to others except by arguments that are essentially circular and question-begging.

Hillary Putnam presses a slightly different version of the above objection. Relativism, he argues, tries to “naturalize” the concept of reason. What he means is that relativists try to discuss questions of truth, knowledge, and rationality in a thoroughly descriptive, non-normative way. Like social scientists afraid of allowing value-judgments to creep into their work, they take a detached stance and simply report the epistemic customs and practices of different cultures, eschewing any impulse to endorse or criticize them. And this amounts, in Putnam’s words, to “mental suicide”. For, while particular norms of rationality will be entrenched within a particular culture, reason has an inalienable critical or transcendent function which can be used to criticize existing epistemic norms. Relativism can thus be accused of encouraging a certain kind of intellectual passivity.

Relativists have also been accused of embracing determinism, and certainly thinkers like Nietzsche and Foucault sometimes invite this charge. The epistemic norms of a culture or a period are taken to be shaped by non-rational forces such as class interests, technology, or the will to power of a group or individual. And what people then come to believe is seen as a function of these norms. For example, Foucault suggests that the classification of homosexuality as a disease results from employing a certain kind of theoretical framework, one that posits a sharp distinction between the natural and the unnatural and correlates the former with the healthy, the latter with the sick. And this framework becomes established because it serves certain interests. So truth is identified with what is believed to be true, and what is believed to be true is determined by larger social forces operating within a culture or historical epoch.

This deterministic tendency, like the attempt to naturalize reason, is held by critics to entail, or at least encourage, a renunciation of the longstanding project of using reason to criticize existing norms, beliefs, and practices in order to furnish ourselves with better ones. Relativism is thus associated with the counter-Enlightenment aspects of postmodernism. But association is not the same thing as logical entailment. It may well be true that some relativists are drawn towards determinism or feel they must eschew value judgments. But it is not clear that these tendencies must be part of a relativistic outlook. Other relativists will argue that the connection between relativism and determinism, say, is historical and contingent rather than logical and necessary. In their view, one can consistently endorse a relativistic view of truth while still being committed to the relative superiority of some views over others, to the value of critical reflection, and to the possibility of using reason as an instrument of scientific and social progress.

6. Conclusion

Cognitive relativism continues to be an important but controversial position that one encounters in contemporary debates about the nature of truth, knowledge, rationality, and science. These debates can sometimes be confusing because people neither agree about exactly what relativism affirms, nor about whose views should be described as a relativistic.

Critics of relativism sometimes seem to assume that relativists are denying that they believe—or denying themselves the right to believe—obvious truths. But the more sophisticated relativists do not deny that statements like “the earth is round” are true. They just favour a certain philosophical account of what is involved and implied when we describe such statements as “true”. The situation here is reminiscent of the debate between idealists and some of their materialist critics. The critics charge idealists like Berkeley with holding that our sense perceptions are illusions, and they think they can refute this doctrine by doing things like kicking stones. But the idealists do not see themselves as holding or implying any such view. They just think that the materialist explanation of our sense-experiences is philosophically problematic; so they offer what they take to be a more coherent alternative.

On the other hand, relativism is sometimes advanced quite crudely. Then, instead of being a philosophical view about the status of our beliefs and the limitations on how we might support these beliefs, it becomes an excuse for accepting uncritically one’s own culture’s assumptions and epistemic norms; or it serves to rationalize intellectual apathy or slackness masquerading as tolerance of diverse opinions. Just as idealists still have to negotiate what we normally call the material world, so relativists have to make decisions about whether particular claims are true or false. Their philosophical relativism may incline them towards being more open-minded and tolerant than dyed-in-the-wool absolutists and objectivists. But they cannot avoid adopting specific standpoints, choosing between theories, and endorsing particular beliefs and values. At bottom, the debate over relativism is about whether it is possible for relativists to make these commitments consistently and sincerely.

7. References and Further Reading

  • Bernstein, Richard J. Beyond Objectivism and Relativism. Philadelphia: University of Pennsylvania Press, 1985.
  • Davidson, Donald. “On the Very Idea of a Conceptual Scheme.” Proceedings and Addresses of the American Philosophical Association (1974), 5-20.
  • Field, Hartry. “Realism and Relativism.” Journal of Philosophy 79 (1982): 553-557.
  • Forster, Paul D. “What Is at Stake Between Putnam and Rorty?” Philosophy and Phenomenological Research LII, No. 3 (1992): 585-603.
  • Foucault, Michel. Power/Knowledge: Selected Interviews and Other Writings. Edited by Colin Gordon. Translated by Colin Gordon, Leo Marshall, John Mepham, and Kate Soper. New York: Pantheon Books, 1980.
  • Foucault, Michel. The Foucault Reader. Edited by Paul Rabinow. New York: Pantheon Books, 1984
  • Gadamer, Hans-Georg. Truth and Method. Second revised edition. Translated and revised by J. Weinsheimer and D. G. Marshall. New York: Crossroad, 1989.
  • Gellner, E.. Relativism and the Social Sciences. Cambridge: Cambridge University Press, 1985.
  • Geertz, Clifford. The Interpretation of Cultures. New York: Basic Books, 1973.
  • Goodman, Nelson. Ways of Worldmaking. Indianapolis: Hackett, 1978.
  • Habermas, Jürgen. The Theory of Communicative Action, vol. 1, Reason and the Rationalization of Society. Translated by Thomas McCarthy. Boston: Beacon Press, 1984.
  • Habermas, Jürgen. The Philosophical Discourse of Modernity. Translated by Frederick Lawrence. Cambridge, Mass.: M.I.T. Press, 1987.
  • Hollis, Martin and Lukes, Steven (eds). Rationality and Relativism. Cambridge, Mass.: The M.I.T. Press, 1982.
  • Jackson Ronald Lee. “Cultural Imperialism or Benign Relativism? A Putnam-Rorty Debate.” International Philosophical Quarterly XXVIII, No. 4, Issue 112 (1988).
  • Jarvie, I. C. Rationality and Relativism: In search of a philosophy and history of anthropology. London: Routledge & Kegan Paul, 1984.
  • Johnson Jeffery L. “Making Noises in Counterpoint or Chorus: Putnam’s Rejection of Relativism.” Erkenntnis 34 (1991): 323-345.
  • Kelly, Michael, ed. Critique and Power: Recasting the Foucault/Habermas Debate. Cambridge, Mass.: M.I.T. Press, 1994.
  • Krausz, Michael, and Meiland, Jack W., eds. Relativism: Cognitive and Moral. Notre Dame: University of Notre Dame Press, 1982.
  • Krausz, Michael. Relativism: Conflicts and confrontations. Notre Dame: University of Notre Dame Press, 1989.
  • Kuhn Thomas S. The Structure of Scientific Revolutions, 2nd Edition. Chicago: University of Chicago Press, 1970.
  • MacIntyre, Alasdair. “Relativism, Power, and Philosophy.” Proceedings and Addresses of the American Philosophical Association. Newark, Delaware: APA (1985): 5-22.
  • Plato, Theaetetus. Translated by M. J. Levett, revised by Myles Burnyeay. Indianapolis: Hackett, 1990.
  • Preston, John. “On Some Objections to Relativism.” Ratio 5, No. 1 (1992): 57-73.
  • Putnam, Hilary. Reason, Truth and History. Cambridge: Cambridge University Press, 1981.
  • Putnam, Hilary. Realism and Reason: Philoosophical Papers, Volume 3. Cambridge: Cambridge University Press, 1983.
  • Putnam, Hilary. The Many Faces of Realism. La Salle, Illinois: Open Court, 1987.
  • Quine, Willard Van Orman. Ontological Relativity and Other Essays. New York: Columbia University Press, 1969.
  • Rorty, Richard. Consequences of Pragmatism. Minneapolis: University of Minnesota Press, 1982.
  • Rorty, Richard. Contingency, irony, and solidarity. Cambridge: Cambridge University Press, 1989.
  • Rorty, Richard. Objectivity, relativism, and truth: Philosophical papers, Volume 1. Cambridge: Cambridge University Press, 1991.
  • Rorty, Richard. Truth and Progress: Philosophical Papers, Volume 3. Cambridge: Cambridge University Press, 1998.
  • Scheffler, Israel. Science and Subjectivity. Indianapolis: Bobbs-Merrill, 1967.
  • Solomon Miriam. “On Putnam’s argument for the inconsistency of relativism.” The Southern Journal of Philosophy XXVIII, No. 2 (1990): 213-220.
  • Throop, William M. “Relativism and Error: Putnam’s Lessons for the Relativist.” Philosophy and Phenomenological Research 49 (1989): 675-678.
  • Westacott, Emrys. “Relativism, Truth, and Implicit Commitments.” International Studies in Philosophy 32:2 (2000(: 95-126.
  • Whorf, Benjamin Lee. Language, Thought and Reality. Cambridge, Mass.: M.I.T. Press, 1956.
  • Winch, Peter. The Idea of a Social Science and its Relation to Philosophy. London: Routeldge & Kegan Paul, 1958.
  • Wilson, Bryan. Rationality. Oxford: Basil Blackwell, 1970.
  • Wittgenstein, Ludwig. Philosophical Investigations. Translated by G. E. M. Anscombe. Oxford: Basil Blackwell, 1953.

Author Information

Emrys Westacott
Email: westacott@alfred.edu
Alfred University
U. S. A.

Moral Realism

The moral realist contends that there are moral facts, so moral realism is a thesis in ontology, the study of what is. The ontological category “moral facts” includes both the descriptive moral judgment that is allegedly true of an individual, such as, “Sam is morally good,” and the descriptive moral judgment that is allegedly true for all individuals such as, “Lying for personal gain is wrong.” A signature of the latter type of moral fact is that it not only describes an enduring condition of the world but also proscribes what ought to be the case (or what ought not to be the case) in terms of an individual’s behavior.

The traditional areas of disagreement between the realist camp and the antirealist camp are cognitivism, descriptivism, moral truth, moral knowledge, and moral objectivity. The long and recalcitrant history of the realism/antirealism debate records that the focal point of the debate has been shaped and reshaped over centuries, with a third way, namely, Quasi-realism, attracting more recent attention. Quasi-realism debunks the positions of both realism and antirealism.

On the one hand, considering cognitivism, descriptivism, moral truth, moral knowledge, and moral objectivity as specifying the sufficient conditions for moral realism ignores the quasi-realist way. On the other hand, defining moral realism in a way that accommodates quasi-realism concedes too much: unlike the moral realist, the quasi-realist denies that moral facts are explanatory. Consequently, one can view quasi-realism as the contemporary heir of antirealism.

Table of Contents

  1. The Realism/Antirealism Debate
    1. Cognitivism
      1. Descriptivism
      2. Mackie’s Error Theory
      3. Waller’s Megaethical Level
    2. Truth in Moral Judgments
      1. An Analogy
      2. Skorupski’s Irrealist Cognitivism
      3. The Correspondence Theory Requires Realism, Not Vice Versa
    3. Literal Moral Truth?
    4. Moral Knowledge
    5. Moral Objectivity
  2. Quasi-Realism, Antirealism, and the EI thesis
    1. An Analogy: Quasi-Realism about Derogatory Judgments
    2. Quasi-Realism, Antirealism, and Explanationist Moral Realism
  3. Moral Realism after Quasi-Realism
  4. References and Further Reading

1. The Realism/Antirealism Debate

If there are moral facts, how can we know them? For a realist, moral facts are as certain as mathematical facts. Moral facts and mathematical facts are abstract entities, and as such, are different in kind from natural facts. One cannot literally display moral facts as one could display, say, a plant. One can display a token of the type, for example one can write “lying for personal gain is wrong” or one can write an equation; however, one cannot observe moral and mathematical facts in quite the same way as one can observe, with the aid of a microscope, clorophyll in a leaf. Such limitations of experience do not stop realists and antirealists from disagreeing on virtually every aspect of the moral practices that seem to presuppose the existence of moral facts. The list of contested areas includes moral language, moral truth, moral knowledge, moral objectivity, moral psychology, and so on. These areas are not discrete but intermingle.

The moral realist may argue for the view that there are moral facts as follows:

(1) Moral sentences are sometimes true.

(2) A sentence is true only if the truth-making relation holds between it and the thing that makes it true.

(3) Thus, true moral sentences are true only because there holds the truth-making relation between them and the things that make them true.

Therefore,

(4) The things that make some moral sentences true must exist.

It is a short inference from the existence of the things that make some moral sentences true to the existence of moral facts.

The moral antirealist can respond to the argument by denying any of the three premises. The antirealist could be a non-descriptivist in rejecting premise (1): no moral sentences are true for they do not describe how the world is; or, she may reject a version of the correspondence theory of truth by denying premise (2): she may argue that a sentence can be true even if there holds no truth-making relation between it and the thing that makes it true. For instance, she may be a proponent of the coherence theory of truth, which holds that a sentence can be true only when there is a truth making relation between it and other sentences relevant to it. Or, she may even reject as illegitimate the inference from “things that make some moral sentences true” to the “existence of moral facts.”

In the past, many antirealists were noncognitivists, holding that moral judgments are not cognitive states like ordinary beliefs: that is, antirealists hold that unlike beliefs, the essential function or aim of moral judgments is not to represent the world accurately. (A non-descriptivist claim is that cognitivism —more specifically descriptivism— is necessary, but not sufficient for moral realism, as will be shown presently.) Moral judgments are, according to the noncognitivist, mental states of some other kind: they are emotions, desires, or intentions of the sort that are expressed by commands or prescriptions.

If moral judgments are expressed by commands or prescriptions, then there cannot be literal moral truths. (Cf. Wright 1993. He argues that the focal discussion in the realist/antirealist debate should be about the acceptable theories of truth.) If there are no literal moral truths, then no moral judgments may be cited as evidence for knowing how the world is. Moral knowledge can no longer be considered as descriptive or propositional; or, no one is justified in believing certain things about the world in making moral judgments. This illustrates how the noncognitivist analysis of moral judgments can be escalated into the antirealist rejection of (those good names that we take for granted when we participate in moral practices such as) “moral truths” and “moral knowledge.” The antirealist’s noncognitivism threatens moral objectivity as well. Objectivity is to be found within the world. If moral judgments are not about accurately describing the world —for example, if moral judgments are about us —then moral objectivity will not be found within the world. If moral objectivity is to be found within us, then it is not the same objectivity with which we began, or, so had been the old antirealist’s way.

a. Cognitivism

If it is noncognitivism that provides the antirealist a way of rejecting moral truth, moral knowledge, and moral objectivity, the denial of noncognitivism (that is, cognitivism) must be necessary for the realist to properly claim them. Cognitivism is the view that moral judgments are cognitive states just like ordinary beliefs. It is part of their function to describe the world accurately. The realist argument that stems from cognitivism — as we saw from the above argument— is oftentimes guided by the apparent difficulties that the noncognitivist analysis of moral judgments faces. For instance, there is the famous Frege-Geach problem, namely, the noncognitivist difficulty of rendering emotive, prescriptive or projective meaning for embedded moral judgments.

Geach (1965) uses the “the Frege point,” according to which “a proposition may occur in discourse now asserted, now unasserted, and yet be recognizably the same proposition,” to establish that no noncognitivist (“the anti-descriptive theorist”) analysis of moral sentences and utterances can be adequate.

Consider a simple moral sentence: “Setting a kitten on fire is wrong.” Suppose that the simple sentence means, “Boo to setting a kitten on fire!” The Frege point dictates that the antecedent of “if setting a kitten on fire is wrong, then getting one’s friends to help setting a kitten on fire is also wrong” must mean the same as the simple sentence. But this cannot be because the antecedent of the conditional makes no such assertions while the simple moral sentence does. In other words, the noncognitivist analysis of moral sentences cannot be given to the conditional sentences with the embedded simple moral sentence. The problem can be generally applied to cases of other compound sentences such as “It is wrong to set a kitten on fire, or it is not.” Even if the noncognitivist analysis of the simple sentence were correct, compound sentences within which a simple moral sentence is embedded should be given an analysis independently of the noncognitivist analysis of it. This seems unacceptable to many. For the following argument is valid: “It is wrong to set a kitten on fire, or it is not; it is not ‘not wrong’; hence, it is wrong to set a kitten on fire.” If the argument is valid, then the conclusion must mean the same as one of the disjuncts of its first premise. The argument would be otherwise invalid because of an equivocation, and the noncognitivist seems to be forced to say that the argument is invalid.

The Frege-Geach problem demonstrates the noncognitivists’ requirement of adequately rendering emotive, prescriptive, expressive, or projective meaning of those moral sentences that are embedded within compound moral sentences. (For more on the Frege-Geach problem, see Non-Cognitivism in Ethics. See also Darwall, Gibbard, and Railton 1992: 151-52.)

The cognitivist understanding of moral judgments is at the center of moral realism. For the cognitivist, moral judgments are mental states; moral judgments are of the same kind as ordinary beliefs, that is, cognitive states. But how are we to know this? One manageable way is to focus on what we intend to do when we make moral judgments, and also on how we express them. Moral judgments are intended to be accurate descriptions of the world, and statements express moral judgments (as opposed to command or prescription) just as statements express ordinary beliefs. That is, statements express moral language. The statements that express moral judgments are either true or false just as the statements that express ordinary beliefs are. Moral truths occur when our signs match the world.

Language allows us to communicate with one another, typically using sentences and utterances. A large part of language involves, among many other things, influencing others and us. Normative language, in contrast with descriptive language, includes moral language (that is, moral language is part of evaluative or normative language). It is even more important not to be swayed by moral language because moral reality grips us. It is bad that others try to deceive us, but it is worse that we deceive ourselves into accepting moral facts simply because of the language that we use. That is, moral language — if it is not to describe the world —must not be mistaken as descriptive. Moral language binds us in a certain manner, and the manner in which it binds us is important.

i. Descriptivism

Moral language and descriptive language share the same syntactic structure. “Sam is good” predicates a kind of goodness to Sam just as “Sam is four-legged” predicates having four legs to her. “Being good” as in “being good is being able to bear one’s own scrutiny” and “having four legs” as in “having for legs is not required of being a dog” are both noun-like phrases. Again, to say, “If Sam is good, then she will be able to bear her own scrutiny,” illustrates that moral predication could be embedded to form a compound sentence just as descriptive predication could. We use both parts of language with an equal ease. Almost all of us are proficient in using moral language. Most of us understand what others express with it; and, we are expected to have understood what moral language means. Few people would apply the term “morally permissible” to an apparent case of wanton cruelty. Furthermore, moral language is governed by the same fundamental rules of logic as descriptive language. For instance, one and the same action cannot be good and bad at the same time. (The philosophical rejection of moral facts remains popular, although this focal reliance on the logico-linguistic aspect of the moral practices is no longer fashionable. See Darwall, Gibbard, and Railton 1992, especially p. 123.)

From this, must we then infer that there are entities like “moral goodness” and “obligation” to which moral language refers in the world? Are the three characteristics of structural similarity between moral and descriptive languages, the equal ease with which we employ them, and the logical interplay between them good enough reasons for thinking that there are moral facts? Is it not possible that our ways of influencing others and ourselves are exactly where syntax and semantics of our language betray us and, consequently, that moral language suffers from a lack of referents analogous to terms such as “nothing,” the “present king of France,” do?

Either moral language describes (or, it is intended to describe accurately) the world or it does not. According to descriptivists, moral language describes the world. The descriptivist position has been thought as the mark of moral realism, while the non-descriptivist position as that of antirealism. This is captured as follows:

(C1) S is a moral realist if and only if S is a moral descriptivist.

So while one may hold that there are no moral facts, according to C1, one may not at the same time hold that moral language describes or is intended to describe the world. Again, one may not hold both that there are moral facts but that our languages about them do not describe the world. For if C1 were true, being a moral realist and being a descriptivist about moral language are logically equivalent. So any non-descriptivist realism and any descriptivist antirealism would show that C1 is false. The possibilities will be discussed shortly in §2 and §3. Descriptivism and, hence, the truth-aptness of moral language. is discussed in more detail in what follows. (Ignored for the moment is what Blackburn calls “quietism” according to which “at some particular point the debate is not a real one, and that we are only offered, for instance, metaphors and images from which we can profit as we please” 1984, 146. One may claim quietism to be present in pretty much any important and interesting philosophical dispute, like “primary versus secondary, fact versus value, description versus expression, or of any other significant kind” 1998, 157. Quietism about whether moral language describes the world, if true, would render the traditional realism/antirealism debate over descriptivism as a dispute over no difference where there is nothing more than “the celebration of the seamless web of language” 1998, 157.)

Descriptivism in meta-ethics is a cognitivist view, according to which moral language describes (or, is intended to describe) the world. (Cf. Horgan and Timmons 2000, 124. This rough definition, according to them, falls under the dogma of the “[mistaken] semantic assumption: All genuinely cognitive content is descriptive content.” Conflating descriptivism with cognitivism is, according to them, “a largely unquestioned dogma.”) An inevitable corollary of descriptivism is that moral language is apt to truth evaluation; that is, statements express moral judgments that are either true or false. We may say alternatively that moral sentences express propositions without affecting the result of the discussion. As Nicholas Sturgeon puts it, “moral [sentences] typically express [statements] capable of truth and falsity” (1986, 116). Strictly speaking, then, descriptivism says little about, and remains neutral with respect to, the two views in moral epistemology: there are moral statements that are known to be true. Descriptivism does not tell us whether there is any moral statement known to be true. Nor does it tell us anything about the things by virtue of which moral statements are true when they are true. (Cf. Skorupski 1999. He thinks that descriptivism in conjunction without a substantial theory of truth is no descriptivism at all. There is just a terminological difference, and the descriptivism in conjunction with a substantial theory of truth will be discussed in section 2.)

The moral descriptivist believes that moral statements express moral judgments, and that they are either true or false. If every sentence that is capable of truth-value describes the world, then so does every moral statement. Moral language describes the world because every truth-apt sentence describes, or is intended to describe the world. The non-descriptivist denies that. The non-descriptivist believes that moral statements do not express moral judgments. Rather, the non-descriptivist believes that moral judgments are expressed by commands or prescriptions. Neither commands nor prescriptions are truth-apt, and as a result they typically are not meant to describe the world. Moral language does not describe the world, according to the non-descriptivist. That is, it represents our wishes, preferences, emotions, and so on, but it represents nothing over and above them. Figure 1 illustrates the disagreement between the descriptivist and the non-descriptivist. (Definite antirealist positions are marked with the dotted boxes in the figures that follow. An oval box will mark definite realist positions. See figure 5.)


Figure 1

Non-descriptivists disagree about exactly what moral language accomplishes, while they are unanimous about what it does not. G. E. Moore’s open question argument supports emotivism, a non-descriptivism contrary to his intention in the beginning of the 20th century. A. J. Ayer and C. L. Stevenson argue that moral judgments express feelings of approval or disapproval, or that making moral judgments is equivalent to emoting in reference to behaviors of others and ours. (See Ayer 1952 and Stevenson 1937, 1944, and 1963.) Stevenson says that, “Mr. G. E. Moore’s familiar objection about the open question is chiefly pertinent in this regard. No matter what set of scientifically knowable properties a thing may have (says Moore, in effect), you will find, on careful introspection, that it is an open question to ask whether anything having these properties is good,” (1937, 18). R. M. Hare’s universal prescriptivism, according to which “‘ought’-judgments are prescriptive like plain imperatives, but differ from them in being universalizable” (1991, 457) emphasizes that moral language facilitates ways of prescribing actions for all of us. The norm-expressivism of Allen Gibbard has renewed arguments for non-descriptivism recently. Rejecting emotivism, Gibbard,1990, holds that moral judgments are concerned about rational-to-have or justified moral sentiments, not just about feelings or preferences one has. Apparently, he holds that some moral feelings can be called rational-to-have or justified. It is when “one’s acceptance of norms that permit the feeling” (Darwall, Gibbard, and Railton: 1992, 150-51) is expressed, a feeling may be called rational-to-have. So while moral judgments (and moral language) are expressive of what we accept as norms, namely, a state of mind, they are not about describing the world, namely, non-descriptivism about moral judgment and language. Blackburn’s projectivism seems difficult to classify one way or another especially when it is considered in conjunction with his quasi-realism (Blackburn: 1984, 1993, and 1998). Moral language according to the projectivist lets us spin our own story onto the world. Non-descriptivists agree, nonetheless, that moral language is the tool of choice when we are panting for help, recommending a course of actions, passing judgments on what others do, and so on, but it is never the tool for describing the world.

The views discussed above can be illustrated with an example. Consider the moral sentence, “Petal ought to avoid eating too much.” The utterance of the sentence expresses the speaker’s judgment about Petal and perhaps about her tendency to the excessive consumption of food. The cognitivist holds that the speaker’s judgment is of the same kind as ordinary beliefs, that is the cognitivist holds that the speaker’s moral judgment is a cognitive state. Beliefs are representations of how things are, namely, possible states of affairs; and, language typically expresses beliefs. According to the cognitivist, then, the moral sentence that expresses the moral judgment represents a possible state of affairs. We may say that the descriptivist maintains that the moral sentence describes what ought to be the case about Petal and her tendency toward food. Petal could be instantiating the property of the “oughtness” of avoiding the excessive consumption of food, although this is not the only cognitivist way of maintaining her descriptivism about moral language. Just as the morning star refers to Venus, the linguistic item “ought to avoid eating too much” may refer to a moral property. It might even be maintained that there obtains the referential relation between moral expressions and the things in the world that they are supposed to pick out.

Noncognitivists hold that the speaker’s judgment in saying, “Petal ought to avoid eating too much,” is not of the same kind as cognitive states. Some noncognitivists go further and deny that the moral sentence represents a possible state of affairs. That is, some noncognitivists are non-descriptivists as well. The non-descriptivists maintain that the surface structure of moral language—and the logical interplay it displays within our use of it—is not a good guide in understanding what moral language does for us (and what we intend to do with it). The word “nothing” picks out no object whatsoever, although it serves as a grammatical subject; the definite description the “present King of France” refers to no one, although its article “the” indicates a unique satisfier of the description, and so on. These are familiar cases (of our language betraying us ontologically). So, part of the non-descriptivist claim is that moral language ontologically manipulates us just as “nothing” and the “present king of France” do. The merit of the view according to which there lurks a deeper structure (or meaning) to our moral language must be judged on how successful the non-descriptivist construal of the sentence about Petal is.

The non-descriptivist construal of “Petal ought to avoid eating too much” varies. Emotivism construes it as the way of emoting the speaker’s disapproval of Petal’s excessive consumption of food, or the way of informing Petal of her feeling. The expressivist construes it as the speaker’s way of expressing her preference with regard to Petal’s eating habit. The prescriptivist construes it as the way of commanding Petal to not eat excessively. The norm-expressivist construes it as the way of expressing the speaker’s non-acceptance of the norms that allow such a consumption of food. Perhaps the projectivist would construe the statement about Petal as a way of “objectifying” the speaker’s disapproval. However, all reject that there is a dyadic relationship of reference or correspondence, between the moral sentence and how the world is. The dyadic relation has all but been reduced to the monothetic relation of showing/manifesting the speaker’s psychological state. (The truth of this does not entail that people do not believe in moral principles. A. J. Ayer says that “[t]o say…that these moral judgments are merely expressive of certain feelings, feelings of approval or disapproval, is an over simplification” 1954, 238.) Figure 2 diagrams the non-descriptivist positions.


Figure 2

The contrast between descriptivism and non-descriptivism seems inapt for Gilbert Harman’s relativism because his relativism is a definite moral antirealist position. He rejects the objective status of moral facts. (See his 1977, 1986, and 2000; see also Harman and Thomson 1996 in which an interesting discussion of reasons both for and against moral objectivity is presented.) The relativist maintains that there are some ethical questions that can be correctly answered with “yes” for one, and “no” for another. Her claim implies nothing concerning for what moral language is meant. Error theorists maintain that moral judgments systematically err by positing moral facts. (For instance, Mackie says that “[t]he assertion that there are objective values or intrinsically prescriptive entities or features of some kind, which ordinary moral judgments presuppose is, I hold it not meaningless but false” 1977, 40.) That is, moral language aims to get the world right, but it always misses the mark. Mackie’s error theory in this respect occupies an important niche between the sides of the descriptivism divide and the sides of the moral realism divide. Figure 3 incorporates projectivism, relativism, and error theories, into figures 1 and 2.


Figure 3

The ontological ramification of accepting descriptivism (or, cognitivism) is not inevitably moral realism. Figure 3 indicates that descriptivism is not sufficient for moral realism. Mackie’s error theory is discussed in §2 in establishing the insufficiency. Blackburn’s projectivism, and John Skorupski’s “irrealist cognitivism” will be very briefly discussed as well. Descriptivism is nonetheless necessary for moral realism. The necessity is argued in §3 when Bruce Waller’s “megaethical level” is considered and rejected. That is, a conjunct of C1 will be shown to be false while the other conjunct of C1 will be shown to be true, thereby making the conjunction C1 false; more specifically, it will be shown that “if S is a moral descriptivist, thenS is a moral realist” is false and it will be shown that “S is a moral realist only if S is a moral descriptivist” is true.

ii. Mackie’s Error Theory

Is it true that S is a moral realist if and only if S is a descriptivist? That is, is C1 true? Any coherent descriptivist antirealism would establish that C1 is false. Another way that C1 could be shown to be false is to establish the possibility of non-descriptivist realism. The insufficiency of descriptivism will be established in this section. The realist territory, as it were, will not be properly marked by descriptivism.

Consider Mackie’s remark that:

The assertion that there are objective values or intrinsically prescriptive entities or features of some kind, which ordinary moral judgments presuppose is, I hold it not meaningless but false (1977, 40).

Moral judgments are false, or so the above-quoted passage reads. But why are they all false? It is because there are no entities to which moral language refers. Moral language purports to describe things that are not there. According to Mackie, it is a (perpetual) error to suppose that there are moral entities, thus, the name “error theory.” Mackie’s error theory is a prima facie descriptivist antirealist position: it maintains that there are no moral facts. In addition he accepts that moral judgments are meant to describe the world. Is this combination of moral antirealism and descriptivism plausible? Blackburn certainly thinks that it is not.

Blackburn, whose own view seems to be indeterminate between descriptivism and non-descriptivism, thinks that Mackie’s error theory is inconsistent. This is partly because of the apparent difficulty in attributing a pervasive systematic error to our making moral judgments. As Blackburn puts it, “[T]he puzzle is why, in the light of the error theory, Mackie did not at least indicate how a shmoral vocabulary [that is, a moral vocabulary cleansed of its ontological error] would look, and why he did not himself go on only to shmoralize, not to moralize.” According to Blackburn, this is so seriously puzzling that Mackie’s failure to shmoralize “in itself suggests that no error can be incorporated in mere use of those concepts” (1985, 2).

To try avoiding the pervasive and systematic error should appear reasonable to those who were aware of it. But Mackie seemed “quite happy to go on to express a large number of straightforward moral views [namely, to moralize rather than to shmoralize]” (Blackburn 1985, 1).

Does Blackburn’s charge establish that Mackie’s antirealism and descriptivism combination is inconsistent? No, it does not. What Blackburn demands of Mackie is the consistent deployment of his meta-ethical view in his moral practice. But to lead a moral life strictly according to one’s meta-ethical view requires heroic efforts. Try imagining an error theorist deploying his meta-ethical views when it comes to the existence of an external world! She cannot help but conduct her business as if it is no error in thinking that there exists a world external to her. It is impossible for her to show that it is an error to believe in the existence of such a world. More generally, the second-order beliefs on the first-order moral practices are rarely made explicit. Everyday moral practices (within which Mackie continues to moralize) are not a translucent showcase for meta-ethical views. So, Blackburn fails to establish that descriptivist antirealism is inconsistent. That is, Blackburn should expect no explicit display of Mackie’s error-theoretic commitments.

Blackburn’s projectivism may qualify for the descriptivist antirealism. (Blackburn’s descriptivism will be discussed in §2 of section 1.2 in more detail.) Moral language has content, according to Blackburn, but the content is not determined by the world. The content of moral language is determined rather by what “the mind [expresses as] a reaction by ‘spreading itself on the world’” (Blackburn 1984, 75). That moral language has content suggests that part of its function is to accurately describe the world. At the same time, Blackburn’s projectivism is an antirealist position because he maintains that the content is somehow “written” by us.

There are other recent theories that result from explicit attempts at combining descriptivism and antirealism. Hatzimoysis says “a minimalist conception of truth fits the bill of antirealist cognitivism in ethics.” (See for example, Hatzimoysis 1997, 448.) Skorupski’s “irrealist cognitivism” is one such theory. He argues for it by denying “all content is factual content” (1999, 438).

The fact that moral language expresses cognitive states, that is, that moral language has descriptive content, according to Skorupski does not guarantee the existence of moral facts; nor does it justify belief in the existence of moral facts. (Cf. Horgan and Timmons 2000. They distinguish three different kinds of content: declarative, cognitive, and descriptive.) Skorupski says that “normative claims are truth-apt contents of cognition…but their truth is not a matter of correspondence or representation” (1999, 436). The truth-apt fragment of language is truth-apt because of its descriptive content. So the first conjunct of Skorupski’s remark is descriptivist. But when moral language is true (or false), it is so not because it corresponds to the world: there is nothing that answers to moral language. That is, Skorupski rejects the existence of moral facts, and his position is hence antirealism.

Is Skorupski’s irrealist cognitivism consistent? Descriptivism by no means entails the correspondence theory of truth, and Skorupski’s antirealism is based solely on his denial of the correspondence theory of truth. Irrealist cognitivism is hence consistent.

Mackie’s error theory, Blackburn’s projectivism, and Skorupski’s irrealist cognitivism are instances of descriptivist antirealism. We may then conclude that moral descriptivism is not sufficient for moral realism. But is it a necessary condition for moral realism? If it is, then we may hope to mark the proper realist territory by adding additional necessary conditions. (My emphasis on consistency of maintaining both descriptivism and antirealism is not meant to suggest that a descriptivism/non-descriptivism debate as represented by, say, the Frege-Geach problem which claims that embedded moral language appears to have descriptive contents rather than emotive, prescriptive or projective content, is not as important and relevant to the realism/antirealism debate. See Darwall, Gibbard, and Railton 1992, especially pp. 151-152.) The necessity of descriptivism for realism will be discussed in the following section. Another conjunct of C1, “S is a moral realist only if Sis a descriptivist” will be examined.

iii. Waller’s Megaethical Level

Few philosophers take the noncognitivist realist position seriously. For instance, Geoffrey Sayre-McCord (1988, 9-14) dismisses it quickly as inconsistent. But noncognitivist realism is certainly a logical possibility. In this section, we shall examine Waller’s arguments for its tenability.

Waller’s noncognitivism is attenuated: moral judgments are not cognitive states when no fundamental common values are in place. He says “noncognitivism insists that when fundamental value conflicts arise and basic value questions are posed, then the disputes and values are noncognitive” (1994, 63). Statements only express moral judgments when an assumed set of common fundamental values is present. Waller’s remark that “such independent moral conversion is evidence in favor of moral realism and against noncognitivism” sounds inconsistent with the label of his theory “noncognitivist moral realism.” (See his 1992, 129.) Waller’s remark makes it seem as if moral realism and noncognitivism are contradictory to each other. Waller’s strategy is to distinguish the “megaethical” level from the level where there are uncontested fundamental values. This allows Waller to maintain that at one level “the moral facts are internally real,” but at another level, namely, the megaethical level, “[the moral facts] are ideal” (1994, 67). Waller’s divide-and-conquer strategy entitles him to either cognitivist moral realism at the level of assumed values, or noncognitivist antirealism at the megaethical level. So Waller’s “noncognitivist realism” fails as a noncognitivist realist position. We may then conclude that cognitivism (or, descriptivism) is necessary for moral realism. Cognitivism, the view that moral judgments are cognitive states like ordinary beliefs (with its two corollaries, namely, descriptivism and their truth-aptness), could facilitate the realist/antirealist debate, but cognitivism alone is not sufficient in facilitating the discussion, not solely in its terms anyway.

The necessity of cognitivism for realism may lead us to expect that specifying additional necessary conditions for realism could mark the proper realist territory. Cognitivism combined with some substantial theory of truth is taken up next.

b. Truth in Moral Judgments

Moral statements express judgments, and for some, moral statements describe the world. But moral realism is not present everywhere cognitivism (or, descriptivism) is present. That is, cognitivism and descriptivism, which had once crystallized the realism/antirealism debate, no longer do so. Crispin Wright’s recommendation that “moral anti-realists, for instance, should grant that moral judgments are apt for truth and falsity” (1993, 65) illuminates more recent discussions of the subject. Mackie’s error theory (1977), Skorupski’s irrealist cognitivism (1999), and perhaps Blackburn’s projectivism (for example, 1984) illustrate, as we saw earlier, the possibility of consistently combining cognitivism with antirealism.

An error theorist maintains her antirealism by insisting that moral judgments involve a pervasive error. No moral judgments are true, according to the error theorist, although they are truth-apt because they purport to describe the world. Moral realists part company with the error theorists over truth in moral judgments: some moral judgments are true. Still, this is not sufficient for moral realism. The projectivist functioning as a quasi-realist and Skorupski should be able to claim that some moral judgments are true. Moral truths can be literal or figurative; and, they can be the matter of correspondence or coherence (coherence with other already held beliefs stands in here for the range of “modified characteristics” of truth). Figure 4 illustrates this point:


Figure 4

Deflationist theorists of truth reject that the truth-predicate “is true” adds to the meaning of linguistic items. For instance, “snow is white” and “‘snow is white’ is true,” mean, according to them, the same. Deflationist theories include F. P. Ramsey’s redundancy theory of truth (or, the prosentential theory of truth) and Paul Horwich’s more recent minimalism. Inflationist (substantive or robust) theorists of truth, in contrast with deflationists, maintain that truth is a real and important linguistic item. Inflationist theories include the correspondence theory of truth, the coherence theory of truth, and the so-called pragmatic theory of truth. Inflationists disagree not only about the nature of the property of truth, but also disagree about the bearers of the property truth.

i. An Analogy

Consider the judgment, “Suffering from lack of food is bad.” The judgment is usually expressed with the statement “suffering from lack of food is bad.” Call it a “B-statement.” Sometimes, we find it necessary to express it with “it is true that suffering from lack of food is bad.” Call it a “T-statement.” (To complete it, there are “F-statements” like “it is false that suffering from lack of food is bad.”) We use T-statements to emphasize partiality toward “being true to the world.” However, regardless of what motivates us to use T-statements, the explicit ascription of truth in T-statements commands our attention. Does the T-statement add anything extra to the B-statement? If so, what is it that the T-statement says over and above the B-statement?

There are two broad ways to answer the question: deflationism and various forms of substantial theory (or what we called above “inflationist theory”). Substantial theorists deny that the B-statement and the T-statement are exactly the same while the deflationist maintains that the difference is merely stylistic. If the deflationist has her way, then it is obvious that antirealists could have truth in moral judgments. (David Brink argues against the coherentist theory of truth with respect to moral constructivism. See Brink 1989, 106-7 and 114; see Tenenbaum, 1996, for the deflationist approach.) Antirealist moral truths would seem irrelevant in marking the realist territory. If some form of substantial theory is true, then the T-statement adds something to what the B-statements say. Here are two alternatives.

Letting a coherence theory of truth stand in for the range of “modified theories” (namely, the inflationist theories of truth that are different from the correspondence theory of truth), and the “B-proposition” for what the B-statement describes about the world, the T-statement adds that:

(1) The B-proposition corresponds to an actual state of affairs.

(2) The B-proposition belongs to a maximally coherent system of belief.

It is worth noting also that even the non-descriptivist may say that the T-statement adds to the B-statement, insofar as the B-statement expresses something other than the B-proposition. The non-descriptivist has two alternatives as well.

The T-statement adds that (letting a coherence theory of truth stand in for the range of “modified theories,” and the “B-feeling-proposition” stand in for the range of non-descriptivism, for example, the speaker dislikes suffering from lack of food):

(3) The B-feeling-proposition corresponds to an actual state of affairs.

(4) The B-feeling-proposition belongs to a maximally coherent system of belief. We may say that the T-statement specifies truth conditions for the B-proposition or for the B-feeling-proposition. It could be objected that the non-descriptivist must deny that there are truth-conditions for moral language. Nonetheless, she need not object to moral language describing something about the world figuratively.

If option (1) were true, then there would have to be an actual state of affairs that makes the B-statement true. That is, there must be a truth-maker for the statement, “suffering from lack of food is bad,” and the truth-maker is the fact that suffering from lack of food is bad. But no other alternatives require the existence of the fact for them to be true.

If one ignores deflationism, truth in moral judgments gives rise to exactly four alternative theories of truth. Realists cannot embrace options (3) and (4) because, as we saw, non-descriptivism is sufficient for moral antirealism. The remaining option (2), although it is a viable option for the realist, falls short of guaranteeing that there are moral facts. In other words, moral realists must find other ways to establish the existence of moral facts, even if option (2) allows a way of maintaining moral truths for the realists. Modified theories, for example, the coherence theory of truth are simply silent about whether there are B-facts. That is, option (2) could be maintained even if there were no B-facts such as suffering from lack of food is bad. Thus, the most direct option for realists in marking her territory from the above list of alternatives is (1). It appears then that the correspondence truth in moral judgments properly marks the realist territory. This is captured in C2:

(C2) S is a moral realist if and only if S is a descriptivist; S believes that moral judgments express truth, and S believes that the moral judgments are true when they correspond to the world.

Is C2 true? No, it is not. For the antirealist may choose to deny that moral judgments literally describe the world. This is how Skorupski earns his antirealist title.

ii. Skorupski’s Irrealist Cognitivism

If C2 were true, then there could not be any cognitivist antirealist who believes that some moral judgments are true, and who also holds that moral truth is a matter of correspondence to the world. However, Skorupski’s irrealist cognitivism qualifies as one such position.

Skorupski maintains that moral judgments have truth-apt contents, but he denies that the contents of moral judgments are factual. Skorupski remarks “[normative language’s] truth is not a matter of correspondence or representation” (1999, 436). This remark may suggest that Skorupski’s irrealist cognitivism is a variant of option (2) above about what the T-statement adds to the B-statement. Nonetheless, there is an extension of Skorupski’s theory that would consistently allow it to fall within option (1). This extension of Skorupski’s theory would be a cognitivist antirealist position, combined with a correspondence theory of truth.

Moral statements express moral judgments, and as such, moral statements can be either true or false. What makes moral statements true when they are true? Skorupski’s remark above rejects that correspondence to the world is the truth-making relation. As was mentioned, this rejection could indicate that Skorupski holds a modified theory of truth or a deflationist theory. Perhaps he does, but it is not explicit. What is explicit is Skorupski’s denial that moral judgments have factual contents. How is it possible that some moral judgments are true if moral judgments are not factual? One way to answer it—and to extend Skorupski’s irrealism—is to maintain that moral judgments are not literal. Moral judgments are still expressed by moral statements, but what moral statements describe are not moral states of affairs. Moral statements express states of affairs of the world other than moral ones. In this way, moral statements can be true by corresponding to the world, once moral statements are recognized as describing, for example, a psychological aspect of the world.

Consider the statement “Santa Claus came early last year.” Call it the S-statement. (The “S-statement,” “T-statement,” “S-proposition,” “S-feeling-proposition,” and cognates are used as “B-Statement”, “T-Statement,” “B-proposition”, “B-feeling-proposition” and its cognates are above.) Does the S-statement describe the world as it was last year? Surely, it does. It reports either that (1) there was at least one person whose image fits the description of Santa, or that (2) there was the giver of toys around Christmas. It reports also that the person in either case came earlier than other years. Children are delighted by Santa’s early appearance in primarily the sense of (2). And they wonder, “Will Santa come early this year as well?” Similarly, children reason, “If Santa comes early, I will have an early Christmas present.” Of course, very few us of are Santa realists, although most of us are cognitivists about the S-statement in either sense.

How are adults able to maintain both cognitivism about the S-statement (more specifically descriptivism about it) and antirealism about Santa facts in the sense of S-statement (1)? Adults acknowledge the existence of surrogate toy-givers, while denying that the S-statement expresses the S-proposition in the sense of (1), namely, adults deny that there was at least one person whose image fits the description of Santa. Instead, adults believe that the S-statement expresses the S-feeling-proposition, or something equivalent to it. This is how one maintains antirealist cognitivism about Santa judgments.

There are many garden-variety Santa judgments. Santa judgments are expressed by Santa-statements, but no Santa-statements express the S-proposition. The S-statement does not involve the state of affairs in which there is the person whose name is Santa Claus. Nonetheless, the S-statement could be either true or false. Suppose that it is true, that Santa did come early last year, but suppose that we are also not realists about Santa Claus. We know better than those who are perplexed by the existence of people who fit perfectly the descriptions of Santa. We know that the S-statement does not say anything about a person named Santa Claus. For most, the S-statement is never about Santa, but rather it is about, for example, the toy-givers, the state of one’s national economy, and so on. That is, we deny that the S-statement expresses the S-proposition, however, this rejection does not force us to adopt deflationism or a modified theory of truth. The S-statement could express something true when it corresponds with the world as long as it expresses something other than the S-proposition. For instance, the S-statement expresses something true if the S-statement expresses the fact that the state of the national economy was good last year, and if the state of the national economy last year was actually good: in this case the S-statement expresses something true when it correctly reports the economy of last year. There is no inconsistency.

Analogously, moral statements express moral judgments. Insofar as moral statements are understood as expressing psychological facts about the world, moral statements can be true or false. Some “moral” statements are true in this way. Furthermore, they are true because they correspond to the world. Even if this is not Skorupski’s theory, it is an extension of his theory that instantiates cognitivist antirealism, combined with a correspondence theory of truth. This shows that C2 is false.

iii. The Correspondence Theory Requires Realism, Not Vice Versa

Our previous discussion of Skorupski’s cognitivist irrealism gives no details about the correspondence theory of truth it employs. It might be objected that such lack makes it impossible to judge whether or not Skorupski’s theory, or an extension of it, constitutes a counterexample to C2. But the “correspondence theory” is ambiguous between the general conception of truth that appeals to correspondence as the truth-making relation, and the very detailed analysis of truth that satisfactorily specifies the notion of truth in terms of the correspondence relation. As the general conception, the correspondence theory of truth is insufficient for moral realism. Antirealists are entitled to the correspondent truth of moral judgments insofar as moral judgments are understood “figuratively.” For as the general conception, the correspondence theory of truth imposes “for any proposition , it is true that just in case there is a way things could be such that anyone who believed, doubted, etc. that would believe, doubt, etc. that things were that way, and things are that way” (Wright 1999, 218). Apparently, the conception “offers little more than a long-hand version of the correspondence platitude,” and it “certainly carries no direct implications for the realism debate in its modern conception” because “there is so far no commitment to any specific general conception of the kind of relations that may be involved in truth, or of the nature of the non-propositional items in their fields” (Wright 1999, 223-24). On the other hand, as analysis, the correspondence theory perhaps is too strong for realism. The latter point will not be discussed further as our purpose here is to establish the non-sufficiency and the non-necessity of the correspondence theory of truth for moral realism. It seems reasonable to suppose that Skorupski’s irrealist cognitivism, or an extension of it, constitutes a counterexample to C2 as the general conception of correspondence theory of truth.

To sum up, consider the following five claims:

  1. The correspondence theory of truth is false or implausible.
  2. The correspondence theory of truth requires the truth of realism.
  3. The correspondence theory of truth is not required for realism (and no particular theory of truth is).
  4. “The correspondence theory of truth in conjunction with cognitivism” is not sufficient for realism.
  5. “The correspondence theory of truth in conjunction with cognitivism and the correspondence (truth) of moral judgments” is not sufficient for realism.

The discussion of Skorupski’s (extended) antirealism aims at establishing claim (5), but since (5) implies (4) there is no need for independently establishing claim (4). Claim (1) is apparently bold, controversial, and not required for our purpose. Claim (2) seems false: an error theorist like Mackie is a moral antirealist, however, he may adopt a correspondence theory of truth and not contradict his particular brand of moral antirealism. Furthermore, claim (2) is not required for our purpose either. To properly mark the realist territory, we need not determine if the correspondence theory of truth— whether one considers it to be general theory or analysis—requires realism. Finally, claim (3) seems at least OK, and it is relevant to our goal. The T-statement discussed above, namely the T-statement that “‘Santa came early last year’ belongs to a maximally coherent system of beliefs,” shows that realists, moral or otherwise, are not forced to accept the correspondence theory of truth. That said, if moral realists opt for moral truths of the non-correspondence kind, then they would have to find other ways of establishing the existence of moral facts.

c. Literal Moral Truth?

In the previous section, it is proposed that one need not be a moral realist if she is a cognitivist that believes moral judgments express moral truths and that the truths they express are truths because of a correspondence between the judgments and facts in the world. The argument might attract the following response: such an antirealist position appears possible simply because it involves denying that there are any literal truths in moral discourse; even if cognitivism and moral truths that are obtained by employing a revisionary theory of meaning are considered to not be adequate for moral realism, then cognitivism and moral truths that are obtained on a literal understanding of moral language should be considered adequate for moral realism. This section offers replies to such a potential response.

Consider again the Santa statement, “Santa Claus came early last year.” An antirealist may construe it as saying

The national economy last year was good, and the economic boom was manifested by consumer confidence.

Consequently, the antirealist can say that because the S-statement expresses the S-feeling-proposition about the national economy and consumer confidence, nothing prevents the antirealist from adopting a correspondence conception of truth. Children, of course, insist that the S-statement is literal, that is, it expresses the S-proposition, “Santa Claus came early last year.” If the S-statement were to be taken literally, no antirealist could hold both that there are some Santa truths and that those Santa truths are matters of correspondence to the world. Santa antirealists cannot acknowledge any Santa fact if such an acknowledgement presupposes the existence of Santa, the person. The S-statement obviously express something other than the S-proposition, but is it the same with moral judgments and statements?

The preceding discussion signals a shift in the realist/antirealist debate. The literal meaning of moral language now comes to the fore of the discussion. We seem to have run a full circle. The non-descriptivist and the non-cognitivist point out that moral language may manipulate us ontologically because it misleads us into thinking that moral statements describe the world: obviously, the Santa statement cannot be taken literally. Even if it is unreasonable to insist on the literal interpretation of the S-statement, the same cannot be maintained with an equal confidence about moral statements. It is not obvious that moral language must not be taken literally. We are certain that there is no such living person as Santa Claus: that is why we can be certain that the S-statement cannot be taken literally. Nonetheless, with respect to moral statements, the existence of moral facts is exactly the issue. As a result, we cannot be as certain about moral language as we are about the S-statement that it must not be taken literally.

Granted, one of the most deeply rooted realist and antirealist disagreements has been whether moral language expresses things literally. Should moral language be taken literally or in some revisionist fashion? Skorupski, an antirealist cognitivist, must maintain that moral language describes the world, yet it does not do so literally. For instance, it expresses our ways of influencing others and ourselves. Realists, on the other hand, must maintain that moral language describes the world, and it does so literally. Moral language comes with shades of normativity, but that does not entail that moral language cannot be taken literally. Instead, the logico-linguistic considerations prove that moral language is no different from ordinary declarative statements that express ordinary beliefs. How are we to decide between the two? Does “species-ism is as (morally) bad as racism” express whatever it expresses literally? Is it even feasible to apply literalism, in the first place, to the realist/antirealist debate?

Surely, it is difficult to decide between the two above-mentioned alternatives. Language allows many things for us. For example, people sometimes disagree about whether an utterance expresses a genuine question or whether it expresses an assertion (in the form of a rhetorical question). This indicates that it can be difficult to know when a statement is to be taken literally and when it is not. If literalism were to carry any weight for the realism/antirealism debate, then there should be some independent way of telling when a statement is to be taken literally. That is, literalism about moral language requires an independent footing. Furthermore, it is very difficult to imagine that the long and recalcitrant history of the realist/antirealist debate has been just about the literal meaning of moral language. We presumably understand what moral statements express, if only in a rudimentary fashion. The disagreement about literalism may help explain why moral realists and antirealists often seem to talk past each other. Nevertheless, attributing different meaning to moral terms fails to further our inquiry. At any rate, it does not seem feasible to make literalism a criterion for moral realism, especially when the difficulty associated with literalism about moral language is considered.

d. Moral Knowledge

Some moral judgments are literally true, but some truths are not known. It is sometimes thought that we get moral facts right, while others get them totally wrong. Is there any merit to such a claim? Does one ever know a certain moral judgments to be true? (Joel Kupperman asks, for instance, “[i]f there is some set of moral truths, or approximately correct moral beliefs, independent of our feelings, attitudes, or opinions, then how can we ever know that we have found or arrived at them?” 1988, 33.) We get some moral facts right sometimes, according to the realist. That is, we succeed in knowing certain moral judgments to be true. Moral realism implies some sort of literal success theory, and so moral knowledge is implied by it. Or, moral realism entails at least the possibility of such knowledge.

Moral realists hold that we can have justified true moral beliefs, or that we can have warranted moral beliefs, according to some post-Gettier theories of knowledge. (See, for instance, Alvin Plantinga’s discussion of “warrant.”; See Gettier, 1963, and Plantinga, 1993a and 1993b). Some moral antirealists deny this. For example, Mackie’s error theory insists that no moral judgments are known to be true because the moral statements that express them always describe the world falsely. It is impossible to know something false as true! Moral skeptics hold that no moral judgments are justified or warranted. The epistemic success claim at once provokes epistemological questions: under what conditions are we ever justified or warranted in holding moral beliefs? And, how can we truly say that we have correct moral facts?

In answer, some moral realists have adopted a coherentist theory of justification, while others have opted for foundationalism and intuitionism. For instance, David Brink adopts coherentism in defense of a naturalist version of moral realism. (See especially Brink 1989, 122-43.) Naturalistic epistemology also deserves a serious consideration. (Cf. Consider Jaegwon Kim’s worry of losing normativity. See Kim, 1988, and Quine, 1986.) Some theories of justification are able to accommodate moral knowledge more easily than others. A causal theory of knowledge and justification, for instance, is ill suited for the task. Alvin Goldman’s reliabilism may not be the best-suited theory for it either. (See Goldman, 1978, and 1986.) But it seems obvious that the belief that moral knowledge is possible can be maintained even with these externalist theories of justification. Consider, for instance, a version of reliabilism: S is justified in holding “that p” iff pis the result of a reliable cognitive process. One can be justified in holding that Doctor Evil is no good if the judgment results from a reliable cognitive process, say, for example, the cognitive process that results in Austin Powers being good.

The possibility of moral knowledge does not entail moral realism, even though moral realism entails moral knowledge. As was shown above, there is nothing to stop the moral antirealist from claiming moral knowledge once she helps herself to cognitivism, moral truths, and some theory of justification. On the other hand, moral realists need not be shy about adopting an externalist epistemology either. A naturalistic realist would hope that moral knowledge is on a par with empirical knowledge. The realist may even agree that the paradigm justification for empirical knowledge is perceptual and is thus causal. The moral realist would have to reject causal reductionism, according to which the causal power of the supervening facts is entirely reducible to that of base facts. Moral judgments are true just in case they correctly report the supervening facts that depend on the non-moral base facts.

e. Moral Objectivity

Moral realists maintain that some literal moral truths are known, or that we are justified in holding them. Moral judgments are true just in case they correctly report the supervening facts that depend on the non-moral base facts. But are moral facts—the supposed truth-makers of moral judgments—objective? It could be the case that no ethical judgments are true independently of the desires or emotions that we happen to have, or, there could be different yet valid answers to the same ethical question as ethical relativists insist. Neither subjectivists nor relativists are obliged to deny that there is literal moral knowledge. Of course, according to them, moral truths imply truths about human psychology. Moral realists must maintain that moral truths —and hence moral knowledge—do not depend on facts about our desires and emotions for their truth. For instance, W. D. Falk analyzes the good as “a dispositional property of things as ideally assessed, a power to evoke favor by way of an ideal assessment” (Piker 1995, 102). Having objective literal moral knowledge seems to be sufficient for moral realism because no moral antirealists would acknowledge the possibility of such knowledge. Figure 5 summarizes the results of the discussion from 1.1-1.5.

figure5
Figure 5

We finally arrive at the definite moral realist position, which is marked by the oval box above. The combination of cognitivism, descriptivism, success theory, literalism, and objectivism seems sufficient for moral realism. Nonetheless, there are a couple of reasons why the moral realist territory is better marked by the explanationist consideration. This consideration leads to explanationist moral realism according to which there must be moral facts because they are essential in our understanding of the world. Literalism faces uncertainty if one considers what moral sentences mean, a consideration that is not ideal for the realism/antirealism debate. Despite these categories, the advent of quasi-realism signals the new antirealist way. A quasi-realist can claim that cognitivism, descriptivism, moral truths, moral knowledge, and even moral objectivity, are within the antirealist camp.

2. Quasi-Realism, Antirealism, and the EI thesis

Quasi-realists such as R. M. Hare, Gilbert Harman, and Simon Blackburn promise to set people free from the unduly rigid ontology of moral realism, namely, the existence of moral facts. Quasi-realism would allow people to enjoy the traditional realist comforts such as moral truths, moral knowledge, and moral objectivity, without the realists’ baggage of commitments, theoretical burdens, and practical costs, or so they contend. It all sounds too good to be true, but such a possibility seems exciting: why insist on the existence of moral facts if all aspects of our moral practices, especially the realist-sounding ones, could be understood without the fact-multiplying realist ontology? Of course, the real question is this: is there anything significant that will be lost in our understanding of our moral practices if we were to settle for quasi-realism? A definite “yes” to the question has to be given, and we shall see why in this section.

The possibility that the quasi-realist extends to people is that quasi-realism poses no serious threat to the moral realist position. However, this quasi-realist contention— that by siding with quasi-realism nothing significant will be lost in our understanding of our moral practices—is simply mistaken. The quasi-realist loses some of the best explanations of events, states of affairs, and phenomena within the world: the quasi-realist must reject folk moral explanations. This is so, it will be argued, because the quasi-realist cannot accommodate folk moral explanations without reducing them to naturalistic explanations.

a. An Analogy: Quasi-Realism about Derogatory Judgments

Blackburn discusses derogatory judgments in his attempt to show how the quasi-realist allows for realist comforts. The quasi-realistic understanding of these judgments, according to Blackburn, allows for antirealist cognitivism about derogatory judgments, derogatory descriptivism, derogatory truth, derogatory knowledge, and even derogatory objectivity. The same may be said of the quasi-realistic understanding of moral judgments: for example, the quasi-realist might be entitled to cognitivism when it comes to moral judgments, descriptivism when it comes to moral language, moral truth, moral knowledge, and the quasi-realist perhaps may even be entitled to moral objectivity. Analogously to the quasi-realism about derogatory judgments, Blackburn claims that quasi-realists are entitled to all these, without being committed to the existence of moral facts as part of the supposed fabric of the world.

Blackburn’s derogatory judgments argument goes something like this: “Kraut” is an inherently derogatory expression. The judgment “Franz is a Kraut” is a cognitive state just like ordinary non-derogatory beliefs. It consists partly of the judgment that Franz is German. The sentence or utterance “Franz is a Kraut” expresses a statement that describes how the world is. The Franz sentence expresses something true, namely, that Franz is a German insofar as it expresses nothing further about him. But the Franz sentence expresses more than just his nationality. It also expresses that Germans, including Franz, are fit objects of derision. We may call this additional part the “derogatory judgment” of the Franz sentence. The Franz sentence expresses something false because, according to Blackburn, the part that expresses the derogatory judgment is false. No one is a fit object of derision solely because of his nationality. Consequently, the Franz statement describes the world falsely.

What makes the Franz statement false? What makes the Franz statement false is twofold: 1) no one is a fit object of derision solely because of his nationality, so, the statement is false because it has failed to refer to anything; and 2) there is no person in the world toward whom it is appropriate to have the derogatory attitude and/or intention that is expressed by way of the Franz statement. The quasi-realist may maintain that the truth or falsity of the Franz statement is to be determined by the existence or non-existence of the person toward whom it is appropriate to have such an attitude. Since there is no such person, the Franz statement is false. That is to say, the speaker of the Franz sentence speaks falsely because she reports a state of affairs as actual that is non-actual, namely she is falsely reporting that it is appropriate to have derogatory attitudes toward some people solely because of their nationality, although she may be correctly identifying Franz’s nationality as German. Truth or falsity in derogatory judgments may be found in the way that they correspond or do not correspond to the world.

Analogously, quasi-realists may earn the right to maintain cognitivism when it comes to moral judgments, descriptivism, moral truths, moral knowledge, moral objectivity, and so on. For the quasi-realist, the inner workings of moral language are such that they afford such realist-sounding expressions like moral truths without ever accepting the realist ontology.

b. Quasi-Realism, Antirealism, and Explanationist Moral Realism

The quasi-realist paints a rosy philosophical picture in which one can enjoy realist-sounding luxuries while not multiplying entities beyond necessity. Nonetheless, the nagging question remains: is it not better to have a real thing than to have a quasi-real thing, especially when the theoretical price is right? We must challenge the quasi-realist’s entitlement to be regarded as the contemporary heir of moral antirealism, and examine her reasons for thinking that quasi-realism is true. It is ethical relativism that wins Harman antirealist entitlements. Blackburn earns his spurs through projectivism that eventually allows for the ontological parsimony. But why do quasi-realists think their particular brand of antirealism is true? Both Harman and Blackburn give a surprisingly unanimous explanation. They call it the explanatory inadequacy thesis of the moral and it addresses the comparative explanatory inferiority of moral facts, the total lack of explanatory power of moral facts, or explanatory reductionism.

For instance, according to Blackburn, projectivism must be true because “we need to explain the ban on mixed worlds, and the argument goes that antirealism [projectivism] does this better than realism” (1984, 184). Harman thinks that ethical relativism—the view that “there is no single true morality”—must be true because it is a “reasonable inference from the most plausible explanation of moral diversity” (Harman and Thomson 1996, 8). Harman’s reason is a version of the explanatory inadequacy of moral facts thesis. It is the inadequacy thesis that entitles the quasi-realist to the antirealist parsimony. To mark the moral realist territory in such a way that implies the irrelevance view (the view that the explanatory inadequacy of moral facts does not constitute evidence against moral realism) ignores the fact that it is primarily the inadequacy thesis that entitles the quasi-realist to anti-realism. The explanatory power of moral facts is the only realist doctrine that is immune from quasi-realist debunking.

It is puzzling for the quasi-realist to advance the explanatory inadequacy thesis since she has ample room for accommodating folk moral explanations. She only needs to appeal to the putative moral facts as though they are real. The “as though” attitude does a yeoman’s work. It gives her the right to use notions such as bivalence, moral truth, moral knowledge, and so on. It seems rather arbitrary to stop at accommodating moral explanations. The quasi-realist’s dismissive attitude toward moral explanations is the quasi-realist’s qualification as an antirealist.

3. Moral Realism after Quasi-Realism

Such quasi-delicacies like quasi-moral-truths, quasi-moral-knowledge, or quasi-moral-objectivity allow for contemporary antirealist ways, but moral realists surely cannot rest content with them. Moral realists must find a way for not only rejecting the quasi-realist’s debunking of the disagreements between the traditional realist and the antirealist, but also a way for establishing “real” moral comforts. A couple of ways moral realists do this is by asserting the existence of objective literal moral truths and explanationist moral realism.

Figure 5 indicates an inflated way of establishing the realist’s ontological thesis, namely, that there are moral facts. On this inflated moral realism, the realist view turns out to be a jumble of 4 major theories in philosophy: cognitivism, descriptivism, literalism, and success theory. (The correspondence theory of truth is neither necessary nor sufficient for moral realism as we saw above.) Although the existence of objective literal moral truths may show that the aforementioned theories are jointly sufficient for moral realism, it ignores the quasi-realist’s ways of saying the realist-sounding things (the quasi-realist’s way in masquerading as moral realists, if you will). A less inflated way of marking the realist territory would be advisable, should there be such a way. This is because quasi-realists insist that they are as much entitled to cognitivism, descriptivism, moral truth, moral knowledge and even moral objectivity as moral realists. Their insistence effectively thwarts realist attempts at marking their territory by relying on the traditional disagreement between realists and antirealists mapped in figure 5.

Explanationist moral realism has been suggested as a way of blocking the alleged quasi-realist masquerade. It focuses on the significance of having moral explanations. The explanationist moral realist holds that moral facts genuinely explain events and states of affairs in the world. In a rough and ready way, the explanationist realist maintains that there are moral facts because they explain non-moral events. However, her claim is debated even within the realist camp. Some moral realists consider that explanatory adequacy (or, inadequacy for that matter) is irrelevant in establishing the truth of moral realism; and, it is no easy task to show that moral facts are genuinely explanatory (or, that the quasi-realist’s accommodation of moral explanations is not as robust as she claims it to be). Nonetheless, since explanationist moral realism is much simpler than the inflated moral realism of figure 5, explanationist moral realism demands the realist’s close attention.

4. References and Further Reading

  • Alston, William P. 1996. A Realist Conception of Truth. Ithaca: Cornell University Press.
  • Ayer, A. J. 1952. Language, Truth, and Logic. New York: Dover Publications.
  • Blackburn, Simon. 1981. “Rule Following and Moral Realism,” In Holtzman and Leich (1981).
  • Blackburn, Simon. 1984. Spreading the Word: Groundings in the Philosophy of Language. Oxford: Oxford University Press.
  • Blackburn, Simon. 1993. Essays in Quasi-Realism. New York: Oxford University Press.
  • Blackburn, Simon. 1998. Ruling Passions: A Theory of Practical Reasoning. Oxford: Clarendon Press.
  • Blackburn, Simon, and Keith Simmons, eds. 1999. Truth. Oxford: Oxford University Press.
  • Brink, David O. 1989. Moral Realism and the Foundations of Ethics. Cambridge: Cambridge University Press.
  • Darwall, Stephen, Allan Gibbard, and Peter Railton. 1992. Toward Fin de siècle Ethics: Some Trends. The Philosophical Review, 101 (1):115-89.
  • Dodd, Julian. 2002. “Recent Work on Truth,” Philosophical Books, 43:279-91.
  • Fine, Kit. 2001. “The Question of Realism,” Philosopher’s Imprint 1, (1):1-30.
  • Geach, Peter. 1965. “Assertion,” The Philosophical Review, 74:449-465.
  • Gettier, E. L. 1963. “Is Justified True Belief Knowledge?” Analysis, 23 (6).
  • Gibbard, Allan. 1990. Wise Choices, Apt Feelings. Cambridge: Harvard University Press.
  • Goldman, Alvin I. 1978. “A Causal Theory of Knowing,” in Essays on Knowledge and Justification, edited by G. S. Pappas and M. Swain. Ithaca: Cornell University Press.
  • Goldman, Alvin I. 1986. “What is Justified Belief?” in Empirical Knowledge: Readings in Contemporary Epistemology, edited by P. K. Moser: Rowman & Littlefield Publishers, Inc.
  • Hare, R. M. 1952. The Language of Morals. Oxford: Oxford University Press.
  • Harman, Gilbert. 1977. The Nature of Morality. New York: Oxford University Press.
  • Harman, Gilbert. 1986. “Moral Explanations of Natural Facts—Can Moral Claims Be Tested Against Moral Reality?” The Southern Journal of Philosophy, XXIV (Supplement):57-68.
  • Harman, Gilbert. 2000. Explaining Value and Other Essays in Moral Philosophy. Oxford: Oxford University Press.
  • Harman, Gilbert, and Judith Jarvis Thomson. 1996. Moral Relativism and Moral Objectivity. Cambridge: Blackwell.
  • Hatzimoysis, Anthony. 1997. “Minimalism about Truth and Ethical Cognitivism,” in Analyomen, 2, Volume III: Philosophy of Mind, Practical Philosophy, Miscellanea, edited by G. Meggle. de-Gruyter: Hawthorne.
  • Horgan, Terence, and Mark Timmons. 2000. “Nondescriptivist Cognitivism: Framework for a New Metaethic,” Philosophical Papers, 29:121-153.
  • Horwich, Paul. 1998. Truth. 2nd ed. Oxford: Clarendon Press.
  • Kim, Jaegwon. 1988. What is “Naturalized Epistemology?” Philosophical Perspectives 2 (Epistemology):381-405.
  • Kupperman, Joel J. 1988. “Ethical Fallibility,” Ratio 1:33-46.
  • Lynch, Michael P. 1997. “Critical Study: Minimal Realism or Realistic Minimalism?” The Philosophical Quarterly 47 (189):512-518.
  • Piker, Andrew. 1995. “W. D. Falk’s Alternative to Moral Realism and Anti-Realism,” Auslegung 20 (2):100-105.
  • Plantinga, Alvin. 1993a. Warrant: the Current Debate. New York: Oxford University Press.
  • Plantinga, Alvin. 1993b. Warrant and Proper Function. New York: Oxford University Press.
  • Quine, W. V. O. 1986. “Epistemology Naturalized,” in Empirical Knowledge: Readings in Contemporary Epistemology, edited by P. K. Moser: Rowman & Littlefield Publishers, Inc.
  • Sayre-McCord, Geoffrey. 1988. “The Many Moral Realisms,” in Essays on Moral Realism. Ithaca, NY: Cornell University Press.
  • Skorupski, John. 1999. “Irrealist Cognitivism,” Ratio XII:436-459.
  • Stevenson, C. L. 1937. “The Emotive Meaning of Ethical Terms,” Mind 46:14-31.
  • Stevenson, C. L. 1944. Ethics and Language. New Haven: Yale University Press.
  • Stevenson, C. L. 1963. Facts and Values. New Haven: Yale University Press.
  • Sturgeon, Nicholas L. 1986. “Harman on Moral Explanations of Natural Facts,” The Southern Journal of Philosophy XXIV (Supplement):69-78.
  • Tenenbaum, Sergio. 1996. “Realists without a Cause: Deflationary Theories of Truth and Ethical Realism,” Canadian Journal of Philosophy 26 (4):561-90.
  • Waller, Bruce N. 1994. “Noncognitivist Moral Realism,” Philosophia 24 (1-2):57-75.
  • Wedgwood, Ralph.  2007. Nature of Normativity, Oxford University Press.
  • Wright, Crispin. 1992. Truth and Objectivity. Cambridge: Harvard University Press.
  • Wright, Crispin.1993. “Realism: The Contemporary Debate: Whither Now?” in Reality, Representation and Projection, edited by J. Haldane and C. Wright. Oxford: Oxford University Press.
  • Wright, Crispin.1999. “Truth: A Traditional Debate Reviewed,” in Blackburn and Simmons (1999).

Author Information

Shin Kim
Email: skim@hufs.ac.kr
Hankuk University of Foreign Studies
Korea

The Phenomenological Reduction

There is an experience in which it is possible for us to come to the world with no knowledge or preconceptions in hand; it is the experience of astonishment. The “knowing” we have in this experience stands in stark contrast to the “knowing” we have in our everyday lives, where we come to the world with theory and “knowledge” in hand, our minds already made up before we ever engage the world. However, in the experience of astonishment, our everyday “knowing,” when compared to the “knowing” that we experience in astonishment, is shown up as a pale epistemological imposter and is reduced to mere opinion by comparison.

The phenomenological reduction is at once a description and prescription of a technique that allows one to voluntarily sustain the awakening force of astonishment so that conceptual cognition can be carried throughout intentional analysis, thus bringing the “knowing” of astonishment into our everyday experience. It is by virtue of the “knowing” perspective generated by the proper performance of the phenomenological reduction that phenomenology claims to offer such a radical standpoint on the world phenomenon; indeed, it claims to offer a perspective that is so radical, it becomes the standard of rigor whereby every other perspective is judged and by which they are grounded. In what follows there will be close attention paid to correctly understanding the rigorous nature of the phenomenological reduction, the epistemological problem that spawned it, how that problem is solved by the phenomenological reduction, and the truly radical nature of the technique itself.

In other words, the phenomenological reduction is properly understood as a regimen designed to transform a philosopher into a phenomenologist by virtue of the attainment of a certain perspective on the world phenomenon. The path to the attainment of this perspective is a species of meditation, requiring rigorous, persistent effort and is no mere mental exercise. It is a species of meditation because, unlike ordinary meditation, which involves only the mind, this more radical form requires the participation of the entire individual and initially brings about a radical transformation of the individual performing it similar to a religious conversion. Husserl discovered the need for such a regimen once it became clear to him that the foundation upon which scientific inquiry rested was compromised by the very framework of science itself and the psychological assumptions of the scientist; the phenomenological reduction is the technique whereby the phenomenologist puts him or herself in a position to provide adequately rigorous grounds for scientific or any other kind of inquiry.

Table of Contents

  1. Introduction
  2. Historical Background of the Phenomenological Reduction
    1. Husserl’s Early Works
    2. Husserl’s Later Works
  3. The Epistemological Problem the Phenomenological Reduction Aims to Solve
  4. The Analysis That Disclosed the Need for the Reduction
    1. The Self-Refutation of the Sciences
    2. The Reduction Prefigured
  5. The Structure, Nature and Performance of the Phenomenological Reduction
    1. The Structure of the Phenomenological Reduction
      1. The Two Moments of the Phenomenological Reduction
        1. The Epoché
        2. The Reduction Proper
    2. The Nature of the Phenomenological Reduction
      1. Self-Meditation Radicalized
      2. Radical, Rigorous, and Transformative
    3. The Performance of the Phenomenological Reduction
      1. Self-Meditation
  6. How the Reduction Solves the Epistemological Problem
    1. The Problem of Constitution
    2. The Reduction and the Theme of Philosophy
  7. References and Further Reading

1. Introduction

The phenomenological reduction is the meditative practice described by Edmund Husserl, the founder of phenomenology, whereby one, as a phenomenologist, is able to liberate oneself from the captivation in which one is held by all that one accepts as being the case. According to Husserl, once one is liberated from this captivation-in-an-acceptedness, one is able to view the world as a world of essences, free from any contamination that presuppositions of conceptual framework or psyche might contribute. Many have variously misunderstood the practice of the phenomenological reduction, not in the sense that what they are doing is wrong, but in the sense that they do not take what they do far enough; this article will acquaint the reader with the extent to which Husserl and Fink’s original account intended the performance of the reduction to be taken.

The procedure of the phenomenological reduction emerges in Husserl’s thought as a necessary requirement of the solution he proposed to a problem that he, himself, had raised with respect to the adequacy of the foundation upon which scientific inquiry rests. Thus, if we are ever to achieve an appropriate level of appreciation for the procedure of the phenomenological reduction, we must begin by acquainting ourselves with the role that Husserl sees it playing in his overall project of giving the sciences an adequate epistemological foundation. This problem of the foundation of scientific inquiry spans Husserl’s entire career from his early to later work; we see its beginning arguments in Logical Investigations, one of his earlier works, and we also see it playing a prominent role later in his career as it dominates one of his latest works, The Crisis of European Sciences and Transcendental Phenomenology. Accordingly, this article will take as themes for its major divisions: 1) the historical background of the phenomenological reduction, 2) Husserl’s analysis of the foundation of scientific inquiry that demonstrates a need for the phenomenological reduction, and 3) The Structure, Nature, and Performance of the Phenomenological Reduction.

The section on the historical background of the phenomenological reduction will serve to show that this procedure does not arrive as “a bolt out of the blue,” as it were; rather, it appears as the logically required solution to a specific problem. The problem that it addresses is the problem of the adequacy of the foundations of scientific inquiry. To illustrate Husserl’s misgivings with the foundations of scientific inquiry, consider the logical relationship between the axioms of geometry and its theorems and proofs. The point of doing proofs in geometry is to show that each theorem of geometry is adequately grounded in the axioms, that which is taken as being “given” in geometry. In scientific inquiry, what scientists take as being given is the natural world and the things in that world; consequently, those things and the world itself are never questioned but taken to be the logical bedrock upon which the subsequent scientific investigations are based. In other words, scientists take the world to be their axioms; and it is this axiomatic status that Husserl throws into question when he shows that the results of scientific investigation are a function of both the architectonics of scientific hypotheses and the psychological coloring of the investigating scientist. For this reason, Husserl says that if we are ever to be able to access the pure world so that it can act as a proper foundation, we must strip away both of these qualifications and return to the “things themselves” [die Sache selbst]. That is, we must return to the world as it is before it is contaminated by either the categories of scientific inquiry or the psychological assumptions of the scientist. The phenomenological reduction is the technique whereby this stripping away occurs; and the technique itself has two moments: the first Husserl names epoché, using the Greek term for abstention, and the second is referred to as the reduction proper, an inquiring back into consciousness.

2. Historical Background of the Phenomenological Reduction

a. Husserl’s Early Works

Since the main burden of this article lies in the specific area of the phenomenological reduction, it is not necessary to go into great detail regarding Husserl’s early work beyond noting that it dealt almost exclusively with mathematics and logic; and that it is the ground out of which his later thought grew. In his Philosophy of Arithmetic (1891), Husserl questions the psychological origin of basic arithmetical concepts such as unity, multiplicity, and number; a project that he pursues later into the Prolegomena to the Logical Investigations. In the former work, Husserl gives us an analysis of the origin of the authentic concept of number, i.e., number to be conceived intuitionally. It is here that Husserl pays special attention to the question of the foundation of abstraction for the basic arithmetical concepts. Thus, we find that Husserl’s early efforts at providing a subjective complement to objective logic led him to investigate the general a priori of correlation of cognition, of the sense of cognition and the object of cognition, and led him also to conceive an absolute science designed as a universal analysis of constitution in which the origins of objectivity in transcendental subjectivity are elucidated.

A crucial element of Husserl’s early work in the Philosophy of Arithmetic is his critique of psychologism; it is this critique that is continued in his Logical Investigations and which sets the stage for the emancipation of the formal-logical objects and laws from psychological determinations, as was the then-current view. However, this liberation was not Husserl’s ultimate goal, but merely the preparatory work for understanding the connection between pure logic and concrete (psychical, or rather phenomenological) processes of thinking, between ideal conditions of cognition and temporally individuated acts of thinking.

b. Husserl’s Later Works

It is owing to this goal that Husserl’s later work moves quickly away from the strictly logical and mathematical character of his early work and takes on the more transcendental character of his later work. Thus, the trend of Husserl’s thought moves from his critique of the psychologistic account of mathematical and logical objects to transcendental subjectivity by means of his persistent questioning of the foundation of knowledge. It is important to note that his questioning of the foundation of knowledge is not the same as the quest for certainty that characterizes much of modernist thought—to which some philosophers believe Husserl’s American contemporary, John Dewey in his The Quest for Certainty, presented successful objections. Rather, Husserl’s quest was not for certainty but for the founding of the conditions for the possibility of knowledge. That is, he was not searching for an answer to the question: How do we know the tree is in the quad? He was seeking an answer to the question: How does it come about that consciousness can make contact with the tree in the quad? This is what was meant above when mention was made that Husserl’s ultimate goal was to understand the connection between pure logic and concrete processes of thinking.

In his dogged pursuit of an answer to this question, Husserl is pushed from the then current psychological theory to the object; from the object back to consciousness, and finally all the way back to transcendental consciousness and the emergence of the “ultimate question of phenomenology” regarding the phenomenology of phenomenology. It is this question of the phenomenology of phenomenology that dominates the inquiry into the nature of the phenomenological reduction that we find in Sixth Cartesian Meditation and in the articles that Eugen Fink wrote around 1933 and 1934 in his attempt to further explain the phenomenological philosophy of Edmund Husserl. However, what we need is a more finely tuned elucidation of the epistemological problem that was the initial impetus driving Husserl’s early efforts.

3. The Epistemological Problem the Phenomenological Reduction Aims to Solve

The prevailing epistemology in Husserl’s time was a neo-Kantian position; indeed, it was owing to the criticism brought against phenomenology by this cadre of philosophers that Eugen Fink was constrained to publish his very important article, “The Phenomenological Philosophy of Edmund Husserl and Contemporary Criticism” in the journal, Kant-Studien; Fink uses the locution “contemporary criticism” in his title as a euphemism for “neo-Kantians.” Roughly put, the Kantian epistemological model is one that strives to ameliorate the stark contrast between the position Descartes put forward and the one brought about by the criticism of his position in the writings of Locke, Berkeley, and Hume, to name a few; that is, Kant’s position is one that seeks an irenic modulation between the rationalists and the empiricists. Kant’s epistemology, however conciliatory toward each camp, still leaned heavily on certain aspects of Descartes’ thought; notably, the distinction between consciousness and object (mind and body), albeit in Kant’s terms this distinction was taken up as a distinction between a noumenal world and a phenomenal world—a difference that Kant bridged by means of the categories. The categories themselves were arrived at by asking the question: what would have to be the case in order for our experience of the world to be as it is? This question is commonly referred to as the question determining the conditions for the possibility of experience and more specifically as the Transcendental Deduction.

Husserl’s epistemological insight is that there is no such distinction between consciousness and object, as had been assumed by Descartes and subsequently taken up in a slightly different form by Kant. In Husserl’s thought, the terms “noesis” and “noema” do not so much identify distinct items set over against each other (e.g. consciousness and object) as much as they provide a linguistic vehicle to speak about the interpenetration of each by the other as aspects of a more inclusive whole, the Life-world—understood in its broadest sense. A key point made by Fink in his article for the neo-Kantians is that when we think of the world, it is always a world already containing us thinking it; this fact is overlooked by the Kantian picture of the world; a picture which assumes a perspective that is neither consciousness nor world but which sets each over against the other. For Kant, this imagined perspective is what gives us access to the distinction between the noumenal and phenomenal worlds; ironically, it is also this perspective that makes the transcendental deduction necessary, since the distinction between noumenal and phenomenal is a state of affairs to which we do not have direct access and must, of necessity, deduce it.

Husserl constructs his epistemological position by first noticing the very obvious fact that all consciousness is consciousness of something; and it is this insight that establishes the relationship between the noesis and noema. If knowledge is ever to be established at all, it must be established in consciousness; the epistemological problem, then, for Husserl is to describe consciousness, since without consciousness, no knowledge is possible. Or, to put a more Kantian spin on it, consciousness itself is the condition for the possibility of knowledge. Furthermore, since we are always already in a world, the first task of epistemology is to properly and accurately describe what is already the case; and we can do this only if we begin with a thorough examination of consciousness itself and carry that examination all the way back to the “I” in the “I Am.” Husserl speaks of going “back” [ruckfrage] because we must begin where we are; and where we are includes a sense of self whose identity is temporarily seated in the sedimented layers of consciousness built up through our temporal experiences. Hence, if we are to encounter the “I” we must dig back down through those layers or we must continually present ourselves with the question: who is “I”? as we consider the great variety of things with which we have identified. This questioning back is the method of the phenomenological reduction and aims to lay bare the “I”—the condition for the possibility of knowledge.

It is important to keep in mind that Husserl’s phenomenology did not arise out of the questioning of an assumption in the same way that much of the history of thought has progressed; rather, it was developed, as so many discoveries are, pursuant to a particular experience, namely, the experience of the world and self that one has if one determinedly seeks to experience the “I”; and, Hume notwithstanding, such an experience is possible.

4. The Analysis That Disclosed the Need for the Reduction

Although it is generally conceded that Husserl’s thought underwent a significant transformation from his early interests in logic and mathematics, as indicated in his “On the Concept of Number” and his Philosophy of Arithmetic, to his later transcendental interests, as indicated by The Crisis of European Sciences and Transcendental Phenomenology, the actual “turning point” is not so generally accepted. This is due, in part, to the fact that Husserl’s work can be viewed developmentally both according to the chronological appearance of his work and according to its systematic connections. Thus, the “development” of his thought can be seen either in terms of his published work, i.e., chronologically, or in terms of key systematic methodological concepts. Viewed chronologically, Bernet, Kern, and Marbach (Bernet, 1989) put the beginning of the split around 1915-1917, the last years Husserl spent at Göttingen, but is only clearly seen in the early years of Husserl’s teaching at Freiburg (around 1917-1921) (p.1); but considered systematically, they say that the partition relates to the consistent extension of the research program of phenomenological philosophy towards a genetic-explanatory phenomenology as a supplement to the hitherto carried-out static-descriptive phenomenology (p.1). The terms “static,” “genetic,” and “generative” phenomenology refer to aspects of phenomenology that come into play after the reduction has been performed; however, they articulate distinctions that must be kept clearly in mind when evaluating phenomenological analyses.

In the early phases of his thinking, Husserl was concerned chiefly with the phenomenological-descriptive analysis of specific types of experiences and their correlates as well as with describing general structures of consciousness; he also aimed at the foundation and elaboration of the corresponding methodology (phenomenological reflection, reduction, and eidetics) (p.1). Similarly in the later phases of his thought, there is the attempt by means of genetic phenomenology to elucidate the concrete unification of experiencing in the personal ego and in the transcendental community of egos, or monads, as well as in the constitution of the correlative surrounding worlds and of the one world common to all (p.2).

For the purposes of tracing the development of the phenomenological reduction, I take the relevant period of the transformation of Husserl’s thought from early to late to be between 1900 and 1913; the two volumes of Logical Investigations were published in 1900 and 1901 but it wasn’t until the appearance of The Idea of Phenomenology in 1907 that many of the characteristic themes of phenomenology were explicitly articulated. This little volume was soon followed by the publication of “Philosophy as Rigorous Science” in 1911; and that by the publication of Ideas I in 1913, where the most explicit treatment, up to that time, of the main phenomenological themes is given.

a. The Self-Refutation of the Sciences

In order to grasp the full import of the move that Husserl makes to phenomenology, we must understand the arguments that motivate that move; and we get a glimpse of those arguments in his “Philosophy as Rigorous Science” published in 1911. In that article, Husserl’s chief aim is epistemological and expresses itself first as a critique of the natural sciences and psychology and then as an adumbration of a technique that later, in 1913 with the publication of Ideen I, would be termed the “epoché ” or the “reduction.”

Husserl begins his critique of the natural sciences by noting certain absurdities that become evident when such naturalism is adopted in an effort to “naturalize” consciousness and reason; these absurdities are both theoretical and practical. Husserl says that when “the formal-logical principles, the so-called ‘laws of thought,’ are interpreted by naturalism as natural laws of thinking,” there occurs a kind of “inevitable” absurdity owing to an inherent inconsistency involved in the naturalist position. His claim in this article alludes to the more fully formed argument from volume 1 of his Logical Investigations (Husserl, 1970), which will be summarized here.

The natural sciences are empirical sciences and, as such, deal only with empirical facts. Thus, when the formal-logical principles are subsumed under the “laws of Nature” as “laws of thought,” this makes the “law of thought” just one among many of the empirical laws of nature. However, Husserl notes that “the only way in which a natural law can be established and justified, is by induction from the singular facts of experience” (p.99). Furthermore, induction does not establish the holding of the law, “only the greater or lesser probability of its holding; the probability, and not the law, is justified by insight” (p.99). This means that logical laws must, without exception, rank as mere probabilities; yet, as he then notes, “nothing, however, seems plainer than that the laws of ‘pure logic’ all have a priori validity” (p.99). That is to say, the laws of ‘pure logic’ are established and justified, not by induction, but by apodictic inner evidence; insight justifies their truth itself. Thus, as Husserl remarks in “Philosophy as a Rigorous Science” (1965) that “naturalism refutes itself” (p.80). It is this theoretical absurdity that leads to a similar absurdity in practice.

The absurdity in practice, says Husserl, becomes apparent when we notice that the naturalist is “dominated by the purpose of making scientifically known whatever is genuine truth, the genuinely beautiful and good; he wants to know how to determine what is its universal essence and the method by which it is to be obtained in the particular case” (pp.80-81). Thus, the naturalist believes that through natural science and through a philosophy based on the same science the goal has been attained; but, says Husserl, the naturalist is going on presuppositions; indeed, to the extent that he theorizes at all, it is just to that extent “that he objectively sets up values to which value judgments are to correspond, and likewise in setting up any practical rules according to which each one is to be guided in his willing and in his conduct” (p.81). It is this state of affairs that drives Husserl to the observation that the naturalist is “idealist and objectivist in the way he acts”; since both of these cannot be true at the same time, the naturalist is involved in an absurdity (p.80).

Husserl claims that the natural scientist is not outwardly aware of these absurdities owing to the fact that he “naturalizes reason” and, on this account, is blinded by prejudice. He adds, “One who sees only empirical science will not be particularly disturbed by absurd consequences that cannot be proved empirically to contradict facts of nature” (pp.81-82). This is not to say that Husserl is arguing against science as such, to the contrary, he says that there is “in all modern life no more powerfully, more irresistibly progressing idea than that of science” and that “with regard to its legitimate aims, it is all-embracing. Looked upon in its ideal perfection, it would be reason itself, which could have no other authority equal or superior to itself” (p.82). The problem is that naturalism, which wanted to establish philosophy both on a basis of strict science and as a strict science, appears completely discredited along with its method. To this point in the argument, Husserl has simply shown that the foundation upon which scientific inquiry rests is self-contradictory and fails to offer adequate grounding. So, if the natural scientist cannot provide us with a “rigorous science” then what is needed and to whom can we look?

b. The Reduction Prefigured

Husserl’s idea is that the problems belonging to the domain of a “strict science,” namely, theoretical, axiological, and practical problems, give us a clue themselves as to the method required for their solution. He says, “through a clarification of the problems and through penetration into their pure sense, the methods adequate to these problems, because demanded by their very essence, must impose themselves on us” (p.83). It is for this reason that the refutation of naturalism based on its consequences that he just finished accomplishes very little for him, what is important is the principiant critique of the foundations of naturalism; and by this he means that he wants to direct a critical analysis at the philosophy that believes “it has definitely attained the rank of an exact science” (p.84). So what Husserl will be putting to the test is the relative strength of the term “exact” when it is used in this context. It is not the case that Husserl thinks that a science of nature does not produce important results; he thinks it does. The problem, as Husserl sees it, is that a science of nature is inadequate if it is not ultimately grounded in a strictly scientific philosophy. Husserl is not criticizing the results of science (the structural design and dignity of the house that science built) but only the foundation upon which those results rest.

With respect to the foundation, Husserl says that all natural science is naïve in regard to its point of departure because the nature that it investigates “is for it simply there.” In other words, the things that natural science investigates are its foundation because they mark the point of departure for natural science. These things are simply taken for granted uncritically as being there and “it is the aim of natural science to know these unquestioned data in an objectively valid, strictly scientific manner” (p.85). The same holds true for psychology in its domain of consciousness. It is the task of psychology “to explore this psychic element scientifically within the psychophysical nexus of nature, to determine it in an objectively valid way, to discover the laws according to which it develops and changes, comes into being and disappears” (p.86). Even where psychology, as an empirical science, concerns itself with determinations of bare events of consciousness and not with dependencies that are psychophysical, “those events are thought of, nevertheless, as belonging to nature, that is, as belonging to human or brute consciousnesses that for their part have an unquestioned and co-apprehended connection with human and brute organisms” (p.86). Thus, he states that “every psychological judgment involves the existential positing of physical nature, whether expressly or not” (p.86).

This uncritical acceptance is also reflected in the naïveté that characterizes natural science since at every place in its procedure it accepts nature as given and relies upon it when it performs experiments. Thus, ultimately, every method of experiential science leads back precisely to experience. But isolated experience is of no worth to science; rather, “it is in the methodical disposition and connection of experiences, in the interplay of experience and thought which has its rigid logical laws, that valid experience is distinguished from invalid, that each experience is accorded its level of validity, and that objectively valid knowledge as such, knowledge of nature, is worked out” (p.87). Although this critique of experience is satisfactory, says Husserl, as long as we remain within natural science and think according to its point of view, a completely different critique of experience is still possible and indispensable. It is a critique that places in question all experience as such as well as the sort of thinking proper to empirical science (p.87).

For Husserl, this is a critique that raises questions such as: “how can experience as consciousness give or contact an object? How can experiences be mutually legitimated or corrected by means of each other, and not merely replace each other or confirm each other subjectively? How can the play of a consciousness whose logic is empirical make objectively valid statements, valid for things that are in and for themselves? Why are the playing rules, so to speak, of consciousness not irrelevant for things?” It is by means of these questions that Husserl hopes to highlight his major concern of how it is that natural science can be comprehensible in every case, “to the extent that it pretends at every step to posit and to know a nature that is in itself—in itself in opposition to the subjective flow of consciousness” (p.88). He says that these questions become riddles as soon as reflection upon them becomes serious and that epistemology has been the traditional discipline to which these questions were referred, but epistemology has not answered the call in a manner “scientifically clear, unanimous, and decisive.”

To Husserl, this all points to the absurdity of a theory of knowledge that is based on any psychological theory of knowledge. He punctuates this claim by noting that if certain riddles are inherent, in principle, to natural science, then “it is self-evident that the solution of these riddles according to premises and conclusions in principle transcends natural science.” He adds that “to expect from natural science itself the solution of any one of the problems inherent in it as such—thus inhering through and through, from beginning to end—or even merely to suppose that it could contribute to the solution of such a problem any premises whatsoever, is to be involved in a vicious circle” (pp.88-89).

With this being the case, it becomes clear to Husserl that every scientific, as well as every pre-scientific, application of nature “must in principle remain excluded in a theory of knowledge that is to retain its univocal sense. So, too, must all expressions that imply thetic existential positings of things in the framework of space, time, causality, etc. This obviously applies also to all existential positings with regard to the empirical being of the investigator, of his psychical faculties, and the like” (p.89). It is here, in this passage, that we see the formal beginnings of what will later be termed the “epoché ” and “reduction” in Ideen I.

Husserl is advocating a theory of knowledge that will investigate the problems of the relationship between consciousness and being in a way that excludes, not only the “thetic existential positings of things in the framework of space, time, causality, etc.,” but also the “existential positings” and “psychical faculties” of the investigator. In other words, he wants to separate the subject matter he is investigating from both the theoretical framework of science and the coloring with which any investigator might qualify it. But to do so, knowledge theory can have before its eyes “only being as the correlate of consciousness: as perceived, remembered, expected, represented pictorially, imagined, identified, distinguished, believed, opined, evaluated, etc.” And for Husserl, this means that the investigation must be directed “toward a scientific essential knowledge of consciousness, toward that which consciousness itself ‘is’ according to its essence in all its distinguishable forms” (p.89). Husserl also notes that the investigation must also be directed toward “what consciousness ‘means,’ as well as toward the different ways in which—in accord with the essence of the aforementioned forms—it intends the objective, now clearly, now obscurely, now by presenting or by presentifying, now symbolically or pictorially, now simply, now mediated in thought, now in this or that mode of attention, and so in countless other forms, and how ultimately it ‘demonstrates’ the objective as that which is ‘validly,’ ‘really’” (p.89).

To summarize, what Husserl wants to do is to provide an unshakable ground for science, so as to make it “rigorous” and “exact.” He dismisses the efforts of both science and psychology to provide such a ground owing to the fact that the “riddles” inherent in each necessarily put the solution outside of their reach. He also notes that the traditional discipline of epistemology has failed to do this and suggests that what is needed is an investigation that is directed toward “a scientific essential knowledge of consciousness, toward that which consciousness itself ‘is’ according to its essence in all its distinguishable forms.” Furthermore, this can only be done if we separate the matter in question from the qualifications imposed on it by either the theoretical framework of science or the existential “positings” of the investigator. In other words, we must return to the matters in question, as they are themselves; and the procedure whereby this is accomplished is phenomenology, specifically, the phenomenological reduction.

5. The Structure, Nature and Performance of the Phenomenological Reduction

a. The Structure of the Phenomenological Reduction

i. The Two Moments of the Phenomenological Reduction

What actually occurs when one undertakes to perform the reduction can be discerned by giving careful attention to the things Husserl and Fink have said about it; but let me first address some terminological concerns regarding two key concepts. In Sixth Cartesian Meditation (Fink, 1995), Fink tells us “epoché and the action of the reduction proper are the two internal basic moments of the phenomenological reduction, mutually required and mutually conditioned” (p.41). This passage alerts us to the fact that the locution, phenomenological reduction, denotes two separate “moments,” each of which requires and conditions the other. Thus, in speaking of “the reduction” one needs to be careful to specify whether it is the reduction proper, which is only one of the two moments, that is meant, or whether one means the entire operation of the phenomenological reduction.

Let me also draw attention to the term “moments” here because, in order to get an accurate conception and understanding of the phenomenological reduction, we must see that it is not done in two “steps.” The moments are internal logical moments and do not refer to two “steps” that one might take to conclude the procedure as one might do, for example, in waxing a floor: where the first step is to strip off the old wax and the second step is to apply the new wax; steps imply a temporal individuation that is not true of the moments of the phenomenological reduction. Husserl’s term, epoché, the negative move whereby we bracket the world, is not a “step” that we do “first” in an effort to prepare ourselves for the later “step,” reduction proper; rather, the bracketing and the move whereby we drive the self back upon itself, the reduction proper, occur together.

There were many during his day who misunderstood what Husserl and Fink were trying to communicate; and I think part of what might have contributed to this misunderstanding is that Husserl’s readers thought that the reduction was a “two-step” process conducted wholly within the realm of the mind or imagination, not requiring any other kind of bodily participation.

1) The Epoché

Husserl’s insight is that we live our lives in what he terms a “captivation-in-an-acceptedness;” that is to say, we live our lives in an unquestioning sort of way by being wholly taken up in the unbroken belief-performance of our customary life in the world. We take for granted our bodies, the culture, gravity, our everyday language, logic and a myriad other facets of our existence. All of this together is present to every individual in every moment and makes up what Fink terms “human immanence”; everyone accepts it and this acceptance is what keeps us in captivity. The epoché is a procedure whereby we no longer accept it. Hence, Fink notes in Sixth Cartesian Meditation: “This self consciousness develops in that the onlooker that comes to himself in the epoché reduces ‘bracketed’ human immanence by explicit inquiry back behind the acceptednesses in self-apperception that hold regarding humanness, that is, regarding one’s belonging to the world; and thus he lays bare transcendental experiential life and the transcendental having of the world” (p.40). Husserl has referred to this variously as “bracketing” or “putting out of action” but it boils down to the same thing, we must somehow come to see ourselves as no longer of this world, where “this world” means to capture all that we currently accept.

At this point it may prove prudent to head off some possible misunderstandings with respect to the epoché. Perhaps the most frequent error made with respect to the epoché is made in regards to its role in the abstention of belief in the world. Here it is important to realize two things: the first is that withdrawal of belief in the world is not a denial of the world. It should not be considered that the abstention of belief in the world’s existence is the same as the denial of its existence; indeed, the whole point of the epoché is that it is neither an affirmation nor a denial in the existence of the world. In fact, says Fink, “the misunderstanding that takes the phenomenological epoché to be a straightforwardly thematic abstention from belief (instead of understanding it as transcendentally reflective!) not only has the consequence that we believe we have to fear the loss of the thematic field, but is also intimately connected with a misunderstanding of the reductive return to constituting consciousness” (p.43). The second thing has to do with who it is that is doing the abstaining and this directly concerns the moment of the reduction proper.

2) The Reduction Proper

The second moment of the phenomenological reduction is what Fink terms the “reduction proper;” he says, “under the concept of ‘action of reduction proper’ we can understand all the transcendental insights in which we blast open captivation-in-an-acceptedness and first recognize the acceptedness as an acceptedness in the first place” (p.41). If the epoché is the name for whatever method we use to free ourselves from the captivity of the unquestioned acceptance of the everyday world, then the reduction is the recognition of that acceptance as an acceptance. Fink adds, “abstention from belief can only be radical and universal when that which falls under disconnection by the epoché comes to be clearly seen precisely as a belief-construct, as an acceptedness.” It is the seeing of the acceptance as an acceptance that is the indication of having achieved a transcendental insight; it is transcendental precisely because it is an insight from outside the acceptedness that is holding us captive. It should be kept in mind that the “seeing” to which Fink refers is not a “knowing that” we live in captivation-in-an-acceptedness, since this can be achieved in the here and now by simply believing that Fink is telling the truth; the kind of “seeing” to which Fink refers is rather more like the kind of seeing that occurs when one discovers that the mud on the carpet was put there by oneself and not by another, as was first suspected.

Thus, as Fink points out, it is through the reductive insight into the transcendental being-sense of the world as “acceptedness” that “the radicality of the phenomenological epoché first becomes possible;” but “on the other hand, the reduction consistently performed and maintained, first gives methodic certainty to the reductive regress” (p.41). Taken together, the epoché and the reduction proper comprise the technique referred to as the phenomenological reduction; since these two moments cannot occur independently, it is easy to see how the single term, “reduction,” can come to be the term of preference to denote the whole of the phenomenological reduction.

Fink also brings out a misunderstanding relating to the reduction proper, which is that it is taken as a species of speculation: “hand in hand with this misunderstanding of the epoché goes a falsification of the sense of the action of reduction proper (the move back behind the self-objectivation of transcendental subjectivity). The latter is rejected as speculative construction, for instance when one says: in actuality the phenomenologist has no other theme than human inwardness” (p.47). To think that there is such reinterpretation or speculation is to miss the point of the reduction proper, that is, it is to miss the fact that what it does is interrogate man and the world and makes them the theme of a transcendental clarification—it is precisely the world phenomenon, or “being”, which is bracketed.

According to Fink and Husserl, the phenomenological reduction consists in these two “moments” of epoché and reduction proper; epoché is the “moment” in which we abandon the acceptedness of the world that holds us captive and the reduction proper indicates the “moment” in which we come to the transcendental insight that the acceptedness of the world is an acceptedness and not an absolute. The structure of the phenomenological reduction has belonging to it the human I standing in the natural attitude, the transcendental constituting I, and the transcendental phenomenologizing I, also called the onlooker or spectator. Fink says that “the reducing I is the phenomenological onlooker. This means he is, first, the one practicing the epoché and then the one who reduces, in the strict sense” (p.39).

Thus, it is by means of the epoché and reduction proper that the human I becomes distinguished from the constituting I; it is by abandoning our acceptance of the world that we are enabled to see it as captivating and hold it as a theme. It is from this perspective that the phenomenologist is able to see the world without the framework of science or the psychological assumptions of the individual.

b. The Nature of the Phenomenological Reduction

The phenomenological reduction is a radical, rigorous, and transformative meditative technique. To illustrate this, let me turn to comments that Fink makes in his “What Does the Phenomenology of Edmund Husserl Want to Accomplish: The Phenomenological Idea of Laying a Ground” (Fink, 1966/1972; German/English).

i. Self-Meditation Radicalized

The most important point to be made in reference to the nature of the phenomenological reduction is that it is a meditative technique and not a mere mental or imaginative technique. Furthermore, it is a self-meditation that has been radicalized. Fink introduces this in his discussion of laying a ground. He says that “the laying-of-a-ground of a philosophy is the original beginning of the philosopher himself, not with and for others but for himself alone; it is the disclosing of the ground which is capable of bearing the totality of a philosophical interpretation of the world” (p.161/11). In this passage we can plainly see that the ground of which Fink is speaking is not considered to be propositions, ideas, or anything else of that sort; rather the ground is precisely the philosopher him or herself. Thus, Fink says, “it is a fateful error to suppose that the principles, in accordance with which a ground-laying of philosophy is to proceed, would be present—transported, as it were, from the conflict of philosophers—as a normative ideal prior to and outside of philosophy” (p.161/11). Hence, regardless of “how such a ground-laying is carried out—be it as a return to the concealed, a priori law-giving of reason, or be it as a progression towards essentials, and the like—the meditation [die Besinnung], in which such a ground-laying is carried out, is always the first, fundamental decision of a philosophizing” (p.161/11).

Unless the term “meditation,” as Fink uses it in this context, springs out at one when reading it, the heart of this passage is likely to be misunderstood. Here there is a clear connection being established between some meditative practice [Besinnung] and the laying of a ground for philosophy. It is important to draw attention to this feature since we typically think of axioms or assumptions when we assay to discern the foundation of a philosophy; but Fink is making a clear break with that practice, holding instead that the first, fundamental decision of a philosophizing is “the meditation, in which a ground-laying is carried out” [“immer ist die Besinnung, in der sich eine solche Grundlegung vollzieht, die erste grundsätzliche Entscheidung eines Philosophierens.”] (p.162/11).

Fink adds to this by noting that “the commencement of the idea of laying-a-ground, which determines a philosophy, is always already the implicit (and perhaps only obscurely conscious) fore-grasp upon the system. Thus in embryonic form, the idea of the system is sketched out in the idea of laying-a-ground” (p.162/11). In other words, the idea of the ground-laying works itself out in whatever philosophy it grounds; the philosophy is itself pre-figured in the ground-laying and reflects it.

He explains this pre-figuring further by saying that, in the case of the philosophy of Husserl, the idea of the ground-laying working itself out “can, at first, be made understandable from the pathos of phenomenology, that is, from the deportment of the human existence lying at its ground” (p.162/11). Fink allows that this pathos is “in no way a specifically ‘phenomenological’ one, but is, rather, the constant pathos of every philosophy which, when taken seriously in a particular, inexorable way, must lead to phenomenology itself” (p.162/11). Indeed, this pathos is “nothing other than the world-wide storm of the passion of thinking which, extending out into the totality of entities and grasping it, subjects it to the spirit” (p.163/11). Fink is saying here that the will, as the pathos of philosophy, is “resolved to understand the world out of the spirit [die Welt aus dem Geist zu verstehen],” which does not mean the “naïve belief in a pre-given and present-at-hand ‘spiritual sense’ of the world, but solely the willingness to bring the spirit first to its realization precisely through the knowledge of the All of entities” (p.163/12).

Although this passage would seem to indicate the crassest “intellectualism,” since it seems to be saying that knowledge is the main operative process, Fink is insistent that neither the “‘rationalistically’ claimed self-certainty of the spirit” (here read Descartes), nor “the fascination with chaos” (read Nietzsche) that “all too easily is transformed into a defeatism of reason,” captures what he means. Rather, he says, “precisely in the face of chaos, standing fast against it, the philosopher ventures the spiritual conquering of the entity; he raises the claim of a radical and universal knowledge of the world” (p.164/12). If we inquire as to how it is possible that spirit can maintain itself and its claim, or whether it has itself already become a “ground experience”; whether we “Know what authentically is ‘spirit’” or what the true power of philosophizing existence is, Fink tells us: “Understanding itself in the passion of thinking, the pathos of the one who is philosophizing is cast back upon itself: it radicalizes itself into self-meditation [Selbstbesinnung], as into the way in which the spirit [der Geist] experiences itself. The phenomenological philosophy of Husserl lives in the pathos of that self-realization of the spirit [der Geist] which takes place in self-meditation” (p.164/13). Indeed, “the idea of the ground-laying of philosophy peculiar to phenomenology is the idea of the pure and persistent self meditation [der reinen und konsequenten Selbstbesinnung]” (p.164/13).

Although, as Fink notes, in the subjective mode of self-meditation, every philosophy carries out the business of laying a ground; “phenomenology is also materially grounded exclusively on self-meditation [gründet auch sachlich ausschließlich auf Selbstbesinnung]” (p.164/13). What Fink means here by using the term “exclusively” is that “from the very beginning phenomenology foregoes ever abandoning the deportment of pure self-meditation in favor of an objective deportment. It wants to be grounded solely upon the results of a radical and persistent self-meditation and to establish upon them the entirety of its philosophical system” (p. 164/13). Hence, for phenomenology, self-meditation is not a “mere subjective method for disclosing, as the ground and basis of the philosophical interpretation of the world, an objectivity sketched out in our spirit, for example, the objective essence of reason; rather it re-delineates the sole fundamental realm in which the philosophical problem of the world can arise” (p.164/13). Thus, in phenomenology “the concept of ‘ground,’ in return to which the philosophical grasping of the world realizes itself, has lost its usual ‘objective’ sense precisely through the persistent adherence to self-meditation, carried out with a certain radicalism of ‘purity,’ as the exclusive thematic source of philosophy” (p.165/13). Fink adds: “The ground, posited in the phenomenological idea of laying-a-ground, is the ‘self’ which uncovers itself only in pure self-meditation” (p.165/13-14).

The general logical form of this argument will reappear in 1954 with the publishing of The Crisis of European Sciences and Transcendental Phenomenology. There the argument is made that the sciences not only take the everyday life-world for granted, the everyday life-world is actually the ground for all that the sciences do because it is from there that they take their starting point. In a similar move of reasoning, the argument in this article is aimed at drawing attention to the obvious fact that the philosopher is always the real ground for any philosophy; and that if we wish, as it were, to ground that ground, we must embark on a procedure of self-meditation—indeed, if rigor is to be maintained, we are required to undertake such a course of action.

Of course, a number of questions immediately surround the suggestion of “self-meditation,” all of which derive from “the naïve and familiar, pre-given concept of ‘self-meditation’”; but it is precisely this concept that must be transformed, says Fink: “the dimension of philosophy can be attained only in the radical change of self-meditation from the indeterminateness of the preliminary, still unclarified concept into the determined phenomenological setting” (p.165/14). Thus, the former questions are now transformed into questions such as: How can this change be accomplished, and what must the nature of self-meditation be, such that, precisely in the thematization of the self, the question of the totality of entities is included and traced out in its fundamental solution? Fink’s response is that to this there is only one answer: “the transformation of the idea of the common self-meditation happens eo ipso in an extremely intensified taking of self-meditation seriously. The seriousness demanded here wants nothing less than to expose the spirit to a ground-experience which will bring it back into the power of the essence that is purely proper to it. In the self-meditation radicalized into the ‘phenomenological reduction,’ the spirit should accomplish a movement towards itself, should come unto itself” (p.165/14). But in what sense is this self-meditation radical?

ii. Radical, Rigorous, and Transformative

Some today have misunderstood the phenomenological reduction and it is probable that this failure to grasp what Husserl has discovered is partly owing to the radical nature of Husserl’s project being completely missed. Fink pieces together the very analysis of the reduction that is wanted here if we are ever to disabuse ourselves of the view that the reduction is nothing more than a mere incantation or formal condition—a mental exercise.

This type of misunderstanding of the nature of phenomenology is not something new; Fink himself made explicit reference to its breadth, even as late as 1934 when this article was originally published, saying: “The contemporary judgment of the phenomenological philosophy of Husserl fails, almost without exception, to recognize its true meaning” (Accomplish, p. 6). He then cites examples, noting that “Husserl is judged, admired and reproached sometimes as an eidetician and logician, at other times as a theoretician of knowledge, on the one hand, as an ontologist giving word to the ‘matters themselves,’ and, on the other hand, as an ‘Idealist.’ Thereby, every such Interpretation is capable, with moderate violence, of ‘proving’ itself from his writings. The authentic and central meaning of Edmund Husserl’s philosophy is today still unknown” (p. 6). Fink attributes this lack of authentic understanding, not to a lack of willingness to understand on the part of the community of readers, but, to the essence of phenomenology itself. So, the important question is: what is it about the essence of phenomenology that makes it so difficult for the devotee to come away with an authentic understanding of it?

According to Fink, we find the answer to this question by considering the fact that the appropriation of the true meaning of phenomenology “cannot at all come about within the horizon of our natural deportment of knowledge. Access to phenomenology demands a radical reversal of our total existence reaching into our depths, a change of every pre-scientifically-immediate comportment to world and things as well as of the disposition of our life lying at the basis of all scientific and traditionally-philosophical attitudes of knowledge” (p. 6).

Nearly everyone, who has had even a casual acquaintance with Husserl’s writings, has read something akin to this passage somewhere, claiming the radicality of what phenomenology attempts. Husserl is continually drawing our attention to the radical nature of phenomenology and how it affects all of our scientific knowledge and understanding; indeed, emphasizing how it grounds that very knowledge and understanding. The important thing to notice in regards to such passages, however, is that the misunderstanding of phenomenology arises precisely because the notions of the term “radical,” which are employed by the would-be readers as a hermeneutical guide in their efforts to come to an authentic appreciation of the practice of phenomenology, fail to capture all that Husserl intends by his use of it—and this in spite of the fact that he, time and again, tells us that his use of the term “radical” is new.

Consider, for instance, Husserl’s introduction to the Cartesian Meditations where he expounds on the need for a “radical new beginning” of philosophy saying, “to renew with greater intensity the radicalness of their spirit, the radicalness of self-responsibility, to make that radicalness true for the first time by enhancing it to the last degree…” (Cartesian Meditations, p. 6). Husserl’s emphatic demand that the radicalness become true “for the first time” indicates that his sense of “radical” is much more radical than might ordinarily be thought. Again, in Sixth Cartesian Meditation we read, “This is the problem of the proper methodological character of the phenomenological fore-knowledge that first makes it possible to pose the radical questions—in a new sense of ‘radical’—, to provide the motive for performing the phenomenological reduction” (Sixth, p. 36). Here we see an explicit mention of the fact that the term “radical” is being employed in a “new” sense.

Thus, when some of misunderstand the reduction, they, most probably, are not taking seriously Husserl’s claim of radicality, i.e., they have not understood exactly how extreme Husserl’s sense of the term is. If they, however, take a close look at Fink’s development and analysis of phenomenology in this article and by pay close attention to the intensity of the language he uses in relation to it, we can remedy this deficiency quite easily; but not without also considering the rigor required to perform the phenomenological reduction.

One important feature of the way Fink sets up his discussion of the ground and his illustration of the rigor required in the performance of the phenomenological reduction is his dramatic use of Plato’s allegory of the cave. He says, “the violence, tension and struggle of the accomplishment of philosophizing symbolized in this allegory also determines the phenomenological philosophy of Edmund Husserl” (Accomplish, p. 160/9). If there is any doubt as to how we should understand the terms “violence” and “struggle,” as he uses them in this context, Fink dispatches it immediately with the following: “The philosophical ‘unchaining,’ the tearing oneself free from the power of one’s naïve submission to the world, the stepping-forth from out of that familiarity with entities which always provides us with security, in one word, the phenomenological ‘epoché,’ is anything but a noncommittal, ‘merely’ theoretical, intellectual act; it is rather a spiritual movement of one’s self encompassing the entire man and, as an attack upon the ‘state-of-motionlessness’ supporting us in our depths, the pain of a fundamental transformation down to our roots” (p. 160-1/9). It should be clear that Fink’s use of terms such as “violence,” “struggle,” “unchaining,” “pain,” and “fundamental transformation” indicate a much more rigorous project than armchair philosophy has been wont to allow up to this point. But what is it that makes it so rigorous; what is it that we do when we perform the phenomenological reduction?

We get a preliminary description of what is required from Fink: “Our era can really attain to Husserl’s philosophy, which down to today is still unknown and ungrasped, only by ascending out of the cave of world-constraint, by passing through the pain of self-releasement—and not through ‘critiques’ that are thoroughly bound to the naïve understanding of the world, enslaved to the natural thought-habits and entangled in the pre-constituted word-meanings of the everyday and scientific language” (p. 161/10). Here, again, we find familiar language; language that might have been encountered in any number of Husserl’s other writings, but what is of interest to us in this passage is the picture of what it is we are “ascending out of.” In this regard, it is helpful to recall the phrase used in Sixth Cartesian Meditation to describe the same thing, namely, “captivation-in-an-acceptedness.” The situation Fink is describing is this: the lives that we live in our everyday world are lived in toto with that world, i.e., the world, as we understand it, is part of what makes us who we think we are; and, conversely, the world is only what it is (what we think it is) by virtue of having us in it, because when we think of the totality of the world, we must remember that it is a totality already containing us thinking it. Hence, we (the world and ourselves) hold each other mutually captive by virtue of what we accept—the acceptednesses—to be true. This reflexive containment is part of what Fink means when he says, “To know the world by returning to a ‘transcendence’ which once again contains the world within it signifies the realization of a transcendental knowledge of the world. This is the sole sense in which phenomenology is to be considered as a ‘transcendental philosophy’” (Criticism, p. 100).

With this statement we finally arrive at the core of what Fink means to communicate; the phenomenological reduction is self-meditation radicalized. On its face, his statement may seem to involve the presupposition that the self is already estranged from its own essence; however, as Fink points out, “phenomenology does not begin with a ‘presupposition’; rather, by an extreme enhancement and transformation of the natural self-meditation, it leads to the ground-experience which opens-up not only the concealed-authentic essence of the spirit, but also the authentic sense of the natural sphere from out of which self-meditation comes forth” (Accomplish, p. 166/14-15). The ground-experience, furthermore, can succeed “only when, with the most extreme sharpness and consequence, every naïve claiming of the mundane-ontological self-understanding is cut off, when the spirit is forced back upon itself to Interpret itself purely as that ‘self’ which is the bearer and accomplisher of the valuation of every natural ‘self-understanding’” (p. 169/17-18). This view is already made explicit in direct connection with the phenomenological onlooker in Fink’s discussion in Sixth Cartesian Meditation (pp. 39-40). The meditation does not bring the reducing “I” into being; the reducing “I” is disclosed once the shrouding cover of human being is removed. That is, by un-humanizing ourselves we discover the reducing “I”—the phenomenological onlooker who is the one practicing the epoché.

Now we can more clearly grasp the meaning of Fink’s statement; when he speaks of spirit being “forced back upon itself,” the “itself” is the phenomenological onlooker—spirit; and the radicalization of self-meditation is the procedure whereby we discover what Husserl earlier referred to as “I am, this life is.” This is “radicalization” precisely because it is to be done without any reference to the mundane. Let me explain, the world is familiarly and horizonally pre-given to us in its totality; furthermore, we are pre-given in it. So, the mundane-ontological self-interpretedness of the spirit is a moment in the totality of the pre-givenness of the world. Hence, if we use any element of the mundane-ontological interpretedness of the world, we have not exercised a “radical” shift. In order for the shift to be truly radical in Husserl’s sense, no element of the mundane can enter into either the motivation for self-meditation or into the ground of it—in the sense of an understanding of the essence of spirit prior to the ground-experience that brings spirit to itself. What we want to accomplish is a radical shift in which the spirit (phenomenological onlooker) is forced back upon itself to interpret itself purely as that “self” that is the bearer (as the human ego) and accomplisher (transcendental constituting ego) of the valuation of the entirety of the mundane-ontological self-interpretedness.

The radical nature of the phenomenological reduction seems to have been greatly underdetermined by some and that we can only get a truly accurate picture of what Husserl means by taking seriously his claim that, not only is the reduction radical, but it is radical in a “new” sense of that term; this “new” radicality is linked directly to self-meditation that has been radicalized—radicalized, that is, insofar as it is a self-meditation that is “forced back upon itself to Interpret itself purely as that ‘self’ which is the bearer and accomplisher of the valuation of every natural ‘self-understanding.’” One practical way to grasp what it means for the self to be “forced back upon itself to interpret itself purely as that ‘self’ which is the bearer and accomplisher of the valuation of every natural ‘self-understanding,’” is to understand this ‘self’ as the “I” in “I am.” Let us now take a closer look at exactly how this technique is performed.

c. The Performance of the Phenomenological Reduction

Husserl criticizes scientific inquiry on the grounds that it does not have a philosophically rigorous foundation. The reason it does not have a philosophically rigorous foundation is because it has failed to take into consideration the fact that both the framework of its own inquiry (that is, the assumptions of time, space, causality, etc.) and the psychological assumptions of the individual scientist act to color its findings. Since there has to be a way that consciousness can contact the objective world, then the rigorous philosophical grounding that is wanted must be disclosed in this relationship. Hence, what is needed is a way to examine consciousness as it is in itself, free from the scientific framework and psychological assumptions. This procedure is the phenomenological reduction and the term “reduction” is a term that Husserl uses to indicate a reflective inquiring back into consciousness; it is an interrogation conducted by consciousness into itself. In the idiom of our own everyday parlance, we might phrase this inquiry as an exercise in determining who the “I” is whenever we say “I AM.” Indeed, the path that we naturally follow in seeking an answer to this question leads precisely to the kind of interrogation of the self by the self that Husserl and Fink both claim to be ingredient in the performance of the reduction.

i. Self-Meditation

Phrases such as “resolved to understand the world out of the spirit,” “spiritual movement,” “religious conversion,” “fundamental transformation,” “ground experience,” “un-humanize,” and “meditation” are all leading clues as to how this technique should be understood and performed. We know that the technique is similar to the ordinary self-meditation, only radicalized; we know that it requires strenuous effort and, once completed, brings a transformation similar to a religious conversion. We also know that in the process we are “un-humanized” yet have the “entire man” encompassed. These leading clues not only direct our steps in the performance of the technique, but also give us criteria by which to judge our attempts. For instance, if we think we have performed the reduction, then we should feel as though we have experienced a religious transformation; if we do not feel that way, then chances are our technique was faulty and we did not perform it after all.

If we are to build up a picture of this technique we must begin by assuming that Husserl and Fink have an authentic discovery that they are trying to communicate and that their choice of terms to describe this experience is not careless. The title of Fink’s article gives us the framework we need to complete this task. He tells us right away that he is interested in the idea of laying a ground. Laying a ground is another way of saying that preparation is being made; indeed, the ground that is laid is preparing the way for the phenomenological philosophy of Edmund Husserl; and the ground in question is the philosopher. Fink is telling us that the philosopher is the ground for phenomenology and that the philosopher, as ground, needs preparation. What is it that prepares the philosopher to be the ground for phenomenology? It is the phenomenological reduction. The phenomenological reduction prepares the philosopher to be a phenomenologist in the same way that the experience associated with religious conversion prepares the devotee to live the religious life. Husserl says in the Crisis: “the total phenomenological attitude and the epoché belonging to it are destined in essence to effect…a complete personal transformation, comparable in the beginning to a religious conversion, which then, however, over and above this, bears within itself the significance of the greatest existential transformation which is assigned as a task to mankind as such” (p.137).

The phenomenological reduction is properly understood as a regimen designed to transform a philosopher into a phenomenologist by virtue of the attainment of a certain perspective on the world phenomenon. The path to the attainment of this perspective is a species of meditation, requiring rigorous and persistent effort. It is a species of meditation because, unlike ordinary meditation, which involves only the mind, this more radical form requires the participation of the entire individual, including, as Fink says, “the pathos of the one who is philosophizing.” However, because it is a species of meditation, one can assume the basic starting point of stilling the body, mind, and emotions while sitting in a comfortable position, having made provisions not to be disturbed. What is aimed at with these outward preparations is the goal of taking as much of the world “out of play” as possible, leaving only the meditative task to occupy one’s attention.

Once settled in this comfort, the “inquiring back” into consciousness may begin; it is the having of the self as the only object of meditation that makes this a self-meditation. Since what we are after is a self-meditation, the focus of attention is on the self and the radicalization of this meditation consists in one relentlessly pushing back and forcing the self onto itself. This can be done by repeatedly affirming, not merely saying, “I am” to oneself while trying to experience or “catch” the “I” in the present instead of remembering it. In the attempt to experience the “I” in the present, one will be forced to feel the I-ness of it; this is why Fink says the performance of the technique encompasses the “entire man” and speaks of the “pathos of the one who is philosophizing.”

In the course of this practice, one will become aware of the three “I”s: the human ego, the constituting ego, and the onlooker, or spectator. It is unlikely that much progress will be made on the first attempt; however, each try makes the return easier until there will come a day when you feel your consciousness rising (or yourself sinking) and the brightness of the world around you seems to be increasing. At that point you will know “I AM” and your perspective on the world will be the one that Husserl has promised—you will be a phenomenologist and will never be the same again. Indeed, Fink says that “the phenomenological ‘epoché,’ is anything but a noncommittal, ‘merely’ theoretical, intellectual act; it is rather a spiritual [geistig] movement of one’s self encompassing the entire man and, as an attack upon the ‘state-of-motionlessness’ supporting us in our depths, the pain of a fundamental transformation down to our roots” (Accomplish, p. 9). Adding that in the epoché “the transcendental tendency that awakens in man and drives him to inhibit all acceptednesses nullifies man himself; man un-humanizes [entmenscht] himself” (Sixth, 40). It should be clear from these passages that whatever is involved in the epoché, it is certainly no mere mental exercise; and if we take Fink and Husserl at their word, it is a “spiritual movement of one’s self encompassing the entire man,” which would indicate a far more radical effort than seems indicated by some who treat the phenomenological reduction as something no more strenuous than exercising the imagination or reciting an incantation.

6. How the Reduction Solves the Epistemological Problem

a. The Problem of Constitution

I have already noted that in his Philosophy of Arithmetic Husserl found serious fault with psychologism in his efforts to emancipate ideal objects from psychology and demonstrate their independence. With this critique, however, came the following question: How do the ideal objects come to be given? This is simply the question concerning the correlation of subject and object noted above with respect to the tree and the quad. In his “The Decisive Phases in the Development of Husserl’s Philosophy,” Walter Biemel addresses this very concern and brings his considerable familiarity with Husserl’s works to bear upon it. He offers the following quotation from the Nachlass (F I 36, B1.19a f.) for consideration: “When it is made evident that ideal objects, despite the fact that they are formed in consciousness, have their own being in themselves, there still remains an enormous task which has never been seriously viewed or taken up, namely, the task of making this unique correlation between the ideal objects which belong to the sphere of pure logic and the subjective psychical experience conceived as a formative activity a theme for investigation. When a psychical subject such as I, this thinking being, performs certain (and surely not arbitrary but quite specifically structured) psychical activities in my own psychical life, then a successive formation and production of meaning is enacted according to which the number-form in question, the truth in question, or the conclusion and proof in question…emerges as the successively developing product.”

Biemel uses this quotation to make the point that in it Husserl expresses his real concern and the real theme of his phenomenology; Biemel draws our attention to the parenthetical phrase concerning psychical activities, namely, “(and surely not arbitrary but quite specifically structured),” to make the point that “the subject cannot arbitrarily constitute (and surely the issue here is that of constitution) any meaning whatsoever; rather are the constitutive acts dependent upon the essence of the objects in question.” In other words, if we are to consider the essence of the number three, for example, it is not the case that the essence of that number, contra psychologism, is dependent upon what psychical activities are required in order to form the number; rather, in order to understand the meaning of the number three, “we must perform determinate acts of collective connecting, otherwise the meaning of 3 in general will remain entirely closed to us. There is something like the number three for us when we can perform the collecting-unifying activity in which three become capable of being presented.” This does not mean that the essence of the number three would be arbitrarily determined by this activity so that the number would in each case change according to the manner in which one constitutes it. “Either I perform the acts which disclose the essence of the number three, with the result that for me there is something like three, or I do not perform them and then there is no 3 except for those who have performed this activity.” This “collecting-unifying activity” is the activity of constitution.

Biemel reminds us that the problem of constitution is the source of many a misunderstanding and adds, “the ordinary use of ‘constitution’ equates it with any kind of production, but ‘constitution’ in the strong sense is more of a ‘restitution’ than a constitution insofar as the subject ‘restores’ what is already there, but this, however, requires the performance of certain activities.” Citing a letter from Husserl to Hocking dated January 25, 1903, Biemel drives his point home: “Regarding the meaning of the concept of constitution employed in the Logical Investigations Husserl states: ‘The recurring expression that ‘objects are constituted’ in an act always signifies the property of an act which makes the object present (vorstellig): not ‘constitution’ in the usual sense.’” Hence, the best way to discuss the concept of constitution, says Biemel, is to discuss it as the-becoming-present-of-an-object; and the acts which make this becoming-present possible, which set it in motion, are the constituting acts. Or, as Husserl would put it in his Formal and Transcendental Logic, “This manner of givenness—givenness as something coming from such original activity—is nothing other than the way of their being ‘perceived’ which uniquely belongs to them.

This problem of constitution first appears in the Logical Investigations and continues to be one of the basic problems of phenomenology; however, the interest in it here is that constitution figures prominently in the resolution of the epistemological problem.

b. The Reduction and the Theme of Philosophy

In his “The Problem of the Phenomenology of Edmund Husserl,” Fink allows that access to the fundamental problem of Husserl’s phenomenology is uncertain owing to the fact that the fundamental problem of any philosophy is often not identical with the particular questions with which its literature begins. Indeed, the fundamental problem may often even await a proper formulation; one that can emerge only after the philosopher’s later stages of the development of his or her own thought are reworked. And although Husserl’s thought started with the sense-formation of mathematics and logic, these interests do not comprise what Fink terms the genuine problem or theme of phenomenology.

This very zigzag process of moving back and forth from one stage to the whole and back again within which the formulation of the genuine problem occurs discloses a distinction between two types of knowing. The first type is one in which we are engaged in a developmental process that will answer certain formulatable questions; that is, it is an expecting-to-know that is characterized chiefly by the fact that it advances an already established body of knowledge—in short, it is a knowing about knowledge that is lacking. For instance, in archaeology we might plan digs in areas surrounding certain cities expecting to add to our stock of knowledge about the ancient life in that setting in order to fill in known gaps in our accounts. This is knowledge of what is lacking.

This type of knowing is not, however, the type of knowing that emerges in the zigzag process to which I just referred. The type of knowing prevalent in the zigzag process is one in which what is obvious becomes questionable; not in the sense of creating arbitrary doubts or from the mere mistrust of the human mind; rather, questionable because, as Fink says, “philosophy is an experience that man has of himself and the existent;” and it is owing to this that the origin of philosophical problems is wonder. This means that “problem” in the philosophical sense is not an expecting-to-know on the basis of a path to knowledge but rather the formation of an expecting-to-know. Philosophy is, therefore, the shaking of the ground which bears human familiarity with the existent; it is the shaking of the basis which forms the presupposition for the progressive augmentation of knowledge, i.e., the shaking of the basis of expecting-to-know of the first type. It is the very unsettling of the foundations of knowledge and the questioning of the existent qua existent as well as the questioning of the nature of truth.

The astonishment in question is just the very experience that man has of himself and the existent that is the foundation needed for epistemology; because it is in this wonder that the “unsettling idea of a genuine mode of knowing the existent suddenly emerges from beneath the ordered, familiar world in which we are at home and about which we have fixed meanings concerning things, man and God, meanings which make certainty in life possible.” It is a “genuine mode” precisely because it is not already decided what the nature of the existent and the nature of truth are; after all, it cannot be original if the original formation of the ideas of “existent” and “truth” has already occurred; whether it is decided through a lengthy effort belonging to the past of human spirit or through the inconspicuous obviousness of the natural world-view. In other words, the only “knowing” that is original is the “knowing” that properly belongs to astonishment; because it is only in astonishment that man experiences the complete collapse of his traditional knowledge and pre-acquaintance with the world and with things; a collapse that is due entirely to a new confronting of the existent and a new projection of the senses of “being” and “truth.” We should be sensitive to Fink’s use of the term “original” here because the way he uses it in this passage heralds the sense of “founding” invoked in the way phenomenology provides a ground for epistemology.

Fink has told us that the astonishment in which philosophy begins is in no way “merely a ‘disposition,’ a feeling.” Rather, “it is the fundamental disposition of pure thought; it is original theory.” What Fink means to communicate with this is that in astonishment a change and transformation of knowing occurs such that what we already know is reduced to mere opinion and that even the very nature of knowing is altered. In other words, Fink marks a distinction between the “knowing” that stands in need of a foundation and the “knowing” that does the founding. The knowing that does the founding is the original knowing of astonishment; it is original precisely because it does not come to the existent and truth with conceptions in hand, having already decided their nature; and the door to sustained astonishment is opened by the rigorous performance of the phenomenological reduction.

It should not be inferred from this passage that there is anything whimsical about the way astonishment proclaims the existent; as though, for example, that being and truth are presented as mere conventions. Rather, what is wanted is the ability to, as Fink says, sustain and develop astonishment “by the awakening force of conceptual cognition” because it is the extent of the creative force of wonder that ultimately determines the rank and achievement of a philosophy. It is precisely this burden that is borne by the phenomenological reduction, which aims at voluntarily awakening the force of conceptual cognition and sustaining it throughout intentional analysis. Thus, it is borne out as was noted above that philosophy does not begin with an assumption but an experience; namely, the experience of having performed the phenomenological reduction. This experience is the astonishment in which original knowing occurs; and it is upon original knowing that the “knowing” of the existent, or epistemology, is grounded.

This relation, in which a physical experience is the condition for the possibility of thought, is not new to philosophy; logical analysis crucially depends upon one having the ability (experience) to be aware of logical connections; absent this ability, as Wittgenstein has also noticed, there is nothing we can do to atone for it in the individual—the individual either sees the logical connections or does not. It is the experience of being aware of, and noticing, logical connections that really grounds logical analysis. So, too, with the phenomenological reduction; without the experience of astonishment granted by having successfully performed the phenomenological reduction, no epistemology can be truly grounded because every epistemological claim must sometime trace itself back to the original knowledge; and the original knowledge can be had only in astonishment, the very fruit of accurately performing the phenomenological reduction. In other words, the ground for epistemology is, in the final analysis, the philosopher’s own astonishment; if this astonishment is voluntarily taken up and sustained, as in the performance of the phenomenological reduction, then the report of what is disclosed in that experience can be entered into the stock of human knowledge as an epistemological datum. And, in the same way that the validity of any logical argument is verified by each individual at every step by seeing for him or herself whether each step follows logically from the previous step by invoking one’s own ability to recognize logical connections, every epistemological datum must be similarly verified by the phenomenologist returning to astonishment through the phenomenological reduction and comparing the results achieved with those at hand. What is needed to assure consistent results and the scientific rigor Husserl said properly belonged to phenomenology is a more careful adherence to the rigorous conditions of performing the phenomenological reduction by phenomenologists so that it does not deteriorate into the psychologistic practice of free association or mere mental exercise; it is, after all, a rigorous meditative exercise requiring the struggle of the whole person.

7. References and Further Reading

  • Berger, Gaston. The Cogito in Husserl’s Philosophy. Translated by Kathleen McLaughlin. Evanston: Northwestern University Press, 1972.
  • Bernet, Rudolf. “Phenomenological Reduction and the Double Life of the Subject.” In Reading Heidegger from the Start: Essays in His Earliest Thought, eds. Theodore Kisiel and John van Buren, Albany: SUNY Press, 1994.
  • Bernet, Rudolf, Iso Kern, and Eduard Marbach. An Introduction to Husserlian Phenomenology. Evanston: Northwestern University Press, 1993.
  • Biemel, Walter. “Les Phases decisive dans le development de la philosophie de Husserl.” In Husserl: Cahiers de Royaumont, no III. Paris: Minuit, 1959.
  • Bochiniski, I.M. Contemporary European Philosophy. Translated by Donald Nicholl and Karl Aschenbrenner. Berkeley: University of California Press, 1966.
  • Boehm, Rudolf. “Basic Reflections on Husserl’s Phenomenological Reduction.” International Philosophical Quarterly 5 (1965): 183-202.
  • Boehm, Rudolf. “Les Ambiguités des Concepts Husserliens d’‘immanence’ et de ‘transcendence.’” Revue Philosophique de la France et de l’Etranger 149 (1959): 481-526.
  • Boehm, Rudolf. Vom Gesichtspunkt der Phänomenologie. Den Haag: Martinus Nijhoff, 1968.
  • Boehm, Rudolf. Vom Gesichtspunkt der Phänomenologie II. Den Haag: Martinus Nijhoff, 1981.
  • Bruzina, Ronald. “Construction in Phenomenology.” In The Reach of Reflection: Issues for Phenomenology’s Second Century, eds. Steven Crowell, Lester Embree, and Samuel J. Julian (Electron Press, October 2001), 46-71.
  • Bruzina, Ronald. Edmund Husserl and Eugen Fink: Beginnings and Ends in Phenomenology 1928-1938. New Haven: Yale University Press, 2004.
  • Carr, David. “The ‘Fifth Meditation’ and Husserl’s Cartesianism.” Philosophy and Phenomenological Research. 34:14-35, 1973.
  • Carr, David. The Paradox of Subjectivity: The Self in the Transcendental Tradition. Oxford: Oxford University Press, 1999.
  • Depraz, Natalie, and Marc Richir, eds. Eugen Fink: Actes Du Colloque de Cerisy-la-Salle 23-30 Juillet 1994. Atlanta: Rodopi, 1997.
  • Elveton, R. O., ed. The Phenomenology of Husserl: Selected Readings. Chicago:Quadrangle Books, 1970.
  • Farber, Marvin. The Aims of Phenomenology: The Motives, Methods, and Impact of Husserl’s Thought. New York: Harper Torchbooks, 1966.
  • Farber, Marvin. The Foundation of Phenomenology: Edmund Husserl and the Quest for a Rigorous Science of Philosophy. Albany: SUNY Press, 1943.
  • Fink, Eugen. “L’Analyse intentionnelle et le probleme de la pensee speculative.” In Problemes actuels de la phenomenologie, 54-87. Brussels: Desclee de Brower, 1952.
  • Fink, Eugen. “The Phenomenological Philosophy of Edmund Husserl and Contemporary Criticism.” In The Phenomenology of Husserl, 73-147. Chicago: Quadrangle Books, 1970.
  • Fink, Eugen. Sixth Cartesian Meditation: The Idea of a Transcendental Theory of Method. Translated by Ronald Bruzina. Bloomington: Indiana University Press, 1995.
  • Fink, Eugen. “Was Will Die Phänomenologie Edmund Husserls,” in Studien zur Phänomenologie 1930-1939 (Den Haag: Martinus Nijhoff, 1966).
  • Fink, Eugen. “What Does the Phenomenology of Edmund Husserl Want to Accomplish?” Translated by Arthur Grugan. Research in Phenomenology 2, (1972): 5-27.
  • Hopkins, Burt C. “Husserl’s Account of Phenomenological Reflection and Four Paradoxes of Reflexivity.” Research in Phenomenology 19, (1989): 180-194.
  • Husserl, Edmund. Analyses Concerning Passive and Active Synthesis: Lectures on Transcendental Logic. Translated by Anthony J. Steinbock. Dordrecht: Kluwer Academic Publishers, 2001.
  • Husserl, Edmund. Cartesian Meditations. Translated by Dorion Cairns. The Hague: Martinus Nijhoff, 1977.
  • Husserl, Edmund. The Crisis of European Sciences and Transcendental Phenomenology. Translated by David Carr. Evanston: Northwestern University Press, 1970.
  • Husserl, Edmund. Husserliana Vol. VIII. Erste Philosophie (1923/24), II. Edited by Rudolf Boehm. Haag: Martinus Nijhoff, 1959.
  • Husserl, Edmund. Experience and Judgment. Translated by James S. Churchill and Karl Ameriks. Evanston: Northwestern University Press, 1973.
  • Husserl, Edmund. Formal and Transcendental Logic. Translated by Dorion Cairns. The Hague: Martinus Nijhoff, 1969.
  • Husserl, Edmund. The Idea of Phenomenology. Translated by William P. Alston and George Nakhnikian. The Hague: Martinus Nijhoff, 1973.
  • Husserl, Edmund. The Idea of Phenomenology. Translated by Lee Hardy. Dordrecht: Kluwer Academic Publishers, 1999.
  • Husserl, Edmund. Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy: First Book. Translated by F. Kersten. Dordrecht: Kluwer Academic Publishers, 1998.
  • Husserl, Edmund. Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy: Second Book. Translated by R. Rojcewicz and A. Schuwer. Dordrecht: Kluwer Academic Publishers, 1989.
  • Husserl, Edmund. Ideas: General Introduction to Pure Phenomenology. Translated by W. F. Boyce Gibson. New York: Collier Books, 1962.
  • Husserl, Edmund. Logical Investigations. Translated by J. N. Findlay. 2Vols. London: Routledge and Kegan Paul, 1970.
  • Husserl, Edmund. Phenomenology and the Crisis of Philosophy. Translated by Quentin Lauer. New York: Harper Torchbooks, 1965.
  • Husserl, Edmund. The Phenomenology of Internal Time-consciousness. Translated by James S. Churchill. Bloomington: Indiana University Press, 1964.
  • Kearney, Richard and Mara Rainwater, eds. The Continental Philosophy Reader. London: Routledge, 1998.
  • Kersten, Fred. “Notes From Underground: Merleau-Ponty and Husserl’s Sixth Cartesian Meditation.” In The Prism of the Self, ed. Steven Crowell. Dordrecht: Kluwer Academic Publishers, no date.
  • Kockelmans, Joseph J, ed. Phenomenology: The Philosophy of Edmund Husserl and Its Interpretation. Garden City, New York: Doubleday, 1967.
  • Lauer, Quentin. Phenomenology: Its Genesis and Prospect. New York: Harper Torchbooks, 1965.
  • Lawlor, Leonard. Derrida and Husserl: The Basic Problem of Phenomenology. Bloomington: Indiana University Press, 2002.
  • McKenna, William, Robert M. Harlan and Laurence E. Winters, eds. Apriori and World: European Contributions to Husserlian Phenomenology. The Hague: Martinus Nijhoff, 1981.
  • Natanson, Maurice. Edmund Husserl: Philosopher of Infinite Tasks. Evanston: Northwestern University Press, 1973.
  • Ricoeur, Paul. “Husserl’s Fifth Cartesian Meditation.” In Husserl: An Analysis of His Phenomenology. Translated by Edward G. Ballard and Lester Embree. Evanston: Northwestern University Press, 1967.
  • Ricoeur, Paul. “A Study of Husserl’s Cartesian Meditations I-IV.” In Husserl An Analysis of His Phenomenology. Translated by Edward G. Ballard and Lester Embree. Evanston: Northwestern University Press, 1967.
  • Sokolowski, Robert. Husserlian Meditations. Evanston: Northwestern University Press, 1974.
  • Souche-Daques, S. “La Lecture Husserlienne de Sein und Zeit.” Philosophie 21 (1989): 7-36.
  • Stapleton, Timothy J. “The ‘Logic’ of Husserl’s Transcendental Reduction.” Man and World 15 (1982): 369-382.
  • Welton, Donn, ed. The Essential Husserl: Basic Writings in Transcendental Phenomenology. Bloomington: Indiana University Press, 1999.
  • Welton, Donn, ed. The New Husserl: A Critical Reader. Bloomington: Indiana University Press, 2003.
  • Zahavi, Dan. Husserl and Transcendental Intersubjectivity. Translated by Elizabeth A. Behnke. Athens, Ohio: Ohio University Press, 2001.
  • Zahavi, Dan. Husserl’s Phenomenology. Stanford: Stanford University Press, 2003.

Author Information

John Cogan
Email: jmcogan@siu.edu
St. Petersburg College
U. S. A.

René Descartes:
The Mind-Body Distinction

painting of DescartesOne of the deepest and most lasting legacies of Descartes’ philosophy is his thesis that mind and body are really distinct—a thesis now called “mind-body dualism.” He reaches this conclusion by arguing that the nature of the mind (that is, a thinking, non-extended thing) is completely different from that of the body (that is, an extended, non-thinking thing), and therefore it is possible for one to exist without the other. This argument gives rise to the famous problem of mind-body causal interaction still debated today: how can the mind cause some of our bodily limbs to move (for example, raising one’s hand to ask a question), and how can the body’s sense organs cause sensations in the mind when their natures are completely different? This article examines these issues as well as Descartes’ own response to this problem through his brief remarks on how the mind is united with the body to form a human being. This will show how these issues arise because of a misconception about Descartes’ theory of mind-body union, and how the correct conception of their union avoids this version of the problem. The article begins with an examination of the term “real distinction” and of Descartes’ probable motivations for maintaining his dualist thesis.

Table of Contents

  1. What is a Real Distinction?
  2. Why a Real Distinction?
    1. The Religious Motivation
    2. The Scientific Motivation
  3. The Real Distinction Argument
    1. The First Version
    2. The Second Version
  4. The Mind-Body Problem
  5. Descartes’ Response to the Mind-Body Problem
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. What is a Real Distinction?

It is important to note that for Descartes “real distinction” is a technical term denoting the distinction between two or more substances (see Principles, part I, section 60). A substance is something that does not require any other creature to exist—it can exist with only the help of God’s concurrence—whereas, a mode is a quality or affection of that substance (see Principles part I, section 5). Accordingly, a mode requires a substance to exist and not just the concurrence of God. Being sphere shaped is a mode of an extended substance. For example, a sphere requires an object extended in three dimensions in order to exist: an unextended sphere cannot be conceived without contradiction. But a substance can be understood to exist alone without requiring any other creature to exist. For example, a stone can exist all by itself. That is, its existence is not dependent upon the existence of minds or other bodies; and, a stone can exist without being any particular size or shape. This indicates for Descartes that God, if he chose, could create a world constituted by this stone all by itself, showing further that it is a substance “really distinct” from everything else except God. Hence, the thesis that mind and body are really distinct just means that each could exist all by itself without any other creature, including each other, if God chose to do it. However, this does not mean that these substances do exist separately. Whether or not they actually exist apart is another issue entirely.

2. Why a Real Distinction?

A question one might ask is: what’s the point of arguing that mind and body could each exist without the other? What’s the payoff for going through all the trouble and enduring all the problems to which it gives rise? For Descartes the payoff is twofold. The first is religious in nature in that it provides a rational basis for a hope in the soul’s immortality [because Descartes presumes that the mind and soul are more or less the same thing]. The second is more scientifically oriented, for the complete absence of mentality from the nature of physical things is central to making way for Descartes’ version of the new, mechanistic physics. This section investigates both of these motivating factors.

a. The Religious Motivation

In his Letter to the Sorbonne published at the beginning of his seminal work, Meditations on First Philosophy, Descartes states that his purpose in showing that the human mind or soul is really distinct from the body is to refute those “irreligious people” who only have faith in mathematics and will not believe in the soul’s immortality without a mathematical demonstration of it. Descartes goes on to explain how, because of this, these people will not pursue moral virtue without the prospect of an afterlife with rewards for virtue and punishments for vice. But, since all the arguments in the Meditations—including the real distinction arguments— are for Descartes absolutely certain on a par with geometrical demonstrations, he believes that these people will be obliged to accept them. Hence, irreligious people will be forced to believe in the prospect of an afterlife. However, recall that Descartes’ conclusion is only that the mind or soul can exist without the body. He stops short of demonstrating that the soul is actually immortal. Indeed, in the Synopsis to the Mediations, Descartes claims only to have shown that the decay of the body does not logically or metaphysically imply the destruction of the mind: further argumentation is required for the conclusion that the mind actually survives the body’s destruction. This would involve both “an account of the whole of physics” and an argument showing that God cannot annihilate the mind. Yet, even though the real distinction argument does not go this far, it does, according to Descartes, provide a sufficient foundation for religion, since the hope for an afterlife now has a rational basis and is no longer a mere article of faith.

b. The Scientific Motivation

The other motive for arguing that mind and body could each exist without the other is more scientifically oriented, stemming from Descartes’ intended replacement of final causal explanations in physics thought to be favored by late scholastic-Aristotelian philosophers with mechanistic explanations based on the model of geometry. Although the credit for setting the stage for this scholastic-Aristotelian philosophy dominant at Descartes’ time should go to Thomas Aquinas (because of his initial, thorough interpretation and appropriation of Aristotle’s philosophy), it is also important to bear in mind that other thinkers working within this Aristotelian framework such as Duns Scotus, William of Ockham, and Francisco Suarez, diverged from the Thomistic position on a variety of important issues. Indeed, by Descartes’ time, scholastic positions divergent from Thomism became so widespread and subtle in their differences that sorting them out was quite difficult. Notwithstanding this convoluted array of positions, Descartes understood one thesis to stand at the heart of the entire tradition: the doctrine that everything ultimately behaved for the sake of some end or goal. Though these “final causes,” as they were called, were not the only sorts of causes recognized by scholastic thinkers, it is sufficient for present purposes to recognize that Descartes believed scholastic natural philosophers used them as principles for physical explanations. For this reason, a brief look at how final causes were supposed to work is in order.

Descartes understood all scholastics to maintain that everything was thought to have a final cause that is the ultimate end or goal for the sake of which the rest of the organism was organized. This principle of organization became known as a thing’s “substantial form,” because it was this principle that explained why some hunk of matter was arranged in such and such a way so as to be some species of substance. For example, in the case of a bird, say, the swallow, the substantial form of swallowness was thought to organize matter for the sake of being a swallow species of substance. Accordingly, any dispositions a swallow might have, such as the disposition for making nests, would then also be explained by means of this ultimate goal of being a swallow; that is, swallows are disposed for making nests for the sake of being a swallow species of substance. This explanatory scheme was also thought to work for plants and inanimate natural objects.

A criticism of the traditional employment of substantial forms and their concomitant final causes in physics is found in the Sixth Replies where Descartes examines how the quality of gravity was used to explain a body’s downward motion:

But what makes it especially clear that my idea of gravity was taken largely from the idea I had of the mind is the fact that I thought that gravity carried bodies toward the centre of the earth as if it had some knowledge of the centre within itself (AT VII 442: CSM II 298).

On this pre-Newtonian account, a characteristic goal of all bodies was to reach its proper place, namely, the center of the earth. So, the answer to the question, “Why do stones fall downward?” would be, “Because they are striving to achieve their goal of reaching the center of the earth.” According to Descartes, this implies that the stone must have knowledge of this goal, know the means to attain it, and know where the center of the earth is located. But, how can a stone know anything? Surely only minds can have knowledge. Yet, since stones are inanimate bodies without minds, it follows that they cannot know anything at all—let alone anything about the center of the earth.

Descartes continues on to make the following point:

But later on I made the observations which led me to make a careful distinction between the idea of the mind and the ideas of body and corporeal motion; and I found that all those other ideas of . . . ‘substantial forms’ which I had previously held were ones which I had put together or constructed from those basic ideas (AT VII 442-3: CSM II 298).

Here, Descartes is claiming that the concept of a substantial form as part of the entirely physical world stems from a confusion of the ideas of mind and body. This confusion led people to mistakenly ascribe mental properties like knowledge to entirely non-mental things like stones, plants, and, yes, even non-human animals. The real distinction of mind and body can then also be used to alleviate this confusion and its resultant mistakes by showing that bodies exist and move as they do without mentality, and as such principles of mental causation such as goals, purposes (that is, final causes), and knowledge have no role to play in the explanation of physical phenomena. So the real distinction of mind and body also serves the more scientifically oriented end of eliminating any element of mentality from the idea of body. In this way, a clear understanding of the geometrical nature of bodies can be achieved and better explanations obtained.

3. The Real Distinction Argument

Descartes formulates this argument in many different ways, which has led many scholars to believe there are several different real distinction arguments. However, it is more accurate to consider these formulations as different versions of one and the same argument. The fundamental premise of each is identical: each has the fundamental premise that the natures of mind and body are completely different from one another.

The First Version

The first version is found in this excerpt from the Sixth Meditation:

[O]n the one hand I have a clear and distinct idea of myself, in so far as I am simply a thinking, non-extended thing [that is, a mind], and on the other hand I have a distinct idea of body, in so far as this is simply an extended, non-thinking thing. And accordingly, it is certain that I am really distinct from my body, and can exist without it (AT VII 78: CSM II 54).

Notice that the argument is given from the first person perspective (as are the entire Meditations). This “I” is, of course, Descartes insofar as he is a thinking thing or mind, and the argument is intended to work for any “I” or mind. So, for present purposes, it is safe to generalize the argument by replacing “I” with “mind” in the relevant places:

  1. I have a clear and distinct idea of the mind as a thinking, non-extended thing.
  2. I have a clear and distinct idea of body as an extended, non-thinking thing.
  3. Therefore, the mind is really distinct from the body and can exist without it.

At first glance it may seem that, without justification, Descartes is bluntly asserting that he conceives of mind and body as two completely different things, and that from his conception, he is inferring that he (or any mind) can exist without the body. But this is no blunt, unjustified assertion. Much more is at work here: most notably what is at work is his doctrine of clear and distinct ideas and their veridical guarantee. Indeed the truth of his intellectual perception of the natures of mind and body is supposed to be guaranteed by the fact that this perception is “clear and distinct.” Since the justification for these two premises rests squarely on the veridical guarantee of whatever is “clearly and distinctly” perceived, a brief side trip explaining this doctrine is in order.

Descartes explains what he means by a “clear and distinct idea” in his work Principles of Philosophy at part I, section 45. Here he likens a clear intellectual perception to a clear visual perception. So, just as someone might have a sharply focused visual perception of something, an idea is clear when it is in sharp intellectual focus. Moreover, an idea is distinct when, in addition to being clear, all other ideas not belonging to it are completely excluded from it. Hence, Descartes is claiming in both premises that his idea of the mind and his idea of the body exclude all other ideas that do not belong to them, including each other, and all that remains is what can be clearly understood of each. As a result, he clearly and distinctly understands the mind all by itself, separately from the body, and the body all by itself, separately from the mind.

According to Descartes, his ability to clearly and distinctly understand them separately from one another implies that each can exist alone without the other. This is because “[e]xistence is contained in the idea or concept of every single thing, since we cannot conceive of anything except as existing. Possible or contingent existence is contained in the concept of a limited thing…” (AT VII 166: CSM II 117). Descartes, then, clearly and distinctly perceives the mind as possibly existing all by itself, and the body as possibly existing all by itself. But couldn’t Descartes somehow be mistaken about his clear and distinct ideas? Given the existence of so many non-thinking bodies like stones, there is no question that bodies can exist without minds. So, even if he could be mistaken about what he clearly and distinctly understands, there is other evidence in support of premise 2. But can minds exist without bodies? Can thinking occur without a brain? If the answer to this question is “no,” the first premise would be false and, therefore, Descartes would be mistaken about one of his clear and distinct perceptions. Indeed, since we have no experience of minds actually existing without bodies as we do of bodies actually existing without minds, the argument will stand only if Descartes’ clear and distinct understanding of the mind’s nature somehow guarantees the truth of premise 1; but, at this point, it is not evident whether Descartes’ “clear and distinct” perception guarantees the truth of anything.

However, in the Fourth Meditation, Descartes goes to great lengths to guarantee the truth of whatever is clearly and distinctly understood. This veridical guarantee is based on the theses that God exists and that he cannot be a deceiver. These arguments, though very interesting, are numerous and complex, and so they will not be discussed here. Suffice it to say that since Descartes believes he has established God’s inability to deceive with absolute, geometrical certainty, he would have to consider anything contradicting this conclusion to be false. Moreover, Descartes claims that he cannot help but believe clear and distinct ideas to be true. However, if God put a clear and distinct idea in him that was false, then he could not help but believe a falsehood to be true and, to make matters worse, he would never be able to discover the mistake. Since God would be the author of this false clear and distinct idea, he would be the source of the error and would, therefore, be a deceiver, which must be false. Hence, all clear and distinct ideas must be true, because it is impossible for them to be false given God’s non-deceiving nature.

That said, the clarity and distinctness of Descartes’ understanding of mind and body guarantees the truth of premise 1. Hence, both “clear and distinct” premises are not blunt, unjustified assertions of what he believes but have very strong rational support from within Descartes’ system. However, if it turns out that God does not exist or that he can be a deceiver, then all bets are off. There would then no longer be any veridical guarantee of what is clearly and distinctly understood and, as a result, the first premise could be false. Consequently, premise 1 would not bar the possibility of minds requiring brains to exist and, therefore, this premise would not be absolutely certain as Descartes supposed. In the end, the conclusion is established with absolute certainty only when considered from within Descartes’ own epistemological framework but loses its force if that framework turns out to be false or when evaluated from outside of it.

These guaranteed truths express some very important points about Descartes’ conception of mind and body. Notice that mind and body are defined as complete opposites. This means that the ideas of mind and body represent two natures that have absolutely nothing in common. And, it is this complete diversity that establishes the possibility of their independent existence. But, how can Descartes make a legitimate inference from his independent understanding of mind and body as completely different things to their independent existence? To answer this question, recall that every idea of limited or finite things contains the idea of possible or contingent existence, and so Descartes is conceiving mind and body as possibly existing all by themselves without any other creature. Since there is no doubt about this possibility for Descartes and given the fact that God is all powerful, it follows that God could bring into existence a mind without a body and vice versa just as Descartes clearly and distinctly understands them. Hence, the power of God makes Descartes’ perceived logical possibility of minds existing without bodies into a metaphysical possibility. As a result, minds without bodies and bodies without minds would require nothing besides God’s concurrence to exist and, therefore, they are two really distinct substances.

The Second Version

The argument just examined is formulated in a different way later in the Sixth Meditation:

[T]here is a great difference between the mind and the body, inasmuch as the body is by its very nature always divisible, while the mind is utterly indivisible. For when I consider the mind, or myself in so far as I am merely a thinking thing, I am unable to distinguish any parts within myself; I understand myself to be something quite single and complete….By contrast, there is no corporeal or extended thing that I can think of which in my thought I cannot easily divide into parts; and this very fact makes me understand that it is divisible. This one argument would be enough to show me that the mind is completely different from the body…. (AT VII 86-87: CSM II 59).

This argument can be reformulated as follows, replacing “mind” for “I” as in the first version:

  1. I understand the mind to be indivisible by its very nature.
  2. I understand body to be divisible by its very nature.
  3. Therefore, the mind is completely different from the body.

Notice the conclusion that mind and body are really distinct is not explicitly stated but can be inferred from 3. What is interesting about this formulation is how Descartes reaches his conclusion. He does not assert a clear and distinct understanding of these two natures as completely different but instead makes his point based on a particular property of each. However, this is not just any property but a property each has “by its very nature.” Something’s nature is just what it is to be that kind of thing, and so the term “nature” is here being used as synonymous with “essence.” On this account, extension constitutes the nature or essence of bodily kinds of things; while thinking constitutes the nature or essence of mental kinds of things. So, here Descartes is arguing that a property of what it is to be a body, or extended thing, is to be divisible, while a property of what it is to be a mind or thinking thing is to be indivisible.

Descartes’ line of reasoning in support of these claims about the respective natures of mind and body runs as follows. First, it is easy to see that bodies are divisible. Just take any body, say a pencil or a piece of paper, and break it or cut it in half. Now you have two bodies instead of one. Second, based on this line of reasoning, it is easy to see why Descartes believed his nature or mind to be indivisible: if a mind or an “I” could be divided, then two minds or “I’s” would result; but since this “I” just is my self, this would be the same as claiming that the division of my mind results in two selves, which is absurd. Therefore, the body is essentially divisible and the mind is essentially indivisible: but how does this lead to the conclusion that they are completely different?

Here it should be noted that a difference in just any non-essential property would have only shown that mind and body are not exactly the same. But this is a much weaker claim than Descartes’ conclusion that they are completely different. For two things could have the same nature, for example, extension, but have other, changeable properties or modes distinguishing them. Hence, these two things would be different in some respect, for example, in shape, but not completely different, since both would still be extended kinds of things. Consequently, Descartes needs their complete diversity to claim that he has completely independent conceptions of each and, in turn, that mind and body can exist independently of one another.

Descartes can reach this stronger conclusion because these essential properties are contradictories. On the one hand, Descartes argues that the mind is indivisible because he cannot perceive himself as having any parts. On the other hand, the body is divisible because he cannot think of a body except as having parts. Hence, if mind and body had the same nature, it would be a nature both with and without parts. Yet such a thing is unintelligible: how could something both be separable into parts and yet not separable into parts? The answer is that it can’t, and so mind and body cannot be one and the same but two completely different natures. Notice that, as with the first version, mind and body are here being defined as opposites. This implies that divisible body can be understood without indivisible mind and vice versa. Accordingly each can be understood as existing all by itself: they are two really distinct substances.

However, unlike the first version, Descartes does not invoke the doctrine of clear and distinct ideas to justify his premises. If he had, this version, like the first, would be absolutely certain from within Descartes’ own epistemological system. But if removed from this apparatus, it is possible that Descartes is mistaken about the indivisibility of the mind, because the possibility of the mind requiring a brain to exist would still be viable. This would mean that, since extension is part of the nature of mind, it would, being an extended thing, be composed of parts and, therefore, it would be divisible. As a result, Descartes could not legitimately reach the conclusion that mind and body are completely different. This would also mean that the further, implicit conclusion that mind and body are really distinct could not be reached either. In the end, the main difficulty with Descartes’ real distinction argument is that he has not adequately eliminated the possibility of minds being extended things like brains.

4. The Mind-Body Problem

The real distinction of mind and body based on their completely diverse natures is the root of the famous mind-body problem: how can these two substances with completely different natures causally interact so as to give rise to a human being capable of having voluntary bodily motions and sensations? Although several versions of this problem have arisen over the years, this section will be exclusively devoted to the version of it Descartes confronted as expressed by Pierre Gassendi, the author of the Fifth Objections, and Descartes’ correspondent, Princess Elizabeth of Bohemia. Their concern arises from the claim at the heart of the real distinction argument that mind and body are completely different or opposite things.

The complete diversity of their respective natures has serious consequences for the kinds of modes each can possess. For instance, in the Second Meditation, Descartes argues that he is nothing but a thinking thing or mind, that is, Descartes argues that he is a “thing that doubts, understands, affirms, denies, is willing, is unwilling, and also imagines and has sensory perceptions” (AT VII 28: CSM II 19). It makes no sense to ascribe such modes to entirely extended, non-thinking things like stones, and therefore, only minds can have these kinds of modes. Conversely, it makes no sense to ascribe modes of size, shape, quantity and motion to non-extended, thinking things. For example, the concept of an unextended shape is unintelligible. Therefore, a mind cannot be understood to be shaped or in motion, nor can a body understand or sense anything. Human beings, however, are supposed to be combinations of mind and body such that the mind’s choices can cause modes of motion in the body, and motions in certain bodily organs, such as the eye, cause modes of sensation in the mind.

The mind’s ability to cause motion in the body will be addressed first. Take for example a voluntary choice, or willing, to raise one’s hand in class to ask a question. The arm moving upward is the effect while the choice to raise it is the cause. But willing is a mode of the non-extended mind alone, whereas the arm’s motion is a mode of the extended body alone: how can the non-extended mind bring about this extended effect? It is this problem of voluntary bodily motion or the so-called problem of “mind to body causation” that so troubled Gassendi and Elizabeth. The crux of their concern was that in order for one thing to cause motion in another, they must come into contact with one another as, for example, in the game of pool the cue ball must be in motion and come into contact with the eight-ball in order for the latter to be set in motion. The problem is that, in the case of voluntarily bodily movements, contact between mind and body would be impossible given the mind’s non-extended nature. This is because contact must be between two surfaces, but surface is a mode of body, as stated at Principles of Philosophy part II, section 15. Accordingly, the mind does not have a surface that can come into contact with the body and cause it to move. So, it seems that if mind and body are completely different, there is no intelligible explanation of voluntary bodily movement.

Although Gassendi and Elizabeth limited themselves to the problem of voluntary bodily movement, a similar problem arises for sensations, or the so-called problem of “body to mind causation.” For instance, a visual sensation of a tree is a mode of the mind alone. The cause of this mode would be explained by the motion of various imperceptible bodies causing parts of the eye to move, then movements in the optic nerve, which in turn cause various “animal spirits” to move in the brain and finally result in the sensory idea of the tree in the mind. But how can the movement of the “animal spirits,” which were thought to be very fine bodies, bring about the existence of a sensory idea when the mind is incapable of receiving modes of motion given its non-extended nature? Again, since the mind is incapable of having motion and a surface, no intelligible explanation of sensations seems possible either. Therefore, the completely different natures of mind and body seem to render their causal interaction impossible.

The consequences of this problem are very serious for Descartes, because it undermines his claim to have a clear and distinct understanding of the mind without the body. For humans do have sensations and voluntarily move some of their bodily limbs and, if Gassendi and Elizabeth are correct, this requires a surface and contact. Since the mind must have a surface and a capacity for motion, the mind must also be extended and, therefore, mind and body are not completely different. This means the “clear and distinct” ideas of mind and body, as mutually exclusive natures, must be false in order for mind-body causal interaction to occur. Hence, Descartes has not adequately established that mind and body are two really distinct substances.

5. Descartes’ Response to the Mind-Body Problem

Despite the obviousness of this problem, and the amount of attention given to it, Descartes himself never took this issue very seriously. His response to Gassendi is a telling example:

These questions presuppose amongst other things an explanation of the union between the soul and the body, which I have not yet dealt with at all. But I will say, for your benefit at least, that the whole problem contained in such questions arises simply from a supposition that is false and cannot in any way be proved, namely that, if the soul and the body are two substances whose nature is different, this prevents them from being able to act on each other (AT VII 213: CSM II 275).

So, Descartes’ response to the mind-body problem is twofold. First, Descartes contends that a response to this question presupposes an explanation of the union between the mind (or soul) and the body. Second, Descartes claims that the question itself stems from the false presupposition that two substances with completely different natures cannot act on each other. Further examination of these two points will occur in reverse order.

Descartes’ principles of causation put forward in the Third Meditation lie at the heart of this second presupposition. The relevant portion of this discussion is when Descartes argues that the less real cannot cause something that is more real, because the less real does not have enough reality to bring about something more real than itself. This principle applies on the general level of substances and modes. On this account, an infinite substance, that is, God, is the most real thing because only he requires nothing else in order to exist; created, finite substances are next most real, because they require only God’s creative and conservative activity in order to exist; and finally, modes are the least real, because they require a created substance and an infinite substance in order to exist. So, on this principle, a mode cannot cause the existence of a substance since modes are less real than finite substances. Similarly, a created, finite substance cannot cause the existence of an infinite substance. But a finite substance can cause the existence of another finite substance or a mode (since modes are less real than substances). Hence, Descartes’ point could be that the completely diverse natures of mind and body do not violate this causal principle, since both are finite substances causing modes to exist in some other finite substance. This indicates further that the “activity” of the mind on the body does not require contact and motion, thereby suggesting that mind and body do not bear a mechanistic causal relation to each other. More will be said about this below.

The first presupposition concerns an explanation of how the mind is united with the body. Descartes’ remarks about this issue are scattered across both his published works and his private correspondence. These texts indicate that Descartes did not maintain that voluntary bodily movements and sensation arise because of the causal interaction of mind and body by contact and motion. Rather, he maintains a version of the form-matter theory of soul-body union endorsed by some of his scholastic-Aristotelian predecessors and contemporaries. Although a close analysis of the texts in question cannot be conducted here, a brief summary of how this theory works for Descartes can be provided.

Before providing this summary, however, it is important to disclaim that this scholastic-Aristotelian interpretation is a minority position amongst Descartes scholars. The traditional view maintains that Descartes’ human being is composed of two substances that causally interact in a mechanistic fashion. This traditional view led some of Descartes’ successors, such as Malebranche and Leibniz (who also believed in the real distinction of mind and body), to devise metaphysical systems wherein mind and body do not causally interact despite appearances to the contrary. Other philosophers considered the mind-body problem to be insurmountable, thereby denying their real distinction: they claim that everything is either extended (as is common nowadays) or mental (as George Berkeley argued in the 18th century). Indeed, this traditional, mechanistic interpretation of Descartes is so deeply ingrained in the minds of philosophers today, that most do not even bother to argue for it. However, a notable exception is Marleen Rozemond, who argues for the incompatibility of Descartes’ metaphysics with any scholastic-Aristotelian version of mind or soul-body union. Those interested in closely examining her arguments should consult her book Descartes’s Dualism. A book arguing in favor of the scholastic-Aristotelian interpretation is entitled Descartes and the Metaphysics of Human Nature; Chapter 5 specifically addresses Rozemond’s concerns.

Two major stumbling blocks Rozemond raises for the scholastic-Aristotelian interpretation concern the mind’s status as a substantial form and the extent to which Descartes can maintain a form of the human body. However, recall that Descartes rejects substantial forms because of their final causal component. Descartes’ argument was based on the fact (as he understood it) that the scholastics were ascribing mental properties to entirely non-mental things like stones. Since the mind is an entirely mental thing, these arguments just do not apply to it. Hence, Descartes’ particular rejection of substantial forms does not necessarily imply that Descartes did not view the mind as a substantial form. Indeed, as Paul Hoffman noted:

Descartes really rejects the attempt to use the human soul as a model for explanations in the entirely physical world. This makes it possible that Descartes considered the human mind to be the only substantial form. At first glance this may seem ad hoc but it is also important to notice that rejecting the existence of substantial forms with the exception of the mind or rational soul was not uncommon amongst Descartes’ contemporaries.

Although the mind’s status as a substantial form may seem at risk because of its meager explicit textual support, Descartes suggests that the mind a “substantial form” twice in a draft of open letter to his enemy Voetius:

Yet, if the soul is recognized as merely a substantial form, while other such forms consist in the configuration and motion of parts, this very privileged status it has compared with other forms shows that its nature is quite different from theirs (AT III 503: CSMK 207-208).

Descartes then remarks “this is confirmed by the example of the soul, which is the true substantial form of man” (AT III 508: CSMK 208). Although other passages do not make this claim explicitly, they do imply (in some sense) that the mind is a substantial form. For instance, Descartes claims in a letter to Mesland dated 9 February 1645, that the soul is “substantially united” with the human body (AT IV 166: CSMK 243). This “substantial union” was a technical term amongst the scholastics denoting the union between a substantial form and matter to form a complete substance. Consequently, there is some reason for believing that the human mind is the only substantial form left standing in Descartes’ metaphysics.

Another major stumbling block recognized by Rozemond is the extent to which, if any, Descartes’ metaphysics can maintain a principle for organizing extension into a human body. This was a point of some controversy amongst the scholastics themselves. Philosophers maintaining a Thomistic position argued that the human soul is the human body’s principle of organization. While others, maintaining a basically Scotistic position, argued that some other form besides the human soul is the form of the body. This “form of corporeity” organizes matter for the sake of being a human body but does not result in a full-fledged human being. Rather it makes a body with the potential for union with the human soul. The soul then actualizes this potential resulting in a complete human being. If Descartes did hold a fundamentally scholastic theory of mind-body union, then is it more Thomistic or Scotistic? Since intellect and will are the only faculties of the mind, it does not have the faculty for organizing matter for being a human body. So, if Descartes’ theory is scholastic, it must be most in line with some version of the Scotistic theory. Rozemond argues that Descartes’ rejection of all other substantial forms (except the human mind or soul) precludes this kind of theory since he cannot appeal to the doctrine of substantial forms like the Scotists.

Although Descartes argues that bodies, in the general sense, are constituted by extension, he also maintains that species of bodies are determined by the configuration and motion of their parts. This doctrine of “configuration and motion of parts” serves the same purpose as the doctrine of substantial forms with regards to entirely physical things. But the main difference between the two is that Descartes’ doctrine does not employ final causes. Recall that substantial forms organize matter for the purpose of being a species of thing. The purpose of a human body endowed with only the form of corporeity is union with the soul. Hence, the organization of matter into a human body is an effect that is explained by the final cause or purpose of being disposed for union. But, on Descartes’ account, the explanatory order would be reversed: a human body’s disposition for union is an effect resulting from the configuration and motion of parts. So, even though Descartes does not have recourse to substantial forms, he still has recourse to the configuration of matter and to the dispositions to which it gives rise, including “all the dispositions required to preserve that union” (AT IV 166: CSMK 243). Hence, on this account, Descartes gets what he needs, namely, Descartes gets a body properly configured for potential union with the mind, but without recourse to the scholastic notion of substantial forms with their final causal component.

Another feature of this basically Scotistic position is that the soul and the body were considered incomplete substances themselves, while their union results in one, complete substance. Surely Descartes maintains that mind and body are two substances but in what sense, if any, can they be considered incomplete? Descartes answers this question in the Fourth Replies. He argues that a substance may be complete insofar as it is a substance but incomplete insofar as it is referred to some other substance together with which it forms yet some third substance. This can be applied to mind and body as follows: the mind insofar as it is a thinking thing is a complete substance, while the body insofar as it is an extended thing is a complete substance, but each taken individually is only an incomplete human being.

This account is repeated in the following excerpt from a letter to Regius dated December 1641:

For there you said that the body and the soul, in relation to the whole human being, are incomplete substances; and it follows from their being incomplete that what they constitute is a being through itself (that is, an ens per se; AT III 460: CSMK 200).

The technical sense of the term “being through itself” was intended to capture the fact that human beings do not require any other creature but only God’s concurrence to exist. Accordingly, a being through itself, or ens per se, is a substance. Also notice that the claim in the letter to Regius that two incomplete substances together constitute a being through itself is reminiscent of Descartes’ remarks in the Fourth Replies. This affinity between the two texts indicates that the union of mind and body results in one complete substance or being through itself. This just means that mind and body are the metaphysical parts (mind and body are incomplete substances in this respect) that constitute one, whole human being, which is a complete substance in its own right. Hence, a human being is not the result of two substances causally interacting by means of contact and motion, as Gassendi and Elizabeth supposed, but rather they bear a relation of act and potency that results in one, whole and complete substantial human being.

This sheds some light on why Descartes thought that an account of mind-body union would put Gassendi’s and Elizabeth’s concerns to rest: they misconceived the union of mind and body as a mechanical relation when in fact it is a relation of act and potency. This avoids Gassendi’s and Elizabeth’s version of this problem. This aversion is accomplished by the fact that modes of voluntary motion (and sensations, by extrapolation) should be ascribed to a whole human being and not to the mind or the body taken individually. This is made apparent in a 21 May 1643 letter to Elizabeth where Descartes distinguishes between various “primitive notions.” The most general are the notions of being, number, duration, and so on, which apply to all conceivable things. He then goes on to distinguish the notions of mind and body:

Then, as regards body in particular, we have only the notion of extension, which entails the notions of shape and motion; and as regards the soul on its own, we have only the notion of thought, which includes the perceptions of the intellect and the inclinations of the will (AT III 665: CSMK 218).

Here body and soul (or mind) are primitive notions and the notions of their respective modes are the notions “entailed by” or “included in” these primitives. Descartes then discusses the primitive notion of mind-body union:

Lastly, as regards the soul and the body together, we have only the notion of their union, on which depends our notion of the soul’s power to move the body, and the body’s power to act on the soul and cause its sensations and passions (AT III 665: CSMK 218).

In light of the immediately preceding lines, this indicates that voluntary bodily movements and sensations are not modes of the body alone, or the mind alone, but rather are modes of “the soul and the body together.” This is at least partially confirmed in the following lines from Principles, part I, article 48:

But we also experience within ourselves certain other things, which must not be referred either to the mind alone or to the body alone. These arises, as will be made clear in the appropriate place, from the close and intimate union of our mind with the body. This list includes, first, appetites like hunger and thirds; secondly, the emotions or passions . . . (AT VIIIA 23: CSM I 209).

These texts indicate that the mind or soul is united with the body so as to give rise to another whole complete substance composed of these two metaphysical parts. And, moreover, this composite substance now has the capacity for having modes of its own, namely, modes of voluntary bodily movement and sensation, which neither the mind nor the body can have individually. So, voluntary bodily movements are not modes of the body alone caused by the mind, nor are sensations modes of the mind alone caused by the body. Rather, both are modes of a whole and complete human being. On this account, it makes no sense to ask how the non-extended mind can come into contact with the body to cause these modes. To ask this would be to get off on the wrong foot entirely, since contact between these two completely diverse substances is not required for these modes to exist. Rather all that is necessary is for the mind to actualize the potential in a properly disposed human body to form one, whole, human being to whom is attributed modes of voluntary movement and sensation.

Although the scholastic-Aristotelian interpretation avoids the traditional causal interaction problem based on the requirements of contact and motion, it does run up against another version of that problem, namely, a problem of formal causation. This is a problem facing any scholastic-Aristotelian theory of mind or soul-body union where the soul is understood to be an immaterial substantial form. Recall that the immaterial mind or soul as substantial form is suppose to act on a properly disposed human body in order to result in a full-fledged human being. The problem of formal causal interaction is: how can an immaterial soul assubstantial form act on the potential in a material thing? Can any sense be made of the claim that a non-extended or immaterial things acts on anything? Descartes noticed in a letter to Regius (AT III 493: CSMK 206) that the scholastics did not try to answer this question and so he and Regius need not either. The likely explanation of their silence is that the act-potency relation was considered absolutely fundamental to scholastic-Aristotelian philosophy and, therefore, it required no further explanation. So, in the end, even if Descartes’ theory is as described here, it does not evade all the causal problems associated with uniting immaterial souls or mind to their respective bodies. , However, if this proposed account is true, it helps to cast Descartes’ philosophy in a new light and to redirect the attention of scholars to the formal causal problems involved.

6. References and Further Reading

Primary Sources

  • Descartes, Rene, Ouevres de Descartes, 11 vols., eds. Charles Adam and Paul Tannery, Paris: Vrin, 1974-1989.
    • This is still the standard edition of all of Descartes’ works and correspondence in their original languages. Cited in the text as AT, volume, page.
  • Descartes, Rene, The Philosophical Writings of Descartes, 3 vols., trans. John Cottingham, Robert Stoothoff, Dugald Murdoch and Anthony Kenny, Cambridge: Cambridge University Press, 1984-1991
    • This is the standard English translation of Descartes philosophical works and correspondence. Cited in the text as CSM or CSMK, volume, page.

Secondary Sources

  • Broughton, Janet and Mattern, Ruth, “Reinterpreting Descartes on the Notion of the Union of Mind and Body,” Journal of the History of Philosophy 16 (1978), 23-32.
    • A reinterpretation of the notion of mind-body union in the correspondence with Elizabeth, which addresses Radner’s interpretation of it. See below.
  • Garber, Daniel, “Understanding Interaction: What Descartes Should Have Told Elizabeth,” Southern Journal of Philosophy, Supp. 21 (1983), 15-32.
    • Article addressing the issues of the primitive notions and how this theory should be used to explain mind-body causal interaction to Elizabeth.
  • Hoffman, Paul, “The Unity of Descartes’ Man,” The Philosophical Review 95 (1986), 339-369.
    • Article arguing that Descartes’ theory of mind-body union is more in line with scholastic-Aristotelian theories of soul-body union than previously supposed.
  • Kenny, Anthony, Descartes: A Study of His Philosophy, New York: Random House, 1968. See especially chapters 4 and 10.
    • These chapters provide classic interpretations of the real distinction between mind and body and the mind-body problem.
  • Mattern, Ruth, “Descartes’ Correspondence with Elizabeth Concerning both the Union and Distinction of Mind and Body” in Descartes: Critical and Interpretive Essays, ed. Michael Hooker, Baltimore: John Hopkins University Press, 1978, 212-222.
    • Short essay examining Descartes’ correspondence with Elizabeth on this issue and how it was supposed to direct her to a correct understanding of mind-body causal interaction.
  • Radner, Daisie, “Descartes’ Notion of the Union of Mind and Body,” Journal of the History of Philosophy 9 (1971), 159-170.
    • This is the first article in Anglo-American scholarship to address the issue of mind-body union. It addresses several texts, including the letter to Elizabeth enumerating the primitive notions.
  • Rozemond, Marleen, Descartes’s Dualism, Cambridge: Harvard University Press, 1998.
    • This book argues for a particular understanding of the real distinction between mind and body that would preclude Hoffman’s scholastic-Aristotelian account of their union.
  • Skirry, Justin, Descartes and the Metaphysics of Human Nature, London and New York: Thoemmes-Continuum Press, 2005.
    • This book takes issue with Rozemond’s account of the mind-body union through a close re-examination of fundamental features of Descartes’ metaphysics and by building on certain features of Hoffman’s account.
  • Voss, Stephen, “Descartes: The End of Anthropology” in Reason, Will and Sensation, ed. John Cottingham, Oxford: Clarendon Press, 1994.
    • This essay provides a close textual analysis of Descartes’ account of the union of mind and body on the supposition that he maintained a Platonic rather than scholastic-Aristotelian theory of mind-body union.
  • Williams, Bernard, Descartes: The Project of Pure Enquiry, Sussex: Harvester Press, 1978. See especially chapter 4.
    • This is another classic account of the mind-body relation in Descartes.
  • Wilson, Margaret, Descartes, London and Boston: Routledge and Kegan Paul, 1978.
    • Provides classic accounts of the real distinction argument and issues concerning mind-body causal interaction.

Author Information

Justin Skirry
Email: jskirry@yahoo.com
U. S. A.

Praise and Blame

Joel Feinberg observed that “moral responsibility… is a subject about which we are all confused” (1970: 37). Perhaps nowhere is this confusion more evident than in our understandings of praise and blame. This entry will contrast three influential philosophical accounts of our everyday practices of praise and blame, in terms of how they might be justified. On the one hand, a broadly Kantian approach sees responsibility for actions as relying on forms of self-control that point back to the idea of free will. On this account praise and blame are justified because a person freely chooses her actions. Praise and blame respond to the person as the chooser of her deed; they recognize her dignity as a rational agent, as Kantians tend to put it. This approach sharply contrasts with two further ways of thinking about the issues. One is utilitarian, where praise and blame are justified in terms of their social benefits. Another, more complex approach is roughly Aristotelian. This approach situates practices of praise and blame in terms of our on-going relationships with one another. This approach stresses the importance of mutual accountability, moral education, and assessments of character in terms of the many vices and virtues.

Table of Contents

  1. Introduction
  2. The Problem of Free Will
  3. Two Contrasting Approaches
    1. The Utilitarian Account
    2. The Aristotelian Account
  4. The Kantian Account and Moral Worth
  5. The Idea of Moral Worth
  6. Conclusion
  7. References and Further Reading

1. Introduction

This article will not try to convey the exact details of these accounts, but to show how these ways of looking at mutual accountability capture important parts of our everyday commonsense. One modern commentator claimed that, in our attitudes to moral responsibility, “we are all Kantians now” – by “we” meaning not just philosophers but all Western persons (Adkins, 1960: 2). Another central figure in this debate, Bernard Williams, agrees that Kant captured a widespread tendency of modern moral thinking, but also claims that there exist important counter-tendencies in our actual practices of responsibility. For Williams, ancient Greek understandings are actually more realistic and helpful than the Kantian one. So far as our modern praising and blaming actually make sense, he claims, they are better captured by a (roughly) Aristotelian account.

There are some important differences between praise and blame that will not be central to this entry; in fact, blame will get the greatest attention here. This is partly because praise seems less problematic: misplaced blame is felt as deeply unfair, not least because being exposed to blame is unpleasant and costly in a way that being praised is not. But it is principally because blame has a closer connection than praise to matters of intense philosophical interest, including freedom, responsibility and desert. We often praise inanimate objects (such as art works or buildings) and animals (a loyal pet, for example), although we could not blame such entities, however deeply dissatisfied we felt with them. The focus of this article, however, will be upon entities that are clearly open to blame as well as praise: human beings.

What is blame, such that only human beings can be blamed? We are all familiar with resentment, reproach and accusation regarding a person’s past actions; likewise, we all know the sense of guilt, shame or indignation they can elicit. Philosophers differ on how far certain emotions may be central to blame (this relates to a wider dispute, regarding which emotions, if any, constitute a proper basis for moral action). What is clear is that blame suggests both responsibility and culpability. Here, responsibility only implies that the act can be identified with a person, such that she can reasonably be expected to respond for it in some way. That is, it does not necessarily imply fault, or culpability. This is the idea that the person is “in the wrong,” that fault somehow attaches to them so that they deserve blame. (Philosophers tend to describe this as “blameworthiness.”) What sense we should give to these ideas of culpability or desert, and what is necessary for us to think of a person as responsible: these are central issues for this entry.

For further aspects of responsibility, see the sister entry to this article, responsibility. Another article also examines the topic of free will in depth. Nonetheless, since Kant’s account begins with the question of free will, it is also necessary to say something about this straightaway. The entry will then set out the utilitarian and Aristotelian accounts, before returning to Kant’s theory. It concludes by discussing ideas of moral worth and desert that make Kant’s account so appealing.

2. The Problem of Free Will

The free will debate has become an old chestnut of modern philosophy. It is an intuitively plausible way of approaching the issues – familiar to many even before they encounter philosophical texts. It is perhaps surprising, then, that this debate is actually a rather modern one.

The basic gist is this: if I am to be responsible (really responsible) for my conduct, then it must be within my control. However, if it is true that every event in the universe is determined by causal laws, then this must be true of the events that constitute my actions. Therefore, my conduct cannot really be within my control; therefore, I am not really responsible for my conduct. Two conclusions immediately suggest themselves. One is that it is incoherent to praise or blame me – and everyone else – for our actions, because it is so difficult to doubt the causal well-orderedness of the universe. The alternative conclusion, scarcely more appealing, is that the human will somehow sits outside this causal framework – ie, we have free will – because it is unthinkable that our moral ideas be so desperately incoherent.

Both lines of thought are incompatibilist; that is, they see the ideas of responsibility involved in praise and blame as incompatible with the causal well-orderedness of the universe. But while both attract some limited support among philosophers, the overwhelming consensus now lies with compatibilism. This is simply the thesis that responsibility and causal order are compatible. Most philosophers agree that the alleged incompatibility results from some important confusions, although there is much less consensus about what these may be. At least one area of confusion is clear, however, and forms the central issue of this article: what sort of responsibility for conduct is involved in praise and blame? Several familiar points in the free will debate are helpful for approaching this.

In the first place, it is well-known that this debate does not turn on the truth of determinism as such. Determinism is the idea that every event is determined by fixed causal laws. Yet it may well be that every event is somehow random in origin. One interpretation of quantum physics claims that causal laws are the product of statistical regularities, while these regularities stem from a near infinite number of random events. So far as the human will is concerned, this makes no difference. If my conduct is the product of chance, this makes me no more responsible for it than does its being generated by causal laws. The point is that if I am to be blamed or praised, then I must control my conduct – not causal laws, nor mere chance, nor some particular combination of the two.

Second, the free will debate bears a disquieting similarity to an older controversy. In medieval philosophy it used to be asked how God’s omniscience – his knowledge of everything that has happened and will happen – could be reconciled with our being subject to his moral judgment (that is, being sent to heaven or to hell). If God knows what we will do then this seems to imply that it is already decided whether we will act well or badly. And this, in turn, suggests that it makes no sense to punish or reward us. Theologians developed various doctrines to overcome this difficulty, but few sound convincing to modern ears – perhaps because the problem itself is no longer a live one, even for most believers. However that may be, it is interesting that many modern versions of the debate seem to take at least one of the planks of Christian theology for granted: that individuals have wills that can be bad or good, usually now expressed in the terms of people’s “blameworthiness” or (less often) “praiseworthiness.”

In this way, the modern American philosopher Joel Feinberg ironically referred to “a moral bank account” that we carry through life, which sums up our moral credits and debits in a single sum (1970: 20). Whether or not such an “account” makes sense, it is at least clear that the idea of “the will” is by no means self-explanatory. For Kant, as we shall see, it was obvious that all my choices can be summed up in a single moral evaluation, whether I have a “good” or “bad” will. Kant is equivocal, however, as to whether only God might make this evaluation, or whether human beings might also form reasonable opinions on the matter. But especially if we take the point of view of mutual, human accountability, it is not obvious why we should believe any such single evaluation to be possible, or what role this evaluation might play in our individual or collective lives. Certainly, we usually praise and blame in terms of particular actions and particular vices and virtues – not a good or bad will.

Third, this way of framing the issues creates a problematic gulf between normal moral agents (adult human beings of sound mind) and other creatures – animals and children. At some stage of evolution, and at some stage toward maturity, certain animals become “free,” when before they had all been determined in their conduct. Although it is grossly implausible that there are no relevant moral differences between the other animals, children, and human adults, it is no more plausible that the free will simply pops into existence at a certain stage of human development. (Within a Christian framework this issue was less problematic: human beings, and only human beings, have souls.) Nonetheless, we tend to think there is something sufficiently distinctive about human action, so that many non-religious people find the idea of free will plausible, and almost everyone assumes that blame (if not praise) only makes sense with regard to (mature?) human beings.

Taking the last three points together generates a further point. If the idea of the will is complex, and there is no straightforward moral dividing line between children and adults, between humans and other animals – together, these ideas suggest that a “will” is not something we all straightforwardly “have.” In other words: it is implausible that all adult humans have the same capacities, all to the same extent, that are involved in controlling action. One way of retaining the idea of the will might be to think of it as the bundle of capacities that are needed to control action in the light of moral concerns, these capacities being set only at such a level that all adult human beings of sound mind really seem to possess them. But two points need to be kept in mind about such a strategy. First, it remains the case that people will vary in how far they possess such capacities, and this variation will largely be a product of upbringing and natural qualities – that is, not something within an individual’s own control. Second, the sort of ultimate control over one’s moral character supposed in Kant’s or similar “free will” accounts is unlikely to be vindicated in this way.

3. Two Contrasting Approaches

Two influential lines of thought oppose the idea that praise and blame relate to “free will,” the metaphysical idea that we are responsible for our action because they are controlled by us and not (simply) caused by the world around us. For the utilitarian, praise and blame, like all our other practices, can only be justified in terms of their social consequences. A more complex account was given by Aristotle, who shares the utilitarian’s sense that praise and blame have important social consequences, but also offers an extended account of how they relate to the capacities needed for moral action.

a. The Utilitarian Account

The utilitarian case is straightforward. Blame and praise encourage us to perform socially valuable actions and to avoid socially costly actions. If we know we will be blamed for greed or cruelty, for example, then we have powerful motives to avoid these. Praise and blame also involve us in making assessments of people’s strengths and weaknesses, which is important when it comes to deciding who should be entrusted with which tasks and responsibilities. The stingy person might make a good banker, but a bad organizer of social occasions.

This approach does seem to capture important truths: we want to encourage and discourage different sorts of activity, and we need to have a sense of what different people are good at. It also makes sense of why we don’t blame some actions, even if they had bad outcomes (even though, in principle, only outcomes matter to the utilitarian). If the bad outcome was not chosen by the person (for example, she was forced to act that way by someone else), then there is nothing to be gained from blaming them (much better to blame the person who forced her). Thus the utilitarian can accommodate the important fact that praise and blame relate to free action: but this need not be thought of in terms of metaphysical “free will,” but instead the compatibilist freedom involved in choosing one’s actions independently of others’ interference.

But the utilitarian account faces a simple objection: does it really provide for responsibility, still more culpability? For example, if we know that someone does not respond well to criticism, it seems that the utilitarian case for blame is undermined. We would do much better to flatter and cajole them into acting differently. Of course, the utilitarian might reply that this is often what we in fact do with such people. Further, he might add that we do still blame such people when we discuss their characters behind their backs, perhaps describing them as self-righteous or stubborn. What seems to be missing in this response, however, is the idea that the person deserves blame. They seem to deserve criticism in just the same way that a faulty machine or a cracked mug deserve criticism: it’s useful that everyone knows they’re faulty, but they can hardly be described as blameworthy. Especially when we move from blame to the question of sanctions or punishment, this lack of desert seems to present a real problem for the utilitarian account.

Utilitarians face a more complex criticism, which goes beyond the scope of this entry. Historically more concerned with the actions of government than individuals, utilitarianism never developed a realistic moral psychology – that is, very roughly, an account of what makes the decent person tick. This lack of attention has permitted some of the most devastating critique of utilitarianism, such as Bernard Williams’s and Susan Wolf’s. But if we want to understand responsibility, our capacity to accept praise and blame as well as our tendency to dole them out, then we need to have a fairly good picture of moral agency.

b. The Aristotelian Account

This is where Aristotle’s more complex account enters the story. The most famous discussion of when people can be praised and blamed for their actions remains Aristotle’s. As with the utilitarians, Aristotle saw no need to talk about praise and blame in terms of free will. Aristotle speaks of whether acts are voluntary, and whether we attribute them to a person or to other factors. Some have ascribed this way of framing the issues to a lack of moral or scientific sophistication on the part of the ancient Greeks. However, a number of modern philosophers, most prominently Bernard Williams and Martha Nussbaum, have suggested that an Aristotelian account is actually more coherent and sophisticated than those typical of modern philosophy – and, indeed, more coherent than our modern, “common sense” intuitions about moral responsibility.

At first glance, it looks as if Aristotle takes it for granted that we are responsible for our actions, so that others can reasonably praise or blame or punish us. What he does is to highlight various conditions that lessen or cancel our responsibility. He discusses force of events, threats and coercion, ignorance, intoxication and bad character. Yet, taken together, his account shows us the basic elements involved in being a person who can reasonably be praised or blamed.

The first limitation upon voluntary action that Aristotle discusses is force of circumstances. His well-known example concerns a ship caught in a storm; the sailors must throw goods overboard if the ship is not to sink (NE 1110a). In this case the action is not fully voluntary, and we would not blame the sailors for their actions. (Nor, of course, would we blame the storm: the undesirable consequence, the loss of the goods, must be chalked off as the product of natural causes, for which no one can be blamed.) Note that such cases are extreme examples of the force of necessity under which we always live – we are always constrained in our actions by circumstances, although we only tend to notice this when the constraint is sudden or unexpected. (If blame were to arise in such a situation, it would be where the sailors failed to take account of necessity, so that the ship and many aboard perished.)

In fact, it tends to be the interference of other people that causes us the most grief – and which really causes problems for responsibility attributions. Such interference can take many forms, but its paradigmatic forms are coercion and manipulation. Regarding coercion, Aristotle’s judgment is balanced. It depends on what action my coercer is demanding of me, and what threats he makes. Some actions are so heinous that we should be blamed for doing them, whatever we are threatened with (and whatever blame also attaches to our coercer) – thus Aristotle dismisses the idea that a man might be “compelled” to kill his mother (NE 1110a). This makes it clear that a central issue at stake in attributions of responsibility is the expectations that people have of one another. There are some forms of coercion we do not usually expect people to resist, but there are also some sorts of action that we think people should never undertake, regardless of such factors. In such cases praise and blame are clearly working to clarify and reinforce these expectations – in other words, they provide for a form of moral education.

Aristotle does not comment on manipulation, where other people lead us to a false view of our circumstances. But he does discuss ignorance of these circumstances, and how it undermines our responsibility. If we are ignorant of who someone is, for example – as was Oedipus, who did not know that the old man obstructing him was actually his father – we may commit acts we would otherwise abhor – thus Oedipus committed patricide, killing his own father. For Aristotle, such actions are not to be blamed (with the important provisos that the ignorance is not itself culpable and the action was otherwise justified). What decides good or bad character is how a person reacts when he finds out the truth – if we fail to regret our deeds, then we can certainly be blamed, even if the original choice was justifiable. Our regret about the deed shows that we want to disown it, and prepares us to make up for it as best we can. A lack of regret shows we are happy for the deed to have been done anyhow, even though we are now aware of facts that others think should have prevented us from acting that way.

This argument hints at an important point. For Aristotle, the moral judgment of the self may be quite different from the judgments of others. The actor should regret his action deeply but, as long as he does so, on-lookers should not blame, but rather pity or perhaps console him. If we suppose that both actor and on-looker are making a judgment about the actor’s moral worth this seems puzzlingly inconsistent. Yet Aristotle’s account has a different logic: The actor’s regret reveals his determination not to be associated with such an action. The on-lookers’ pity relates to their awareness that this “self-blame” is proper yet not earned; it is something that could fall upon anyone in the wrong circumstances. Simplifying, we could say that on-lookers make a positive judgment of the actor, based on his preparedness to make a negative judgment of himself. But this is not so paradoxical if we think of these judgments, not as relating to moral worth, but as preparations for action. Something has gone wrong, after all, and those affected seem to deserve some recompense. In such a situation, the actor will feel duty-bound to help put things right (perhaps to compensate, at any rate to apologise or show remorse). On-lookers, pitying rather than blaming, try to make his task easier, since the responsibility, in such a case, was not earned by the actor.

We have just discussed actions done in ignorance of the facts. But not every form of ignorance excuses; factual knowledge is very different from moral knowledge. What if a man did not know murder was wrong? Would this make his murders morally innocent? Aristotle says not: there are certain things we can and do expect people to know – above all, basic moral truths such as the wrongness of murder. But this knowledge is not as straightforward as it might appear: it must include a fairly good capacity to judge which sorts of killing count as murder. Nazi bureaucrat Adolf Eichmann organized the killing of thousands, without a sense of its wrongness. Aristotle is clear: such moral ignorance, an inability or failure to judge, excuses no adult. Eichmann should be held responsible for murder. But why should moral ignorance not excuse, when factual ignorance does? We must recognize that moral knowledge is actually rather different from factual knowledge. If a person is morally ignorant it is his whole character, his lasting ability to judge and act well, that is impaired – and presumably very difficult to set right. Isolated errors in factual knowledge, on the other hand, can be easily corrected. So long as we subsequently recognize and regret what we have done, factual mistakes involve no lasting corruption of character.

Still, if a person is morally ignorant it follows that they are unable to choose well. Aristotle agrees, arguing that those of settled bad character – be they morally ignorant or otherwise – are unable to make decent moral judgments. Does this mean that blame is incoherent or misplaced? He claims not. Even if the vicious person cannot now choose to act otherwise, there was a time when her vices were not fixed, when she could have chosen not to be vicious. Therefore, Aristotle says, she can be blamed. This is neat but rather unconvincing. Aristotle is famous for emphasising the importance of good upbringing and habituation, and presumably many vices are formed in childhood, before people have formed capacities for deliberating reasonably. Indeed, many vices undercut the capacity for rational deliberation. So it is a clear implication of Aristotle’s own account that the badly brought up person may never be in a position to choose not to be vicious. Note, further, that this move represents Aristotle at his most Kantian: blame is justified by reference to control, to a “could have done otherwise” – even when his own account of character formation suggests that such control probably never existed.

What are we to say, then, when a person seems unlikely to change: she appears quite settled in some particular vice, either because she cannot understand the criticism or because she is unable to alter her character or habits? Such cases are very common, and – unless we suppose that they are not morally deplorable – seem to undermine the modern assumption that blame must relate only to conduct under our control. (The same sort of argument can also be made with praise: a virtuous person might be quite unable to do certain things – commit cruelty, for example.) Clearly, if we think a character trait is really beyond alteration, by us or by the person concerned, our blaming won’t involve an attempt to reason with the person we condemn. But our condemnation might have another rationale: for example, to clarify what sort of standards we expect of others, or to signal our fellow-feeling with those who have been adversely affected by someone’s vices.

In sum, Aristotle’s account is not entirely self-consistent. Generally his focus is two-fold: upon the qualities of character revealed by acts, in terms of our overall moral expectations; and upon the responsibilities that must be born, given the effects of an action. For most of the time, his account proceeds without much reference to desert, and it is this neglect that seems to pose the chief difficulty for the Aristotelian story. It is interesting, then, that Aristotle himself sometimes suggests that bad qualities are to be blamed because they were originally subject to choice, even though this quasi-Kantian claim is not (on his own account of character formation) really supportable. Whether or not Aristotle should have made this argument, it does show how powerful is the thought that blame must be justified in terms of what the person herself chose – however long ago that choice supposedly was made.

Despite this, philosophers have returned to Aristotle’s account again and again to illuminate key ingredients of responsible agency.

  • The capacity to respond to others’ censure and encouragement, whether expressed emotionally (eg, as resentment) or in the more articulated forms of praise and blame.
  • A reasonable grasp of how actions are understood by people around us and how they affect others, including the need to share out responsibilities for “patching things up” where something has gone wrong. (That we praise and blame children, however, emphasises the educative and encouraging role that praise and blame play in developing such knowledge.)
  • Together with our own ability to express judgments of others, these capacities allow us to participate in forms of mutual accountability, whereby we inculcate and to some extent enforce shared standards of action.

This list is not comprehensive, but it serves to illustrate the underlying point of an Aristotelian account: our praising and blaming of one another rest on these sort of fairly basic capacities, which do not seem to demand any strong metaphysical elaboration. Indeed, if we approach the matter this way, the puzzle seems to be inverted. Not, “how might free will and determinism be reconciled?;” rather, “why should we feel there is a metaphysical issue at all?”

4. The Kantian Account and Moral Worth

We have seen that the Aristotelian and utilitarian accounts face a common criticism. Illuminating as they may be, they seem to pay too little attention to the question of desert, or culpability. Is the vicious person blameworthy? Does the person of good will, however much she is hindered by bad luck and hard circumstances, not deserve moral recognition? Our intuitions tend to answer such questions affirmatively. And the most usual justification is that the bad person has less moral worth than the person of good will, and therefore deserves blame and perhaps even punishment. A utitilitarian such as JJC Smart sees such justifications as “pharisaical” – that is, as hypocritically self-righteous, and encouraging of excessively moralistic forms of blame and retribution. But there is no denying the power and influence of such justifications.

The reason why so many people – within and without academic philosophy – feel the pull of the free will debate lies in the idea of moral worth we often associate with responsibility attributions such as blame. Galen Strawson expresses the core idea as follows: “if we have [true responsibility], then it makes sense, at least, to suppose that it might be just to punish some with eternal torment in hell, and reward others with eternal bliss in heaven” (1991: viii). Any such “ultimate” merit or demerit clearly has to be a matter of strictly individual desert. If it were merely a matter of chance who went to heaven or hell – or who would do so, if those fates really existed – this would plainly be a matter of mere fortune. Such intense good or bad luck would make the world even more morally arbitrary than it already is. If such merit is to be fairly allocated, therefore, it needs to be seen as something that lies within individuals’ own control. This line of thought, in turn, is based on what John Skorupski calls an “ideal of pure egalitarian desert” (1999: 156). Modern morality regards each person as equal in moral standing, as having an intrinsic dignity and deserving of equal respect. The thought is that we all equally possess control over our will, so that it makes sense to imagine everybody reaping an equally fair return on how well we exercise that control. (Clearly, this line of thought goes against the idea of the will referred to above, as a “bundle” of capacities unequally distributed among human beings.)

The thinker who grapples most systematically with these questions is Kant. He sees us all as equal in our capacity to strive for morality. But he knows that we don’t all do this, and claims that only some are worthy of happiness.

For Kant, our moral worth – the goodness of our will – is gauged by how sincerely and persistently we have sought to do our duty. To do our duty may be much harder for some people, for instance, those who have violent passions or who were brought up with bad habits. But moral worth is not about results; it is about the will. We all have such a will, an ability to choose well, despite the fact that some of us face stronger counter-inclinations or more difficult circumstances. To truly judge a person’s moral worth involves seeing past all the obstacles that their will has faced. Kant argues that this makes moral worth impossible for us to judge with any assurance; only God can see beyond all those things. This lack of knowledge corresponds to Kant’s main concern, which is how we judge ourselves. Our concern should be to do the right thing, and to do it because it is the right thing. To Kant it’s no problem that we’re never sure about others’ wills, and the obstacles or benefits they have faced. The point is that we can never be sure of our own motivations, and must always be attempting to do better in the future.

Moreover, Kant claims we are all equally well able to see what we should do. For Kant “even the most hardened scoundrel” would act morally, were it not for the opposing incentives of his inclinations and desires (Groundwork, 4:454). Kant needs to claim this because otherwise he would not be able to justify condemning people who suppose they are doing the right thing, when in fact their acts are quite wicked – the problem of the self-righteous wrong-doer. Adolf Eichmann, who we mentioned before, seems to have been sincere in thinking his acts were defensible (he even justified his actions with a twisted version of Kant’s moral philosophy!). Yet no one, and certainly not Kant, would doubt that he deserved the gravest condemnation for his crimes. In simplest form, the Kantian thought is that, if only we wanted to, we could all see that certain things are wrong – for example, no one could possibly want a world where everyone committed actions like Eichmann’s. Nonetheless, such examples are problematic for Kant, because it does seem implausible that people are equal in their capacities for moral knowledge. People’s sensitivity to different moral considerations is highly variable, and is clearly shaped by up-bringing and environment.

(By way of contrast, it may be worth noting that from an Aristotelian perspective, the realities of moral ignorance and moral disagreement pose no theoretical problems. In fact, they provide an important justification for praise and blame in terms of mutual accountability – that is, they help with moral learning by communicating when we have met or failed to meet moral standards. But because Kant’s account goes inward, to my scrutiny of my motives and intentions, he says remarkably little about this crucial educative aspect of responsibility attributions.)

Modern Kantian writers differ on how to deal with these two issues, the invisibility of the will and the claim that we share equal access to moral knowledge. One important line of thought is Christine Korsgaard’s. When we blame someone, she claims, we are recognising his capacity to reason about his conduct. Many people have felt that it is “enlightened” not to blame people for bad conduct, and instead to offer explanations that excuse or mitigate – for instance, by taking a person’s anti-social behaviour to have been caused by a bad childhood rather than a bad will. But Kantians insist that this is to deny someone recognition as a rational agent, as someone capable of choosing his action in the light of reasons. This corresponds to the important intuition that there is something patronising about making excuses for people, and not taking their own point of view seriously.

It is not clear whether blame, on this account, need have any link with the idea that someone’s will has proved defective; and it is this which is important if we are to give a place to culpability within the Kantian schema. Modern Kantians usually concede that Kant was too optimistic about our ability always to see the right thing to do. In this case, it is sometimes difficult for us to judge correctly, and so we have to work together at discovering the moral standards applicable in complex situations. Clearly, then, we need to communicate concerning the rights and wrongs of our individual actions. What this seems to omit, however, is the fact that desert is in play when we blame: blame often has an emotional content, and rarely sounds like a disinterested conversation about what would have been the right thing to do. One reason for this, in turn, is that we are identified by our acts, and tend to identify ourselves with them: if our acts are faulty, and none of the standard excusing conditions apply (such as factual ignorance, as discussed by Aristotle), so too must our character be, if blame is to be deserved. (On the other hand, perhaps it is true that we tend to “take things too personally.”)

This points to a real difficulty for Kantians. Moral evaluation is supposed to concern the will, not all the other complicated factors that have formed our character. (Aristotelians, and many others, reject the idea that such a separation can be made, even in theory.) Although Kantians think such a separation is theoretically possible, in practice they concede that we can only guess at the will. This seems to suggest that we should not blame one another, inasmuch as blame implies culpability, an individual failure to will rightly. But this leaves us with two unrealistic alternatives. One is that we explain bad conduct in terms of mitigating factors, which is plainly unattractive, for the very good Kantian reason that it fails to respect people as the choosers of their deeds. Yet the other obvious alternative, that instead of blame we should pursue an enlightened, as well as enlightening, conversation about correct responses to situations, is patently unreal. If people as we know them are going to change, or learn, by and large it will not be unemotional reasoning that alters them, but the many forces that speak to all aspects of character – for instance, resentment, shame, force of opinion. Yet, for all that these characteristic aspects of blame do not operate on the will (as Kantians conceive it), they certainly convey moral disapproval, and can be very effective.

5. The Idea of Moral Worth

The notion of moral worth central to Kant’s account is probably what one writer on ancient Greek ethics – AWH Adkins – had in mind when he said, “We are all Kantians now.” (1960: 2) Kant’s idea attractively reconciles two broad value judgments: (i) the egalitarian idea that all persons are moral equals by virtue of having freedom to choose morally; and (ii) the idea that responsibility relates to desert, so that people can nonetheless be judged very differently – some being condemned for their lives and characters, others praised. Although we have seen serious problems with the idea that people have an equal ability to choose well, most people agree that blame which attaches to parts of our character that we cannot control is deeply unfair. Does this mean, then, that we should accept a Kantian idea of moral worth, where praise and blame are understood as responses to people’s ultimate deserts?

To begin with, contrast Kant with Aristotle. Aristotle makes no claims about a person’s ultimate merit or demerit. People might be vicious or virtuous in various ways, and there might be rare paragons who possess a comprehensive set of virtues (yes, these are philosophers). Naturally we would not want to associate with the vicious, and naturally we will want to condemn their vices in no uncertain terms: It might help them to learn to do better, and it may caution others against them, and it should reinforce our own and other people’s sense of what character traits are desirable. But for Aristotle there is no sense that the vicious are earning a lasting form of discredit that should condemn them in the eyes of an ultimate judge. If the vicious person were to protest to Aristotle that the condemnations he faced were unfair, perhaps because his character had been shaped by his vicious parents, one suspects Aristotle would be rather unmoved. Life isn’t fair, he might say, and we certainly won’t make it fairer by pretending some vices are less real because of their origin in early childhood, let alone because of their fixity within an individual’s character. It may be unpleasant (he might continue) for you to hear this blame and condemnation – indeed, I’m glad that it is, because at least it shows that you are not so vicious that you don’t care about others’ opinions of you – but there are other matters at stake here, above all the standards and expectations which regulate all our lives together.

So Aristotle’s characteristic view is that some people just are better than others, in their abilities to choose rightly as in other regards. Given this “brute fact,” it is all the more important to give attention to mutual moral education and ensuring that people feel the need to take responsibility where things have gone wrong. Yet it does seem true that Aristotle paid too little attention to the question of desert. We can see this by recalling that he is not wholly consistent here. As we saw, he does try to justify our blame of the vicious person in terms of that person’s choice to become vicious, supposing that otherwise our condemnation would be unfair. Nonetheless, the main thrust of his account seems to be that Kant’s egalitarian fairness is not something we can really achieve.

On the other hand, it is difficult to deny the basic, very appealing intuition of Kant’s ethics: that people’s happiness should correspond to their moral worth – to the sincere intentions that are within everyone’s control. Apart from its appeal to fairness, this conception is also plausible because it corresponds well to several features of praise and blame. We do tend to judge the intent behind people’s actions, rather than the often haphazard results of their deeds. We take account of people’s circumstances, and judge less harshly where these place hard or immoral pressures on people. We also, quite often, feel that allowances should be made for the effects on character of abusive or deprived upbringings. In each case, we can interpret these concessions in Kantian terms – as drawing a distinction between the person’s will and the obstacles of circumstance, thus keeping our moral evaluation to what is within a person’s control – and, therefore, what concerns their deserts.

There are, however, reasons to doubt whether this Kantian interpretation is really the best account of these intuitions. The most obvious problem is that we often expect people to take responsibility for things they didn’t intend. This is not only in those cases where we judge that someone should have formed their intentions more carefully. Certainly we judge the negligent driver who causes an accident more harshly than a driver who was careful but nevertheless caused an accident. But even in the latter case, we expect the driver to bear important responsibilities. The problem that many of the things which attract moral culpability are wholly or partly outside of individual control is connected with the problem of moral luck. It is important to realise, however, that this problem is based on the Kantian idea that moral judgments, be it of character or future responsibilities, are deserved because they relate to a person’s “moral worth.”

Aristotle’s account offers a different way of understanding these everyday intuitions about when blame is justified. On his account we are judging the character of the person we are dealing with, based on how they act, how seriously they take their responsibilities, and how they respond to others’ responsibility attributions. To judge such questions we do indeed give a lot of weight to a person’s intentions: obviously, an intended action reveals a person’s character especially clearly. At the same time, we need to appreciate what he knew about the situation he was responding to, what pressures he was under, and special factors affecting his ability to deliberate and choose. Hence Aristotle’s concern with factual ignorance, force of circumstances, and intoxication; and we might note the more modern concern with mental illness. On an Aristotelian line, the point is that these factors alter the extent to which actions reveal the character of the person. That they undermine the person’s “control” is true, but subsidiary. To support this thought, we might consider how certain forms of bad character constitute a lack of control over one’s actions – thus the person who is weak-willed or indecisive, for example. Here weak-willed, indecisive action reveals the person, and her inability to control her actions.

This suggests that we do not need to accept Kant’s will-based view, where blame relates to moral worth. But we might still wonder if the other accounts can explain the culpability aspect of blame, the idea that it relates to desert.

Both utilitarians and Aristotelians can agree that at least one sense of desert clearly applies. A person deserves to be judged accurately, just as the facts deserve to be assessed truly, if they are to be assessed at all. As we need to judge one another, then clearly we deserve to be assessed fairly. But this doesn’t quite take us to the idea that a person has earned blame, for the fact is that a negative judgment of our character is unpleasant and costly. After all, human beings understand such judgments, and feel their effects, in a way that other entities do not.

There is another question of desert: praise raises the possibility of reward, while blame almost automatically suggests we ought to do something to make up for what we have done or how we have been. Moral philosophers continue to dispute whether utilitarians can give a proper account of this sort of responsibility. But we have already seen how Aristotle could respond. On his view responsibility attributions have a practical aspect: they are preparations for action. It is obvious that when something has gone wrong, we need to distribute the resulting responsibilities: who should pay compensation, apologise, or even be punished. If we take the view that there are always duties to be done, including making good when things have gone wrong, then the question is not what the results say about people’s moral worth, but rather how responsibilities for making good can be fairly divvied up.

But whether this is enough to justify the sense of desert that tends to attach to judgments of blame, or whether we tend to be too keen to invest blame with ideas of personal desert – these are questions much beyond the scope of this entry.

6. Conclusion

Praise and blame relate to our sense of people as capable of taking responsibility for their actions. As we saw, ideas about responsibility are usually presented in terms of a contest between two positions, compatibilism and incompatibilism. Incompatibilists accept the dilemma of free will versus determinism: responsibility depends on me controlling my actions, rather than other causal influences that operate around me. Praise, but especially blame, make no sense if determinism is true. Compatibilists, on the other hand, want to insist that the causal well-orderedness of the universe is, precisely, compatible with our responsibility for our actions. But for most philosophers the question is not whether responsibility and causal well-orderedness are compatible, but how. In other words, to adapt Adkins’s adage, “we are all compatibilists now.”

The essential issue for any compatibilist position lies in the conception of responsibility it relies on – an issue much less well-explored by philosophers than the metaphysics of freedom and determinism. This article has contrasted three broad schools of thought on how we put responsibility into practice, by praising and blaming one another. When Adkins claimed that “we are all Kantians now,” he was not referring to Kant’s (incompatibilist) metaphysics but rather to our tendency to feel that responsibility attributions must have depth, that they reflect something about a person’s “real” deserts. Yet this position leads us to claims about control over the self, to the idea of choices that are really ours and not the result of any external influence. In other words, it is more difficult than it may seem to separate Kant’s position from his metaphysical account of freedom and the incompatibilism which he, above all other writers, so strongly articulated.

The roughly Aristotelian alternative discussed here has been most influentially articulated in Bernard Williams’s critique of modern accounts of morality, which he thinks are most clearly expressed in Kant’s philosophy. Williams argues that these ideas neither make sense on their own terms, nor do they make sense of what we actually do when we do engage in attributions of responsibility. As we have seen, Aristotle’s account of praise and blame is based on: (i) how far acts reveal character; (ii) the fair distribution of responsibilities to act; and (iii) the attempt to exchange reasons, share standards, and maintain relationships with those whom we judge – and who judge us in turn.

What both the Aristotelian and utilitarian accounts lack is the deep thirst for equality and fairness which motivate Kant. Aristotle’s account provides no equivalent to the Kantian will – some moral quantity which all human beings possess and which grounds the idea of their equal worth. Nor does it really satisfy the widespread sense that moral judgment should offer fairness – even though the world does not. There is a deeply appealing sense of fairness in Kant’s concern to do justice to each person’s will, by isolating some moral core to the person independent of all formative and environmental factors. Even if wicked people prosper and the innocent suffer, our moral judgment of each constitutes a deep and subtle form of compensation: with regard to what really matters, the one is lacking while the other is undiminished. Even if goodness is made much harder for some, and its results may be correspondingly less, nonetheless we should try to see past those externals, once more, to what really matters.

To this, the Aristotelian and the utilitarian alike may say: to treat praise and blame as reflecting such a pure form of desert is to lose touch with what really matters about them. Praise and blame help us live together in a world where ultimate deserts are impossible to make out, if they exist at all. But just because we cannot make out people’s “moral worth,” it is still true that we need to take responsibility – not least, in our openness to one another’s praise and blame.

7. References and Further Reading

  • Adkins, AWH (1960) Merit and responsibility, Clarendon Press, Oxford.
  • Aristotle Nicomachean ethics (the most readable translation is Roger Crisp’s, Cambridge University Press, Cambridge, 2000).
  • Feinberg, Joel (1970) Doing and deserving: essays in the theory of responsibility (Princeton University Press, Princeton NJ).
    • A set of classic essays on responsibility for action, including justifications of praise and blame.
  • Fingarette, Herbert (1967) On responsibility (Basic Books, New York).
    • Another set of classic essays, including the argument that blame is intelligible insofar as it connects up with someone’s pre-existing concern for others.
  • Kant, Immanuel (1784) Groundwork to the metaphysics of morals (the best translation is Mary Gregor’s, Cambridge University Press, Cambridge, 1998).
  • Korsgaard, Christine (1996) “Creating the Kingdom of Ends: Reciprocity and Responsibility in Personal Relations” in her Creating the kingdom of ends (Cambridge University Press, Cambridge).
    • A sophisticated Kantian account of praise and blame.
  • Skorupski, John (1999) “The definition of morality” in his Ethical explorations (Oxford University Press, Oxford).
  • Smart, J.J.C. (1961) “Free will, praise and blame” Mind 70, 291-306.
    • A clear and succinct utilitarian account of praise and blame.
  • Smiley, Marion (1992) Moral responsibility and the boundaries of community: power and accountability from a pragmatic point of view (University of Chicago Press, Chicago).
    • Criticises conventional discussions of freedom and determinism, claiming that they fail to investigate the idea of responsibility.
  • Strawson, Galen (1991) Freedom and belief (Clarendon, Oxford).
  • Strawson, Peter (1974) “Freedom and resentment” in his Freedom and resentment and other essays (Methuen, London).
    • This famous essay resituates the free will debate by highlighting the importance of “reactive attitudes” such as resentment to interpersonal relations.
  • Williams, Bernard (1993) Shame and necessity (University of California Press, Berkeley CA) .
    • A sustained argument that the ancient Greeks had a nuanced and sophisticated account of responsibility attributions.
  • Williams, Bernard (1995a) “How free does the will need to be?” in his Making sense of humanity and other philosophical papers, 1982-1993 (Cambridge University Press, Cambridge).
  • Williams, Bernard (1995b) “Voluntary acts and responsible agents,” in his Making sense of humanity.

Author Information

Garrath Williams
Email: g.d.Williams@lancaster.ac.uk
University of Lancaster
United Kingdom

Diogenes of Sinope (c. 404—323 B.C.E.)

diogenes_of_sinopeThe most illustrious of the Cynic philosophers, Diogenes of Sinope serves as the template for the Cynic sage in antiquity. An alleged student of Antisthenes, Diogenes maintains his teacher’s asceticism and emphasis on ethics, but brings to these philosophical positions a dynamism and sense of humor unrivaled in the history of philosophy. Though originally from Sinope, the majority of the stories comprising his philosophical biography occur in Athens, and some of the most celebrated of these place Alexander the Great or Plato as his foil.It is disputed whether Diogenes left anything in writing. If he did, the texts he composed have since been lost. In Cynicism, living and writing are two components of ethical practice, but Diogenes is much like Socrates and even Plato in his sentiments regarding the superiority of direct verbal interaction over the written account. Diogenes scolds Hegesias after he asks to be lent one of Diogenes’ writing tablets: “You are a simpleton, Hegesias; you do not choose painted figs, but real ones; and yet you pass over the true training and would apply yourself to written rules” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 48). In reconstructing Diogenes’ ethical model, then, the life he lived is as much his philosophical work as any texts he may have composed.

Table of Contents

  1. Life
  2. Philosophical Practice: A Socrates Gone Mad
  3. References and Further Reading

1. Life

The exceptional nature of Diogenes’ life generates some difficulty for determining the exact events that comprise it. He was a citizen of Sinope who either fled or was exiled because of a problem involving the defacing of currency. Thanks to numismatic evidence, the adulteration of Sinopean coinage is one event about which there is certainty. The details of the defacing, though, are murkier: “Diocles relates that [Diogenes] went into exile because his father was entrusted with the money of the state and adulterated the coinage. But Eubulides in his book on Diogenes says that Diogenes himself did this and was forced to leave home along with his father” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 20). Whether it was Diogenes or his father who defaced the currency, and for whatever reasons they may have done so, the act led to Diogenes’ relocation to Athens.

Diogenes’ biography becomes, historically, only sketchier. For example, one story claims that Diogenes was urged by the oracle at Delphi to adulterate the political currency, but misunderstood and defaced the state currency (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 20). A second version tells of Diogenes traveling to Delphi and receiving this same oracle after he had already altered the currency, turning his crime into a calling. It is, finally, questionable whether Diogenes ever consulted the oracle at all; the Delphic advice is curiously close to Socrates’ own injunction, and the interweaving of life and legend in Diogenes’ case is just as substantial.

Once in Athens, Diogenes famously took a tub, or a pithos, for an abode. In Lives of Eminent Philosophers, it is reported that Diogenes “had written to some one to try and procure a cottage for him. When this man was a long time about it, he took for his abode the tub in the Metroön, as he himself explains in his letters” (Diogenes Laertius, Book 6, Chapter 23). Apparently Diogenes discovered that he had no need for conventional shelter or any other “dainties” from having watched a mouse. The lesson the mouse teaches is that he is capable of adapting himself to any circumstance. This adaptability is the origin of Diogenes’ legendary askēsis, or training.

Diogenes Laertius reports that Diogenes of Sinope “fell in” with Antisthenes who, though not in the habit of taking students, was worn out by Diogenes’ persistence (Lives of Eminent Philosophers, Book 6, Chapter 22). Although this account has been met with suspicion, especially given the likely dates of Diogenes’ arrival in Athens and Antisthenes’ death, it supports the perception that the foundation of Diogenes’ philosophical practice rests with Antisthenes.

Another important, though possibly invented, episode in Diogenes’ life centers around his enslavement in Corinth after having been captured by pirates. When asked what he could do, he replied “Govern men,” which is precisely what he did once bought by Xeniades. He was placed in charge of Xeniades’ sons, who learned to follow his ascetic example. One story tells of Diogenes’ release after having become a cherished member of the household, another claims Xeniades freed him immediately, and yet another maintains that he grew old and died at Xeniades’ house in Corinth. Whichever version may be true (and, of course, they all could be false), the purpose is the same: Diogenes the slave is freer than his master, who he rightly convinces to submit to his obedience.

Though most accounts agree that he lived to be quite old— some suggesting he lived until ninety— the tales of Diogenes’ death are no less multiple than those of his life. The possible cause of death includes a voluntary demise by holding his breath, an illness brought on by eating raw octopus, or death by dog bite. Given the embellished feel of each of these reports, it is more likely that he died of old age.

2. Philosophical Practice: A Socrates Gone Mad

When Plato is asked what sort of man Diogenes is, he responds, “A Socrates gone mad” (Diogenes Laertius, Book 6, Chapter 54). Plato’s label is representative, for Diogenes’ adaptation of Socratic philosophy has frequently been regarded as one of degradation. Certain scholars have understood Diogenes as an extreme version of Socratic wisdom, offering a fascinating, if crude, moment in the history of ancient thought, but which ought not to be confused with the serious business of philosophy. This reading is influenced by the mixture of shamelessness and askēsis which riddle Diogenes’ biography. This understanding, though, overlooks the centrality of reason in Diogenes’ practice.

Diogenes’ sense of shamelessness is best seen in the context of Cynicism in general. Specifically, though, it stems from a repositioning of convention below nature and reason. One guiding principle is that if an act is not shameful in private, that same act is not made shameful by being performed in public. For example, it was contrary to Athenian convention to eat in the marketplace, and yet there he would eat for, as he explained when reproached, it was in the marketplace that he felt hungry. The most scandalous of these sorts of activities involves his indecent behavior in the marketplace, to which he responded “he wished it were as easy to relieve hunger by rubbing an empty stomach” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 46).

He is labeled mad for acting against convention, but Diogenes points out that it is the conventions which lack reason: “Most people, he would say, are so nearly mad that a finger makes all the difference. For if you go along with your middle finger stretched out, some one will think you mad, but, if it’s the little finger, he will not think so” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 35). In these philosophical fragments, reason clearly has a role to play. There is a report that Diogenes “would continually say that for the conduct of life we need right reason or a halter” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 24). For Diogenes, each individual should either allow reason to guide her conduct, or, like an animal, she will need to be lead by a leash; reason guides one away from mistakes and toward the best way in which to live life. Diogenes, then, does not despise knowledge as such, but despises pretensions to knowledge that serve no purpose.

He is especially scornful of sophisms. He disproves an argument that a person has horns by touching his forehead, and in a similar manner, counters the claim that there is no such thing as motion by walking around. He elsewhere disputes Platonic definitions and from this comes one of his more memorable actions: “Plato had defined the human being as an animal, biped and featherless, and was applauded. Diogenes plucked a fowl and brought it into the lecture-room with the words, ‘Here is Plato’s human being.’ In consequence of which there was added to the definition, ‘having broad nails’” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 40). Diogenes is a harsh critic of Plato, regularly disparaging Plato’s metaphysical pursuits and thereby signaling a clear break from primarily theoretical ethics.

Diogenes’ talent for undercutting social and religious conventions and subverting political power can tempt readers into viewing his position as merely negative. This would, however, be a mistake. Diogenes is clearly contentious, but he is so for the sake of promoting reason and virtue. In the end, for a human to be in accord with nature is to be rational, for it is in the nature of a human being to act in accord with reason. Diogenes has trouble finding such humans, and expresses his sentiments regarding his difficulty theatrically. Diogenes is reported to have “lit a lamp in broad daylight and said, as he went about, ‘I am searching for a human being’” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 41).

For the Cynics, life in accord with reason is lived in accord with nature, and therefore life in accord with reason is greater than the bounds of convention and the polis. Furthermore, the Cynics claim that such a life is the life worth living. As a homeless and penniless exile, Diogenes experienced the greatest misfortunes of which the tragedians write, and yet he insisted that he lived the good life: “He claimed that to fortune he could oppose courage, to convention nature, to passion reason” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 38).

3. References and Further Reading

  • Billerbeck, Margarethe. Die Kyniker in der modernen Forschung. Amsterdam: B.R. Grüner, 1991.
  • Branham, Bracht and Marie-Odile Goulet-Cazé, eds. The Cynics: The Cynic Movement in Antiquity and Its Legacy. Berkeley: University of California Press, 1996.
  • Dudley, D. R. A History of Cynicism from Diogenes to the 6th Century A.D. Cambridge: Cambridge University Press, 1937.
  • Goulet-Cazé, Marie-Odile. L’Ascèse cynique: Un commentaire de Diogène Laërce VI 70-71, Deuxième édition. Paris: Libraire Philosophique J. VRIN, 2001.
  • Goulet-Cazé, Marie-Odile and Richard Goulet, eds. Le Cynisme ancien et ses prolongements. Paris: Presses Universitaires de France, 1993.
  • Diogenes Laertius. Lives of Eminent Philosophers Vol. I-II. Trans. R.D. Hicks. Cambridge: Harvard University Press, 1979.
  • Long, A.A. and David N. Sedley, eds. The Hellenistic Philosophers, Volume 1 and Volume 2. Cambridge: Cambridge University Press, 1987.
  • Malherbe, Abraham J., ed. and trans. The Cynic Epistles. Missoula, Montana: Scholars Press, 1977.
  • Navia, Luis E. Diogenes of Sinope: The Man in the Tub. Westport, Connecticut: Greenwood Press, 1990.
  • Navia, Luis E. Classical Cynicism: A Critical Study. Westport, Connecticut: Greenwood Press, 1996.
  • Paquet, Léonce. Les Cyniques grecs: fragments et témoignages. Ottawa: Presses de l’Universitaire d’Ottawa, 1988.

Author Information

Julie Piering
Email: japiering@ualr.edu
University of Arkansas at Little Rock
U. S. A.

Models

The word “model” is highly ambiguous, and there is no uniform terminology used by either scientists or philosophers. Here, a model is considered to be a representation of some object, behavior, or system that one wants to understand. This article presents the most common type of models found in science as well as the different relations—traditionally called “analogies”—between models and between a given model and its subject. Although once considered merely heuristic devices, they are now seen as indispensable to modern science. There are many different types of models used across the scientific disciplines, although there is no uniform terminology to classify them. The most familiar are physical models such as scale replicas of bridges or airplanes. These, like all models, are used because of their “analogies” to the subjects of the models. A scale model airplane has a structural similarity or “material analogy” to the full scale version. This correspondence allows engineers to infer dynamic properties of the airplane based on wind tunnel experiments on the replica. Physical models also include abstract representations which often include idealizations such as frictionless planes and point masses. Another, but completely different type of model, is constituted by sets of equations. These mathematical models were not always deemed legitimate models by philosophers. Model-to-subject and model-to-model relations are described using several different types of analogies: positive, negative, neutral, material, and formal.

Like unobservable entities, models have been the subject of debate between scientific realists and antirealists. One’s position often depends on what one considers the truth-bearers in science to be. Those who take fundamental laws and/or theories to be true believe that models are true in inverse proportion to the degree of idealization used. Highly idealized models would therefore be (in some sense) less true. Others take models to be true only insofar as they describe the behavior of empirically observable systems. This empiricism leads some to believe that models built from the bottom-up are realistic, while those derived in a top-down manner from abstract laws are not.

Models also play a key role in the semantic view of theories. What counts as a model on this approach, however, is more closely related to the sense of models in mathematical logic than in science itself.

Table of Contents

  1. Models in Science
  2. Physical Models
  3. Mathematical Models
  4. State Spaces
  5. Models and Realism
  6. Models and the Semantic View of Theories
  7. References and Further Reading

1. Models in Science

The word “model” is highly ambiguous, and there is no uniform terminology used by either scientists or philosophers. This article presents the most common type of models found in science as well as the different relations—traditionally called “analogies”—between models and between a given model and its subject. For most of the 20th century, the use of models in science was a neglected topic in philosophy. Far more attention was given to the nature of scientific theories and laws. Except for a few philosophers in the 1960’s, Mary Hesse in particular, most did not think the topic was particularly important. The philosophically interesting parts of science were thought to lie elsewhere. As a result, few articles on models were published in twenty-five years following Hesse’s (1966). [These include (Redhead, 1980) and (Wimsatt, 1987), and parts of (Bunge, 1973) and (Cartwright, 1983.] The situation is now quite different. As philosophers of science have come to pay greater attention to actual scientific practice, the use of models has become an import area of philosophical analysis.

2. Physical Models

One familiar type of model is the physical model: a material, pictorial, or analogical representation of (at least some part of) an actual system. “Physical” here is not meant to convey an ontological claim. As we shall see, some physical models are material objects; others are not. Hesse classifies many of these as either replicas or analogue models. Examples of the former are scale models used in wind tunnel experiments. There is what she calls a “material analogy” between the model and its subject, that is, a pretheoretic similarity in how their observable properties are related. Replicas are often used when the laws governing the subject of the model are either unknown or too computationally complex to derive predictions. When a material analogy is present, one assumes that a “formal analogy” also exists between the subject and the model. In a formal analogy, the same laws govern the relevant parts of both the subject and model.

Analogue models, in contrast, have a formal analogy with the subject of the model but no material analogy. In other words, the same laws govern both the subject and the model, although the two are physically quite different. For example, ping-pong balls blowing around in a box (like those used in some state lotteries) constitute an analogue model for an ideal gas. Some analogue models were important before the age of digital computers when simple electric circuits were used as analogues of mechanical systems. Consider a mass M on a frictionless plane that is subject to a time varying force f(t) (Figure 1). This system can be simulated by a circuit with a capacitor C and a time varying voltage source v(t). The voltage across C at time t corresponds to the velocity of M.

Figure 1: Analogue Machine

Today engineers and physicists are more familiar with simplifying models. These are constructed by abstracting away properties and relations that exist in the subject. Here we find the usual zoo of physical idealizations: frictionless planes, perfectly elastic bodies, point masses, and so forth. Consider a textbook mass-spring system with only one degree of freedom (that is, the spring oscillates perfectly along one dimension) shown in Figure 2. This particular system is physically possible, but nonactual. Real springs always wobble just a bit. If by chance a spring did oscillate in one dimension for some time, the event would be unlikely but would not violate any physical laws. Frictionless planes, on the other hand, are nonphysical rather than merely nonactual.

Figure 2: Physical Water Drop Model

Simplifying models provide a context for Hesse’s other relations known as positive, negative, and neutral analogies. Positive analogies are the ways in which the subject and model are alike—the properties and relations they share. Negative analogies occur when there is a mismatch between the two. The idealizations mentioned in the previous paragraph are negatively analogous to their real-world subjects. In a scale-model airplane (a replica), the length of the wing relative to the length of the tail is a positively analogous since the ratio is the same in the subject and the model. The wood used to make the model is negatively analogous since the real airplane would use different materials. Neutral analogies are relations that are in fact either positive or negative, but it is not yet known which. The number of neutral analogies is inversely related to our knowledge of the model and its subject. One uses a physical model with strong, positive analogies in order to probe its neutral analogies for more information. Ideally, all neutral analogies will be sorted into either positive or negative. The early success of the Bohr model of the atom showed that it had positive analogies to real hydrogen atoms. In Hesse’s terms, the neutral analogies proved to be negative when the model was applied to atoms with more than one electron.

The use of “analogy” in this regard has declined somewhat in recent years. “Idealization” has replaced “negative analogy” when these simplifications are built into physical models from the start. The degree to which a model has positive analogies is more typically described by how “realistic” the model is. One might also use the notion of “approximate truth”—a term long recognized as more suggestive than precise. The rough idea is that more realistic models—those with stronger positive analogies—contain more truth than others. “Negative analogy” contains an ambiguity. Some are used at the beginning of the model-building process. The modeler recognizes the false properties for what they are and uses them for a specific purpose—usually to simplify the mathematics. Other negative analogies, known as “artifacts,” are unintended consequences of idealizations, data collection, research methods, and limitations of the medium used to construct the model. Some artifacts are benign and obvious. Consider the wooden models of molecules used in high school chemistry classes. Three balls held together by sticks can represent a water molecule, but the color of the balls is an artifact. (As the early moderns were fond of pointing out, atoms are colorless.) Other artifacts are produced by measuring devices. It is impossible, for example, to fully shield an oscilloscope from the periodic signal produced by its AC current source. This produces a periodic component in the output signal not present in the source itself.

The heavy emphasis here on models in the physical sciences has more to do with the interests of philosophers than scientific practice. Physical models are used throughout the sciences, from immunoglobulin models of allergic reactions to macroeconomic models of the business cycle.

3. Mathematical Models

Philosophers have generally taken physical models as paradigm cases of scientific models. In many branches of science, however, mathematical models play a far more important role. There are many examples, especially in dynamics. Equation (1) below is an ordinary differential equation representing the motion of a frictionless pendulum. [θ is the angle of the string from vertical, l is the length of the string, and g is the acceleration due to gravity. The two dots in the first term stand for the second derivative with respect to time.] Even when sets of equations have clearly been used “to model” some behavior of a system, philosophers were often unwilling to take these as legitimate models. The difference is driven in part by greater familiarity with models in mathematical logic. In the logician’s realm, a model satisfies a set of axioms; the axioms themselves are not models. To philosophers, equations look like axioms. Referring to a set of equations as “a model” then sounds like a category mistake.

(1)

This attitude was eroded in part by the central role mathematical models played in the development of chaos theory. The 1980s saw a deluge of scientific articles with equations governing nonlinear systems as well as the state spaces that represented their evolution over time (see section 4). Physical models, on the other hand, were often bypassed altogether. This made it far more difficult to dismiss “mathematical model” as a scientist’s misnomer. It soon became apparent that all of the issues regarding idealizations, confirmation, and construction of physical models had mathematical counterparts.

Consider the physical model of the electric circuit in Figure 1. A common idealization is to stipulate that the circuit has no resistance. When we look to the associated differential equations—a mathematical model—there is a corresponding simplification, in this case the elimination of an algebraic term that represented the resistance of the wire. Unlike this example, simplification is often more than a mere convenience. The governing equations for many types of phenomena are intractable as they stand. Simplifications are needed to bridge the computational gap between the laws and phenomena they describe. In the old (pre-1926) quantum theory, for example, it was common to run across a Hamiltonian (an important type of function in physics that expresses the total energy of the system) that blocked the usual mathematical techniques—for example, separation of variables. Instead, a perturbation parameter λ was used to convert the problematic Hamiltonian into a power series such as in equation (2) below. [I, θ are classical action-angle variables. See any text on classical mechanics for more on this method.] Once in this form, one may generate an approximate solution for to an arbitrary degree of precision by keeping a finite number of terms and discarding the rest. This is sometimes called a “mediating mathematical model” (Morton 1993) since it operates, in a sense, between the intractable Hamiltonian and the phenomenon it is thought to describe.

(2)

4. State Spaces

State spaces have received scant attention in the philosophical literature until recently. They are often used in tandem with a mathematical model as a means for representing the possible states of a system and its evolution. The “system” is often a physical model, but might also be a real-world phenomenon essentially free of idealizations. Figure 3 is the state space associate with equation (1), the mathematical model for an ideal (frictionless) pendulum. Since θ represents the angle of the string, a,b correspond to the two highest points of deflection. represents velocity. [The coefficient .] Hence c,d are the points at which the pendulum is moving the fastest.

Figure 3: State Space for Ideal Pendulum

State spaces take a variety of forms. Quantum mechanics uses a Hilbert space to represent the state governed by Schrödinger’s equation. The space itself might have an infinite number of dimensions with a vector representing an individual state. The ordinary differential equations used in dynamics require many-dimensional phase spaces. Points represent the system states in these (usually Euclidean) spaces. As the state evolves over time, it carves a trajectory through the space. Every point belongs to some possible trajectory that represents the system’s actual or possible evolution. A phase space together with a set of trajectories forms a phase portrait (Figure 4). Since the full phase portrait cannot be captured in a diagram, only a handful of possible trajectories are shown in textbook illustrations. If the system allows for dissipation (for example friction), attractors can develop in the associated phase portrait. As the name implies, an attractor is a set of points toward which neighboring trajectories flow, though the points themselves possess no actual attractive force. The center of Figure 4a, known as a point attractor, might represent a marble coming to rest at the bottom of a bowl. Simple periodic motion, like a clock pendulum, produces limit cycles, attracting sets forming closed curves in phase space (Figure 4b).

Figure 4: Sample Phase Portraits

Let us consider a very simple system—a leaky faucet—that illustrates the use of each type of model mentioned. Researchers at the University of California, Santa Cruz, believed that the time between drops does not change randomly over time, but instead has an underlying dynamical structure (Martien 1985). In other words, one drip interval causally influences the next. In order to explore this hypothesis, a simplified physical model for a drop of water was developed (the one shown above in Figure 2). They believed that a water drop is roughly like a one-dimensional, oscillating mass on a spring. Part of the mass detaches when the spring extends to a critical point. The amount of mass that detaches depends on the velocity of the block when it reaches this point.

The mathematical model (3) for this system is relatively simple. y is the vertical position of the drop, v is its velocity, m is its mass prior to detachment, and Δm is the amount of mass that detaches (k, b, and c are constants). When this model is simulated on a computer, the resulting phase portrait is very similar to the one that was reconstructed from the data in the lab. Although this qualitative agreement is too weak to completely vindicate these models of the dripping faucet, it does provide a small degree confirmation.

(3)

Going back to the physical model, there are two clear idealizations/negative analogies. First, of course, is that water drops are not shaped like rigid blocks. Second, the mass-spring model only oscillates along one axis. Real liquids are not constrained in this way. However, these idealization allow for a far simpler mathematical model to be used than one would need for a realistic fluid. (Without these idealizations, (3) would have to be replaced by a difficult partial differential equation.) In addition, Peter Smith has argued that this mathematical tractability came with a steep price, namely, an unrecognized artifact (1998). The problem is that the state space for this particular system contains a “strange attractor” with a fractal structure, a geometrical structure far more complex than the attractors in Figure 4. Smith argues that the infinitely intricate structure of this attractor is an artifact of the mathematics used to describe the evolution of the system. If more realistic physical and mathematical models were used, this negative analogy would likewise disappear.

5. Models and Realism

One of the perennial debates in the philosophy of science has to do with realism. What aspects of science—if any—truly represent the real world? Which devices, on the other hand, are merely heuristic? Antirealists hold that some parts of the scientific enterprise—laws, unobservable entities, and so forth—do not correspond to anything in reality. (Some, like van Fraassen (1980), would say that if by chance the abstract terms used by scientists did denote something real, we have no way of knowing it.) Scientific realists argue that the successful use of these devices shows that they are, at least in part, truly describing the real world. Let’s now consider what role models have played in this debate.

Whether models should be taken realistically depends on what one takes the truth-bearers in science to be. Some hold that foundational, scientific truths are contained either in mature theories or their fundamental laws. If so, then idealized models are simply false. The argument for this is straightforward (Achinstein 1965). Let’s say that theory T describes a system S in terms of properties p1, p2, and p3. As we have seen, simplified models either modify or ignore some of the properties found in more fundamental theories. Say that a physical model M describes S in terms of p1 and p4. If so, then T describes S in one way; M describes S in a logically incompatible way. The simplifying assumptions needed to build a useful model contradict the claims of the governing theory. Hence, if T is true, M is false.

In contrast, Nancy Cartwright has long argued that abstract laws, no matter how “fundamental” to our understanding of nature, are not literally true. In her earlier work (1983), she argued that it is not models that are highly idealized, but rather the laws themselves. Abstract laws are useful for organizing scientific knowledge, but are not literally true when applied to concrete systems. They are “true,” she argues, only insofar as they correctly describe simplified physical models (or “simulacra”). Fundamental laws are true-of-the-model, not true simpliciter. The idea is something like being true-in-a-novel. The claim “The beast that terrorized the island of Amity in 1975 was a squid” is false-in-the-novel Jaws. Similarly, Newton’s second law of motion plus universal gravitation are only true-in-Newtonian-particle-models.

For most scientific realists, whether physical models are “true” or “real” is not a simple yes-or-no question. Most would point out that even idealizations like the frictionless plane are not simply false. For two blocks of iron sliding past each other, neglecting friction is a poor approximation. For skis sliding over an icy slope, it is much better. In other words, negative analogies come in degrees. If the idealizations are negligible, we may properly say that a physical model is realistic.

Scientific realists have not always held similar views about mathematical models. Textbook model building in the physical sciences often follows a “top-down” approach: start with general laws and first principles and then work toward the specifics of the phenomenon of interest. Dynamics texts are filled with models that can serve as the foundation for a more detailed mathematical treatment (for example, an ideal damped pendulum or a point particle moving in a central field). Philosophers have paid much less attention to models constructed from the bottom-up, that is, models that begin with the data rather than theory. What little attention bottom-up modeling did receive in the older modeling literature was almost entirely negative. Conventional wisdom seemed to be that phenomenological laws and curve-fitting methods were devices researchers sometimes had to stoop to in order to get a project off the ground. They were not considered models, but rather “mathematical hypotheses designed to fit experimental data” (Hesse 1967, 38). According to Ernan McMullin, sometimes physicists—and other scientists presumably—simply want a function that summarizes their observations (1967, 390-391). Curve-fitting and phenomenological laws do just that. The question of realism is avoided by denying the legitimacy of bottom-up mathematical models.

In her broad attack on “theory-driven” philosophy of science, Cartwright has recently defended a nearly opposite view (1999). She argues that top-down mathematical models are not realistic, but bottom-up models are. Once again, this verdict follows from a more general thesis about the truth-bearers in science. Cartwright is an antirealist about fundamental laws and abstract theories which, she claims, serve only to systematize scientific knowledge. Since top-down mathematical models use these laws as first principles from which to begin, they cannot possibly represent real systems. Bottom-up models, on the other hand, are not derived from covering laws. They are instead tied to experimental knowledge of particular systems. Unlike fundamental theories and their associated top-down models, bottom-up models are designed to represent actual objects and their behavior. It is this grounding in empirical knowledge that allows these kinds of mathematical models to be the primary device in science for representing real-world systems.

6. Models and the Semantic View of Theories

This typology of models and their properties has been developed with an eye toward scientific practice. Within the philosophy of science itself, models have also played a central role in understanding the nature of scientific theories. For most of the 20th century, philosophers considered theories to be special sets of sentences. Theories on this so-called “syntactic view” are linguistic entities. The meaning of the theory is contained in the sentences that constitute it, roughly the same way the meaning of this article is contained in these sentences. The semantic view, in contrast, uses the model-theoretic language of mathematical logic. In broad terms, a theory just is a family of models. The theory/model distinction collapses. Using the terminology we have already defined, a model in this sense might be an idealized physical model, an existing system in nature, or even a state space. The semantic content of a theory, on this view, is found in a family of models rather than in the sentences that describe them. If a given theory were axiomatized—a rare occurrence—one could think of these models as those entities for which the axioms are true. To take a toy example, say T1 is a theory whose sole axiom is “for any two lines, at most one point lies on both.” Figure 5 is one model that constitutes T1:

Figure 5: A Model of Theory T1

A model for ideal gases would be a physical model of dilute, perfectly elastic atoms in a closed container with an ordered set of parameters P, V, m, M, T> that satisfies the equation . (Respectively, pressure, volume, mass of the gases, molecular weight of the molecules, and temperature. R is a constant). In fact two different sets of parameters P1, V1, m1, M1, T1> and P2, V2, m1, M1, T2> constitute two separate models in the same family.

Some advocates of the semantic view claim that the use of the term “model” is similar in science and in logic (van Fraassen, 1980). This similarity has been one of the motivating forces behind this particular understanding of scientific theories. Given the distinctions made in previous sections of this article, this similarity seems to be questionable.

First, many things that would count as a model on the semantic view, for example the geometric diagram in Figure 5, are not physical models, mathematical models, or state spaces. In what sense, one wonders, are they scientific models? Moreover, a model on the semantic view might be an existing physical system. For example, Jupiter and its moons would constitute another model of Newton’s laws of motion plus universal gravitation. This blurs the distinction between the model and its subject. One may use a physical and/or mathematical model to study celestial bodies, but such entities are not themselves models. The scientist’s use of the term is not this broad.

Second, as we have already seen, sets of equations often constitute mathematical models. In contrast, laws and equations on the semantic approach are said to describe and classify models, but are never themselves taken to be models. Their relation is satisfaction, not identity.

Some time before the semantic view became popular, Hesse issued what still seems to be the correct verdict: “[M]ost uses of ‘model’ in science do carry over from logic the idea of interpretation of a deductive system,” however, “most writers on models in the sciences agree that there is little else in common between the scientist’s and the logician’s use of the term, either in the nature of the entities referred to or in the purpose for which they are used” (1967, 354).

7. References and Further Reading

  • Achinstein, P. “Theoretical Models.” The British Journal for the Philosophy of Science 16 (1965): 102-120.
  • Bunge, M. Method, Model and Matter. Dordrecht: Reidel, 1973.
  • Cartwright, N. How the Laws of Physics Lie. New York: Clarendon Press, 1983.
  • Cartwright, N. The Dappled World. Cambridge: Cambridge University Press, 1999.
  • Hesse, M. Models and Analogies in Science. Notre Dame: University of Notre Dame Press, 1966.
  • Hesse, M. “Models and Analogy in Science.The Encyclopedia of Philosophy. New York: Macmillan Publishing, 1967.
  • McMullin, E. “What do Physical Models Tell Us?” Logic, Methodology, and Philosophy of Science III. Eds. B. van Rootselaar and J. F. Staal. Amsterdam: North-Holland Publishing, 1967: 385-396.
  • Morrison, M. and M. Morgan, eds. Models as Mediators. Cambridge: Cambridge University Press, 1999.
  • Morton, A. “Mathematical Models: Questions of Trustworthiness.” The British Journal for the Philosophy of Science 44 (1993): 659-674.
  • Morton, A. and M. Suàrez. “Kinds of Models.” Model Validation in Hydrological Science. Eds. P. Bates and M. Anderson. New York: John Wiley Press, 2001.
  • Redhead, M. “Models in Physics.” The British Journal for the Philosophy of Science 31 (1980): 154-163.
  • Smith, P. Explaining Chaos. Cambridge: Cambridge University Press, 1998.
  • Van Fraassen, B. The Scientific Image. New York: Clarendon Press, 1980.
  • Wimsatt, W. “False Models as Means to Truer Theories.” Neutral Models in Biology. Eds. M. Nitecki and A. Hoffmann. New York: Oxford University Press, 1987.

Author Information

Jeffrey Koperski
Email: koperski@svsu.edu
Saginaw Valley State University
U. S. A.

Cynics

Cynicism originates in the philosophical schools of ancient Greece that claim a Socratic lineage. To call the Cynics a “school” though, immediately raises a difficulty for so unconventional and anti-theoretical a group. Their primary interests are ethical, but they conceive of ethics more as a way of living than as a doctrine in need of explication. As such askēsis—a Greek word meaning a kind of training of the self or practice—is fundamental. The Cynics, as well as the Stoics who followed them, characterize the Cynic way of life as a “shortcut to virtue” (see Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 104 and Book 7, Chapter 122). Though they often suggest that they have discovered the quickest, and perhaps surest, path to the virtuous life, they recognize the difficulty of this route.

The colorfulness of the Cynic way of life presents certain problems. The triumph of the Cynic as a philosophical and literary character complicates discussions of the historical individuals, a complication further troubled by a lack of sources. The evidence regarding the Cynics is limited to apothegms, aphorisms, and ancient hearsay; none of the many Cynic texts have survived. The tradition records the tenets of Cynicism via their lives. It is through their practices, the selves and lives that they cultivated, that we come to know the particular Cynic ēthos.

Table of Contents

  1. History of the Name
  2. Major Figures and the Cynic Lineage
  3. Cynic Ethics
    1. Living in Accord with Nature and Opposing Conventions
      1. Freedom and Parrhēsia
      2. Training and Toughness
  4. Cosmopolitanism
  5. The Cynic Legacy
  6. References and Further Reading

1. History of the Name

The origin of the Cynic name kunikos, a Greek word meaning “dog-like”, is a point of contention. Two competing stories explain the source of the name using the figure of Antisthenes (whom Diogenes Laertius identifies controversially as the original Cynic), and yet a third explanation uses the figure of Diogenes of Sinope. First, Antisthenes is said to have taught in the Cynosarges, which is a Greek word that might mean “White Dog,” “Quick Dog,” or even “Dog’s Meat”. The Cynosarges is a gymnasium and temple for Athenian nothoi. “Nothoi” is a term that designates one who is without Athenian citizenship because of being born to a slave, foreigner, or prostitute; one can also be nothoi if one’s parents were citizens but not legally married. According to the first explanation, the term Cynic would, then, derive from the place in which the movement’s founder worshipped, exercised, and, most importantly, lectured. Such a derivation is suspect insofar as later writers could have created the story through an analogy to the way in which the term “Stoic” came from the Stoa Poikilē in which Zeno of Citium taught. Though nothing unquestionably links Antisthenes or any other Cynic to the Cynosarges, Antisthenes was a nothos and the temple was used for worshipping Hercules, the ultimate Cynic hero.

A second possible derivation comes from Antisthenes’ alleged nickname Haplokuōn, a word that probably means a dog “pure and simple”, and is presumably referring to his way of living. Though Antisthenes was known for a certain rudeness and crudeness that could have led to such a name, and later authors, including Aelian, Epictetus, and Stobaeus, identify him as a kuōn, or dog, his contemporaries, such as Plato and Xenophon, do not label him as such. This lack lends some credence to the notion that the term kunikos was applied to Antisthenes posthumously, and only after Diogenes of Sinope, a more illustrious philosopher-dog, had arrived on the scene.

If Antisthenes was not the first Cynic by name, then the origin of the appellation falls to Diogenes of Sinope, an individual well known for dog-like behavior. As such, the term may have begun as an insult referring to Diogenes’ style of life, especially his proclivity to perform all of his activities in public. Shamelessness, which allowed Diogenes to use any space for any purpose, was primary in the invention of “Diogenes the Dog.”

The precise source of the term “Cynic” is, however, less important than the wholehearted appropriation of it. The first Cynics, beginning most clearly with Diogenes of Sinope, embraced their title: they barked at those who displeased them, spurned Athenian etiquette, and lived from nature. In other words, what may have originated as a disparaging label became the designation of a philosophical vocation.

Finally, because Cynicism denotes a way of living, it is inaccurate to equate Cynicism with the other schools of its day. The Cynics had no set space where they met and discoursed, such as the Garden, the Lyceum, or the Academy; for Diogenes and Crates, the streets of Athens provide the setting for both their teaching and their training. Moreover, the Cynics neglect, and very often ridicule, speculative philosophy. They are especially harsh critics of dogmatic thought, theories they consider useless, and metaphysical essences.

2. Major Figures and the Cynic Lineage

The major figures within Cynicism form the pivotal points within a lineage traced from Antisthenes, Socrates’ companion and a major interlocutor in the Socratic dialogues of Xenophon (see especially his Memorabilia and Symposium), through his student, Diogenes of Sinope, to Diogenes’ pupil Crates, and from Crates to both Hipparchia of Maronea, the first known woman Cynic philosopher, and Zeno of Citium, the founder of Stoicism.

Some others among the more notable Cynics include Metrocles of Maronea, brother to Hipparchia and pupil of Crates, Menippus, Demonax of Cyprus, Bion of Borysthenes, and Teles. Thinkers heavily influenced by Cynic thought include Zeno of Citium, Cleanthes of Assos, Aristo of Chios, Musonius Rufus, Epictetus, Dio Chrysostom, and the emperor Julian.

The Socratic schools tend to trace their lineage directly back to Socrates and the Cynics are no exception. As such, the historical authenticity of this heredity is suspect. Nevertheless, it accurately tracks a kind of intellectual transmission that begins with Antisthenes and is passed on to Diogenes, Crates, and Zeno. Cynics seem to have survived into the third century CE; two of Julian’s orations from 361 CE disparage the Cynics of his day for lacking the asceticism and hardiness of “real” Cynics. As a “school” of thought, Cynicism ends in the sixth century CE, but its legacy continues in both philosophy and literature.

3. Cynic Ethics

Foremost for understanding the Cynic conception of ethics is that virtue is a life lived in accord with nature. Nature offers the clearest indication of how to live the good life, which is characterized by reason, self-sufficiency, and freedom. Social conventions, however, can hinder the good life by compromising freedom and setting up a code of conduct that is opposed to nature and reason. Conventions are not inherently bad; however, for the Cynic, conventions are often absurd and worthy of ridicule. The Cynics deride the attention paid to the Olympics, the “big thieves” who run the temples and are seen carrying away the “little thieves” who steal from them, politicians as well as the philosophers who attend their courts, fashion, and prayers for such things as fame and fortune.

Only once one has freed oneself from the strictures that impede an ethical life can one be said to be truly free. As such, the Cynics advocate askēsis, or practice, over theory as the means to free oneself from convention, promote self-sufficiency, and live in accord with nature. Such askēsis leads the Cynic to live in poverty, embrace hardship and toil, and permits the Cynic to speak freely about the silly, and often vicious, way life is lived by his or her contemporaries. The Cynics consistently undermine the most hallowed principles of Athenian culture, but they do so for the sake of replacing them with those in accord with reason, nature, and virtue.

a. Living in Accord with Nature and Opposing Conventions

Though the imperative to live life in accord with nature is rightly associated with Stoicism, the Stoics are following a Cynic lead. Diogenes of Sinope fervently rejects nomos, or convention, by showing the arbitrary and frequently amusing nature of Athenian social, religious, and political mores and trampling the authority of religious and political leaders. Fundamental to this is a redefinition of what is worthy of shame. Diogenes’ body is disorderly, a source of great shame among the Athenians and the reservoir for the principle of shamelessness among the Cynics.

Diogenes uses his body to upend the conventional association of decorum with the good. He breaks etiquette by publicly carrying out activities an Athenian would typically perform in private. For example, he eats, drinks, and masturbates in the marketplace, and ridicules the shame felt when one’s body is unruly or clumsy. This does not mean, however, that there is nothing about which a person ought to feel shame. For example, in Lives of Emminent Philosophers, one finds the following anecdote: “Observing a fool tuning a harp, ‘Are you not ashamed,’ he said, ‘to give this wood concordant sounds, while you fail to harmonize your soul with your life?’ To one who protested ‘I am unfit to study philosophy,’ Diogenes said, ‘Why then live, if you do not care to live well?’” (Diogenes Laertius, Book 6, Chapter 65; R.D. Hicks’ translation is altered for this article.)

As Diogenes ’ reappraisal of shame suggests, the Cynics are not relativists. Nature replaces convention as the standard for judgment. The Cynics believe that it is through nature that one can live well and not through conventional means such as etiquette or religion. One reads that Diogenes of Sinope “would rebuke men in general with regard to their prayers, declaring that they asked for things which seemed to them to be good, not for such as are truly good” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 43). This captures the crux of the Cynic notion of living in accord with nature and contrary to convention. Praying for wealth, fame, or any of the other trappings convention leads one to believe are good is a mistaken enterprise. Life, as given by nature, is full of hints as to how to live it best; but humans go astray, ashamed by petty things and striving after objects, which are unimportant. Consequently, their freedom is hindered by convention.

i. Freedom and Parrhēsia

The Cynics clearly privilege freedom, but not merely in a personal sense as a kind of negative liberty. Instead, freedom is advocated in three related forms: eleutheria, freedom or liberty, autarkeia, self-sufficiency, and parrhēsia, freedom of speech or frankness. Their conception of freedom has some shared aspects with other ancient schools; the notion of autonomy which derives from the imperative that reason rule over the passions is found in the ethics of multiple Classical and Hellenistic thinkers. A specifically Cynic sense of freedom, though, is evident in parrhēsia.

An element of parrhēsia, which can be overlooked when it is defined as free or frank speech, is the risk that accompanies speaking so freely and frankly. Legendary examples of the Cynic’s fearlessly free speech occur in Diogenes of Sinope’s interchanges with Alexander the Great. One such example is the following: “When he was sunning himself in the Craneum, Alexander came and stood over him and said, ‘Ask of me any boon you like.’ To which he replied, ‘Stand out of my light’” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 28). At another point, Alexander pronounces his rank to Diogenes of Sinope by saying, “I am Alexander the Great King.” Diogenes responds with his own rank, “I am Diogenes the Cynic,” which is to say “Diogenes the Dog” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 60).

The examples above demonstrate the unique confluence of humor, fearless truth telling, and political subversion which distinguishes the Cynic way of living. With a few notable exceptions, the philosophers of antiquity can be found at some time or another in the company of rulers (Plato, Aeschines, and Aristippus all attended the court of Dionysius, Xenophon is intimately associated with Cyrus, Aristotle with the Macedonian ruling family, and so on). The Cynics, however, made it a point to shun such contact. The Cynics strive for self-sufficiency and strength, neither of which is capable of being maintained once one enters into the conventional political game. The life of an impoverished, but virtuous and self-sufficient philosopher is preferable to the life of a pampered court philosopher.

Diogenes Laertius writes that, “Plato saw [Diogenes of Sinope] washing lettuces, came up to him and quietly said to him, ‘Had you paid court to Dionysius, you wouldn’t now be washing lettuces,’ and [Diogenes] with equal calmness answered, ‘If you had washed lettuces, you wouldn’t have paid court to Dionysius’” (Lives of Eminent Philosophers, Book 6, Chapter 58). The lesson of this exchange is clear: whereas Plato views paying court as freeing one from poverty, the Cynic sees poverty as freeing one from having to pay court to a ruler. This second sense of freedom so forcefully advocated by the Cynics, comprises both autarkeia, or self-sufficiency, and parrhēsia, or the freedom to speak the truth: something one at court is never free to do. It is no surprise, then, that when asked what is “the most beautiful thing in the world,” Diogenes replied, “Parrhēsia.” (Diogenes Laertius, Lives of Eminent Philosophers, Book 6, Chapter 69.)

ii. Training and Toughness

In order to live the Cynic life, one had to be inured to the various physical hardships entailed by such freedom. This required, then, a life of constant training, or askēsis. The term askēsis, defined above as a kind of training of the self but which also means “exercise” or “practice,” is appropriated from athletic training. Instead of training the body for the sake of victory in the Olympic Games, on the battlefield, or for general good health, the Cynic trains the body for the sake of the soul.

The examples of Cynic training are multiple: Antisthenes praised toil and hardship as goods; Diogenes of Sinope walked barefoot in the snow, hugged cold statues, and rolled about in the scalding summer sand in his pithos; Crates rid himself of his considerable wealth in order to become a Cynic. The ability to live without any of the commodities usually mistaken for necessities is liberating and beneficial. It is also, however, a difficult lesson: “[Diogenes of Sinope] used to say that he followed the example of the trainers of choruses; for they too set the note a little high, to ensure that the rest should hit the right note” (Diogenes Laertius, Book 6, Chapter 35).

4. Cosmopolitanism

The Cynics are not always given credit when it comes to the notion of cosmopolitanism, for the origin of this term is at times ascribed to Stoicism. Moreover, when it is attributed to Cynicism, it is often characterized as a negative tenet that gains content only once it is transplanted into Stoic doctrine (see John L. Moles’ discussion of “Cynic Cosmopolitanism” in The Cynics). However, cosmopolitanism can be fully understood within its Cynic context if it is taken as more than an oxymoron or a pithy retort: “Asked where he came from, [Diogenes of Sinope] said, ‘I am a citizen of the world [kosmopolitēs]’” (Diogenes Laertius, Book 6, Chapter 63). In this last quote, Diogenes is responding to a question calling for him to state his origin with what seems to be a neologism. To be a politēs is to belong to a polis, to be a member of a specific society with all of the benefits and commitments such membership entails. By not responding with the expected “Sinope,” Diogenes is renouncing his duty to Sinopeans as well as his right to be aided by them. It is important to note that Diogenes does not say that he is apolis, that is, without a polis; he claims allegiance to the kosmos, or the universe.

The Cynics, then, cast the notion of citizenship in a new light. To the Greek male of the Classical and Hellenistic period, citizenship was of utmost value. The restrictions on citizenship made it a privilege and these exclusions are, to the Cynic, absurd. Under cosmopolitanism, the Cynic challenges the civic affiliation of the few by opening the privilege to all. General national affiliation was likewise esteemed, and Diogenes’ cosmopolitan response is therefore also a rejection of the limitations of such a view.

Finally, cosmopolitanism revises the traditional conception of the political duties of an individual. As such, the Cynic is freed to live according to nature and not according to the laws and conventions of the polis. The conventional polis is not just rejected but replaced. This has important ethical connections to the notion of living in accord with nature, and can likewise be seen as an important precursor to the Stoic understanding of physis, or nature, as identical to the kosmos, or universe.

5. The Cynic Legacy

The first and most direct Cynic influence is upon the founding of Stoicism. One story, preserved in Diogenes Laertius, tells of Zeno of Citium reading a copy of Xenophon’s Memorabilia in a bookshop while shipwrecked in Athens. He became so taken with the figure of Socrates that he asked the bookseller where he might find such a man. At just that moment, Crates passed by, and the bookseller pointed him out as the one to follow.

Though this, like many of Diogenes Laertius’ stories, may strike one as too propitious to be historically accurate, it preserves the way in which the primary tenets of Stoicism emerge out of Cynicism. The primacy of ethics, the sufficiency of virtue for happiness, the cultivation of indifference to external affairs, the definition of virtue as living in accord with nature, and the importance placed on askēsis, all mark the shared terrain between the Cynics and the Stoics. Indeed, when various Stoic thinkers list the handful of Stoic sages, Cynics, and especially Diogenes of Sinope, are typically among them. Epictetus in particular advocates the Cynic stance, but warns against taking up lightly something so difficult (see Discourses 3.22).

Within political philosophy, the Cynics can be seen as originators of anarchism. Since humans are both rational and able to be guided by nature, it follows that humans have little need for legal codes or political affiliations. Indeed, political associations at times require one to be vicious for the sake of the polis. Diogenes’ cosmopolitanism represents, then, a first suggestion that human affiliation ought to be to humanity rather than a single state.

The impact of Cynicism is also felt in Christian, Medieval, and Renaissance thought, though not without a good deal of ambivalence. Christian authors, for example, praise the Cynics for their self-discipline, independence, and mendicant lifestyle, but rebuke the bawdy aspects of Cynic shamelessness.

Finally, the mark of the Cynic is found throughout the texts of literature and philosophy. Menippean Satire has a clear debt, and Diogenes of Sinope in particular appears as a character in literary and philosophical contexts; Dante, for example, situates Diogenes with other virtuous but pagan philosophers in the first level of hell and Nietzsche is especially fond of both Diogenes and the Cynic attitude. One striking example occurs in section 125 of The Gay Science. Here Nietzsche alludes to the anecdote wherein Diogenes searches for a human being with a lit lamp in daylight (D.L. 6.41). In his own rendition, Nietzsche tells the story of the madman who entered the marketplace with a lit lamp on a bright morning seeking God. It is this same madman who pronounces that God is dead.

6. References and Further Reading

  • Billerbeck, Margarethe. Die Kyniker in der modernen Forschung. Amsterdam: B.R. Grüner, 1991.
  • Branham, Bracht and Marie-Odile Goulet-Cazé, eds. The Cynics: The Cynic Movement in Antiquity and Its Legacy. Berkeley: University of California Press, 1996.
  • Dudley, D. R. A History of Cynicism from Diogenes to the 6th Century A.D. Cambridge: Cambridge University Press, 1937.
  • Epictetus. The Discourses as Reported by Arrian. Trans. W.A. Oldfather. Cambridge: Harvard University Press, 1928.
  • Goulet-Cazé, Marie-Odile. L’Ascèse cynique: Un commentaire de Diogène Laërce VI 70-71, Deuxième édition. Paris: Libraire Philosophique J. VRIN, 2001.
  • Goulet-Cazé, Marie-Odile and Richard Goulet, eds.Le Cynisme ancien et ses prolongements. Paris: Presses Universitaires de France, 1993.
  • Hock, R.F. “Simon the Shoemaker as an Ideal Cynic,” in Greek, Roman and Byzantine Studies, 17 (1976).
  • Diogenes Laertius. Lives of Eminent Philosophers Vol. I-II. Trans. R.D. Hicks. Cambridge: Harvard University Press, 1979.
  • Long, A.A. and David N. Sedley, eds. The Hellenistic Philosophers, Volume 1 andVolume 2. Cambridge: Cambridge University Press, 1987.
  • Malherbe, Abraham J., ed. and trans. The Cynic Epistles. Missoula, Montana: Scholars Press, 1977.
  • Navia, Luis E. Diogenes of Sinope: The Man in the Tub. Westport, Connecticut: Greenwood Press, 1990.
  • Navia, Luis E. Classical Cynicism: A Critical Study. Westport, Connecticut: Greenwood Press, 1996.
  • Navia, Luis E. Antisthenes of Athens. Westport, Connecticut: Greenwood Press, 2001.
  • Paquet, Léonce. Les Cyniques grecs: fragments et témoignages. Ottawa: Presses de l’Universitaire d’Ottawa, 1988.
  • Sloterdijk, Peter. Critique of Cynical Reason. Trans. Michael Eldred. Minneapolis: University of Minnesota Press, 1987

Author Information

Julie Piering
Email: japiering@ualr.edu
University of Arkansas at Little Rock
U. S. A.

Free Will

Most of us are certain that we have free will, though what exactly this amounts to is much less certain. According to David Hume, the question of the nature of free will is “the most contentious question of metaphysics.” If this is correct, then figuring out what free will is will be no small task indeed. Minimally, to say that an agent has free will is to say that the agent has the capacity to choose his or her course of action. But animals seem to satisfy this criterion, and we typically think that only persons, and not animals, have free will. Let us then understand free will as the capacity unique to persons that allows them to control their actions. It is controversial whether this minimal understanding of what it means to have a free will actually requires an agent to have a specific faculty of will, whether the term “free will” is simply shorthand for other features of persons, and whether there really is such a thing as free will at all.

This article considers why we should care about free will and how freedom of will relates to freedom of action. It canvasses a number of the dominant accounts of what the will is, and then explores the persistent question of the relationship between free will and causal determinism, articulating a number of different positions one might take on the issue. For example, does determinism imply that there is no free will, as the incompatibilists argue, or does it allow for free will, as the compatibilists argue? This article explores several influential arguments that have been given in favor of these two dominant positions on the relationship between free will and causal determinism. Finally, there is a brief examination of how free will relates to theological determinism and logical determinism.

Table of Contents

  1. Free Will, Free Action and Moral Responsibility
  2. Accounts of the Will
    1. Faculties Model of the Will
    2. Hierarchical Model of the Will
    3. Reasons-Responsive View of the Will
  3. Free Will and Determinism
    1. The Thesis of Causal Determinism
    2. Determinism, Science and “Near Determinism”
    3. Compatibilism, Incompatibilism, and Pessimism
  4. Arguments for Incompatibilism (or Arguments against Compatibilism)
    1. The Consequence Argument
    2. The Origination Argument
    3. The Relation between the Arguments
  5. Arguments for Compatibilism (or Arguments against Incompatibilism)
    1. Rejecting the Incompatibilist Arguments
    2. Frankfurt’s Argument against “the Ability to Do Otherwise”
    3. Strawson’s Reactive Attitudes
  6. Related Issues
    1. Theological Determinism
    2. Logical Determinism
  7. References and Further Reading

1. Free Will, Free Action and Moral Responsibility

Why should we even care whether or not agents have free will? Probably the best reason for caring is that free will is closely related to two other important philosophical issues: freedom of action and moral responsibility. However, despite the close connection between these concepts, it is important not to conflate them.

We most often think that an agent’s free actions are those actions that she does as a result of exercising her free will. Consider a woman, Allison, who is contemplating a paradigmatic free action, such as whether or not to walk her dog. Allison might say to herself, “I know I should walk the dog—he needs the exercise. And while I don’t really want to walk him since it is cold outside, I think overall the best decision to make is that I should take him for a walk.” Thus, we see that one reason we care about free will is that it seems necessary for free action—Allison must first decide, or choose, to walk the dog before she actually takes him outside for his walk. If we assume that human actions are those actions that result from the rational capacities of humans, we then see that the possibility of free action depends on the possibility of free will: to say that an agent acted freely is minimally to say that the agent was successful in carrying out a free volition or choice.

Various philosophers have offered just such an account of freedom. Thomas Hobbes suggested that freedom consists in there being no external impediments to an agent doing what he wants to do: “A free agent is he that can do as he will, and forbear as he will, and that liberty is the absence of external impediments.” In An Enquiry Concerning Human Understanding, David Hume thought that free will (or “liberty,” to use his term) is simply the “power of acting or of not acting, according to the determination of the will: that is, if we choose to remain at rest, we may; if we choose to move, we also may.… This hypothetical liberty is universally allowed to belong to everyone who is not a prisoner and in chains.” This suggests that freedom is simply the ability to select a course of action, and an agent is free if he is not being prevented by some external obstacle from completing that course of action. Thus, Hobbes and Hume would hold that Allison is free to walk her dog so long as nothing prevents her from carrying out her decision to walk her dog, and she is free not to walk her dog so long as nothing would compel her to walk her dog if she would decide not to.

However, one might still believe this approach fails to make an important distinction between these two related, but conceptually distinct, kinds of freedom: freedom of will versus freedom of action. This distinction is motivated by the apparent fact that agents can possess free will without also having freedom of action. Suppose that before Allison made the choice to walk the dog, she was taking a nap. And while Allison slept, there was a blizzard that moved through the area. The wind has drifted the snow up against the front of her house so that it is impossible for Allison to get out her front door and walk her dog even if she wanted to. So here we have a case involving free will, because Allison has chosen to take the dog for a walk, but not involving free action, because Allison is not able to take her dog for a walk.

Whether or not one can have freedom of action without free will depends on one’s view of what free will is. Also, the truth of causal determinism would not entail that agents lack the freedom to do what they want to do. An agent could do what she wants to do, even if she is causally determined to do that action. Thus, both Hobbes and Hume are rightly characterized as compatibilists.

Even if there is a distinction between freedom of will and freedom of action, it appears that free will is necessary for the performance of free actions. If Allison is brainwashed during her nap to want to walk her dog, then even if no external impediment prevents her from carrying through with this decision, we would say that her taking the dog for a walk is not a free action. Presumably, the reason why it would not be a free action is because, in the case of brainwashing, Allison’s decision does not arise from her free will. Thus, it looks like free will might be a necessary condition for free action, even if the two are distinct. In what follows, the phrase “acting with free will” means engaging in an action as the result of the utilization of free will. Use of the phrase does not deny the distinction between free will and free action.

The second reason to care about free will is that it seems to be required for moral responsibility. While there are various accounts of what exactly moral responsibility is, it is widely agreed that moral responsibility is distinct from causal responsibility. Consider a falling branch that lands on a car, breaking its window. While the branch is causally responsible for the broken window, it is not morally responsible for it because branches are not moral agents. Depending on one’s account of causation, it also might be possible to be morally responsible for an event or state of affairs even if one is not causally responsible for that same event or state of affairs. For present purposes, let us simply say that an agent is morally responsible for an event or state of affairs only if she is the appropriate recipient of moral praise or moral blame for that event or state of affairs (an agent can thus be morally responsible even if no one, including herself, actually does blame or praise her for her actions). According to the dominant view of the relationship between free will and moral responsibility, if an agent does not have free will, then that agent is not morally responsible for her actions. For example, if Allison is coerced into doing a morally bad act, such as stealing a car, we shouldn’t hold her morally responsible for this action since it is not an action that she did of her own free will.

Some philosophers do not believe that free will is required for moral responsibility. According to John Martin Fischer, human agents do not have free will, but they are still morally responsible for their choices and actions. In a nutshell, Fischer thinks that the kind of control needed for moral responsibility is weaker than the kind of control needed for free will. Furthermore, he thinks that the truth of causal determinism would preclude the kind of control needed for free will, but that it wouldn’t preclude the kind of control needed for moral responsibility. See Fischer (1994). As this example shows, virtually every issue pertaining to free will is contested by various philosophers.

However, many think that the significance of free will is not limited to its necessity for free action and moral responsibility. Various philosophers suggest that free will is also a requirement for agency, rationality, the autonomy and dignity of persons, creativity, cooperation, and the value of friendship and love [see Anglin (1990), Kane (1998) and Ekstrom (1999)]. We thus see that free will is central to many philosophical issues.

2. Accounts of the Will

Nearly every major figure in the history of philosophy has had something or other to say about free will. The present section considers three of the most prominent theories of what the will is.

a. Faculties Model of the Will

The faculties model of the will has its origin in the writings of ancient philosophers such as Plato and Aristotle, and it was the dominant view of the will for much of medieval and modern philosophy [see Descartes (1998) and the discussion of Aquinas in Stump (2003)]. It still has numerous proponents in the contemporary literature. What is distinct about free agents, according to this model, is their possession of certain powers or capacities. All living things possess some capacities, such as the capacities for growth and reproduction. What is unique about free agents, however, is that they also possess the capacities for intellection and volition. Another way of saying this is that free agents alone have the faculties of intellect and will. It is in virtue of having these additional faculties, and the interaction between them, that agents have free will.

The intellect, or the rational faculty, is the power of cognition. As a result of its cognitions, the intellect presents various things to the will as good under some description. To return to the case of Allison contemplating walking her dog, Allison’s intellect might evaluate walking the dog as good for the health of the dog. Furthermore, all agents that have an intellect also have a will. The will, or the volitional faculty, is an appetite for the good; that is, it is naturally drawn to goodness. The will, therefore, cannot pursue an option that the intellect presents as good in no way. The will is also able to command the other faculties; the will can command the body to move or the intellect to consider something. In the case of Allison, the will could command the body to pick up the leash, attach it to the dog, and go outside for a walk. As Aquinas, a proponent of this view of the will, puts it: “Only an agent endowed with an intellect can act with a judgment which is free, in so far as it apprehends the common note of goodness; from which it can judge this or the other thing to be good. Consequently, wherever there is intellect, there is free will” (Summa Theologiae, q. 59 a. 3). Thus, through the interaction between the intellect and will, an agent has free will to pursue something that it perceives as good.

b. Hierarchical Model of the Will

A widely influential contemporary account of the will is Harry Frankfurt’s hierarchical view of the will [see Frankfurt (1971)]. This account is also sometimes called a “structuralist” or “mesh” account of the will, since a will is free if it has a certain internal structure or “mesh” among the various levels of desires and volitions. According to the hierarchical model, agents can have different kinds of desires. Some desires are desires to do a particular action; for example, Allison may desire to go jogging. Call these desires “1st order desires.” But even if Allison doesn’t desire to go jogging, she may nevertheless desire to be the kind of person who desires to go jogging. In other words, she may desire to have a certain 1st order desire. Call desires of this sort “2nd order desires.” If agents also have further desires to have particular 2nd order desires, one could construct a seemingly infinite hierarchy of desires.

Not all of an agent’s desires result in action. In fact, if one has conflicting desires, then it is impossible for an agent to satisfy all her desires. Suppose that Allison not only desires to run, but that she also desires to stay curled up in bed, where it is nice and warm. In such a case, Allison cannot fulfill both of her 1st order desires. If Allison decides to act on her desire to run, we say that her desire to run has moved her to action. An effective desire of this sort is called a volition; a volition is a desire that moves the agent all the way to action. Similarly, one can differentiate between a mere 2nd order desire (simply a desire to have a certain desire) and a 2nd order volition (a desire for a desire to become one’s will, or a desire for a desire to become a volition). According to the hierarchical view of the will, free will consists in having 2nd order volitions. In other words, an agent has a free will if she is able to have the sort of will that she wants to have. An agent acts on her own free will if her action is the result of a 1st order desire that she wants to become a 1st order volition.

Hierarchical views of the will are problematic, however, because it looks as if certain sorts of questionable manipulation can be compatible with this view’s account of free will. According to the view under consideration, Allison has free will with regard to going jogging if she has a 2nd order desire that her 1st order desire to go jogging will move her to go jogging. Nothing in this account, however, depends on how she got these desires. Even if she were manipulated, via brainwashing, for example, into having her 2nd order desire for her 1st order desire to go running become her will, Allison has the right “mesh” between her various orders of desires to qualify as having free will. This is an untoward consequence. While more robust hierarchical accounts of the will have the resources for explaining why Allison might not be free in this case, it is widely agreed that cases of manipulation and coercion are problematic for solely structural accounts of the will [see Ekstrom (1999), Fischer (1994), Kane, (2005), Pereboom (2001) and van Inwagen (1983)].

c. Reasons-Responsive View of the Will

A third treatment of free will takes as its starting point the claim that agency involves a sensitivity to certain reasons. An agent acts with free will if she is responsive to the appropriate rational considerations, and she does not act with a free will if she lacks such responsiveness. To see what such a view amounts to, consider again the case of Allison and her decision to walk her dog. A reasons-responsive view of the will says that Allison’s volition to walk her dog is free if, had she had certain reasons for not walking her dog, she would not have decided to walk her dog. Imagine what would have happened had Allison turned on the television after waking from her nap and learned of the blizzard before deciding to walk her dog. Had she known of the blizzard, she would have had a good reason for deciding not to walk her dog. Even if such reasons never occur to her (that is, if she doesn’t learn of the blizzard before her decision), her disposition to have such reasons influence her volitions shows that she is responsive to reasons. Thus, reasons-responsive views of the will are essentially dispositional in nature.

Coercion and manipulation undermine free will, on this view, in virtue of making agents not reasons-responsive. If Allison has been brainwashed to walk the dog at a certain time, then even if she were to turn on the news and sees that it is snowing, she would attempt to walk the dog despite having good reasons not to. Thus, manipulated agents are not reasons-responsive, and in virtue of this lack free will. [See Fischer and Ravizza (1998) for one of the primary reasons-responsive views of free will.]

3. Free Will and Determinism

a. The Thesis of Causal Determinism

Most contemporary scholarship on free will focuses on whether or not it is compatible with causal determinism. Causal determinism is sometimes also called “nomological determinism.” It is important to keep causal determinism distinct from other sorts of determinism, such as logical determinism or theological determinism (to be discussed below). Causal determinism (hereafter, simply “determinism”) is the thesis that the course of the future is entirely determined by the conjunction of the past and the laws of nature. Imagine a proposition that completely describes the way that the entire universe was at some point in the past, say 100 million years ago. Let us call this proposition “P.” Also imagine a proposition that expresses the conjunction of all the laws of nature; call this proposition “L.” Determinism then is the thesis that the conjunction of P and L entails a unique future. Given P and L, there is only one possible future, one possible way for things to end up. To make the same point using possible world semantics, determinism is the thesis that all the states of affairs that obtain at some time in the past, when conjoined with the laws of nature, entail which possible world is the actual world. Since a possible world includes those states of affairs that will obtain, the truth of determinism amounts to the thesis that the past and the laws of nature entail what states of affairs will obtain in the future, and that only those states of affairs entailed by the past and the laws will in fact obtain.

A system’s being determined is different from its being predictable. It is possible for determinism to be true and for no one to be able to predict the future. The fact that no human agent knows or is able to know future truths has no bearing on whether there are future truths entailed by the conjunction of the past and the laws. However, there is a weaker connection between the thesis of determinism and the predictability of the future. If determinism were true, then a being with a complete knowledge of P and L and with sufficient intellective capacities should be able to infallibly predict the way that the future will turn out. However, given that we humans lack both the relevant knowledge and the intellective capacities required, the fact that we are not able to predict the future is not evidence for the falsity of determinism.

b. Determinism, Science and “Near Determinism”

Most philosophers agree that whether or not determinism is true is a contingent matter; that is, determinism is neither necessarily true nor necessarily false. If this is so, then whether or not determinism is true becomes an empirical matter, to be discovered by investigating the way the world is, not through philosophical argumentation. This is not to deny that the truth of determinism would have metaphysical implications. For one, the truth of determinism would entail that the laws of nature are not merely probabilistic—for if they were, then the conjunction of the past and the laws would not entail a unique future. Furthermore, as we shall see shortly, philosophers care very much about what implications the truth of determinism would have for free will. But the point to note is that if the truth of determinism is a contingent truth about the way the world actually is, then scientific investigation should give us insight into this matter. Let us say that a possible world is deterministic if causal determinism is true in that world. There are two ways that worlds could fail to be deterministic. As already noted, if the laws of nature in a given world were probabilistic, then such a world would not be deterministic. Secondly, if there are entities within a world that are not fully governed by the laws of nature, then even if those laws are themselves deterministic, that world would not be deterministic.

Some scientists suggest that certain parts of physics give us reason to doubt the truth of determinism. For example, the standard interpretation of Quantum Theory, the Copenhagen Interpretation, holds that the laws governing nature are indeterministic and probabilistic. According to this interpretation, whether or not a small particle such as a quark swerves in a particular direction at a particular time is described properly only by probabilistic equations. Although the equations may predict the likelihood that a quark swerves to the left at a certain time, whether or not it actually swerves is indeterministic or random.

There are also deterministic interpretations of Quantum Theory, such as the Many-Worlds Interpretation. Fortunately, the outcome of the debate regarding whether Quantum Theory is most properly interpreted deterministically or indeterminstically, can be largely avoided for our current purposes. Even if (systems of) micro-particles such as quarks are indeterministic, it might be that (systems involving) larger physical objects such as cars, dogs, and people are deterministic. It is possible that the only indeterminism is on the scale of micro-particles and that macro-objects themselves obey deterministic laws. If this is the case, then causal determinism as defined above is, strictly speaking, false, but it is “nearly” true. That is, we could replace determinism with “near determinism,” the thesis that despite quantum indeterminacy, the behaviors of all large physical objects—including all our actions—obey deterministic laws [see Honderich (2002), particularly chapter 6].

What would be the implications of the truth of either determinism or near determinism? More specifically, what would be the implications for questions of free will? One way to think about the implications would be by asking the following the question: Could we still be free even if scientists were to discover that causal determinism (or near determinism) is true?

c. Compatibilism, Incompatibilism, and Pessimism

The question at the end of the preceding section (Could we have free will even if determinism is true?) is a helpful way to differentiate the main positions regarding free will. Compatibilists answer this question in the affirmative. They believe that agents could have free will even if causal determinism is true (or even if near determinism is true. In what follows, I will omit this qualification). In other words, the existence of free will in a possible world is compatible with that world being deterministic. For this reason, this position is known as “compatibilism,” and its proponents are called “compatibilists.” According to the compatibilist, it is possible for an agent to be determined in all her choices and actions and still make some of her choices freely.

According to “incompatibilists,” the existence of free will is incompatible with the truth of determinism. If a given possible world is deterministic, then no agent in that world has free will for that very reason. Furthermore, if one assumes that having free will is a necessary condition for being morally responsible for one’s actions, then the incompatibility of free will and determinism would entail the incompatibility of moral responsibility and causal determinism.

There are at least two kinds of incompatibilists. Some incompatibilists think that determinism is true of the actual world, and thus no agent in the actual world possesses free will. Such incompatibilists are often called “hard determinists” [see Pereboom (2001) for a defense of hard determinism]. Other incompatibilists think that the actual world is not deterministic and that at least some of the agents in the actual world have free will. These incompatibilists are referred to as “libertarians” [see Kane (2005), particularly chapters 3 and 4]. However, these two positions are not exhaustive. It is possible that one is an incompatibilist, thinks that the actual world is not deterministic, and yet still thinks that agents in the actual world do not have free will. While it is less clear what to call such a position (perhaps “free will deniers”), it illustrates that hard determinism and libertarianism do not exhaust the ways to be an incompatibilist. Since all incompatibilists, whatever their stripe, agree that the falsity of determinism is a necessary condition for free will, and since compatibilists deny this assertion, the following sections speak simply of incompatibilists and compatibilists.

It is also important to keep in mind that both compatibilism and incompatibilism are claims about possibility. According to the compatibilist, it is possible that an agent is both fully determined and yet free. The incompatibilist, on the other hand, maintains that such a state of affairs is impossible. But neither position by itself is making a claim about whether or not agents actually do possess free will. Assume for the moment that incompatibilism is true. If the truth of determinism is a contingent matter, then whether or not agents are morally responsible will depend on whether or not the actual world is deterministic. Furthermore, even if the actual world is indeterministic, it doesn’t immediately follow that the indeterminism present is of the sort required for free will (we will return to a similar point below when considering an objection to incompatibilism). Likewise, assume both that compatibilism is true and that causal determinism is true in the actual world. It does not follow from this that agents in the actual world actually possess free will.

Finally, there are free will pessimists [see Broad (1952) and G. Strawson (1994)]. Pessimists agree with the incompatibilists that free will is not possible if determinism is true. However, unlike the incompatibilists, pessimists do not think that indeterminism helps. In fact, they claim, rather than helping support free will, indeterminism undermines it. Consider Allison contemplating taking her dog for a walk. According to the pessimist, if Allison is determined, she cannot be free. But if determinism is false, then there will be indeterminacy at some point prior to her action. Exactly where one locates this indeterminacy will depend on one’s particular view of the nature of free will. Let us assume that that indeterminacy is located in which reasons occur to Allison. It is hard to see, the pessimist argues, how this indeterminacy could enhance Allison’s free will, for the occurrence of her reasons is indeterministic, then having those reasons is not within Allison’s control. But if Allison decides on the basis of whatever reasons she does have, then her volition is based upon something outside of her control. It is based instead on chance. Thus, pessimists think that the addition of indeterminism actually makes agents lack the kind of control needed for free will. While pessimism might seem to be the same position as that advocated by free will deniers, pessimism is a stronger claim. Free will deniers thinks that while free will is possible, it just isn’t actual: agents in fact don’t have free will. Pessimists, however, have a stronger position, thinking that free will is impossible. Not only do agents lack free will, there is no way that they could have it [see G. Strawson (1994)]. The only way to preserve moral responsibility, for the pessimist, is thus to deny that free will is a necessary condition for moral responsibility.

As pessimism shows us, even a resolution to the debate between compatibilists and incompatibilists will not by itself solve the debate about whether or not we actually have free will. Nevertheless, it is to this debate that we now turn.

4. Arguments for Incompatibilism (or Arguments against Compatibilism)

Incompatibilists say that free will is incompatible with the truth of determinism. Not all arguments for incompatibilism can be considered here; let us focus on two major varieties. The first variety is built around the idea that having free will is a matter of having a choice about certain of our actions, and that having a choice is a matter of having genuine options or alternatives about what one does. The second variety of arguments is built around the idea that the truth of determinism would mean that we don’t cause our actions in the right kind of way. The truth of determinism would mean that we don’t originate our actions in a significant way and our actions are not ultimately controlled by us. In other words, we lack the ability for self-determination. Let us consider a representative argument from each set.

a. The Consequence Argument

The most well-known and influential argument for incompatibilism from the first set of arguments is called the “Consequence Argument,” and it has been championed by Carl Ginet and Peter van Inwagen [see Ginet (1966) and van Inwagen (1983)]. The Consequence Argument is based on a fundamental distinction between the past and the future. First, consider an informal presentation of this argument. There seems to be a profound asymmetry between the past and the future based on the direction of the flow of time and the normal direction of causation. The future is open in a way that the past is not. It looks as though there is nothing that Allison can now do about the fact that Booth killed Lincoln, given that Lincoln was assassinated by Booth in 1865.

This point stands even if we admit the possibility of time travel. For if time travel is possible, Allison can influence what the past became, but she cannot literally change the past. Consider the following argument:

  1. The proposition “Lincoln was assassinated in 1865” is true.
  2. If Allison travels to the past, she could prevent Lincoln from being assassinated in 1865 (temporarily assumed for reductio purposes).
  3. If Allison were to travel to the past and prevent Lincoln from being assassinated in 1865, the proposition “Lincoln was assassinated in 1865” would be false.
  4. A proposition cannot both be true and false.
  5. Therefore, 2 is false.

So, at most the possibility of time travel allows for agents to have causal impact on the past, not for agents to change what has already become the past. The past thus appears to be fixed and unalterable. However, it seems that the same is not true of the future, for Allison can have an influence on the future through her volitions and subsequent actions. For example, if she were to invent a time machine, then she could, at some point in the future, get in her time machine and travel to the past and try to prevent Lincoln from being assassinated. However, given that he was assassinated, we can infer that her attempts would all fail. On the other hand, she could refrain from using her time machine in this way.

The asymmetry between past and future is illustrated by the fact that we don’t deliberate about the past in the same way that we deliberate about the future. While Allison might deliberate about whether a past action was really the best action that she could have done, she deliberates about the future in a different way. Allison can question whether her past actions were in fact the best, but she can both question what future acts would be best as well as which future acts she should perform. Thus, it looks like the future is open to Allison, or up to her, in a way that the past is not. In other words, when an agent like Allison is using her free will, what she is doing is selecting from a range of different options for the future, each of which is possible given the past and the laws of nature. For this reason, this view of free will is often called the “Garden of Forking Paths Model.”

The Consequence Argument builds upon this view of the fixed nature of the past to argue that if determinism is true, the future is not open in the way that the above reflections suggest. For if determinism is true, the future is as fixed as is the past. Remember from the above definition that determinism is the thesis the past (P) and the laws of nature (L) entail a unique future. Let “F” refer to any true proposition about the future. The Consequence argument depends on two modal operators, and two inference rules. Let the modal operator “☐” abbreviate “It is logically necessary that..,” so that, when it operates on some proposition p, “☐p” abbreviates “It is logically necessary that p.” Let the modal operator “N” be such that “Np” stands for “p is true and no one has, or ever had, any choice about whether p was true.” Call the following two inference rules “Alpha” and “Beta:”

Alpha: ☐p implies Np

Beta: {Np and N(pq)} implies Nq

According to Alpha, if p is a necessary truth, then no one has, or ever had, any choice about whether p was true. Similarly, according to Beta, if no one has, or ever had, any choice about p being true, and no one has, or ever had, any choice that p entails q, then no one has, or ever had, any choice about whether q is true. To see the plausibility of Beta, consider the following application. Let p be the proposition “The earth was struck by a meteor weighing 100 metric tons one billion years ago,” and let q be the proposition “If the earth was struck by a meteor weighing 100 metric tons one billion years ago, then thousands of species went extinct.” Since I have no choice about such a meteor hitting in the past, and have no choice that if such meteor hits, it will cause thousands of species to go extinct, I have no choice that thousands of species went extinct. Beta thus looks extremely plausible. But if Beta is true, then we can construct an argument to show that if determinism is true, then I have no choice about anything, including my supposed free actions in the future. The argument begins with the definition of determinism given above:

(1) ☐{(P and L) → F}

Using a valid logical rule of inference (exportation), we can transform 1 into 2:

(2) ☐{P → (LF)}

Applying Alpha, we can derive 3:

(3) N{P → (LF)}

The second premise in the Consequence Argument is called the “fixity of the past.” No one has, or ever had, a choice about the true description P of the universe at some point in the distant past:

(4) NP

From 3, 4 and Beta, we can deduce 5:

(5) N(LF)

The final premise in the argument is the fixity of the laws of nature. No one has, or ever had, a choice about what the laws of nature are (try as I might, I cannot make the law of universal gravitation not be a law of nature):

(6) NL

And from 5 and 6, again using Beta, we can infer that no one has, or ever had, a choice about F:

(7) NF

Given that F was any true proposition about the future, the Consequence Argument concludes that if determinism is true, then no one has or ever had a choice about any aspect of the future, including what we normally take to be our free actions. Thus, if determinism is true, we do not have free will.

b. The Origination Argument

The second general set of arguments for the incompatibility of free will and determinism builds on the importance of the source of a volition for free will. Again, it will be helpful to begin with an informal presentation of the argument before considering a formal presentation of it. According to this line of thought, an agent has free will when her volitions issue from the agent herself in a particular sort of way (say, her beliefs and desires). What is important for free will, proponents of this argument claim, is not simply that the causal chain for an agent’s volition goes through the agent, but that it originates with the agent. In other words, an agent acts with free will only if she originates her action, or if she is the ultimate source or first cause of her action [see Kane (1998)].

Consider again the claim that free will is a necessary condition for moral responsibility. What reflection on cases of coercion and manipulation suggests to us is that even if a coerced or manipulated agent is acting on her beliefs and desires, this isn’t enough for moral responsibility. We normally assume that coercion and certain forms of manipulation undercut an agent’s moral responsibility precisely because a coerced or manipulated agent isn’t the originator of her coerced action. If Allison is coerced into walking her dog via brainwashing, then her walking of the dog originates in the brainwashing, and not in Allison herself. Consider, then, the similarities between cases of coercion and manipulation, on the one hand, and the implications of the truth of determinism on the other. If determinism were true, it might be true that Allison chooses to walk her dog because of her beliefs and desires, but those beliefs and desires would themselves be the inevitable products of causal chains that began millions of years ago. Thus, a determined agent is at most a source, but not the ultimate source, of her volitions. According to proponents of this sort of argument for incompatibilism, the truth of determinism would mean that agents don’t cause their actions in the kind of way needed for free will and, ultimately, moral responsibility.

We can represent a formal version of the argument, called the “Origination Argument,” as follows:

  1. An agent acts with free will only if she is the originator (or ultimate source) of her actions.
  2. If determinism is true, then everything any agent does is ultimately caused by events and circumstances outside her control.
  3. If everything an agent does is ultimately caused by events and circumstances beyond her control, then the agent is not the originator (or ultimate source) of her actions.
  4. Therefore, if determinism is true, then no agent is the originator (or ultimate source) of her actions.
  5. Therefore, if determinism is true, no agent has free will.

The Origination Argument is valid. So, in evaluating its soundness, we must evaluate the truth of its three premises. Premise 3 is clearly true, since for an agent to be an originator just is for that agent not to be ultimately determined by anything outside of herself. Premise 2 of this argument is true by the definition of determinism. To reject the conclusion of the argument, one must therefore reject premise 1.

Earlier we briefly noted one account of free will which implicitly denies premise 1, namely the hierarchical model of free will. According to this model, an agent acts with free will so long as the causal chain for that action goes through the agent’s 1st- and 2nd-order desires. One way of emphasizing the need for origination over-against such a hierarchical model is to embrace agent-causation. If premise 1 is true, then the agent’s volition cannot be the product of a deterministic causal chain extended beyond the agent. What other options are there? Two options are that volitions are uncaused, or only caused indeterministically. It is difficult to see how an agent could be the originator or ultimate source of volitions if volitions are uncaused. Similarly, for reasons we saw above when discussing the free will pessimist, it looks as if indeterministic causation would undermine, rather than enhance, an agent’s control over her volitions. For these reasons, some incompatibilists favor looking at the causation involved in volitions in a new light. Instead of holding that a volition is caused by a previous event (either deterministically or indeterministically), these incompatibilists favor saying that volitions are caused directly by agents. [For an extended defense of this view, see O’Connor, (2000).] They hold that there are two irreducibly different kinds of causation, event-causation and agent-causation, and the latter is involved in free will. Proponents of agent-causation propose that agents are enduring substances that directly possess the power to cause volitions. Although many philosophers question whether agent-causation is coherent, if it were coherent, then it would provide support for premise 1 of the Origination Argument.

c. The Relation between the Arguments

The above way of delineating the Consequence and Origination Arguments may unfortunately suggest that the two kinds of arguments are more independent from each other than they really are. A number of incompatibilists have argued that agents originate their actions in the way required by premise 1 of the Origination Argument if and only if they have a choice about their actions in the way suggested by the Consequence Argument. In other words, if my future volitions are not the sort of thing that I have a choice about, then I do not originate those volitions. And as the above arguments contend, the truth of causal determinism threatens both our control over our actions and volitions, and our ability to originate those same actions and volitions. For if causal determinism is true, then the distant past, when joined with the laws of nature, is sufficient for every volition that an agent makes, and the causal chains that lead to those volitions would not begin within the agent. Thus, most incompatibilists think that having a choice and being a self-determiner go hand-in-hand. Robert Kane, for instance, argues that if agents have “ultimate responsibility” (his term for what is here called “origination” or “self-determination”), then they will also have alternative possibilities open to them. According to this line of argumentation, the power to cause one’s own actions is not a distinct power from the power to choose and do otherwise. Thus, the two different kinds of arguments for incompatibilism may simply be two sides of the same coin [see Kane (1996) and (2005)].

5. Arguments for Compatibilism (or Arguments against Incompatibilism)

Having laid out representatives of the two most prominent arguments for incompatibilism, let’s consider arguments in favor of compatibilism. In considering these kinds of arguments, it is pedagogically useful to approach them by using the arguments for incompatibilism. So, this section begins by considering ways that compatibilists have responded to the arguments given in the preceding section.

a. Rejecting the Incompatibilist Arguments

As noted above, the Origination Argument for incompatibilism is valid, and two of its premises are above dispute. Thus, the only way for the compatibilist to reject the conclusion of the Origination Argument is to reject its first premise. In other words, given the definition of determinism, compatibilists must reject that free will requires an agent being the originator or ultimate source of her actions. But how might this be done? Most frequently, compatibilists motivate a rejection of the “ultimacy condition” of free will by appealing to either a hierarchical or reasons-responsive view of what the will is [see Frankfurt, (1971) and Fischer and Ravizza, (1998)]. If all that is required for free will, for example, is that a certain mesh between an agent’s 1st-order volitions and 2nd-order desires, then such an account does not require that an agent be the originator of those desires. Furthermore, since the truth of determinism would not entail that agents don’t have 1st and 2nd-order desires and volitions, a hierarchical account of the will is compatible with the truth of determinism. Similarly, if an agent has free will if she has the requisite level of reasons-responsiveness such that she would have willed differently had she had different reasons, ultimacy is again not required. Thus, if one adopts certain accounts of the will, one has reason for rejecting the central premise of the Origination Argument.

Compatibilists have a greater number of responses available to them with regard to the Consequence Argument. One way of understanding the N operator that figures in the Consequence Argument is in terms of having the ability to do otherwise. That is, to say that Allison has no choice about a particular action of hers is to say that she could not have performed a different action (or even no action at all). Incompatibilists can easily account for this ability to do otherwise. According to incompatibilists, an agent can be free only if determinism is false. Consider again the case of Allison. If determinism is false, even though Allison did choose to walk her dog, she could have done otherwise than walk her dog since the conjunction of P and L is not sufficient for her taking her dog for a walk. Compatibilists, however, can give their own account of the ability to do otherwise. For them, to say that Allison could have done otherwise is simply to say that Allison would have done otherwise had she willed or chosen to do so [see, for example, Chisholm (1967)]. Of course, if determinism is true, then the only way that Allison could have willed or chosen to do otherwise would be if either the past or the laws were different than they actually are. In other words, saying that an agent could have done otherwise is to say that the agent would have done otherwise in a different counterfactual condition. But saying this is entirely consistent with one way of understanding the ability to do otherwise. Thus, these compatibilists are saying that Allison has the ability to do something such that, had she done it, either the past or the laws of nature would have been different than they actually are. If P and L entail that the agent does some action A, then the agent’s doing otherwise than A entails that either P or L would have been different than they actually are. Some compatibilists favor saying that agents have this counterfactual power over the past, while others favor counterfactual power over the laws of nature [Compare Lewis (1981) and Fischer (1984)]. Regardless, adopting either strategy provides the compatibilist with a way of avoiding the conclusion of the Consequence Argument by denying either premise 4 or premise 6 of that argument. Furthermore, having such a power is not a hollow victory, for it demarcates a plausible difference between those actions an agent would have done even if she didn’t want to (as in the case of coercion or manipulation) from those actions that an agent only would have done had she had certain beliefs and desires about that action. This view thus differentiates between those actions that were within the agent’s power to bring about from those that were not.

A second compatibilist response to the Consequence Argument is to deny the validity of the inference rule Beta the argument uses. While there are several approaches to this, perhaps the most decisive is the following, called the principle of Agglomeration [see McKay and Johnson (1996)]. Using only the inference rules Alpha, Beta and the basic rule of logical replacement, one can show that

(1) Np

and

(2) Nq

would entail

(3) N(p and q)

if Beta were valid. 1 and 2 do not entail 3, so Beta must be invalid.

To see why 3 does not follow from 1 and 2, consider the case of a coin-toss. If the coin-toss is truly random, then Allison has no choice regarding whether the coin (if flipped) lands heads. Similarly, she has no choice regarding whether the coin (again, if flipped) lands tails. For purposes of simplicity, let us stipulate that the coin cannot land on its side and, if flipped, must land either heads or tails. Let p above represent ‘the coin doesn’t land heads’ and q represent ‘the coin doesn’t land tails’. If Beta were valid, then 1 and 2 would entail 3, and Allison would not have a choice about the conjunction of p and q; that is, she wouldn’t have a choice about the coin not landing heads and the coin not landing tails. If Allison didn’t have a choice about the coin not landing heads and didn’t have a choice about the coin not landing tails, then she wouldn’t have a choice about the coin landing either heads or tails. But Allison does have a choice about this—after all, she can ensure that the coin lands either heads or tails by simply flipping the coin. So Allison does have a choice about the conjunction of p and q. Since Alpha and the relevant rules of logical replacement in the transformation from Np and Nq to N(p and q) are beyond dispute, Beta must be invalid. Thus, the Consequent Argument for incompatibilism is invalid. [For an incompatibilist reply to the argument from Agglomeration, see Finch and Warfield (1998).]

b. Frankfurt’s Argument against “the Ability to Do Otherwise”

Two other arguments for compatibilism build on the freedom requirement for moral responsibility. If one can show that moral responsibility is compatible with the truth of determinism, and if free will is required for moral responsibility, one will have implicitly shown that free will is itself compatible with the truth of determinism. The first of these arguments for compatibilism rejects the understanding of having a choice as involving the ability to do otherwise mentioned above. While most philosophers have tended to accept that an agent can be morally responsible for doing an action only if she could have done otherwise, Harry Frankfurt has attempted to show that this requirement is in fact false. Frankfurt gives an example in which an agent does an action in circumstances that lead us to believe that the agent acted freely [Frankfurt (1969); for recent discussion, see Widerker and McKenna (2003)]. Yet, unbeknown to the agent, the circumstances include some mechanism that would bring about the action if the agent did not perform it on her own. As it happens, though, the agent does perform the action freely and the mechanism is not involved in bringing about the action. It thus looks like the agent is morally responsible despite not being able to do otherwise. Here is one such scenario:

Allison is contemplating whether to walk her dog or not. Unbeknown to Allison, her father, Lloyd, wants to insure that that she does decide to walk the dog. He has therefore implanted a computer chip in her head such that if she is about to decide not to walk the dog, the chip will activate and coerce her into deciding to take the dog for a walk. Given the presence of the chip, Allison is unable not to decide to walk her dog, and she lacks the ability to do otherwise. However, Allison does decide to walk the dog on her own.

In such a case, Frankfurt thinks that Allison is morally responsible for her decision since the presence of Lloyd and his computer chip play no causal role in her decision. Since she would have been morally responsible had Lloyd not been prepared to ensure that she decide to take her dog for a walk, why think that his mere presence renders her not morally responsible? Frankfurt concludes that Allison is morally responsible despite lacking the ability to do otherwise. If Frankfurt is right that such cases are possible, then even if the truth of determinism is incompatible with a kind of freedom that requires the ability to do otherwise, it is compatible with the kind of freedom required for moral responsibility.

c. Strawson’s Reactive Attitudes

In an influential article, Peter Strawson argues that many of the traditional debates between compatibilists and incompatibilists (such as how to understand the ability to do otherwise) are misguided [P. Strawson (1963)]. Strawson thinks that we should instead focus on what he calls the reactive attitudes—those attitudes we have toward other people based on their attitudes toward and treatment of us. Strawson says that the hallmark of reactive attitudes is that they are “essentially natural human reactions to the good or ill will or indifference of others toward us, as displayed in their attitudes and actions.” Examples of reactive attitudes include gratitude, resentment, forgiveness and love. Strawson thinks that these attitudes are crucial to the interpersonal interactions and that they provide the basis for holding individuals morally responsible. Strawson then argues for two claims. The first of these is that an agent’s reactive attitudes would not be affected by a belief that determinism was true:

The human commitment to participation in ordinary interpersonal relationships is, I think, too thoroughgoing and deeply rooted for us to take seriously the thought that a general theoretical conviction might so change our world that, in it, there were no longer such things as inter-personal relationships as we normally understand them.… A sustained objectivity of inter-personal attitude, and the human isolation which that would entail, does not seem to be something of which human beings would be capable, even if some general truth were a theoretical ground for it.

Furthermore, Strawson also argues for a normative claim: the truth of determinism should not undermine our reactive attitudes. He thinks that there are two kinds of cases where it is appropriate to suspend our reactive attitudes. One involves agents, such as young children or the mentally disabled, who are not moral agents. Strawson thinks that we should not have reactive attitudes toward non-moral agents. The second kind of case where it is appropriate to suspend our reactive attitudes are those in which while the agent is a moral agent, her action toward us is not connected to her agency in the correct way. For instance, while I might have the reactive attitude of resentment towards someone who bumps into me and makes me spill my drink, if I were to find out that the person was pushed into me, I would not be justified in resenting that individual. The truth of determinism, however, would neither entail that no agents are moral agents nor that none of an agent’s actions are connected to her moral agency. Thus, Strawson thinks, the truth of determinism should not undermine our reactive attitudes. Since moral responsibility is based on the reactive attitudes, Strawson thinks that moral responsibility is compatible with the truth of determinism. And if free will is a requirement for moral responsibility, Strawson’s argument gives support to compatibilism.

6. Related Issues

The above discussion should help explain the perennial attraction philosophers have to the issues surrounding free will, particularly as it relates to causal determinism. However, free will is also intimately related to a number of other recurrent issues in the history of philosophy. In this final section, I will briefly articulate two other kinds of determinism and show how they are connected to free will.

a. Theological Determinism

The debate about free will and causal determinism parallels, in many ways, another debate about free will, this one stemming from what is often called ‘theological determinism’. Some religious traditions hold that God is ultimately responsible for everything that happens. According to these traditions, God’s willing x is necessary and sufficient for x. But if He is ultimately responsible for everything in virtue of what He wills, then He is ultimately responsible for all the actions and volitions performed by agents. God’s willing that Allison take the dog for a walk is thus necessary and sufficient for Allison taking the dog for a walk. But if this is true, it is hard to see how Allison could have free will. The problem becomes especially astute when considering tradition doctrines of eternal punishment. The traditional Christian doctrine of Hell, for example, is that Hell is a place of eternal punishment for non-repentant sinners. But if theological determinism is true, then whether or not agents repent is ultimately up to God, not to the agents themselves. This worry over free will thus gives rise to a particular version of the problem of evil: why does God not will that all come to faith, when His having such a will is sufficient for their salvation? [For a discussion of these, and related issues, see Helm, (1994).]

b. Logical Determinism

In addition to the causal and theological forms of determinism, there is also logical determinism. Logical determinism builds off the law of excluded middle and holds that propositions about what agents will do in the future already have a truth value. For instance, the proposition “Allison will take the dog for a walk next Thursday” is already true or false. Assume that it is true. Since token propositions cannot change in truth value over time, it was true a million years ago that Allison would walk her dog next Thursday. But the truth of the relevant proposition is sufficient for her actually taking the dog for a walk (after all, if it is true that she will walk the dog, then she will walk the dog). But then it looks like no matter what happens, Allison will in fact take her dog for a walk next Thursday and that this has always been the case. However, it is hard to see how Allison’s deciding to walk the dog can be a free decision since she must (given that the relevant token proposition is true and was true a million years ago) decide to walk him. In response to this problem, some philosophers have attempted to show that free will is compatible with the existence of true propositions about what we will do in the future, and others have denied that propositions about future free actions have a truth value, that is, that the law of excluded middle fails for some propositions. [For an introduction to these issues, see Finch and Warfield, (1999) and Kane, (2002).] If God is a being who knows the truth value of every proposition, this debate also connects with the debate over the relationship between divine foreknowledge and free will.

From this brief survey, we see that free will touches on central issues in metaphysics, philosophy of human nature, action theory, ethics and the philosophy of religion. Furthermore, we’ve seen that there are competing views regarding virtually every aspect of free will (including whether there is, or even could be, such a thing). Perhaps this partially explains the perennial philosophical interest in the topic.

7. References and Further Reading

  • Anglin, W. S. (1990). Free Will and the Christian Faith (Clarendon Press).
  • Broad, C. D. (1952). “Determinism, Indeterminism, and Libertarianism,” in Ethics and the History of Philosophy (Routledge and Kegan Paul).
  • Chisholm, Roderick (1967). “He Could Have Done Otherwise,” Journal of Philosophy 64: 409-417.
  • Descartes, René (1998). Discourse on Method and Meditations on First Philosophy, 4th edition (Hackett Publishing Company).
  • Ekstrom, Laura Waddell (1999). Free Will: A Philosophical Study (HarperCollins Publishers).
  • Finch, Alicia and Ted Warfield (1994). “Fatalism: Logical and Theological,” Faith and Philosophy 16.2: 233-238.
  • Finch, Alicia and Ted Warfield (1998). “The Mind Argument and Libertarianism,” Mind 107: 515-528.
  • Fischer, John Martin (1984). “Power Over the Past,” Pacific Philosophical Quarterly 65: 335-350.
  • Fischer, John Martin (1994). The Metaphysics of Free Will (Blackwell).
  • Fischer, John Martin and Mark Ravizza (1998). Responsibility and Control: A Theory of Moral Responsibility (Cambridge University Press).
  • Frankfurt, Harry (1969). “Alternate Possibilities and Moral Responsibility,” reprinted in Pereboom, (1997), pages 156-166.
  • Frankfurt, Harry (1971). “Freedom of the Will and the Concept of a Person,” reprinted in Pereboom (1997), pages 167-183.
  • Ginet, Carl (1966). “Might We Have No Choice,” in Keith Lehrer, ed., Freedom and Determinism(Random House), pages 205-224.
  • Helm, Paul (1994). The Providence of God (InterVarsity Press).
  • Honderich, Ted (2002). How Free are You?, 2nd edition (Oxford University Press).
  • Kane, Robert (1998). The Significance of Free Will (Oxford University Press).
  • Kane, Robert, ed. (2001). Free Will (Blackwell).
  • Kane, Robert, ed. (2002). The Oxford Handbook of Free Will (Oxford University Press).
  • Kane, Robert (2005). A Contemporary Introduction to Free Will (Oxford University Press).
  • Lewis, David (1981). “Are We Free to Break the Laws?” Theoria 47: 113-121.
  • McKay, Thomas and David Johnson (1996). “A Reconsideration of an Argument against Compatibilism,” Philosophical Topics 24: 113-122.
  • O’Connor, Timothy (2000). Persons and Causes: The Metaphysics of Free Will (Oxford University Press).
  • Pereboom, Derk, ed. (1997). Free Will (Hackett).
  • Pereboom, Derk (2001). Living Without Free Will (Cambridge University Press).
  • Smilansky, Saul (2000). Free Will and Illusion (Clarendon Press).
  • Strawson, Galen (1994). “The Impossibility of Moral Responsibility,” Philosophical Studies 75: 5-24.
  • Strawson, Peter (1963). “Freedom and Resentment,” reprinted in Pereboom (1997), pages 119-142.
  • Stump, Eleonore (2003). Aquinas (Routledge).
  • Van Inwagen, Peter (1983). An Essay on Free Will (Clarendon Press).
  • Widerker, David and Michael McKenna (2003). Moral Responsibility and Alternative Possibilities: Essays on the Importance of Alternative Possibilities (Ashgate).

Author Information

Kevin Timpe
Email: ktimpe@nnu.edu
Northwest Nazarene University
U. S. A.

Analytic Philosophy

The school of analytic philosophy has dominated academic philosophy in various regions, most notably Great Britain and the United States, since the early twentieth century. It originated around the turn of the twentieth century as G. E. Moore and Bertrand Russell broke away from what was then the dominant school in the British universities, Absolute Idealism. Many would also include Gottlob Frege as a founder of analytic philosophy in the late 19th century, and this controversial issue is discussed in section 2c. When Moore and Russell articulated their alternative to Idealism, they used a linguistic idiom, frequently basing their arguments on the “meanings” of terms and propositions. Additionally, Russell believed that the grammar of natural language often is philosophically misleading, and that the way to dispel the illusion is to re-express propositions in the ideal formal language of symbolic logic, thereby revealing their true logical form. Because of this emphasis on language, analytic philosophy was widely, though perhaps mistakenly, taken to involve a turn toward language as the subject matter of philosophy, and it was taken to involve an accompanying methodological turn toward linguistic analysis. Thus, on the traditional view, analytic philosophy was born in this linguistic turn. The linguistic conception of philosophy was rightly seen as novel in the history of philosophy. For this reason analytic philosophy is reputed to have originated in a philosophical revolution on the grand scale—not merely in a revolt against British Idealism, but against traditional philosophy on the whole.

Analytic philosophy underwent several internal micro-revolutions that divide its history into five phases. The first phase runs approximately from 1900 to 1910. It is characterized by the quasi-Platonic form of realism initially endorsed by Moore and Russell as an alternative to Idealism. Their realism was expressed and defended in the idiom of “propositions” and “meanings,” so it was taken to involve a turn toward language. But its other significant feature is its turn away from the method of doing philosophy by proposing grand systems or broad syntheses and its turn toward the method of offering narrowly focused discussions that probe a specific, isolated issue with precision and attention to detail. By 1910, both Moore and Russell had abandoned their propositional realism—Moore in favor of a realistic philosophy of common sense, Russell in favor of a view he developed with Ludwig Wittgenstein called logical atomism. The turn to logical atomism and to ideal-language analysis characterizes the second phase of analytic philosophy, approximately 1910-1930. The third phase, approximately 1930-1945, is characterized by the rise of logical positivism, a view developed by the members of the Vienna Circle and popularized by the British philosopher A. J. Ayer. The fourth phase, approximately 1945-1965, is characterized by the turn to ordinary-language analysis, developed in various ways by the Cambridge philosophers Ludwig Wittgenstein and John Wisdom, and the Oxford philosophers Gilbert Ryle, John Austin, Peter Strawson, and Paul Grice.

During the 1960s, criticism from within and without caused the analytic movement to abandon its linguistic form. Linguistic philosophy gave way to the philosophy of language, the philosophy of language gave way to metaphysics, and this gave way to a variety of philosophical sub-disciplines. Thus the fifth phase, beginning in the mid 1960s and continuing beyond the end of the twentieth century, is characterized by eclecticism or pluralism. This post-linguistic analytic philosophy cannot be defined in terms of a common set of philosophical views or interests, but it can be loosely characterized in terms of its style, which tends to emphasize precision and thoroughness about a narrow topic and to deemphasize the imprecise or cavalier discussion of broad topics.

Even in its earlier phases, analytic philosophy was difficult to define in terms of its intrinsic features or fundamental philosophical commitments. Consequently, it has always relied on contrasts with other approaches to philosophy—especially approaches to which it found itself fundamentally opposed—to help clarify its own nature. Initially, it was opposed to British Idealism, and then to “traditional philosophy” at large. Later, it found itself opposed both to classical Phenomenology (for example, Husserl) and its offspring, such as Existentialism (Sartre, Camus, and so forth) and also “Continental”’ or “Postmodern” philosophy (Heidegger, Foucault and Derrida). Though classical Pragmatism bears some similarity to early analytic philosophy, especially in the work of C. S. Peirce and C. I. Lewis, the pragmatists are usually understood as constituting a separate tradition or school.

Table of Contents

  1. The Revolution of Moore and Russell: Cambridge Realism and The Linguistic Turn
  2. Russell and the Early Wittgenstein: Ideal Language and Logical Atomism
    1. The Theory of Descriptions
    2. Ideal-Language Philosophy vs. Ordinary-Language Philosophy
    3. Frege: Influence or Instigator?
    4. Logical Atomism and Wittgenstein’s Tractatus
  3. Logical Positivism, the Vienna Circle, and Quine
    1. Logical Positivism and the Vienna Circle
    2. W. V. Quine
  4. The Later Wittgenstein and Ordinary-Language Philosophy
    1. Ordinary-Language Philosophy
    2. The Later Wittgenstein
  5. The 1960s and After: The Era of Eclecticism
    1. The Demise of Linguistic Philosophy
    2. The Renaissance in Metaphysics
    3. The Renaissance in History
  6. References and Further Reading
    1. The Revolution of Moore and Russell: Cambridge Realism and The Linguistic Turn
    2. Russell and the Early Wittgenstein: Ideal Language and Logical Atomism
    3. Logical Positivism, the Vienna Circle, and Quine
    4. The Later Wittgenstein, et al.: Ordinary-Language Philosophy
    5. The 1960s and After: The Era of Eclecticism
    6. Critical and Historical Accounts of Analytic Philosophy
    7. Anthologies and General Introductions

1. The Revolution of Moore and Russell: Cambridge Realism and The Linguistic Turn

“It was towards the end of 1898,” wrote Bertrand Russell,

that Moore and I rebelled against both Kant and Hegel. Moore led the way, but I followed closely in his footsteps…. I felt…a great liberation, as if I had escaped from a hot house onto a windswept headland. In the first exuberance of liberation, I became a naïve realist and rejoiced in the thought that grass really is green. (Russell 1959, 22)

This important event in Russell’s own intellectual history turned out to be decisive for the history of twentieth-century philosophy as a whole; for it was this revolutionary break with British Idealism—then the most influential school of philosophical thought in the British universities—that birthed analytic philosophy and set it on the path to supplanting both Idealism and philosophy as traditionally conceived and practiced.

To understand Russell’s elation at the rebellion, one needs to know something about him and also something about British Idealism. Let’s begin with the latter.

At the end of the 19th century, F.H. Bradley, Bernard Bosanquet, and J.M.E. McTaggart were the leading British Idealists. They claimed that the world, although it naively appears to us to be a collection of discrete objects (this bird, that table, the earth and the sun, and so forth), is really a single indivisible whole whose nature is mental, or spiritual, or Ideal rather than material. Thus, idealism was a brand of metaphysical monism, but not a form of materialism, the other leading form of metaphysical monism. It was also a form of what we would now call anti-realism, since it claimed that the world of naïve or ordinary experience is something of an illusion. Their claim was not that the objects of ordinary experience do not exist, but that they are not, as we normally take them to be, discrete. Instead, every object exists and is what it is at least partly in virtue of the relations it bears to other things—more precisely, to all other things. This was called the doctrine of internal relations. Since, on this view, everything that exists does so only in virtue of its relations to everything else, it is misleading to say of any one thing that it exists simpliciter. The only thing that exists simpliciter is the whole—the entire network of necessarily related objects. Correspondingly, the Idealists believed that no statement about some isolated object could be true simpliciter, since, on their view, to speak of an object in isolation would be to ignore the greater part of the truth about it, namely, its relations to everything else.

Analytic philosophy began when Moore and then Russell started to defend a thoroughgoing realism about what Moore called the “common sense” or “ordinary” view of the world. This involved a lush metaphysical pluralism, the belief that there are many things that exist simpliciter. It was not this pluralism, however, nor the content of any of his philosophical views, that inspired the analytic movement. Instead, it was the manner and idiom of Moore’s philosophizing. First, Moore rejected system-building or making grand syntheses of his views, preferring to focus on narrowly defined philosophical problems held in isolation. Second, when Moore articulated his realism, he did so in the idiom of “propositions” and “meanings.” There is a noteworthy ambiguity as to whether these are linguistic items or mental ones.

This terminology is further ambiguous in Moore’s case, for two reasons. First, his views about propositions are highly similar to a view standard in Austro-German philosophy from Bolzano and Lotze to Husserl according to which “propositions” and “meanings” have an Ideal existence—the kind of existence traditionally attributed to Platonic Forms. It is likely that Moore got the idea from reading in that tradition (cf. Bell 1999, Willard 1984). Second, despite strong similarities with the Austro-German view, it is clear that, in Moore’s early thought, “propositions” and “meanings” are primarily neither Ideal nor mental nor linguistic, but real in the sense of “thing-like.” For Moore and the early Russell, propositions or meanings were “identical” to ordinary objects—tables, cats, people. For more on this peculiar view, see the article on Moore, section 2b.

The deep metaphysical complexity attaching to Moore’s view was largely overlooked or ignored by his younger contemporaries, who were attracted to the form of his philosophizing rather than to its content. Taking the linguistic aspect of “propositions” and “meanings” to be paramount, they saw Moore as endorsing a linguistic approach to philosophy. This along with his penchant for attending to isolated philosophical problems rather than constructing a grand system, gave rise to the notion that he had rebelled not merely against British Idealism but against traditional philosophy on the grand scale.

Though Moore was later to object that there was nothing especially linguistic about it (see Moore 1942b), the linguistic conception of Moore’s method was far from baseless. For instance, in a famous paper called “A Defense of Common Sense” (Moore 1925), Moore seems to argue that the common sense view of the world is built into the terms of our ordinary language, so that if some philosopher wants to say that some common sense belief is false, he thereby disqualifies the very medium in which he expresses himself, and so speaks either equivocally or nonsensically.

His case begins with the observation that we know many things despite the fact that we do not know how we know them. Among these “beliefs of common sense,” as he calls them, are such propositions as “There exists at present a living human body, which is my body,” “Ever since it [this body] was born, it has been either in contact with or not far from the surface of the earth,” and “I have often perceived both body and other things which formed part of its environment, including other human bodies” (Moore 1925; in Moore 1959: 33). We can call these common sense propositions.

Moore argues that each common sense proposition has an “ordinary meaning” that specifies exactly what it is that one knows when one knows that proposition to be true. This “ordinary meaning” is perfectly clear to most everyone, except for some skeptical philosophers who

seem to think that [for example] the question “Do you believe that the earth has existed for many years past?” is not a plain question, such as should be met either by a plain “Yes” or “No,” or by a plain “I can’t make up my mind,” but is the sort of question which can be properly met by: “It all depends on what you mean by ‘the earth’ and ‘exists’ and ‘years’….” (Moore 1925; in 1959: 36)

Moore thought that to call common sense into question this way is perverse because the ordinary meaning of a common sense proposition is plain to all competent language-users. So, to question its meaning, and to suggest it has a different meaning, is disingenuous. Moreover, since the bounds of intelligibility seem to be fixed by the ordinary meanings of common sense proposition, the philosopher must accept them as starting points for philosophical reflection. Thus, the task of the philosopher is not to question the truth of common sense propositions, but to provide their correct analyses or explanations.

Moore’s use of the term “analysis” in this way is the source of the name “analytic philosophy.” Early on in analytic history, Moorean analysis was taken to be a matter of rephrasing some common sense proposition so as to yield greater insight into its already-clear and unquestionable meaning. For example, just as one elucidates the meaning of “brother” by saying a brother is a male sibling or by saying it means “male sibling,” so one might say that seeing a hand means experiencing a certain external object—which is exactly what Moore claims in his paper “Proof of an External World” (Moore 1939).

The argument of that essay runs as follows. “Here is one hand” is a common sense proposition with an ordinary meaning. Using it in accordance with that meaning, presenting the hand for inspection is sufficient proof that the proposition is true—that there is indeed a hand there. But a hand, according to the ordinary meaning of “hand,” is a material object, and a material object, according to the ordinary meaning of “material object,” is an external object, an object that isn’t just in our mind. Thus, since we can prove that there is a hand there, and since a hand is an external object, there is an external world, according to the ordinary meaning of “external world.”

These examples are from papers written in the second half of Moore’s career, but his “linguistic method” can be discerned much earlier, in works dating all the way back to the late 1800s—the period of his rebellion against Idealism. Even in Moore’s first influential paper, “The Nature of Judgment” (Moore 1899), he can be found paying very close attention to propositions and their meanings. In his celebrated paper, “The Refutation of Idealism” (Moore 1903b), Moore uses linguistic analysis to argue against the Idealist’s slogan Esse est percipi (to be is to be perceived). Moore reads the slogan as a definition or, as he would later call it, an analysis: just as we say “bachelor” means “unmarried man,” so the Idealist says “to exist” means “to be cognized.” However, if these bits of language had the same meaning, Moore argues, it would be superfluous to assert that they were identical, just as it is superfluous to say “a bachelor is a bachelor.” The fact that the Idealist sees some need to assert the formula reveals that there is a difference in meanings of “to be” and “to be perceived,” and hence a difference in the corresponding phenomena as well.

Moore’s most famous meaning-centered argument is perhaps the “open question argument” of his Principia Ethica (Moore 1903a). The open question argument purports to show that it is a mistake to define “good” in terms of anything other than itself. For any definition of good—“goodness is pleasure,” say—it makes sense to ask whether goodness really is pleasure (or whatever it has been identified with); thus, every attempt at definition leaves it an open question as to what good really is. This is so because every purported definition fails to capture the meaning of “good.”

All of these cases exhibit what proved to be the most influential aspect of Moore’s philosophical work, namely his method of analysis, which many of his contemporaries took to be linguistic analysis. For instance, Norman Malcolm represents the standard view of Moore for much of the twentieth century when he says that “the essence of Moore’s technique of refuting philosophical statements consists in pointing out that these statements go against ordinary language” (Malcolm 1942, 349). In the same essay, he goes on to tie Moore’s entire philosophical legacy to his “linguistic method:”

Moore’s great historical role consists in the fact that he has been perhaps the first philosopher to sense that any philosophical statement that violates ordinary language is false, and consistently to defend ordinary language against its philosophical violators. (Malcolm 1942, 368)

Malcolm is right to note the novelty of Moore’s approach. Although previous philosophers occasionally had philosophized about language, and had, in their philosophizing, paid close attention to the way language was used, none had ever claimed that philosophizing itself was merely a matter of analyzing language. Of course, Moore did not make this claim either, but what Moore actually did as a philosopher seemed to make saying it superfluous—in practice, he seemed to be doing exactly what Malcolm said he was doing. Thus, though it took some time for the philosophical community to realize it, it eventually became clear that this new “linguistic method,” pioneered by Moore, constituted a radical break not only with the British Idealists but with the larger philosophical tradition itself. To put it generally, philosophy was traditionally understood as the practice of reasoning about the world. Its goal was to give a logos—a rationally coherent account—of the world and its parts at various levels of granularity, but ultimately as a whole and at the most general level. There were other aspects of the project, too, of course, but this was the heart of it. With Moore, however, philosophy seemed to be recast as the practice of linguistic analysis applied to isolated issues. Thus, the rise of analytic philosophy, understood as the relatively continuous growth of a new philosophical school originating in Moore’s “linguistic turn,” was eventually recognized as being not just the emergence of another philosophical school, but as constituting a “revolution in philosophy” at large. (See Ayer et al. 1963 and Tugendhat 1982.)

2. Russell and the Early Wittgenstein: Ideal Language and Logical Atomism

The second phase of analytic philosophy is charaterized by the turn to ideal language analysis and, along with it, logical atomism—a metaphysical system developed by Bertrand Russell and Ludwig Wittgenstein. Russell laid the essential groundwork for both in his pioneering work in formal logic, which is covered in Sections 2a and 2b. Though this work was done during the first phase of analytic philosophy (1900-1910), it colaesced into a system only toward the end of that period, as Russell and Whitehead completed their work on the monumental Principia Mathematica (Russell and Whitehead 1910-13), and as Russell began to work closely with Ludwig Wittgenstein.

Wittgenstein seems to have been the sine qua non of the system. Russell was the first to use the term “logical atomism,” in a 1911 lecture to the French Philosophical Society. He was also the first to publicly provide a full-length, systematic treatment of it, in his 1918 lectures on “The Philosophy of Logical Atomism” (Russell 1918-19). However, despite the centrality of Russell’s logical work for the system, in the opening paragraph of these lectures Russell acknowedges that they “are very largely concerned with explaining certain ideas which I learnt from my friend and former pupil Ludwig Wittgenstein” (Russell 1918, 35). Wittgenstein’s own views are recorded in his Tractatus Logico-Philosophicus. First published in 1921, the Tractatus proved to be the most influential piece written on logical atomism. Because of its influence, we shall pay special attention to the Tractatus when it comes to presenting logical atomism as a complete system in Section 2d.

Though Russell and Wittgenstein differed over some of the details of logical atomism, these disagreements can be ignored for present purposes. What mattered for the development of analytic philosophy on the whole was the emergence in the second decade of the twentieth century of a new view of reality tailored to fit recent developments in formal logic and the philosophical methodology connected to it, as discussed in Section 2b. This was the common core of the Russellian and Wittegensteinian versions of logical atomism; thus, blurring the lines between Russell and Wittgenstein actually enables us to maintain better focus on the emerging analytic tradition. It will also make convenient a brief word on Frege, to see why some have wanted to include him as a founder of analytic philosophy (Section 2c).

a. The Theory of Descriptions

Much of Russell’s exuberance over Moore’s realism had to do with its consequences for logic and mathematics. Like so many philosophers before him, Russell was attracted to the objective certainty of mathematical and logical truths. However, because Idealism taught that no proposition about a bit of reality in isolation could be true simpliciter, an apparently straightforward truth such as 2+2=4, or If a=b and b=c then a=c, was not so straightforward after all. Even worse, Idealism made such truths dependent upon their being thought or conceived. This follows from the doctrine of internal relations; for, on the natural assumption that knowledge is or involves a relation between a knower (subject) and something known (object), the doctrine implies that objects of knowledge are not independent of the subjects that know them. This left Idealism open to the charge of endorsing psychologism—the view that apparently objective truths are to be accounted for in terms of the operations of subjective cognitive or “psychological” faculties. Psychologism was common to nearly all versions of Kantian and post-Kantian Idealism (including British Idealism). It was also a common feature of thought in the British empirical tradition, from Hume to Mill (albeit with a naturalistic twist). Moore’s early realism allowed Russell to avoid psychologism and other aspects of Idealism that prevented treating logical and mathematical truths as absolutely true in themselves.

A crucial part of this early realism, however, was the object theory of meaning; and this had implications that Russell found unacceptable. On the object theory, the meaning of a sentence is the object or state of affairs to which it refers (this is one reason why Moore could identify ordinary objects as propositions or meanings; see Section 1). For instance, the sentence “That leaf is green” is meaningful in virtue of bearing a special relationship to the state of affairs it is about, namely, a certain leaf’s being green.

This may seem plausible at first glance; problems emerge, however, when one recognizes that the class of meaningful sentences includes many that, from an empirical point of view, lack objects. Any statement referring to something that does not exist, such as a fictional character in a novel, will have this problem. A particularly interesting species of this genus is the negative existential statement—statements that express the denial of their subjects’ existence. For example, when we say “The golden mountain does not exist,” we seem to refer to a golden mountain—a nonexistent object—in the very act of denying its existence. But, on the object theory, if this sentence is to be meaningful, it must have an object to serve as its meaning. Thus it seems that the object theorist is faced with a dilemma: either give-up the object theory of meaning or postulate a realm of non-empirical objects that stand as the meanings of these apparently objectless sentences.

The Austrian philosopher Alexius Meinong took the latter horn of the dilemma, notoriously postulating a realm of non-existent objects. This alternative was too much for Russell. Instead, he found a way of going between the horns of the dilemma. His escape route was called the “theory of descriptions,” a bit of creative reasoning that the logician F. P. Ramsey called a “paradigm of philosophy,” and one which helped to stimulate extraordinary social momentum for the budding analytic movement. The theory of descriptions appears in Russell’s 1905 essay, “On Denoting,” which has become a central text in the analytic canon. There, Russell argues that “denoting phrases”—phrases that involve a noun preceded by “a,” “an,” “some,” “any,” “every,” “all,” or “the”—are incomplete symbols; that is, they have no meaning on their own, but only in the context of a complete sentence that expresses a proposition. Such sentences can be rephrased—analyzed in Moore’s sense of “analyzed”—into sentences that are meaningful and yet do not refer to anything nonexistent.

For instance, according to Russell, saying “The golden mountain does not exist” is really just a misleading way of saying “It is not the case that there is exactly one thing that is a mountain and is golden.” Thus analyzed, it becomes clear that the proposition does not refer to anything, but simply denies an existential claim. Since it does not refer to any “golden mountain,” it does not need a Meinongian object to provide it with meaning. In fact, taking the latter formulation to be the true logical form of the statement, Russell construes the original’s reference to a non-existent golden mountain as a matter of grammatical illusion. One dispels the illusion by making the grammatical form match the true logical form, and this is done through logical analysis. The idea that language could cast illusions that needed to be dispelled, some form of linguistic analysis was to be a prominent theme in analytic philosophy, both in its ideal language and ordinary language camps, through roughly 1960.

b. Ideal-Language Philosophy vs. Ordinary-Language Philosophy

Russellian analysis has just been just identified as logical rather than linguistic analysis, and yet it was said in a previous paragraph that this was analysis in the sense made familiar by Moore. In truth, there were both significant similarities and significant differences between Moorean and Russellian analysis. On the one hand, Russellian analysis was like Moore’s in that it involved the rephrasing of a sentence into another sentence semantically equivalent but grammatically different. On the other hand, Russell’s analyses were not given in ordinary language, as Moore’s were. Instead, they were given in symbolic logic, that is, in a quasi-mathematical, symbolic notation that made the structure of Russell’s analyzed propositions exceedingly clear. For instance, with the definitions of Mx as “x is a mountain” and Gx as “x is golden,” the proposition that the golden mountain does not exist becomes

~[(∃x)(Mx & Gx) & ∀y((My & Gy) → y=x)]

Equivalently, in English, it is not the case that there is some object such that (1) it is a mountain, (2) it is golden, and (3) all objects that are mountains and golden are identical to it. (For more on what this sort of notation looks like and how it works, see the article on Propositional Logic, especially Section 3.)

By 1910, Russell, along with Alfred North Whitehead, had so developed this symbolic notation and the rules governing its use that it constituted a fairly complete system of formal logic. This they published in the three volumes of their monumental Principia Mathematica (Russell and Whitehead 1910-1913).

Within the analytic movement, the Principia was received as providing an ideal language, capable of elucidating all sorts of ordinary-language confusions. Consequently, Russellian logical analysis was seen as a new species of the genus linguistic analysis, which had already been established by Moore. Furthermore, many took logical analysis to be superior to Moore’s ordinary-language analysis insofar as its results (its analyses) were more exact and not themselves prone to further misunderstandings or illusions.

The distinction between ordinary-language philosophy and ideal-language philosophy formed the basis for a fundamental division within the analytic movement through the early 1960s. The introduction of logical analysis also laid the groundwork for logical atomism, a new metaphysical system developed by Russell and Ludwig Wittgenstein. Before we discuss this directly, however, we must say a word about Gottlob Frege.

c. Frege: Influence or Instigator?

In developing the formal system of Principia Mathematica, Russell relied heavily on the work of several forebears including the German mathematician and philosopher Gottlob Frege. A generation before Russell and the Principia, Frege had provided his own system of formal logic, with its own system of symbolic notation. Frege’s goal in doing so was to prove logicism, the view that mathematics is reducible to logic. This was also Russell’s goal in the Principia. (For more on the development of logic in the late 19th and early 20th centuries, see the article on Propositional Logic, especially Section 2). Frege also anticipated Russell’s notion of incomplete symbols by invoking what has come to be called “the context principle:” words have meaning only in the context of complete sentences.

Frege’s focus on the formalization and symbolization of logic naturally led him into terrain that we would now classify as falling under the philosophy of language, and to approach certain philosophical problems as if they were problems about language, or at least as if they could be resolved by linguistic means. This has led some to see in Frege a linguistic turn similar to that perceivable in the early work of Moore and Russell (on this point, see the article on Frege and Language).

Because of these similarities and anticipations, and because Russell explicitly relied on Frege’s work, many have seen Frege as a founder of analytic philosophy more or less on a par with Moore and Russell (See Dummett 1993 and Kenny 2000). Others see this as an exaggeration both of Frege’s role and of the similarities between him and other canonical analysts. For instance, Peter Hacker notes that Frege was not interested in reforming philosophy the way all the early analysts were:

Frege’s professional life was a single-minded pursuit of a demonstration that arithmetic had its foundations in pure logic alone … One will search Frege’s works in vain for a systematic discussion of the nature of philosophy. (Hacker 1986: 5, 7)

There is no doubt that Frege’s views proved crucially useful and inspiring to key players on the ideal-language side of analytic philosophy. Whether or not this qualifies him as a founder of analytic philosophy depends on the extent to which we see the analytic movement as born of a desire for metaphilosophical revolution on the grand scale. To the extent that this is essential to our understanding of analytic philosophy, Frege’s role will be that of an influence rather than a founder.

d. Logical Atomism and Wittgenstein’s Tractatus

Ludwig Wittgenstein came to Cambridge to study mathematical logic under Russell, but he quickly established himself as his teacher’s intellectual peer. Together, they devised a metaphysical system called “logical atomism.” As discussed at the beginning of Section 2, qua total system, logical atomism seems to have been Wittgenstein’s brainchild. Still, this should not be seen as in any way marginalizing Russell’s significance for the system, which can be described as a metaphysics based on the assumption that an ideal language the likes of which was provided in Principia Mathematica is the key to reality.

According to logical atomism, propositions are built out of elements corresponding to the basic constituents of the world, just as sentences are built out of words. The combination of words in a meaningful sentence mirrors the combination of constituents in the corresponding proposition and also in the corresponding possible or actual state of affairs. That is, the structure of every possible or actual state of affairs is isomorphic with both the structure of the proposition that refers to it and the structure of the sentence that expresses that proposition–so long as the sentence is properly formulated in the notation of symbolic logic. The simplest sort of combination is called an atomic fact because this fact has no sub-facts as part of its structure. An atomic fact for some logical atomists might be something like an individual having a property—a certain leaf’s being green, for instance. Linguistically, this fact is represented by an atomic proposition: for example, “this leaf is green,” or, in logical symbolism “F(a).” Both the fact F(a) and the proposition “F(a)” are called “atomic” not because they themselves are atomic [that is, without structure], but because all their constituents are. Atomic facts are the basic constituents of the world, and atomic propositions are the basic constituents of language.

More complex propositions representing more complex facts are called molecular propositions and molecular facts.  The propositions are made by linking atomic propositions together with truth-functional connectives, such as “and,” “or” and “not.” A truth-functional connective is one that combines constituent propositions in such a way that their truth-values (that is, their respective statuses as true or false) completely determine the truth value of the resulting molecular proposition. For instance, the truth value of a proposition of the form “not-p” can be characterized in terms of, and hence treated as determined by, the truth value of “p” because if “p” is true, then “not-p” is false, and if it is false, “not-p” is true. Similarly, a proposition of the form “p and q” will be true if and only if its constituent propositions “p” and “q” are true on their own.

The logic of Principia Mathematica is entirely truth-functional; that is, it only allows for molecular propositions whose truth-values are determined by their atomic constituents. Thus, as Russell observed in the introduction to the second edition of the Principia, “given all true atomic propositions, together with the fact that they are all, every other true proposition can theoretically be deduced by logical methods” (Russell 1925, xv). The same assumption—called the thesis of truth-functionality or the thesis of extensionality—lies behind Wittgenstien’s Tractatus Logico-Philosophicus.

As mentioned previously, Wittgenstein’s Tractatus proved to be the most influential expression of logical atomism. The Tractatus is organized around seven propositions, here taken from the 1922 translation by C. K. Ogden:

  1. The world is everything that is the case.
  2. What is the case, the fact, is the existence of atomic facts.
  3. The logical picture of the facts is the thought.
  4. The thought is the significant proposition.
  5. Propositions are truth-functions of elementary propositions. (An elementary proposition is a truth function of itself.)
  6. The general form of a truth-function is…. This is the general form of a proposition.
  7. Whereof one cannot speak, thereof one must be silent.

The body of the Tractatus consists in cascading levels of numbered elaborations of these propositions (1 is elaborated by 1.1 which is elaborated by 1.11, 1.12 and 1.13, and so forth)—except for 7, which stands on its own. Propositions 1 and 2 establish the metaphysical side of logical atomism: the world is nothing but a complex of atomic facts. Propositions 3 and 4 establish the isomorphism between language and reality: a significant (meaningful) proposition is a “logical picture” of the facts that constitute some possible or actual state of affairs. It is a picture in the sense that the structure of the proposition is identical to the structure of the corresponding atomic facts. It is here, incidentally, that we get the first explicit statement of the metaphilosophical view characteristic of early analytic philosophy: “All philosophy is a ‘critique of language’ …” (4.0031).

Proposition 5 asserts the thesis of truth-functionality, the view that all complex propositions are built out of atomic propositions joined by truth-functional connectives, and that atomic propositions are truth-functional in themselves. Even existentially quantified propositions are considered to be long disjunctions of atomic propositions. It has since been recognized that a truth-functional logic is not adequate to capture all the phenomena of the world; or at least that, if there is an adequate truth-functional system, we haven’t found it yet. Certain phenomena seem to defy truth-functional characterization; for instance, moral facts are problematic. Knowing whether the constituent proposition “p” is true, doesn’t seem to tell us whether “It ought to be the case that p” is true. Similarly problematical are facts about thoughts, beliefs, and other mental states (captured in statements such as “John believes that…”), and modal facts (captured in statements about the necessity or possibility of certain states of affairs). And treating existential quantifiers as long disjunctions doesn’t seem to be adequate for the infinite number of facts about numbers since there surely are more real numbers than there are available names to name them even if we were willing to accept infinitely long disjunctions. The hope that truth-functional logic will prove adequate for resolving all these problems has inspired a good bit of thinking in the analytic tradition, especially during the first half of the twentieth century. This hope lies at the heart of logical atomism.

In its full form, Proposition 6 includes some unusual symbolism that is not reproduced here.  All it does, however, is to give a general “recipe” for the creation of molecular propositions by giving the general form of a truth-function. Basically, Wittgenstein is saying that all propositions are truth-functional, and that, ultimately, there is only one kind of truth-function. Principia Mathematica had employed a number of truth-functional connectives: “and,” “or,” “not,” and so forth.  However, in 1913 a logician named Henry Sheffer showed that propositions involving these connectives could be rephrased (analyzed) as propositions involving a single connective consisting in the negation of a conjunction. This was called the “not and” or “nand” connective, and was supposed to be equivalent to the ordinary language formulation “not both x and y.” It is usually symbolized by a short vertical line ( | ) called the Sheffer stroke. Though Wittgenstein uses his own idiosyncratic symbolism, this is the operation identified in proposition 6 and some of its elaborations as showing the general form of a truth-function. Replacing the Principia’s plurality of connectives with the “nand” connective made for an extremely minimalistic system—all one needed to construct a complete picture/description of the world was a single truth-functional connective applied repeatedly to the set of all atomic propositions.

Proposition 7, which stands on its own, is the culmination of a series of observations made throughout the Tractatus, and especially in the elaborations of proposition 6. Throughout the Tractatus there runs a distinction between showing and saying. Saying is a matter of expressing a meaningful proposition. Showing is a matter of presenting something’s form or structure. Thus, as Wittgenstein observes at 4.022, “A proposition shows its sense. A proposition shows how things stand if it is true. And it says that they do so stand.”

In the introduction to the Tractatus, Wittgenstein indicates that his overarching purpose is to set the criteria and limits of meaningful saying. The structural aspects of language and the world—those aspects that are shown—fall beyond the limits of meaningful saying. According to Wittgenstein, the propositions of logic and mathematics are purely structural and therefore meaningless—they show the form of all possible propositions/states of affairs, but they do not themselves picture any particular state of affairs, thus they do not say anything. This has the odd consequence that the propositions of the Tractatus themselves, which are supposed to be about logic, are meaningless. Hence the famous dictum at 6.54:

My propositions are elucidatory in this way: he who understands me finally recognizes them as senseless, when he has climbed out through them, on them, over them. (He must so to speak throw away the ladder, after he has climbed up on it.) He must transcend these propositions, and then he will see the world aright.

Though meaningless, the propositions of logic and mathematics are not nonsense. They at least have the virtue of showing the essential structure of all possible facts. On the other hand, there are concatenations of words, purported propositions, that neither show nor say anything and thus are not connected to reality in any way. Such propositions are not merely senseless, they are nonsense. Among nonsense propositions are included the bulk of traditional philosophical statements articulating traditional philosophical problems and solutions, especially in metaphysics and ethics. This is the consequence of Wittgenstein’s presumption that meaningfulness is somehow linked to the realm of phenomena studied by the natural sciences (cf. 4.11 ff). Thus, as he claims in 6.53:

The correct method in philosophy would really be the following: to say nothing except what can be said, that is propositions of natural science—that is something that has nothing to do with philosophy—and then, whenever someone else wanted to say something metaphysical, to demonstrate to him that he had failed to give a meaning to certain signs in his propositions.

In the eyes of its author (as he avers in its Introduction), the real accomplishment of the Tractatus was to have solved, or rather dissolved, all the traditional problems of philosophy by showing that they were meaningless conundrums generated by a failure to understand the limits of meaningful discourse.

3. Logical Positivism, the Vienna Circle, and Quine

a. Logical Positivism and the Vienna Circle

Logical positivism is the result of combining the central aspects of the positivisms of Auguste Comte and Ernst Mach with the meta-philosophical and methodological views of the analytic movement, especially as understood by the ideal-language camp. In all its forms, positivism was animated by the idealization of scientific knowledge as it was commonly understood from at least the time of Newton through the early twentieth century. Consequently, at its core is a view called scientism: the view that all knowledge is scientific knowledge.

As twentieth-century philosophy of science has shown, the definition and demarcation of science is a very difficult task. Still, for several centuries it has been common to presume that metaphysics and other branches of philosophy-as-traditionally-practiced, not to mention religious and “common sense” beliefs, do not qualify as scientific. From the standpoint of scientism, these are not fields of knowledge, and their claims should not be regarded as carrying any serious weight.

At the heart of logical positivism was a novel way of dismissing certain non-scientific views by declaring them not merely wrong or false, but meaningless. According to the verification theory of meaning, sometimes also called the empiricist theory of meaning, any non-tautological statement has meaning if and only if it can be empirically verified. This “verification principle” of meaning is similar to the principle maintained in Wittgenstein’s Tractatus that the realm of meaning is coextensive with the realm of the natural (empirical) sciences. In fact the logical positivists drew many of their views straight from the pages of the Tractatus (though their reading of it has since been criticized as being too inclined to emphasize the parts friendly to scientific naturalism at the expense of those less-friendly). With Wittgenstein, the logical positivists concluded that the bulk of traditional philosophy consisted in meaningless pseudo-problems generated by the misuse of language, and that the true role of philosophy was to establish and enforce the limits of meaningful language through linguistic analysis.

Logical positivism was created and promoted mainly by a number of Austro-German thinkers associated with the Vienna Circle and, to a lesser extent, the Berlin Circle. The Vienna Circle began as a discussion group of scientifically-minded philosophers—or perhaps philosophically minded-scientists—organized by Moritz Schlick in 1922. Its exact membership is difficult to determine, since there were a number of peripheral figures who attended its meetings or at least had substantial connections to core members, but who are frequently characterized as visitors or associates rather than full-fledged members. Among its most prominent members were Schlick himself, Otto Neurath, Herbert Feigl, Freidrich Waismann and, perhaps most prominent of all, Rudolph Carnap. The members of both Circles made contributions to a number of different philosophical and scientific discussions, including logic and the philosophy of mind (see for example this Encyclopedia’s articles on Behaviorism and Identity Theory); however, their most important contributions vis-à-vis the development of analytic philosophy were in the areas of the philosophy of language, philosophical methodology and metaphilosophy. It was their views in these areas that combined to form logical positivism.

Logical positivism was popularized in Britain by A.J. Ayer, who visited with the Vienna Circle in 1933. His book Language, Truth and Logic (Ayer 1936) was extremely influential, and remains the best introduction to logical positivism as understood in its heyday. To escape the turmoil of World War II, several members of the Vienna Circle emigrated to the United States where they secured teaching posts and exercised an immense influence on academic philosophy. By this time, however, logical positivism was largely past its prime; consequently, it was not so much logical positivism proper that was promulgated, but something more in the direction of philosophizing focused on language, logic, and science. (For more on this point, see the article on American Philosophy, especially Section 4).

Ironically, the demise of logical positivism was caused mainly by a fatal flaw in its central view, the verification theory of meaning. According to the verification principle, a non-tautological statement has meaning if and only if it can be empirically verified. However, the verification principle itself is non-tautological but cannot be empirically verified. Consequently, it renders itself meaningless. Even apart from this devastating problem, there were difficulties in setting the scope of the principle so as to properly subserve the positivists’ scientistic aims. In its strong form (given above), the principle undermined not only itself, but also statements about theoretical entities, so necessary for science to do its work. On the other hand, weaker versions of the principle, such as that given in the second edition of Ayer’s Language, Truth, and Logic (1946), were incapable of eliminating the full range of metaphysical and other non-scientific statements that the positivists wanted to disqualify.

b. W. V. Quine

Willard Van Orman Quine was the first American philosopher of any great significance in the analytic tradition. Though his views had their greatest impact only as the era of linguistic philosophy came to an end, it is convenient to take them up in contrast with logical positivism.

An important part of the logical positivist program was the attempt to analyze or reduce scientific statements into so-called protocol statements having to do with empirical observations. This reductionist project was taken up by several members of the Vienna Circle, but none took it so far as did Rudolph Carnap, in his The Logical Structure of the World (1928) and in subsequent work.

The basic problem for the reductionist project is that many important scientific claims and concepts seem to go beyond what can be verified empirically. Claiming that the sun will come up tomorrow is a claim the goes beyond today’s observations. Claims about theoretical entities such as atoms also provide obvious cases of going beyond what can be verified by specific observations, but statements of scientific law run into essentially the same problem. Assuming empiricism, what is required to place scientific claims on a secure, epistemic foundation is to eliminate the gap between observation and theory without introducing further unverifiable entities or views. This was the goal of the reductionist project. By showing that every apparently unverifiable claim in science could be analyzed into a small set of observation-sentences, the logical positivists hoped to show that the gap between observation and theory does not really exist.

Despite being on very friendly terms with Carnap and other members of the Vienna Circle (with whom he visited in the early 1930s), and despite being dedicated, as they were, to scientism and empiricism, Quine argued that the reductionist project was hopeless. “Modern Empiricism,” he claimed,

has been conditioned in large part by two dogmas. One is a belief in some fundamental cleavage between truths which are analytic, or grounded in meanings independently of matters of fact, and truths which are synthetic, or grounded in fact. The other dogma is reductionism: the belief that each meaningful statement is equivalent to some logical construct upon terms which refer to immediate experience. (Quine 1951, 20)

“Both dogmas,” says Quine, “are ill-founded.”

The first dogma with which Quine is concerned is that there is an important distinction to be made between analytic and synthetic claims. Traditionally, the notions of analytic truth, a priori truth, and necessary truth have been closely linked to one another, forming a conceptual network that stands over against the supposedly contradictory network of a posterioricontingent, and synthetic truths. Each of these categories will be explained briefly prior to addressing Quine’s critique of this “dogma” (for a more extensive treatment see the article on A Priori and A Posteriori).

An a priori truth is a proposition that can be known to be true by intuition or pure reason, without making empirical observations. For instance, neither mathematical truths such as 2+2=4, nor logical truths such as If ((a=b) &(b=c)) then (a=c), nor semantic truths such as All bachelors are unmarried men, depend upon the realization of any corresponding, worldly state of affairs, either in order to be true or to be known.  A posteriori truths, on the other hand, are truths grounded in or at least known only by experience, including both mundane truths such as The cat is on the mat and scientific truths such as Bodies in free-fall accelerate at 9.8 m/s2.

Many (if not all) a priori truths seem to be necessary—that is, they could not have been otherwise. On the other hand, many (if not all) a posteriori truths seem to be contingent—that is, that they could have been otherwise: the cat might not have been on the mat, and, for all we know, the rate of acceleration for bodies in freefall might have been different than what it is.

Finally, the necessity and a prioricity of such truths seem to be linked to their analyticity. A proposition is analytically true if the meanings of its terms require it to be true. For example, the proposition “All bachelors are men” is analytically true, because “man” is connected to “bachelor” in virtue of its meaning—a fact recognized by analyzing “bachelor” so as to see that it means “unmarried man”. On the other hand, “All bachelors have left the room” is not analytically true. It is called a synthetic proposition or truth, because it involves terms or concepts that are not connected analytically by their individual meanings, but only insofar as they are synthesized (brought together) in the proposition itself. Such truths are usually, and perhaps always, a posteriori and contingent.

Historically, philosophers have tended to try to explain necessity, a prioricity and analyticity by appealing to abstract objects such as Plato’s Forms or Aristotle’s essences. Such entities purportedly transcend the realm of time, space, and/or the senses, and hence the realm of “nature” as defined by science—at least as this was understood by the scientific naturalism of the late nineteenth and early twentieth centuries. Consequently, devotees of scientific naturalism required an alternative account of necessity, a priority, and analyticity; and here analytic philosophy’s linguistic turn seemed to offer a way forward.

For obvious reasons, and as the above quotation from Quine hints, analytic truths traditionally have been characterized as “true in virtue of meaning.” However, historically, “meaning” has been cashed out in different ways: in terms of abstract, Ideal entities (Plato, Aristotle, Husserl), and in terms of concepts (Locke, Hume), and in terms of language (construed as a system of concrete, sensible symbols with conventionally approved uses). In the context of analytic philosophy’s “linguistic turn,” it was all too easy to take the latter approach, and hence to treat analyticity as deriving from some linguistic phenomenon such as synonymy or the interchangeability of terms.

Such a view was highly amenable to the scientistic, naturalistic, and empiricistic leanings of many early analysts, and especially to the logical positivists. On the assumptions that meaning is fundamentally linguistic and that language is a conventional symbol-system in which the symbols are assigned meanings by fiat, one can explain synonymy without referring to anything beyond the realm of time, space and the senses. If one can then explain analyticity in terms of synonymy, and explain both necessity and a prioricity in terms of analyticity, then one will have theories of analytic, necessary, and a priori truths consistent with scientific naturalism.

Given Quine’s own commitment to scientific naturalism, one might have expected him to join the logical positivists and others in embracing this model and then striving for a workable version of it. However, Quine proposed a more radical solution to the scientific naturalist’s problem with necessity, a prioricity, and analyticity: namely, he proposed to reject the distinctions between analytic and synthetic, a priori and a posteriori, necessary and contingent.

He begins undermining the notion that synonymy-relations are established by fiat or “stipulative definition.” On the naturalistic view of language and meaning, all meanings and synonymy relations would have to have been established by some person or people making stipulative definitions at some particular place and time. For instance, someone would have had to have said, at some point in history, “henceforth, the symbol ‘bachelor’ shall be interchangeable with the symbol ‘unmarried man’.” However, Quine asks rhetorically, “who defined it thus, or when?” (Quine 1951, 24). The point is that we have no evidence of this ever having happened. Thus, at the very least, the naturalistic account of meaning/synonymy is an unverifiable theory of the sort the positivists wanted to avoid. Moreover, what empirical evidence we do have suggests that it is likely false, for, as Quine sees it, “definition—except in the extreme case of the explicitly conventional introduction of new notation—hinges on prior relationships of synonymy” (Quine 1951, 27). In cases where it appears that someone is making a stipulative definition—as in a dictionary, for example—Quine explains that, far from establishing synonymy, the stipulator is either describing or making use of synonymy relations already present in the language. After exploring several kinds of cases in which stipulative definitions seem to establish synonymy relations, he concludes that all but one—the banal act of coining an abbreviation—rely on pre-existing synonymy relations. The upshot is that stipulative definition cannot account for the breadth of cases in which synonymy is exemplified, and thus that it cannot be the general ground of either synonymy or analyticity.

With its foundation thus undermined, the naturalistic theory of analyticity, necessity and a prioricity collapses. However, rather than rejecting naturalism on account of its inability to explain these phenomena, Quine rejects the notion that naturalism needs to explain them on the ground that they are spurious categories. Prima facie, of course, there seems to be a distinction between the analytic and the synthetic, the a priori and the a posteriori, the necessary and the contingent. However, when we attempt to get a deeper understanding of these phenomena by defining them, we cannot do it. Quine explores several other ways of defining analyticity in addition to synonymy and stipulative definition, ultimately concluding that none work. To the contrary, analyticity, synonymy, necessity and related concepts seem to contribute to each other’s meaning/definition in a way that “is not flatly circular, but something like it. It has the form, figuratively speaking, of a closed curve in space” (Quine 1951, 29). Because none of them can be defined without invoking one of the others, no one of them can be eliminated by reducing it to one of the others. Rather than concluding that analyticity, a prioricity, necessity, and so forth are primitive phenomena, Quine takes their indefinability to indicate that there is no genuine distinction to be drawn between them and their traditional opposites.

This brings us to the second dogma. When Quine criticizes “reductionism,” he has principally in mind the logical positivists’ tendency to pursue the reductionist project as if every and any scientific statement, considered in isolation, could be reduced to/analyzed into a small set of observational statements related to it in such a way that they counted uniquely as that claim’s verification and meaning. Over against this “atomistic” or “isolationist” or “local” conception of verification/reductive analysis, Quine argued that scientific claims have predictive power, and hence verifiability or falsifiability, and hence also meaning, only as parts of large networks of claims that together form far-reaching theories that might be called “worldviews.” For this reason, one can never verify or falsify an isolated scientific claim; rather, verification and falsification—and hence also meaning—are holistic. Observations (and observation sentences) that may seem to verify a lone claim actually make a partial contribution to the verification of the total theoretical network to which it belongs.

As the language here suggests, viewed holistically, verification is never absolute. There is no manageable set of observations that will verify a total theory or any of its constitutive claims once and for all. By the same token, observations (and observation sentences) that may seem to falsify a lone claim do not decisively falsify either it or the theory to which it belongs. Rather, such observations require only that some adjustment be made to the theory. Perhaps one of its constitutive claims must be rejected, but not necessarily the one that initially seemed to be falsified. On Quine’s view, any constitutive claim can be saved by making adjustments elsewhere in the theory-network.

This holistic view of meaning and verification reinforces Quine’s rejection of the analytic/synthetic distinction and its fellows. Holism in these areas implies that no claim in one’s total theory is immune from revision or rejection in light of observational evidence. This means that even claims traditionally thought to be necessary and/or analytic, such as those of mathematics and logic, can be revised or rejected in order to preserve other claims to which one is more deeply committed.

Quine’s assault on the analytic/synthetic distinction undermines not merely the positivists’ reductionist project, but also the general practice of analysis which, from the beginning, had been understood to involve the transformation of a sentence into another sentence semantically equivalent (synonymous) but grammatically different. At the same time, Quine’s holism about the meaning of scientific claims and their verification generalizes to become a theory of meaning holism that applies to all meaningful claims whatsoever. However, following Moore’s practice, the analytic method was usually applied to claims in isolation, apart from considerations of their connection to other claims that together might constitute a philosophical “worldview.” Quinean meaning holism undermines this aspect of analysis just as much as it does the logical positivists “isolationist” view of verification.

4. The Later Wittgenstein and Ordinary-Language Philosophy

a. Ordinary-Language Philosophy

Thanks to G.E. Moore, ordinary-language analysis had had a place in the analytic movement from the very beginning. Because of the perceived superiority of ideal-language analysis, however, it dropped almost completely out of sight for several decades. In the 1930s, ordinary-language analysis began to make a comeback thanks mainly to Wittgenstein—whose views had undergone radical changes during the 1920s—but also to a number of other talented philosophers including John Wisdom, John Austin (not to be confused with the nineteenth-century John Austin who invented legal positivism), Gilbert Ryle, Peter Strawson and Paul Grice. Despite differences in their reasons for adopting the ordinary-language approach as well as their respective manners of employing it, these figures’ common focus on ordinary language was a substantial point of unity over against the initially dominant ideal-language approach.

Ordinary-language philosophy became dominant in analytic philosophy only after World War II—hence the dates for the ordinary-language era given in the Introduction are 1945-1965. Indeed, with the exception of several articles by Ryle, the most important texts of the ordinary-language camp were published in 1949 and later—in some cases not until much later, when the linguistic approach to philosophy in all its forms was already on its way out.

Ordinary-language philosophy is sometimes called “Oxford philosophy.” This is because Ryle, Austin, Strawson and Grice were all Oxford dons. They were the most important representatives of the ordinary-language camp after Wittgenstein (who was at Cambridge).  After Wittgenstein died in the early years of the ordinary-language era, they lived to promote it through its heyday.

Despite the strong connection to Oxford, Wittgenstein is usually taken to be the most important of the ordinary-language philosophers. For this reason, we will focus only on his later views in giving a more detailed example of ordinary language philosophy.

b. The Later Wittgenstein

While logical positivism was busy crumbling under the weight of self-referential incoherence, a larger problem was brewing for ideal-language philosophy in general. After publishing the Tractatus, Wittgenstein retired from philosophy and went to teach grade-school in the Austrian countryside. Why wouldn’t he leave academia—after all, he believed he had already lain to rest all the traditional problems of philosophy!

During his time away from the academy, Wittgenstein had occasion to rethink his views about language. He concluded that, far from being a truth-functional calculus, language has no universally correct structure—that is, there is no such thing as an ideal language. Instead, each language-system—be it a full-fledged language, a dialect, or a specialized technical language used by some body of experts—is like a game that functions according to its own rules.

These rules are not of the sort found in grammar books—those are just attempts to describe rules already found in the practices of some linguistic community. Real linguistic rules, according to the later Wittgenstein, cannot be stated, but are rather shown in the complex intertwining of linguistic and non-linguistic practices that make up the “form of life” of any linguistic community. Language is, for the later Wittgenstein, an intrinsically social phenomenon, and its correct modes are as diverse as the many successful modes of corporate human life. Consequently, it cannot be studied in the abstract, apart from its many particular embodiments in human communities.

In contrast with his views in the Tractatus, the later Wittgenstein no longer believed that meaning is a picturing-relation grounded in the correspondence relationships between linguistic atoms and metaphysical atoms. Instead, language systems, or language games, are unanalyzable wholes whose parts (utterances sanctioned by the rules of the language) have meaning in virtue of having a role to play—a use—within the total form of life of a linguistic community. Thus it is often said that for the latter Wittgenstein meaning is use. On this view, the parts of a language need not refer or correspond to anything at all—they only have to play a role in a form of life.

It is important to note that even in his later thought, Wittgenstein retained the view that traditional philosophical problems arise from linguistic error, and that true philosophy is about analyzing language so as to grasp the limits of meaning and see that error for what it is—a headlong tumble into confusion or meaninglessness. However, his new understanding of language required a new understanding of analysis. No longer could it be the transformation of some ordinary language statement into the symbolic notation of formal logic purportedly showing its true form. Instead, it is a matter of looking at how language is ordinarily used and seeing that traditional philosophical problems arise only as we depart from that use.

“A philosophical problem,” says Wittgenstein, “has the form: ‘I don’t know my way about’” (Wittgenstein 1953, ¶123), that is, I don’t know how to speak properly about this, to ask a question about this, to give an answer to that question. If I were to transcend the rules of my language and say something anyhow, what I say would be meaningless nonsense. Such are the utterances of traditional, metaphysical philosophy. Consequently, philosophical problems are to be solved, or rather dissolved,

by looking into the workings of our language, and that in such a way as to make us recognize its workings: … The problems are solved, not by giving new information, but by arranging what we have always known. (Wittgenstein 1953, ¶ 109)

And “what we have always known” is the rules of our language. “The work of the philosopher,” he says, “consists in assembling reminders for a particular purpose” (Wittgenstein 1953, ¶ 127). These reminders take the form of examples of how the parts of language are ordinarily used in the language game out of which the philosoher has tried to step. Their purpose is to coax the philosopher away from the misuse of language essential to the pursuit of traditional philosophical questions. Thus the true philosophy becomes a kind of therapy aimed at curing a lingusitic disease that cripples one’s ability to fully engage in the form of life of one’s linguistic community. True philsophy, Wittgenstein says, “is a battle against the bewitchment of our intelligence by means of language” (Wittgenstein 1953, ¶ 109). The true philosopher’s weapon in this battle is “to bring words back from their metaphysical to their everyday use” (Wittgenstein 1953, ¶ 116), so that “the results of philosophy are the uncovering of one or another piece of plain nonsense and of bumps that the understanding has gotten by running its head up against the limits of language” (Wittgenstein 1953, ¶ 119).

Though Wittgenstein developed these new views much earlier (mainly in the 1920s and 30s), they were not officially published until 1953, in the posthumous Philosophical Investigations. Prior to this, Wittgenstein’s new views were spread largely by word of mouth among his students and other interested persons.

5. The 1960s and After: The Era of Eclecticism

a. The Demise of Linguistic Philosophy

By the mid-1960s the era of linguistic philosophy was coming to a close. The causes of its demise are variegated. For one thing, it was by this time apparent that there were deep divisions within the analytic movement, especially between the ordinary-language and ideal-language camps, over the nature of language and meaning on the one hand, and over how to do philosophy on the other. Up to this point, the core of analytic philosophy had been the view that philosophical problems are linguistic illusions generated by violating the boundaries of meaning, and that they were to be solved by clearly marking those boundaries and then staying within them. It was now becoming clear, however, that this was no easy task. Far from being the transparent phenomenon that the early analysts had taken it to be, linguistic meaning was turning out to be a very puzzling phenomenon, itself in need of deep, philosophical treatment.

Indeed, it was becoming clear that many who had held the core analytic view about the nature of philosophy had relied upon different theories of meaning sometimes implicit, never sufficiently clear, and frequently implausible. The internal failure of logical positivism combined with the external criticisms of Wittgenstein and Quine contributed to the demise of the ideal-language approach. On the other hand, many, including Bertrand Russell, saw the ordinary-language approach as falling far short of serious, philosophical work. For this and other reasons, the ordinary-language approach also drew fire from outside the analytic movement, in the form of Ernest Gellner’s Words and Things (1959) and W.C.K. Mundle’s Critique of Linguistic Philosophy (1970). The former especially had a large, international impact, thereby contributing to what T. P. Uschanov has called “the strange death of ordinary language philosophy.”

The waning of linguistic philosophy signaled also the waning of attempts to specify the proper philosophical method, or even just the method distinctive of analytic philosophy. Quine’s take on the matter—that philosophy is continuous with science in its aims and methods, differing only in the generality of its questions—proved influential and achieved a certain level of dominance for a time, but not to the extent that the linguistic conception of philosophy had during its sixty-year run. Alternatives tied less tightly to the empirical sciences soon emerged, with the result that philosophical practice in contemporary analytic philosophy is now quite eclectic. In some circles, the application of formal techniques is still regarded as central to philosophical practice, though this is now more likely to be regarded as a means of achieving clarity about our concepts than as a way of analyzing language. In other circles meticulous expression in ordinary language is seen to provide a sufficient level of clarity.

Partly because of Quine’s view of philosophy as continuous with science (which, of course, is divided into specializations), and partly because analytic philosophy had always been given to dealing with narrowly-defined questions in isolation from others, post-linguistic analytic philosophy partitioned itself into an ever-increasing number of specialized sub-fields. What had been linguistic philosophy metamorphosed into what we now know as the philosophy of language. Epistemology, the philosophy of mind, the philosophy of science, ethics and meta-ethics, and even metaphysics emerged or re-emerged as areas of inquiry not indifferent to linguistic concerns, but not themselves intrinsically linguistic. Over time, the list has expanded to include aesthetics, social and political philosophy, feminist philosophy, the philosophy of religion, philosophy of law, cognitive science, and the history of philosophy.

On account of its eclecticism, contemporary analytic philosophy defies summary or general description. By the same token, it encompasses far too much to discuss in any detail here. However, two developments in post-linguistic analytic philosophy require special mention.

b. The Renaissance in Metaphysics

Metaphysics has undergone a certain sort of renaissance in post-linguistic analytic philosophy. Although contemporary analytic philosophy does not readily countenance traditional system-building metaphysics (at least as a respected professional activity), it has embraced the piecemeal pursuit of metaphysical questions so wholeheartedly that metaphysics is now seen as one of its three most important sub-disciplines. (The other two are epistemology and the philosophy of language; all three are frequently referred to as “core” analytic areas or sub-disciplines.) This is noteworthy given analytic philosophy’s traditional anti-metaphysical orientation.

The return of metaphysics is due mainly to the collapse of those theories of meaning which originally had banned it as meaningless, but later developments in the philosophy of language also played a role. In the 1960s, the ordinary-language philosopher Peter Strawson began advocating for what he called “descriptive metaphysics,” a matter of looking to the structure and content of natural languages to illuminate the contours of different metaphysical worldviews or “conceptual schemes.” At the same time, and despite his naturalism and scientism which pitted him against speculative metaphysics, Quine’s holistic views about meaning and verification opened the door to speculative metaphysics by showing that theory cannot be reduced to observation even in the sciences. In the 1960s and 70s, the attempts of Donald Davidson and others to construct a formal theory of meaning based on Alfred Tarski’s formal definition of truth eventually led to the development of possible worlds semantics by David Lewis. Consistent with the Quinean insight that meaning is connected to holistic worldviews or, in more metaphysical terms, world-states, possible worlds semantics defines important logical concepts such as validity, soundness and completeness, as well as concepts that earlier logics were incapable of handling—such as possibility and necessity—in terms of total descriptions of a way that some worlds or all worlds might be/have been. For example, proposition p is necessary, if p is true in all possible worlds. Thus, despite its formalism, possible world semantics approximates some aspects of traditional metaphysics that earlier analytic philosophy eschewed.

With the advent of possible worlds semantics, attention shifted from the notion of meaning to that of reference. The latter has to do explicitly with the language-world connection, and so has an overtly metaphysical aspect. In the 1970s, direct reference theories came to dominate the philosophy of language. Developed independently by Saul Kripke and Ruth Barcan Marcus, a direct reference theory claims that some words—particularly proper names—have no meaning, but simply serve as “tags” (Marcus’ term) or “rigid designators” (Kripke’s term) for the things they name. Tagging or rigid designation is usually spelled-out in terms of possible worlds: it is a relation between name and thing such that it holds in all possible worlds. This then provides a linguistic analog of a metaphysical theory of identity the likes of which one finds in traditional “substance” metaphysics such as that of Aristotle. With the restrictions characteristic of earlier analytic philosophy removed, these positions in the philosophy of language made for an easy transition into metaphysics proper.

c. The Renaissance in History

Because analytic philosophy initially saw itself as superseding traditional philosophy, its tendency throughout much of the twentieth century was to disregard the history of philosophy. It is even reported that a sign reading “just say no to the history of ideas” once hung on a door in the Philosophy building at Princeton University (Grafton 2004, 2). Though earlier analytic philosophers would sometimes address the views of a philosopher from previous centuries, they frequently failed to combine philosophical acumen with historical care, thereby falling into faulty, anachronistic interpretations of earlier philosophers.

Beginning in the 1970s, some in the analytic context began to rebel against this anti-historical attitude. The following remembrance by Daniel Garber describes well the emerging historical consciousness in the analytic context (though this was not then and is not now so widespread as to count as characteristic of analytic philosophy itself):

What my generation of historians of philosophy was reacting against was a bundle of practices that characterized the writing of the history of philosophy in the period: the tendency to substitute rational reconstructions of a philosopher’s views for the views themselves; the tendency to focus on an extremely narrow group of figures (Descartes, Spinoza, and Leibniz, Locke, Berkeley and Hume in my period); within that very narrow canon the tendency to focus on just a few works at the exclusion of others, those that best fit with our current conception of the subject of philosophy; the tendency to work exclusively from translations and to ignore secondary work that was not originally written in English; the tendency to treat the philosophical positions as if they were those presented by contemporaries, and on and on and on. (Garber 2004, 2)

Over against this “bundle of practices,” the historical movement began to interpret the more well-known problems and views of historical figures in the context of, first, the wholes of their respective bodies of work, second, their respective intellectual contexts, noting how their work related to that of the preceding generation of thinkers, and, third, the broader social environment in which they lived and thought and wrote.

Eventually, this new historical approach was adopted by philosopher-scholars interested in the history of analytic philosophy itself. As a result, the last two decades have seen the emergence of the history (or historiography) of analytic philosophy as an increasingly important sub-discipline within analytic philosophy itself. Major figures in this field include Tom Baldwin, Hans Sluga, Nicholas Griffin, Peter Hacker, Ray Monk, Peter Hylton, Hans-Johann Glock and Michael Beaney, among a good many others. The surge of interest in the history of analytic philosophy has even drawn efforts from philosophers better known for work in “core” areas of analytic philosophy, such as Michael Dummett and Scott Soames.

Some of these authors are responsible for discovering or re-discovering the fact that neither Moore nor Russell conceived of themselves as linguistic philosophers. Others have been involved in the debate over Frege mentioned in Section 2c. All this has served to undermine received views and to open a debate concerning the true nature of analytic philosophy and the full scope of its history. (For more on this, see Preston 2004, 2005a-b).

6. References and Further Reading

The main divisions of this bibliography correspond to the main divisions of the article, which in turn correspond to the main historical phases of analytic philosophy. In addition, there is at the end a section on anthologies, collections and reference works that do not fit nicely under the other headings.

a. The Revolution of Moore and Russell: Cambridge Realism and The Linguistic Turn

Primary Sources

  • Moore, G. E. 1899: “The Nature of Judgment,” Mind 8, 176-93. Reprinted in Moore 1993, 1-19.
  • Moore, G. E. 1903a: Principia Ethica, Cambridge: Cambridge University Press.
  • Moore, G. E. 1903b: “The Refutation of Idealism” Mind 12, 433-53. Reprinted in Moore 1993, 23-44.
  • Moore, G. E. 1925: “A Defense of Common Sense” in J. H. Muirhead ed., Contemporary British Philosophy, London: Allen and Unwin, 193-223. Reprinted in Moore 1959, 126-148, and Moore 1993, 106-33.
  • Moore, G. E. 1939: “Proof of an External World,” Proceedings of the British Academy 25, 273-300. Reprinted in Moore 1993, 147-70.
  • Moore, G. E. 1942a: “An Autobiography,” in Schilpp ed., 1942, 3-39.
  • Moore, G. E. 1942b: “A Reply to My Critics,” in Schilpp ed., 1942, 535-677.
  • Moore, G. E. 1959: Philosophical Papers, London: George Allen and Unwin.
  • Moore, G. E. 1993: G.E. Moore: Selected Writings, ed. Thomas Baldwin, London: Routledge.
  • Russell, Bertrand. 1959: My Philosophical Development, London: George Allen and Unwin; New York: Simon and Schuster.

Secondary Sources

  • Ayer, A.J. (ed ) 1971: Russell and Moore: The Analytical Heritage, Cambridge, MA: Harvard University Press.
  • Baldwin, T. 1990: G. E. Moore, London: Routledge.
  • Baldwin, T. 1991: “The Identity Theory of Truth,” Mind, New Series, Vol. 100, No. 1, 35-52.
  • Bell, David. 1999: “The Revolution of Moore and Russell: A Very British Coup?” in Anthony O’Hear (ed.), German Philosophy Since Kant, Cambridge and New York: Cambridge University Press.
  • Griffin, Nicholas. 1991: Russell’s Idealist Apprenticeship, Oxford: Clarendon Press.
  • Hylton, Peter. 1990: Russell, Idealism, and the Emergence of Analytic Philosophy, Oxford: Clarendon Press.
  • Schilpp, P.A., ed. 1942: The Philosophy of G.E. Moore, Library of Living Philosophers Vol. 4, La Salle: Open Court.

b. Russell and the Early Wittgenstein: Ideal Language and Logical Atomism

Primary Sources

  • Frege, Gottlob. 1879: Concept Script, a formal language of pure thought modeled upon that of arithmetic, tr. by S. Bauer-Mengelberg, in J. van Heijenoort (ed.), From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, Cambridge, MA: Harvard University Press, 1967.
  • Frege, Gottlob. 1892: “On Sense and Reference” tr. by M. Black, in Translations from the Philosophical Writings of Gottlob Frege, P. Geach and M. Black (eds.), Oxford: Blackwell, 3rd ed., 1980.
  • Russell, Bertrand. 1905: “On Denoting,” Mind 14: 479-93.
  • Russell, Bertrand. 1908: “Mathematical Logic as Based on the Theory of Types,” American Journal of Mathematics, 30, 222-262. Reprinted in Russell 1956, 59-102.
  • Russell, Bertrand. 1914: “On Scientific Method in Philosophy,” in Russell 1918, 97-124.
  • Russell, Bertrand. 1918-19: “The Philosophy of Logical Atomism,” The Monist 28:495-527 and 29:33-63, 190-222, 344-80; reprinted La Salle, Illinois: Open Court, 1985.
  • Russell, Bertrand. 1918: Mysticism and Logic: and Other Essays, New York: Longmans, Green and Co.
  • Russell, Bertrand. 1944a: “My Mental Development,” in Schilpp, ed. 1944, 3-20.
  • Russell, Bertrand. 1944b: “Reply to Criticisms,” in Schilpp, ed. 1944, 681-741.
  • Russell, Bertrand. 1946: “The Philosophy of Logical Analysis,” from A History of Western Philosophy, London: Allen and Unwin; New York: Simon and Schuster, 1946; reprinted in Dennon and Egner, eds., 1961, pp. 301-307.
  • Russell, Bertrand. 1950: “Is Mathematics Purely Linguistic?,” in Russell 1973, pp. 295-306.
  • Russell, Bertrand. 1956: Logic and Knowledge, Robert Marsh, ed., London: Unwin Hyman Ltd.
  • Russell, Bertrand. 1959: My Philosophical Development, London: Unwin.
  • Russell, Bertrand. 1973: Essays in Analysis, Douglas lackey, ed., London: George Allen and Unwin Ltd.
  • Russell, Bertrand, and Whitehead, Alfred North. 1910-1913: Principia Mathematica 3 vols. London: Cambridge University Press. Second edition 1925.
  • Wittgenstein, Ludwig. 1922: Tractatus Logico-Philosophicus, tr. C.K. Ogden. London: Routledge & Kegan Paul.

Secondary Sources

  • Kenny, Anthony. 2000: Frege: An Introduction to the Founder of Modern Analytic Philosophy, Blackwell Publishers.
  • Baker, G .P. and Hacker, P.M.S. 1983: “Dummett’s Frege or Through a Looking-Glass Darkly,” Mind, 92, pp. 239-246.
  • Baker, G .P. and Hacker, P.M.S. 1984: Frege: Logical Excavations, Oxford: Blackwell.
  • Baker, G .P. and Hacker, P.M.S. 1987: “Dummett’s Dig: Looking-Glass Archaeology,” Philosophical Quarterly, 37, pp. 86-99.
  • Baker, G .P. and Hacker, P.M.S. 1989: “The Last Ditch,” Philosophical Quarterly, 39, pp. 471-477.
  • Dummett, Michael. 1991: Frege: Philosophy of Mathematics, London: Duckworth.
  • Monk, Ray and Palmer, Anthony (eds.). 1996: Bertrand Russell and the Origins of Analytical Philosophy, Bristol: Thoemmes Press.
  • Reck, Erich (ed.). 2001: From Frege to Wittgenstein: Perspectives on Early analytic philosophy, Oxford: Oxford University Press.
  • Pears, D.F. 1967: Bertrand Russell and the British Tradition in Philosophy, London: Collins.
  • Schilpp, P.A. 1944: The Philosophy of Bertrand Russell, Library of Living Philosophers Vol. 5, La Salle: Open Court.
  • Schrenmann, R. (ed.) 1967: Bertrand Russell: Philosopher of the Century, London: Allen and Unwin.
  • Tait, William (ed). 1997: Early Analytic Philosophy: Frege, Russell, Wittgenstein; Essays in Honor of Leonard Linsky, Chicago: Open Court.

c. Logical Positivism, the Vienna Circle, and Quine

Primary Sources

  • Ayer, A.J. 1936: Language, Truth and Logic, London: Gollantz; second edition 1946; reprinted New York: Dover, 1952.
  • Carnap, Rudolf. 1928: The Logical Structure of the World. English trans. published by Berkeley: University of California Press, 1969.
  • Carnap, Rudolf. 1934: “On the Character of Philosophical Problems,” tr. W.M. Malisoff, in Rorty (ed.) 1967, 54-62.
  • Hempel, Carl. 1950: “Problems and Changes in the Empiricist Criterion of Meaning.” Revue Internationale de Philosophie 4:41-63; reprinted in Ayer (ed.) 1959.
  • Quine, W. V. “Truth by Convention.” In O.H. Lee (ed.), Philosophical Essays for A.N. Whitehead, New York: Longmans, 1936; reprinted in Ways of Paradox: New York: Random House, 1966.
  • Quine, W. V. 1951: “Two Dogmas of Empiricism.” Philosophical Review 60(1951):20-43.
  • Quine, W. V. Word and Object. Cambridge MA: MIT Press, 1960.
  • Quine, W. V. Ontological Relativity and Other Essays. New York: Columbia University Press, 1969.

Secondary Sources

  • Ayer, A.J. (ed ) 1959: Logical Positivism, Westport: Greenwood Press, 1959.
  • Schilpp, P.A. 1963: The Philosophy of Rudolf Carnap, Library of Living Philosophers, Vol. 11, La Salle: Open Court.
  • Schilpp, P.A. The Philosophy of W.V. Quine, Library of Living Philosophers, Vol. 18, La Salle: Open Court.
  • Schilpp, P.A. 1992: The Philosophy of A. J. Ayer, Library of Living Philosophers, Vol. 21, La Salle: Open Court.
  • Sarkar, Sahotra (ed.) 1996: Science and Philosophy in the Twentieth Century: Basic Works of Logical Empiricism, 6 vols., New York & London: Garland Publishing.

d. The Later Wittgenstein, et al.: Ordinary-Language Philosophy

Primary Sources

  • Austin, J.L. 1962: How to Do Things with Words, New York: Oxford University Press.
  • Austin, J.L. 1962: Sense and Sensibilia, London: Oxford University Press.
  • Grice, Paul. 1989: Studies in the Way of Words, Cambridge MA: Harvard University Press.
  • Ryle, Gilbert. 1949: The Concept of Mind, New York: Barnes and Noble.
  • Ryle, Gilbert. 1953: Dilemmas, Cambridge: Cambridge University Press.
  • Strawson, Peter. 1950: “On Referring” Mind, 59: 320-344.
  • Strawson, Peter and Grice, H. P. 1956: “In Defense of a Dogma,” Philosophical Review, 65: 141-58; reprinted in Grice 1989.
  • Wisdom, John. 1931: Interpretation and Analysis in Relation to Bentham’s Theory of Definition,London: Kegan, Paul, Trench, Trubner &Co.
  • Wisdom, John. 1952: Other Minds, Oxford: Blackwell.
  • Wittgenstein, Ludwig. 1953: Philosophical Investigations, tr. G.E.M. Anscombe. Oxford: Blackwell.

Secondary Sources

  • Canfield, J.V. (ed) 1986: The Philosophy of Wittgenstein, New York and London: Garland Publishing, Inc.
  • Hacker, P.M.S. 1986: Insight and Illusion: Themes in the Philosophy of Wittgenstein, Oxford: Clarendon.
  • Kripke, Saul. 1982: Wittgenstein On Rules and Private Language, Cambridge, MA: Harvard University Press.
  • Urmson, J. O. 1956: Philosophical Analysis: Its Development Between the Two World Wars, London, Oxford, New York: Oxford University Press.

e. The 1960s and After: The Era of Eclecticism

  • Hacking, Ian, 1975: Why Does Language Matter to Philosophy?, Cambridge: Cambridge University Press.
  • Kripke, Saul. 1980: Naming and Necessity Cambridge MA: Harvard University Press.
  • Mundle, C. W. K. 1970: A Critique of Linguistic Philosophy, Oxford: Clarendon Press.
  • Gellner, E. 1959: Words and Things: A Critical Account of Linguistic Philosophy and a Study in Ideology, London: Gollancz.

f. Critical and Historical Accounts of Analytic Philosophy

  • Ayer, A. J., et al. 1963: The Revolution in Philosophy, London: Macmillan & Co. Ltd.
  • Ayer, A.J. (ed ) 1982: Philosophy in the Twentieth Century, London: Weidenfield and Nicolson.
  • Beaney, Michael. 2003: “Analysis,” Stanford Encyclopedia of Philosophy, URL= < http://plato.stanford.edu/entries/analysis/>.
  • Biletzki and Matar (eds.). 1998: The Story of Analytic Philosophy: Plot and Heroes, London and New York: Routledge.
  • Capaldi, Nicholas. 2000: The Enlightenment Project in the Analytic Conversation, Dordrecht, Boston, London: Kluwer Academic Publishers.
  • Charlton, William. 1991: The Analytic Ambition: An Introduction to Philosophy, Oxford and Cambridge: Blackwell.
  • Clarke, D.S. 1997: Philosophy’s Second Revolution: Early and Recent Analytic Philosophy, La Salle: Open Court.
  • Coffa, J.A. 1991: The Semantic Tradition from Kant to Carnap, Cambridge: Cambridge University Press.
  • Cohen, L. J. 1986: The Dialogue of Reason: An Analysis of Analytical Philosophy, Oxford: Clarendon Press.
  • Collingwood, R.G. An Essay on Philosophical Method
  • Corrado, Michael. 1975: The Analytic Tradition in Philosophy: Background and Issues, Chicago: American Library Association.
  • Dummett, Michael. 1993: Origins of Analytical Philosophy, Cambridge, MA: Harvard University Press.
  • Garber, Daniel. 2004: “Philosophy and the Scientific Revolution,” in Teaching New Histories of Philosophy, Princeton: Princeton University Center for Human Values.
  • Glock, Hans-Johann (ed.). 1997: The Rise of Analytic Philosophy, Oxford: Blackwell Publishers.
  • Grafton, Anthony. 2004: “A Note from Inside the Teapot,” in Teaching New Histories of Philosophy, Princeton: Princeton University Center for Human Values.
  • Hanna, Robert. 2001: Kant and the Foundations of Analytic Philosophy, Oxford: Oxford University Press.
  • Mehta, Ved. 1961: Fly and the Fly Bottle: Encounters with British Intellectuals, New York: Columbia University Press.
  • Nagel, Ernest. 1936a-b: “Impressions and Appraisals of Analytic Philosophy in Europe,” The Journal of Philosophy vol. 33, no. 1, 5-24 and no. 2, 29-53.
  • Pap, Arthur. 1949: Elements of Analytic Philosophy. New York: Macmillan.
  • Preston, Aaron. 2004: “Prolegomena to Any Future History of Analytic Philosophy,” Metaphilosophy, vol. 35, no. 4, 445-465.
  • Preston, Aaron. 2005a: “Conformism in Analytic Philosophy: On Shaping Philosophical Boundaries and Prejudices,” The Monist, Volume 88, Number 2, April 2005.
  • Preston, Aaron. 2005b: “Implications of Recent Work on Analytic Philosophy,” The Bertrand Russell Society Quarterly, no. 127 (August 2005), 11-30.
  • Prosch, Harry. 1964: The Genesis of Twentieth Century Philosophy: The Evolution of Thought from Copernicus to the Present, Garden City: Doubleday and Co., Inc.
  • Soames, Scott. 2003. Philosophical Analysis in the Twentieth Century, 2 vols., Princeton: Princeton University Press.
  • Stroll, Avrum. 2000: Twentieth Century Analytic Philosophy, New York: Columbia University Press.
  • Warnock, G.J. 1958: English Philosophy Since 1900, London: Oxford University Press.

g. Anthologies and General Introductions

  • Ammerman, Robert (ed.). 1990: Classics of Analytic Philosophy, Indianapolis: Hackett.
  • Baillie, James (ed.). 2002: Contemporary Analytic Philosophy: Core Readings, 2nd edition, Prentice Hall.
  • Martinich, A. P. and Sosa, David (eds.). 2001a: Analytic Philosophy: An Anthology, Blackwell Publishers.
  • Martinich, A. P. and Sosa, David (eds.). 2001b: A Companion to Analytic Philosophy, Blackwell Publishers.
  • Rorty, Richard (ed.). 1992: The Linguistic Turn: Essays in Philosophical Method, Chicago and London: The University of Chicago Press.

Author Information

Aaron Preston
Email: Aaron.Preston@valpo.edu
Valparaiso University
U. S. A.

Consequentialism

Consequentialism is the view that morality is all about producing the right kinds of overall consequences. Here the phrase “overall consequences” of an action means everything the action brings about, including the action itself. For example, if you think that the whole point of morality is (a) to spread happiness and relieve suffering, or (b) to create as much freedom as possible in the world, or (c) to promote the survival of our species, then you accept consequentialism. Although those three views disagree about which kinds of consequences matter, they agree that consequences are all that matters. So, they agree that consequentialism is true. The utilitarianism of John Stuart Mill and Jeremy Bentham is a well known example of consequentialism. By contrast, the deontological theories of John Locke and Immanuel Kant are nonconsequentialist.

Consequentialism is controversial. Various nonconsequentialist views are that morality is all about doing one’s duty, respecting rights, obeying nature, obeying God, obeying one’s own heart, actualizing one’s own potential, being reasonable, respecting all people, or not interfering with others—no matter the consequences.

This article describes different versions of consequentialism. It also sketches several of the most popular reasons to believe consequentialism, along with objections to those reasons, and several of the most popular reasons to disbelieve it, along with objections to those reasons.

Table of Contents

  1. Basic Issues and Simple Versions
    1. Introduction to Plain Consequentialism
    2. What is a “Consequence”?
    3. Plain Scalar Consequentialism
    4. Expectable Consequentialism and Reasonable Consequentialism
    5. Dual Consequentialism
    6. Rule Consequentialism
  2. Two Simple Arguments for Consequentialism
    1. Only Results Remain
    2. Love
  3. Arguments Against Consequentialism
    1. Partiality
    2. Equality
    3. Personal Rights
    4. Human Thinking
  4. Further Arguments for Consequentialism
    1. Reasons for Action
    2. It is Wrong to Choose the Worse Over the Better
    3. The Ideal Spectator
    4. What is Desirable
    5. Common Sense
  5. References and Further Reading
    1. Classic Works
    2. Recommended Collections
    3. Other Recommended Works

1. Basic Issues and Simple Versions

a. Introduction to Plain Consequentialism

There is disagreement about how consequentialism can best be formulated as a precise theory, and so there are various versions of consequentialism. Almost all lack standard names, so the names used here are mostly invented here. Perhaps the most standard precise version of consequentialism is Plain Consequentialism.

Plain Consequentialism: Of all the things a person might do at any given moment, the morally right action is the one with the best overall consequences. (If there is no one best action because several actions are tied for best consequences, then of course any of those several actions would be right.)

Other versions of consequentialism may be generated by making small changes in this theory, as we shall see, so long as the new theory stays faithful to the broad idea that morality is all about producing the right kinds of overall consequences.

Consequentialism does not itself say what kinds of consequences are good. Hence people can agree on consequentialism while disagreeing about what kind of outcome is good or bad. If you happen to be in charge of setting speed limits, you might be thinking that a bad result is a death: the fewer deaths, the better. But the people who die in accidents were all going to die eventually anyway, so a fatal accident does not mean there are more deaths than there otherwise would have been. Perhaps, then, what counts as a good result is the amount of life that the action adds or subtracts in the world? That would explain why fatal accidents are bad, since an early death means less life. But if quantity of life were the only kind of good result, then a long happy life would be no better than a long unhappy life.

The most traditional view among Consequentialists is that the only kind of result that is good in itself is happiness. The picture is roughly as follows. Suppose you are on average just as happy as I am, but you live twice as long. Then you will have had twice as much happiness as I had. So the total happiness we had is three times the happiness I had. Or suppose you are on average twice as happy as I am, and we live equally long. Here too you end up having had twice as much happiness as I had, so the total happiness we had is three times the happiness I had. Or suppose you are unhappy instead: on average just as unhappy as I am happy and for the same amount of time. Unhappiness can be thought of as negative happiness, so that the total happiness we two have in this third case is zero. Now, to find the goodness of the consequences of an action, simply take the total amount of happiness in those consequences. The more happiness there is, the better. Note that if what matters is the total amount, then it does not matter whether the happiness belongs to you or your friend or a stranger—or even a dog, if dogs can have happiness. And it does not matter whether the happiness will happen today or next year. See Bentham (1789); Den Uyl & Machan (1983).

If we take the above view that the good is happiness, and plug it into Plain Consequentialism, we get the view that the right action is the one that causes the most happiness—more than would have been caused by any of the available alternative actions.

On this view, a problem with setting a very high speed limit is that it causes early deaths, which reduce the amount of life and thus reduce the amount of happiness there will be. But a problem with setting a very low speed limit is that driving very slowly takes up time. If people can get where they are going more quickly, they will probably use the time they saved to do things that will add happiness to their lives or the lives of others. Consequentialism suggests that to set a speed limit rightly, you must balance such considerations accurately.

b. What is a “Consequence”?

As mentioned above, in consequentialism the “consequences” of an action are everything the action brings about, including the action itself. In consequentialism, the “consequences” of an action include (a) the action itself, and (b) everything the action causes. What then, do these two kinds of consequence have in common, that makes them both “consequences”? If there is an answer, perhaps it is something like this: both A itself and the things A causes are things that happen if you do A rather than the alternatives to A.

Another important point about “consequences” is that the actual “consequences” of an action, beyond the action itself, need not be actual outcomes. (Before explaining this point, we should note that consequentialism on most versions is a theory about the moral quality of actions. And it is commonly thought that the main kinds of actions that can be morally right or wrong are intentional actions—things we do deliberately, not things like hiccups or small twitches. Hence in the context of consequentialism, perhaps “actions” should normally be understood to mean “intentional actions.”) Suppose I will bake a cake if you win a coin toss, and you are now deciding whether to toss the coin or just walk away. Eventually you decide to toss the coin, you win, and I bake the cake. Was the cake a consequence of your action of tossing the coin? Arguably it was not. For you could have tossed the coin in many slightly different ways, and in many slightly different positions. Your intentional action was to toss the coin, not to toss the coin in the precise manner and position in which you ended up tossing it. But it was the precise manner and position that made you win. Therefore, your intentional action of tossing did not make you win. (But see Tännsjö (1988), 41ff.) Hence, arguably, the consequence of your intentional action was a 50% chance of a cake—not a cake, not half a cake, but a 50% chance of a cake. Perhaps most consequences of most actions we decide on are like that: not actual outcomes, but only probabilities of outcomes.

The usual Consequentialist view is that a 50% chance of a certain good outcome is half as good as that good outcome itself, and a 10% chance is one tenth as good.

Hence it would be misleading to say that consequentialism is the view that morality is all about results. When your boss says she cares only about “results,” that commonly means she does not care whether your gamble had a 1% or a 99% chance of succeeding. She cares only about whether it actually succeeded—even though, as explained above, the success, when it happens, is arguably not a “consequence” of your intentional action at all.

c. Plain Scalar Consequentialism

Plain Consequentialism is a theory about which actions are right. Its standard is high. It says that among all the very many things we could do at any given time, only one or a very few of them are right. The implication is that the rest of them are wrong. So if your action does vastly more good than what most other people would do in similar circumstances, but you could have chosen an action that would have done even a little more, Plain Consequentialism says that what you did was morally wrong. Plain Scalar Consequentialism is different.

Plain Scalar Consequentialism: Of any two things a person might do at any given moment, one is better than another to the extent that its overall consequences are better than the other’s overall consequences.

That is, if A’s consequences are a little better than B’s, then A is morally a little better than B; and if A’s consequences are much better than C’s, then A is morally much better than C. This theory implies that the actions with the best consequences are morally best, but it does not say that if you do the second-best you are doing something morally wrong. It says nothing about right and wrong. See Singer (1977); Norcross (1997).

d. Expectable Consequentialism and Reasonable Consequentialism

Of course, we cannot know the overall consequences of our actions. For example, the setting of a speed limit will help some people and hurt others, but there is no way to know in advance who the people will be, what projects will be helped or hindered, and how the further effects of all these things will play out over the centuries. You cannot know all that before you act (or after).

Is that point an objection to consequentialism? On the one hand, one might think it is an objection, since we are responsible for doing what is morally right and so we must be able to know what is morally right. On the other hand, one might think it is impossible to know what is morally right; morality seems permanently controversial and mysterious. It is unclear, then, whether the standard to which we should hold theories of morality is that they must explain why morality is easy to know about or why morality is terribly hard to know about!

The fact that we do not know the overall consequences of our actions makes room for further versions of consequentialism. Suppose I donate $100 to Malaria Aid, but it turns out this group aids malaria and I have funded an outbreak. Now, Plain Consequentialism implies that what I did is morally wrong, and Plain Scalar Consequentialism implies that it is morally very bad. But you might think that whether my action was morally wrong depends on what consequences it would have been reasonable for me to expect, not on the actual consequences. If the evil group was so cleverly deceptive that even the Better Business Bureau’s web site said they do good work fighting malaria, then you may think the damage done by my money was not my fault. So you may prefer a different version of consequentialism.

Expectable Consequentialism: The morally right action is the action whose reasonably expectable consequences are best. (There can also be a scalar version of this view and of the others introduced below.)

Reasonable estimates of consequences seem to involve a different kind of probability from that discussed in 1.b above. For example, suppose there is a machine that tosses a fair coin with such precision that whenever you press the Toss button, the coin always comes up heads. Now, suppose that you do not happen to know whether this machine always yields heads or always tails. (Or perhaps you do not even know that it is a precision machine.) When you press Toss, your action will have heads as a consequence, but you do not know that. So far as you can tell, heads and tails are equally likely, even if objectively there is a 100% chance of heads. This point can be expressed by saying that there is a 50% epistemic probability of heads, or that the reasonably expectable consequences of pushing the Toss button include a 50% epistemic chance of heads. For purposes of Expectable Consequentialism, a 50% epistemic chance of a good result is half as good as a 100% probability of that same result.

But Expectable Consequentialism has a strange implication. Suppose someone from Tuberculosis Aid comes to my door, says only, “Would you give to Tuberculosis Aid?” and hands me a pamphlet, which explains their evil plans on page 2. The reasonable way to estimate consequences would involve at least glancing through the pamphlet, but I am not interested. I simply assume that this group fights tuberculosis, and I do not look at the pamphlet because I do not care. I do not donate. Thus, without reasonably thinking about my choice, I have done what it would have been reasonable to estimate would have the best results. So Expectable Consequentialism says my thoughtless selfish action was morally right. If you do not want to praise my conduct, you might prefer a new version of consequentialism:

Reasonable Consequentialism: An action is morally right if and only if it has the best reasonably expected consequences.

Reasonable Consequentialism says that for an action of mine to be right, I must actually come to a reasonable conclusion beforehand about the consequences. Expectable Consequentialism says that an action can be right even if I do not think reasonably about it at all, so long as it is the action I would have estimated to have the best consequences if I had done a reasonable job of making an estimate. See Smart (1961).

e. Dual Consequentialism

Reasonable Consequentialism may be too simple. There was something right about my not donating. You might want to say that I fortunately did the right thing, but that my action was morally wrong. For another example, suppose I am sick and you are a doctor. You do a thorough and brilliant job of diagnosis and end up giving me the pill any responsible doctor would have to choose for the symptoms I display. But the pill turns out to harm me, because I have a rare and previously unknown virus. Now in one sense your prescription was wrong, but in another sense it was morally right. Dual Consequentialism can say both of those things. See Sidgwick (1907); Brink (1986).

Dual Consequentialism: The word “right” is ambiguous. It has a moral sense and an objective sense. (i) The objectively right action is the action with the best consequences, and (ii) the morally right action is any action with the best reasonably expected consequences.

f. Rule Consequentialism

If most people who live along a short river toss their garbage in the river, so that it is always full of garbage, then your tossing your own garbage in the river makes no difference to the river, and it saves the inconvenience of driving a few miles to the dump. So consequentialism would seem to support your tossing your garbage in the river. But if everyone hauled their garbage a few miles to the dump instead, in a year or two everyone would have a nice river, which is much more valuable to each person than the minor convenience of not having to haul one’s garbage to the dump. In this case, if each person follows consequentialism, the results are predictably worse than if everyone does something else instead. Thus consequentialism seems to defeat its own purpose.

Hence another kind of theory has been suggested, which might or might not be regarded as a version of consequentialism.

Rule Consequentialism: An action is morally right if and only if it does not violate the set of rules of behavior whose general acceptance in the community would have the best consequences—that is, at least as good as any rival set of rules or no rules at all.

(The name ‘Rule Consequentialism’ is an established term for many variant theories similar to the above). On this theory, an action is not right or wrong because of its own consequences; rather, it is right or wrong depending on whether it violates the collective rules that would have the best consequences. According to Rule Consequentialism, the right thing for each person in the community near the river to do is to follow the rule, “Throw garbage in the dump, not in the river.” Even if nobody else is going to the dump, and your going to the dump causes only inconvenience and no benefit, Rule Consequentialism says to take your garbage to the dump because that is what the best set of community rules would require.

Rule Consequentialism in one or another form has received a great deal of discussion. But since many people regard it as not quite in the spirit of consequentialism and many of the issues surrounding Rule Consequentialism are unique to it, we shall say little more about it here. See Brandt (1979); Hooker et al (2000).

There are more versions of consequentialism than are presented above. See Adams (1976); Railton (1988); Goodin (1995); Mulgan (1997); Murphy (1997). Some others are presented below, and anyone can invent new ones by following the instructions given in section 1a.

2. Two Simple Arguments for Consequentialism

In Section 2 we shall look at two initial reasons to think consequentialism is true and some worries about those reasons. In Section 3 we shall discuss reasons to think consequentialism is false and some worries about those reasons. In Section 4 we shall return to more complex reasons to think consequentialism is true and some worries about those reasons.

a. Only Results Remain

Actions are transient things, soon gone forever. Hence, one might think, in the long run only the results remain, so the only thing that really matters about an action is its results. So consequentialism must be true.

But this reason for favoring consequentialism seems confused. For one thing, consequentialism holds that actions do matter, because they are among their own consequences. More importantly, in the long run no result remains, or at least no earthly result. Pleasures pass by as quickly as actions. People too pass away, and planets evaporate. If only permanent things mattered, then your happiness and misery in this life would not matter at all; but surely they do matter.

b. Love

Arguably consequentialism is implicit in the very familiar conception of morality, shared by many cultures and traditions, which holds that moral perfection means loving all people, loving others as we love ourselves. For what is meant by “love” here? Forming many romantic attachments hardly seems like the path toward perfection; nor perhaps does the widespread spiritual exercise of focusing on wishing people well without actually helping them. If there is truth in the saying that we should “love all people,” perhaps it is simply that we should actively do what is good for people and not bad for them, as much as possible. If we try to produce the greatest total benefit, then we are loving “all people” in the sense that we are being impartial, caring for people in general, promoting each person’s well-being insofar as that is at stake in our actions and insofar as our helping one does not hurt others more.

A similar line of thought starts from the idea that morality is at bottom two things. First, abstractly, to be moral is to do one’s rational best to do what is objectively right. Second, more concretely, to be moral is to care about people. Now, rationality and objectivity are impartial; they do not favor one person over another. Hence to be moral is to care about people equally or impartially, so far as one can, which means trying to benefit people as much as one can. So consequentialism is correct.

One worry about these arguments is that if it happens that the most efficient way for you to help people is to send as much money as possible to help desperately poor people you do not know, then your following consequentialism may involve thinking of the people you know mainly as potential sources of money. And if someone thinks of the people she knows that way, it seems a stretch to call her a “loving” or even a “caring” person.

3. Arguments Against Consequentialism

We turn now to some of the most popular reasons to think consequentialism is false and some possible replies to these attacks

a. Partiality

It is in the spirit of consequentialism to look at goodness ultimately from an impartial, impersonal point of view. For example, a Consequentialist who thinks the kind of consequence that matters is happiness is unlikely to think that one person’s happiness is more important than another’s (so long as the amounts of happiness in question are the same). Hence consequentialism tends to hold that in deciding what to do, you ought to give just as much weight to the needs of total strangers as to the needs of your friends, your family, and even yourself. And since your dollar can usually do more good for desperate refugees than for yourself or your friends, consequentialism seems to hold that you ought to spend most of your dollars on strangers. But when you are deciding whom to spend your money on, common sense seems to hold that you are normally morally permitted to favor yourself over strangers and often morally required to favor your children over strangers. Hence consequentialism conflicts with common sense.

One reply to this objection is that since you know better how to help yourself and those near to you, you will get better results if you focus on them rather than people strange to you or out of view. Further, it is more natural for you to want to help those closer to you, so if you start projects to help your own rather than strangers, you are more likely to follow through and less likely to burn out or lose track of your purpose. Hence the consequences will probably be better. Further, those near to you are counting on your help, so that if you stop helping them their plans will be disrupted, while strangers will not be hurt in that way if you do not spend money on them. Further, your ability to think well and act effectively depends in many ways on your having strong relationships with a few people near to you, so that your spending a bit of time or money on these people not only gives them directly a bit of help or happiness, it also indirectly supports all your other projects now and in the future. For all these reasons it would seem that even a consequentialism that impartially counts each person’s happiness or well-being as being of equal value would advise each of us to be somewhat partial to herself and those near to her, because in that way she can produce the best impartial results. And perhaps that is why common sense favors some partiality. See Singer (1972); Jackson (1991); Kidder (2003).

A different kind of reply to the objection is to adjust consequentialism itself so that it is no longer impartial. Here are two simple examples of such theories:

Egoistic Consequentialism: Of all the things a person might do at any given moment, the morally right action is the one that has the best consequences for that person.

Friendly Consequentialism: Of all the things a person might do at any given moment, the morally right action is the one that has the best consequences for that person and her friends.

Theories like these that count the same kinds of consequence differently for each person acting, are sometimes called “agent-relative” forms of consequentialism, though one might wonder whether they are in the spirit of consequentialism at all. See Sen (1982), Nagel (1986), Scheffler (1994), Bennett (1989), Scheffler (1989), Brink (1986), and Skorupski (1995).

b. Equality

For consequentialism, the simplest way to conceive of the goodness of consequences is in terms of how much they contain of something that is considered good, such as happiness or personal well-being, regardless of who gets it. What matters is the total amount, not who gets what. Such a conception is egalitarian in the sense that it counts every bit of your happiness as being just as important as the same sized bits of my happiness. But one could object that in another sense, such a conception is not egalitarian because it does not care whether happiness is distributed equally or unequally among people. If the greatest total can be created only by exploiting the miserable to make the happy even happier, then such consequentialism would seem to say that you should do it. But common sense may rebel against that idea as being unfair or unjust. Hence consequentialism is wrong. See Le Guin (1973); Rawls (1999); Harsanyi (1977).

One reply to this objection is that our intuitive sense of fairness is not mainly concerned with distributions of ultimate goods like happiness or well-being. Rather, fairness is traditionally concerned with distributions of what we might call “external goods” – goods such as money, status, power, and political rights. These are good because of the further goods that they tend to produce. Now, serious inequality in external goods tends to reduce the total happiness. One reason is that, in general, external goods tend to produce more happiness or well-being when they go to people who have less of these goods than when they go to people who have more. For example, an extra dollar does more good for a poor person than for a rich person. That is a reason to think that promoting equality in external goods will tend to do more total good than promoting inequality. Another reason is that when there is more equality in the main external goods, the basic conditions of people’s lives will be more similar and people will find it easier to understand and sympathize with each other. Hence actions and policies that promote equality in external goods will cause more happiness by promoting a sense of community. Further, institutions that secure basic external equalities, or that aim to protect whoever is poorest and weakest, tend to give everyone more security. This makes life nicer and helps people be concerned for each other rather than fearful of each other, and they will therefore do more good for each other. Actions that promote egalitarian institutions, then, would tend to do the most good overall. Perhaps these points are the basis of our sense of the importance of equality.

A different kind of reply to the objection is to propose that one of the ultimate standards for goodness of consequences should be equality. One might propose, for example, that the consequences of an action are good insofar as they promote the total happiness and promote equality of happiness or of other goods. See Sidgwick (1907). However, once one introduces such a complex standard of goodness for consequences, questions arise as to how to rate the relative importance of the parts of the standard and about how such a view can be given theoretical elegance.

c. Personal Rights

Consequentialism may ask us to meddle too much into other people’s business. For example, perhaps we can do the most good overall if we forcibly stop people from wasting their time and energy on pointless or harmful things like driving SUVs, watching television, eating meat, following sports, and so on. See Frey (1984).

For a more extreme example of meddling, suppose that by using your grandmother’s pension to contribute to efficient and thoughtful charities you can develop permanent clean water supplies for many distant villages, thus saving hundreds of people from painful early deaths and permitting economic development to begin. You need only keep her bound and gagged in the cellar and force her to sign the checks. Consequentialism would seem to say that you should do this, but moral common sense says that you should not. Hence consequentialism is opposed to common sense and is probably wrong.

For another example, suppose you are a surgeon with five patients, each about to die for lack of a certain medicine that you can obtain (in sufficient quantity) only by killing and grinding up a sixth patient. Should you do it? Consequentialism says you should do this; but moral common sense says that you should not. Hence consequentialism is opposed to common sense and so is probably wrong. Foot (1967).

Now, one reply to the extreme examples is that such opportunities are extremely unusual. (At least that is true of the surgery example.) Moral common sense is shaped by and for the demands of ordinary moral life and so common sense may not be very reliable in odd cases. Hence the fact that consequentialism disagrees with common sense about odd cases is no disproof of consequentialism.

Another reply to the extreme examples is to point out that although they rely on secrecy, they overlook secrecy’s consequential drawbacks. To keep a big secret, you must actively mislead and deceive people and keep them at a distance. Continued deception about a serious matter is difficult, so at the outset you must take into account the chance that you will fail or give up. See Jackson (1991). Continued difficult deception uses up mental resources. Hence if you have such a secret, your further projects will be more poorly chosen, designed, and carried out. Also, if you have important secrets, you may find it hard to have ordinary trust for others; you may become somewhat paranoid and ineffective. Further, if you have a big secret that would repel nice honest people, any nice honest person who learns your secret will not want to be your friend. Anyone who does not know your secret will not really know you and hence cannot be your real friend. But we need nice honest friends if we are to be effective doers of good in the long run. We need them for practical help, for mental health, and to help us see ourselves clearly. We need to see ourselves clearly in order to do good effectively in the long run. Now, if you are the sort of person who actually would send money to save distant strangers, anything that cripples your efforts will hurt many people. Hence the reasonable expectation is that embezzling your grandmother’s checks would have terrible consequences. And if you are a skilled surgeon, anything that hampers your operations will hurt people. Hence the reasonable expectation is that harvesting the healthy patient would have bad consequences. A similar argument might be made regarding almost any scheme that would horrify nice honest people.

A more general reply to the claim that consequentialism advises us to meddle in other people’s business is that even where secrecy would not be involved, there are Consequentialist reasons for you to avoid direct meddling with others’ private spheres and personal affairs. For one thing, each of us is in a better position to understand her own affairs than you are and more naturally and reliably concerned than you are to make sure that her own affairs are carried out well. If you get involved in meddling, can you trust yourself to meddle in the right direction and with adequate care? If you want to do good for me, doing the sorts of things that are normally thought of as violating my personal rights is probably a bad bet. That does not mean consequentialism tells you to leave me entirely alone. Consequentialism can still tell you to give me resources or opportunities, or to help me with my projects, or to help improve the laws of our community.

Further, it is important that people be free to make decisions for themselves, even poor decisions, because that is the only way that people develop strength of character and because constant experimentation is the only way humanity learns about the various possibilities of life. Hence consequentialism would seem to ask us to support laws that protect personal freedom against excessive interference by our neighbors or our government. See Mill (1859).

A different kind of reply to the objection is to propose a new standard for the goodness of consequences. One might propose, for example, that an action is good insofar as it decreases the amount of meddling in the world. Or one might propose instead that an action is good insofar as it causes less meddling and more total happiness. Of course, once one introduces such a complex standard of goodness for consequences, questions arise about how to rate the relative importance of the parts of the standard and about how such a view can be given theoretical elegance. A further worry about this new proposal is that it still does not directly tell us not to meddle. For if we can minimize the total amount of meddling in the long run by meddling today (perhaps by spying on terrorism suspects or by privately bombing the citizens of aggressive countries), this new theory tells us to do so. See Sen (1982).

d. Human Thinking

Consequentialism seems to tell us to make all our decisions by thinking about overall consequences. But that way of thinking about life is, one might think, inhuman and immoral. When someone asks you a question, you should not stop to calculate the consequences before deciding whether to answer truthfully. If you decide by looking to the consequences, you are not really an honest person. Also, when you are about to follow through on a project you have started, you should not stop to calculate the overall consequences anew before you proceed. A sane person will decide on a project and then simply follow through, unless some new situation arises. Anyone who stops to calculate consequences before taking any step to fulfill a commitment is not a person of integrity. And what moves you to spend an hour with your friend or spouse or child should not be impartial calculations about the overall impact on the world at large. If you decide by looking to the overall consequences, you do not really love that person. Therefore consequentialism is an inhuman and immoral theory and must be wrong. See Williams (1973); Williams (1981); Stocker (1976).

Now, this objection does not directly apply to Plain Consequentialism or Plain Scalar Consequentialism, for these theories do not say that we should think about consequences. On the contrary, if you think in the inhuman way described in the objection, your plans and your relationships are unlikely to go well, so Plain versions of consequentialism tend to oppose that way of thinking. Such thinking would be action that has bad consequences. See Bales (1971), Railton (1994).

Nor does the objection apply to Rule Consequentialism. Rule Consequentialism suggests that we should evaluate rules of behavior by asking what the consequences would be if everyone accepted this or that rule, but does not say that the rightness of actions has anything to do with the consequences of those actions themselves. See Rawls (1955).

The objection does, however, directly attack Reasonable Consequentialism and Dual Consequentialism, because these theories say that an action is morally wrong unless we have a reasonable estimate of its consequences.

The defender of Reasonable or Dual Consequentialism might argue that the objection has misunderstood what it is to have a reasonable estimate of an action’s consequences. Perhaps it does not involve explicitly thinking about the consequences at all. As I proceed to feed my cat, I almost never think about the consequences of doing so versus not doing so, but surely it would be wrong to say that I have no view or that my view is not reasonable.

Another way of replying to the objection is to propose yet another version of consequentialism.

Double Consequentialism: The word “right” is ambiguous. It has a moral sense and an objective sense. (i) The objectively right action is the action with the best consequences, and (ii) the morally right action is any action one reasonably estimates to be objectively right.

This Double Consequentialism differs from the Dual Consequentialism of 1.e above only in point (ii), on the morally right action. Where Dual Consequentialism had said that the morally right action is “any action with the best reasonably expected consequences,” Double Consequentialism says the morally right action is the action one reasonably estimates to be objectively right. To see the difference in principle between these theories, suppose there is a somewhat reliable authority on what specific kinds of actions are objectively right. For example, suppose God, who knows all the consequences, has announced that certain kinds of things are right. Or suppose a society’s conventional views about what is right and wrong reflect centuries of experience about what tends to cause trouble. Or suppose the recommendation that comes from you friend, your mother, your heart, or your prior resolution, reflects insight into the implications of your action that would not be reflected in the conscious estimates of consequences you might be able to work up on the spur of the moment. Further, suppose that God, society, your friend or your heart has sufficient authority on the points it addresses that the most reasonable way for you to estimate which of your own options are objectively right is to trust that authority. If there is such an authority, then actions one chooses by deferring to the authority may be morally right according to Double Consequentialism even if they are morally wrong according to Dual Consequentialism.

For example, suppose Paul is considering stealing money from his grandmother to help the poor. So far as he can reasonably guess, that scheme would have the best overall consequences. But he remembers that stealing is generally regarded as wrong. He may or may not find consequentialism plausible, but in any case he knows he does not have a solid theoretical understanding of rightness; so he reasonably decides to trust his community’s confident view and does not pursue the scheme. Double Consequentialism says his choice is morally right, even though his decision was not based on estimates of consequences and went against his estimates.

One might object that if the objectively right action is the one whose consequences are best, then general social opinion cannot be an authority on objective rightness, even on those issues where the general opinion is clear. For general social opinion does not agree that the objectively right action is the one whose consequences are best.

But this objection assumes that an authority on the question whether an action is objectively right would have to know exactly what objective rightness is. That assumption may be mistaken, because it is not true that an authority on whether something has a certain feature has to know exactly what that feature is. For example, suppose that many years ago, before anyone knew that gold is made of atoms or that it is the element with atomic number 79, Jack and Jill were hiking in unclaimed land and came upon some heavy shiny lumps. Jack had no idea how to identify gold. But Jill had handled gold a few times before and could make a good guess about whether the lumps were really gold. For the moment, Jill was an authority for Jack on whether these lumps were gold. It was reasonable for him to rely on her imperfect judgment, even though neither of them knew quite what gold is.

Since Double Consequentialism does not imply that you should estimate the consequences of your everyday actions, it seems to escape the objection that consequentialism requires inhuman and immoral thinking.

4. Further Arguments for Consequentialism

a. Reasons for Action

One argument for consequentialism begins from the premise that whatever a person does, she does in order to produce some sort of good result. It may be a benefit to herself or to someone else. It may be a short-run benefit or a long-run benefit. It may be a benefit of a particular kind: a financial benefit, a heath benefit, entertainment or knowledge. It may be the prevention of some harm. But whatever a person does, she does in order to produce some sort of benefit. Her expectation that it will produce or promote that good outcome is her reason for performing the action. Now, different kinds of benefits yield different kinds of reasons. For example, if a certain action would be good for the bank account but bad for the health, there is a financial reason for it and a health reason against it. Similarly, if a certain action would be good for me but bad for you, there is a reason for it and a reason against it. To find out whether the action is rationally justifiable overall, one must look beyond these specific kinds of reason to find what overall reason there is. That is, one must look to see whether financial benefit outweighs the health drawback, and whether the benefit to me outweighs the harm to you. In other words, one must ask whether the action promotes benefit overall. Therefore, an action is rationally justifiable insofar as it does good overall. And since we ought to do what is rationally justifiable, we ought to do whatever does the most good overall. Hence Consequentialism is true.

One worry about the above argument is that its initial premise may be false. We may sometimes act not to produce a benefit, but in order to obey a principle we accept. For example, you may do something simply because you have promised or because it is required by law, without looking to the consequences. Even if every action does aim at some benefit, this does not show that the benefit is the whole reason for each action. Perhaps our reason for each action is a combination of two things: the idea that the action will produce benefits and the idea that the action is morally permissible—that it would not violate any principles of morality. If every action is taken to produce some benefit, that shows only that the benefit is part of the reason for every action, not that the benefit is the whole reason.

Another worry about the above argument is that it presupposes that the notion of overall benefit makes sense. To see how someone might question that, think about skills and skill. Many of our actions are aimed at developing skill. But skill is not one thing. Many of our actions are aimed at developing a skil. To practice one skill, one must neglect or even undermine another skill. (Boxing makes me worse at the piano.) But that does not imply that there is a kind of skill that is neither boxing nor piano but simply “overall skill,” nor does it imply that my training actions are irrational unless I think they will promote overall skill. See Foot (1985); Scanlon (1998).

b. It Is Wrong to Choose the Worse Over the Better

Consider the following argument for consequentialism adapted from Foot (1985).

  1. The whole of an action’s consequences has no further consequences. (Premise)
  2. When we are choosing among such wholes, nothing else is at stake. (From 1)
  3. It can never be right to choose something worse over something better, when nothing else is at stake. (Premise)
  4. It can never be right to choose a worse whole set of consequences over a better. (From 2 and 3)
  5. In choosing an action, one is choosing its whole set of consequences. (Premise)
  6. One ought always to choose an action whose overall consequences are at least as good as the overall consequences of any of the alternative actions; in other words, consequentialism is true. (From 4 and 5)

A worry about the argument is that premise (5) may not be true. In choosing an action, one is normally not choosing its whole set of consequences, because one cannot know what most of the consequences are. One is normally not even choosing the reasonably expectable consequences, because one has not formed any expectation about the action’s likely overall consequences.

A second worry is that premise (1) may not support statement (2). Even though a whole set of consequences has no further consequences, it might have further implications. For not all implications are consequences. For example, one important implication of the fact that my speedometer’s hand is below the ‘55’ is that I am going slower than 55. That is why the position of the hand matters to me. But of course I know that the position of the hand has no effect on my speed. For another example, one important implication of an action I take may be that I (already) am a certain kind of person. An action can show what kind of person I am even if it does not make me be that kind of person. See Campbell and Sowden (1985).

A third worry about the above argument begins from a view about the adjective ‘good’. What we are saying about a knife when we say that it is a “good” one is very different from what we are saying about a painting when we say that it is a “good” one; and similarly the import of ‘good’ seems to differ in the phrases ‘good mathematician’, ‘good liar’, ‘good father’, and ‘good batch of crack’. Thus it would seem that the standards of goodness vary with the kind of thing we are talking about. Now, some kinds of thing do not suggest any standards of goodness: consider ‘good pebble’. If I point to a pebble and say that it is a “good pebble,” you will not know what I mean. Hence ‘good’ seems not to have a meaning in that context. To say that a certain pebble is good is meaningless. Similarly, there are no general standards of goodness for whole sets of consequences in genera. The phrase ‘good whole set of consequences’ is no more communicative or meaningful than the phrase ‘good pebble’. If that is right, then consequentialism itself must be wrong because consequentialism is at root the idea that we ought to bring about good consequences. See Geach (1956); Foot (1985); Thomson (1993).

This controversial line of thought is not only an objection to the above argument for consequentialism, it is also an argument against consequentialism. For if ‘good consequences’ is meaningless, then it cannot be correct to define right action in terms of good consequences, as consequentialism normally does.

One possible reply to this argument against consequentialism is that even if ‘good overall consequences’ turns out to be meaningless, one might still think, for example, that the right action is the one that causes the most happiness. One could phrase consequentialism in general terms as, for example, the theory that “there is some feature of consequences of actions such that the right action is the one whose consequences have that feature to the greatest degree.”

The remaining arguments for consequentialism given here, like the argument from love, do not speak merely of “good consequences overall.” Rather they defend consequentialism by defending the importance of some particular kind of consequence, such as happiness, the satisfaction of desire, or the well-being of people.

c. The Ideal Spectator

Consider the following argument for consequentialism.

  1. What objectively ought to happen, what is objectively desirable, is whatever would be wished for by a spectator with full knowledge and no bias; that is, someone who knows everything and is equally sympathetic with everyone. (Premise)
  2. An impartially sympathetic being who knows everyone’s desires would share everyone’s desires in proportion to their strength. (Premise)
  3. An all-knowing impartial being would, overall, wish for the greatest possible balance of satisfaction of the desires of all people. (From 2)
  4. What objectively ought to happen is whatever would promote the greatest possible balance of satisfaction of the desires of all people. (From 1 and 3)
  5. The right action is the one that objectively ought to happen. (Premise)
  6. The right action is whatever would promote the greatest possible balance of satisfaction of the desires of all people. (From 4 and 5)
  7. Consequentialism is true. (From 6)

One worry about the above argument is that it is not clear why we should think Premise 1 is true. Why would the absence of bias mean being equally sympathetic with everyone? Perhaps an easier way to be free of bias is to have no sympathy for anyone.

Another worry is that 1 and 2 do not imply 3. For one thing, 1 and 2 do not tell us that the ideal spectator would have no concerns other than those she derives from sympathy, but 3 does make that assumption. For another thing, suppose this amazing being does lack all other concerns. Now, 2 tells us that she is full of desires that conflict with each other. 3 says that she has another desire—the desire that all her other desires be fulfilled as much as possible. Why would she have that additional desire? One might suppose that if a person has two conflicting desires, it is rational for her to replace them with a single compromise desire. But if the spectator replaces her conflicting desires, then according to 2 she no longer has the sympathy that makes her a reliable judge. See Firth (1952); Hare (1981), Seanor and Fotion (1988).

d. What is Desirable

Consider this argument for Plain Scalar Consequentialism, which is based on one proposed in Mill (1861):

  1. Desiring something is the same thing as thinking that it will increase one’s happiness or decrease one’s unhappiness. (Premise)
  2. What each person ultimately desires is only her own happiness. (From 1)
  3. What will satisfy each person’s desire is her own happiness—and whatever promotes that. (From 2)
  4. “X is desirable” means “If X occurs, X will help satisfy desire.” (Premise)
  5. What is ultimately desirable for each person is her own happiness—and whatever promotes that. (From 3 and 4)
  6. “Good” and “desirable” are synonyms. (Premise)
  7. What is good for you is happiness for you —and whatever promotes that. (From 5 and 6)
  8. 8. What is good is happiness—and whatever promotes that. (From 7, crossing ‘for you’ out of both sides of the equation)
  9. An action is good insofar as its overall consequences contain happiness. (From 8)
  10. Plain Scalar Consequentialism is true. (From 9)

One worry about this argument is that 1 seems false. For example, people often procrastinate from laziness or fear, knowing that they are hurting themselves in the long run. And even people who do not believe in a life after death often give their lives for larger causes.

Another worry is that it is unclear exactly how 7 is supposed to imply 8. Even in mathematics, crossing the same thing out of both sides of a true equation does not always yield a new true equation. If you cross out “+2” from both sides of “10+2 = 3(2+2),” you change a truth to a falsehood.

A shorter cousin of the above argument, focusing on the fulfillment of desire rather than on happiness, avoids those worries.

  1. “X is desirable” means “X will help satisfy desire if, X occurs.” (Premise)
  2. The words “good” and “desirable” are synonyms. (Premise)
  3. An action is good insofar as it helps to satisfy desire. (From 1 and 2)
  4. An action is good insofar as its consequences include the satisfaction of desire. (From 3)
  5. Consequentialism is true. (From 4)

One worry about this shorter argument is that Premise 2 may be false. For example, it sounds a bit odd to say that when you call someone a good person, you are calling her a desirable person.

Another worry is that it is obscure whether there is anything sensible that might be meant by a greater or lesser amount of “satisfaction of desire.” Are all desires to count or only those that exist at the time of the action or the decision (even if they disappear before most of the consequences arrive)? Presumably the stronger desires are to count for more. But if I desire something slightly and then intensely, which counts? Should a desire count for more if it is held for a longer time? Should it count if it is based on a factual mistake or if it is malicious? See Griffin (1986); Scanlon (1993).

e. Common Sense

There are many moral questions on which common sense is divided or simply stumped. People disagree with each other about the morality of using human embryos for stem cell research, downloading copyrighted music, giving little to the poor, eating animals, having certain kinds of sex, and many other things. One of the main reasons to investigate moral theory is to learn how to approach these questions reasonably.

But on many issues there is a broad range of solid agreement about what is morally obvious, at least in societies that have long permitted open discussion by all. We firmly agree, for example, that equality and rights are very important, that it is not wrong to favor our family and friends over strangers, that it is wrong to torture children, and so on. When we are thinking about morality, that is usually because we are puzzled about some hard question. At such times we might overlook the fact that the aspects of morality that we agree on as obvious cover so much territory that they sketch the basic shape of civilized life.

Yet there is not broad agreement on the abstract question, “What is morality all about? What is morality?” Consequentialism is, as we have seen, one of many different proposed answers to that question. The true answer would presumably have some sort of simplicity and would presumably support most of the concrete moral views that seem most obvious to our common sense. So if consequentialism agrees with common sense, that agreement is some reason to think that consequentialism is true.

Section 3 above presented several objections to consequentialism, arguing that consequentialism conflicts with one or another basic piece of common sense about morality. But in reply to most of these objections, Section 3 presented arguments to show that consequentialism supports those bits of common sense after all.

A worry about this line of thought is that if there were some simple theory like consequentialism that captured what morality is about, one might think that we would have recognized it long ago. But consequentialism is still controversial.

(For more discussion of consequentialism, see the consequentialism section of the article Ethics.)

5. References and Further Reading

a. Classic Works

  • Bentham, Jeremy (J. H. Burns and H. L. A. Hart, eds.). An Introduction to the Principles of Morals and Legislation [1789]. Oxford: Oxford University Press, 1996.
  • Mill, John Stuart (Roger Crisp, ed.), Utilitarianism [1861]. Oxford: Oxford University Press, 1998.
  • Sidgwick, Henry. 1907. The Methods of Ethics, Seventh Edition [1907]. Indianapolis: Hackett Publishing Company, 1981.
  • Moore, G. E. (Thomas Baldwin, ed.) Principia Ethica [1903]. Cambridge: Cambridge University Press, 1993.

b. Recommended Collections

Most of the best recent work on consequentialism is collected in the following anthologies. Any one of these collections provides an excellent introduction to consequentialism. In addition, the fine journal Utilitas is entirely devoted to the topic.

  • Darwall, Stephen. Consequentialism. Oxford: Blackwell Publishing, 2003.
  • Gorovitz, Samuel, ed. John Stuart Mill: Utilitarianism, With Critical Essays. Indianapolis: The Bobbs-Merrill Company, 1971.
  • Pettit, Philip, ed. Consequentialism (International Research Library of Philosophy, Vol. 6). Aldershot: Dartmouth Publishing Group, 1993.
  • Scheffler, Samuel, ed. Consequentialism and Its Critics. Oxford: Oxford University Press, 1988.

c. Other Recommended Works

  • Adams, Robert M. “Motive Utilitarianism.” Journal of Philosophy 73 (1976): 467-481.
  • Bales, R. Eugene. “Act-Utilitarianism: Account of Right-Making Characteristics or Decision-Making Procedures?” American Philosophical Quarterly 8 (1971): 257-65.
  • Bayles, Michael D., ed. Contemporary Utilitarianism.. Garden City: Doubleday, 1968.
  • Bennett, Jonathan. “Two Departures from Consequentialism.” Ethics 100.1 (1989): 54-66.
  • Brandt, Richard. B. A Theory of the Good and the Right. Oxford: Oxford University Press, 1979.
  • Brandt, Richard B. Morality, Utilitarianism, and Rights. Cambridge: Cambridge University Press, 1992.
  • Brink, David. “Utilitarian Morality and the Personal Point of View.” Journal of Philosophy 83.8 (1986): 417-38.
  • Brink, David. Moral Realism and the Foundations of Ethics. Cambridge: Cambridge University Press, 1989
  • Campbell, Richmond, and Sowden, Lanning, eds. Paradoxes of Rationality and Cooperation. Vancouver: University of British Columbia Press, 1985.
  • Den Uyl, Douglas, & Machan, Tibor R. “Recent Work on the Concept of Happiness.” American Philosophical Quarterly 20.2 (1983): 115-134
  • Driver, Julia, ed. Character and Consequentialism. Special Issue of Utilitas, 13.2 (2001).
  • Feldman, Fred. Utilitarianism, Hedonism, and Desert. Cambridge: Cambridge University Press, 1997.
  • Firth, Roderick. “Ethical Absolutism and the Ideal Observer.” Philosophy and Phenomenological Research 12 (1952): 317-345.
  • Foot, Philippa. “The Problem of Abortion and the Doctrine of Double Effect.” Oxford Review 5 (1967): 28-41.
  • Foot, Philippa. “Utilitarianism and the Virtues.” Mind 94 (1985): 196-209.
  • Frey, Raymond. G. Utility and Rights. Oxford: Basil Blackwell, 1984.
  • Geach, Peter. “Good and Evil.” Analysis 17 (1956): 33-42.
  • Goodin, Robert E. Utilitarianism as a Public Philosophy. New York: Cambridge University Press, 1995.
  • Griffin, James. Well-Being. Oxford: Clarendon Press, 1986.
  • Hare, Richard M. Moral Thinking. Oxford: Clarendon Press, 1981.
  • Harsanyi, John. C. “Morality and the Theory of Rational Behavior.” Social Research 44.4 (1977): 623-656.
  • Hart, H. L. A. “Natural Rights: Bentham and John Stuart Mill.” In Essays on Bentham: Studies in Jurisprudence and Political Theory, by H. L. A. Hart. Oxford: Clarendon Press, 1982.
  • Hooker, Brad, ed. Rationality, Rationality, Rules, and Utility: New Essays on the Moral Philosophy of Richard Brandt. Boulder: Westview Press, 1993.
  • Hooker, Brad. “Rule Consequentialism.” Stanford Encyclopedia of Philosophy.
  • Hooker, Brad; Mason, Elinor; and Miller, Dale E. Morality, Rules, and Consequences. Edinburgh: Edinburgh University Press, 2000.
  • Jackson, Frank. “Decision-Theoretic Consequentialism and the Nearest and Dearest Objection.” Ethics 101 (1991): 461-82.
  • Jackson, Frank, and Pargetter, Robert. “Oughts, Options, and Actualism.” Philosophical Review 95 (1986): 233-255.
  • Kagan, Shelly. The Limits of Morality. Oxford: Clarendon Press, 1989.
  • Kagan, Shelly. Normative Ethics. Boulder: Westview, 1998.
  • Kidder, Tracy. Mountains Beyond Mountains. New York: Random House, 2003.
  • Le Guin, Ursula K. The Ones Who Walk Away From Omelas [1973]. Mankato, MN: Creative Education, 1992.
  • Lyons, David. Forms and Limits of Utilitarianism. Oxford: Clarendon Press, 1965.
  • Mill, John Stuart. On Liberty [1859] in John Gray and G. W. Smith, eds., J. S. Mill’s On Liberty in Focus. London: Routledge, 1991.
  • Mulgan, Tim, “Two Conceptions of Benevolence.” Philosophy and Public Affairs 26.1 (1997):62-79.
  • Mulgan, Tim. The Demands of Consequentialism. Oxford: Clarendon Press, 2001.
  • Murphy, Liam B. “A Relatively Plausible Principle of Beneficence: Reply to Mulgan.” Philosophy and Public Affairs 26.1 (1997):80-86.
  • Nagel, Thomas. The View From Nowhere. Oxford: Oxford University Press, 1986.
  • Norcross, Alastair. “Good and Bad Actions.” Philosophical Review 106.1(1997): 1-34.
  • Nozick, Robert. Anarchy, State, and Utopia. New York: Basic Books, 1974.
  • Parfit, Derek. Reasons and Persons. Oxford: Clarendon Press, 1984.
  • Pettit, Philip. “The Consequentialist Perspective.” In Three Methods of Ethics, by Marcia Baron, Philip Pettit, and Michael Slote. Oxford: Blackwell Publishing, 1997.
  • Railton, Peter. “How Thinking about Character and Utilitarianism Might Lead to Rethinking the Character of Utilitarianism.” Midwest Studies in Philosophy, 13 (1988): 398-416.
  • Railton, Peter. “Alienation, Consequentialism, and the Demands of Morality,” Philosophy and Public Affairs, 13.2 (1994): 134-71.
  • Rawls, John. “Two Concepts of Rules” Philosophical Review 64 (1955): 3-32.
  • Rawls, John. A Theory of Justice, Revised Edition. Cambridge, MA: Harvard University Press, 1999. Scanlon, Thomas M. “Value, Desire, and Quality of Life.” In Martha Nussbaum and Amartya Sen, eds., The Quality of Life. Oxford: Clarendon Press, 1993.
  • Scanlon, Thomas M. What We Owe to Each Other. Cambridge, MA: Harvard University Press, 1998.
  • Scarre, Geoffrey. Utilitarianism. London: Routledge, 1996.
  • Scheffler, Samuel. “Deontology and the Agent: A Reply to Bennett” Ethics 100.1 (1989): 67-76.
  • Scheffler, Samuel. The Rejection of Consequentialism, Revised Edition. Oxford: Clarendon Press, 1994.
  • Seanor, Douglas, & Fotion, N. Hare and Critics. Oxford: Clarendon Press, 1988.
  • Sen, Amartya. “Rights and Agency.” Philosophy and Public Affairs 11.1 (1982): 3-39.
  • Sen, Amartya, and Williams, Bernard, eds. Utilitarianism and Beyond. Cambridge: Cambridge University Press, 1982.
  • Shaw, William. H. Contemporary Ethics: Taking Account of Utilitarianism. Malden: Blackwell Publishing, 1999.
  • Singer, Marcus G. “Actual Consequence Utilitarianism.” Mind 86 (1977): 67-77.
  • Singer, Peter. “Famine, Affluence, and Morality.” Philosophy and Public Affairs 1 (1972): 229-243.
  • Singer, Peter. Practical Ethics, Second Edition. Cambridge: Cambridge University Press, 1993.
  • Sinnott-Armstrong, Walter. “Consequentialism.” In The Stanford Encyclopedia of Philosophy.
  • Skorupski, John. “Agent-Neutrality, Consequentialism, Utilitarianism: A Terminological Note.” Utilitas 7 (1995): 49-54.
  • Slote, Michael. “Object Utilitarianism,” Pacific Philosophical Quarterly 66 (1985): 111-124.
  • Slote, Michael. Common-Sense Morality and Consequentialism. London: Routledge and Kegan Paul, 1985.
  • Slote, Michael. Beyond Optimizing. Cambridge, MA: Harvard University Press, 1989.
  • Smart, J. J. C., “Free Will, Praise, and Blame,” Mind 70.279 (1961): 291-306.
  • Smart, J. J. C. “An Outline of a System of Utilitarian Ethics.” In Utilitarianism: For and Against, by J. J. C. Smart and Bernard Williams. Cambridge: Cambridge University Press, 1973.
  • Sprigge, T. L. S. The Rational Foundations of Ethics. London: Routledge & Kegan Paul, 1988.
  • Stocker, Michael. “The Schizophrenia of Modern Ethical Theories.” Journal of Philosophy 73 (1976): 453-466.
  • Sumner, L. W. Welfare, Happiness, and Ethics. Oxford: Clarendon Press, 1996.
  • Tännsjö, Torbjörn. Hedonistic Utilitarianism. Edinburgh: Edinburgh University Press, 1998.
  • Taurek, John. “Should the Numbers Count?” Philosophy & Public Affairs 6 (1977): 293-316.
  • Thomson, Judith Jarvis. “Goodness and Utilitarianism.” Proceedings and Addresses of the American Philosophical Association 67.2 (October 1993): 145-159.
  • Williams, Bernard. “A Critique of Utilitarianism,” in Utilitarianism: For and Against, by J.J.C. Smart and Bernard Williams. Cambridge: Cambridge University Press, 1973.
  • Williams, Bernard. “Persons, Character, and Morality,” in Bernard Williams, Moral Luck. Cambridge: Cambridge University Press, 1981.

Author Information

William Haines
Email: hainesw@hkucc.hku.hk
The University of Hong Kong
China

Donald Herbert Davidson: Mind and Action

Donald Davidson was a 20th century American philosopher whose most profound influences on contemporary philosophy were in the philosophy of mind and action. This article examines in detail two leading motifs in Davidson’s philosophy. One is that mental phenomena resist being “captured in the nomological net of physical theory.” Davidson claims there are no strict deterministic laws on the basis of which mental events can be predicted and explained. He rejects all deterministic, non-normative laws connecting either mental states with physical states or mental states with other mental states. The other motif concerns the problem of analyzing the explanatory force of an agent’s reasons for his or her actions. It is Davidson’s contention that explanation by appeal to reasons is a form of causal explanation, because this is the only way to account for the fact that we have many reasons for acting the way we did, but only one of them is the reason we acted that way.

Davidson’s argument that mental phenomena cannot be captured by strict, deterministic scientific laws as they are normally understood depends upon his treatment of propositional attitudes, attitudes of hoping that p, or fearing that p, or believing that p, where p is some proposition. Propositional attitudes have certain features that distinguish them from physical states and events, says Davidson. For Davidson there is no “underlying mental reality whose laws we can study in abstraction from the normative and holistic perspectives of interpretation.” His theory of propositional attitudes is guided by conclusions drawn from the project of Radical Interpretation, a project initiated by W.V.O. Quine, Davidson’s teacher. Quine challenged two central tenets of Logical Positivism: reductionism and the analytic/synthetic distinction. Following in Quine’s footsteps, Davidson does away with what he considers to be the third and last dogma of empiricism: the dogma of the dualism of scheme and reality.

Table of Contents

  1. Life and Influences
  2. Mind
    1. Anomalism of the Mental
    2. Propositional Attitudes
    3. No Psychophysical Laws
    4. No Psychological Laws
  3. Action
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Influences

Donald Davidson was born on March 6, 1917 in Springfield, Massachusetts. He studied English, Comparative Literature and Classics in his undergraduate years at Harvard, and in his sophomore year he attended two classes that made a lasting impression on him. These were two philosophy classes taught by Alfred North Whitehead in the last year of his career. Afterwards, Davidson was accepted to graduate studies in philosophy at Harvard, where he studied under Willard Van Orman Quine. Quine set Davidson on a course in philosophy quite different from that of Whitehead. Subsequently, Davidson did his dissertation on Plato’s Philebus.

According to Davidson, “The central thesis that emerged was that when Plato had reworked the theory of ideas as a consequence of the explorations and criticisms of the Parmenides, Sophist, Theaetetus, and Politicus, he realized that the theory could no longer be deployed as a main support of an ethical position, as it had been developed in the Republic and elsewhere.” This dissertation reveals the development of Davidson’s philosophical method and his epistemological position.

Davidson’s most profound influences on contemporary philosophy stem from his philosophy of mind and action. However, Davidson’s philosophical positions in action theory and philosophy of mind are intrinsically tied into his work on the semantics of natural languages.

Davidson’s apprenticeship in philosophy took place in an intellectual milieu very different from today’s. In the Anglo-American philosophical community, the middle of the century was dominated by Logical Positivism. Davidson recalls that he got through graduate school at Harvard by reading an anthology of Logical Positivism by Feigl and Sellars. Logical positivism emerged in the Austro-Hungarian Empire early in this century. Influenced by the logicist project of Bertrand Russell and Gottlob Frege on the one hand, and by advances in science on the other, the Logical Positivists of the Vienna Circle turned to physics as a model of theoretical discourse; and they considered sensory experiences to be fundamental. Although Logical Positivism was not entirely a unified movement, the Verification Principle was shared by most of them. It states that the meaning of sentences can be accounted for in terms of experiences that would verify them. Logical Positivism usually promotes a reductionist program: the reduction of all special sciences to physics, and of all meaningful statements to reports about sensory experiences. In his famous paper, Two Dogmas of Empiricism, Davidson’s teacher Quine challenged two central tenets of Logical Positivism: reductionism and the analytic/synthetic distinction. Following in Quine’s footsteps, Davidson does away with what he considers to be the third and last dogma of empiricism: the dogma of the dualism of scheme and reality. See his paper “On the Very Idea of a Conceptual Scheme.”

Of the two leading motifs in Davidson’s mature philosophy discussed in this article, one has to do with the fact that mental phenomena resist being “captured in the nomological net of physical theory.” Davidson rejects strict psychophysical and psychological laws. The other motif concerns the problem of analyzing the explanatory force of an agent’s reasons for his or her actions. It is Davidson’s contention that explanation by appeal to reasons is a form of causal explanation.

2. Mind

a. Anomalism of the Mental

Simply put, “anomalism of the mental” amounts to the claim that the mental is not governed by laws as we usually understand them. In Davidson’s own words:

There are no strict deterministic laws on the basis of which mental events can be predicted and explained.

In developing his position, Davidson attempts to retain his materialism while at the same time to avoid a reductionism. Usually reductionism has been held to have followed from materialism. When Davidson asserts that there can be no laws on the basis of which mental events can be predicted and explained, he has two different types of laws in mind. In the first type of law, an attempt is made to link mental states and events with physical states and events, and the law is used to explain the former on the basis of the latter. Davidson spends much of his effort in Mental Events showing the impossibility of such psychophysical laws. In the second type of law, there is an attempt to formulate strict deterministic laws linking mental states and events to other mental states and events. Davidson denies the possibility of these psychological laws as well. Davidson’s latter claim is considered to be a rejection of the most basic goal of the science of psychology.

In arguing against the possibility of psychophysical laws, Davidson has in mind the following kinds of laws:

(BL) ∀x (x is in M iff x is in P)

where M denotes some mental state or event and P denotes some physical state or event and “iff” abbreviates “if and only if.” The laws of the above kind are known as bridging laws (BL). A stronger version of a bridging law claims identity of properties from different theoretical discourses. A weaker version claims only that whenever an object instantiates one property it instantiates the other. An important distinction between laws and generalizations must be made. There has been general agreement among philosophers, Davidson included, that a law is distinguished from a mere generalization by the following features:

  1. A law must support counterfactual claims. A law of the form “All A are B,” for instance, is said to sustain the claim that if any arbitrary x were, contrary to fact, an A, it would also be B.
  2. It must be capable of confirmation by observable instances.

To illustrate the difference between generalizations that just happen to be true, and real laws, consider the following story (adopted from Jaegwon Kim). Assume that all objects in a fixed domain, for instance all objects in my room, are either blue or red. In addition, all of the above objects are considered either edible or inedible. By some coincidence it so happens that all red objects in my room are edible. Perhaps the red objects in my room are either ripe tomatoes or ripe cherries. This allows us to form a true generalization about this fixed domain:

(G) If x is red then x is edible.

It is obvious that (G) does not support counterfactual conditionals. For instance (G) does not allow us to infer of some green object (say a copy of Davidson’s Essays on Actions and Events) that if it were red it would be edible. Davidson is quite explicit that his attack is aimed at psychophysical laws not at true psychophysical generalizations:

The thesis is rather that the mental is nomologically irreducible: there may be true general statements relating the mental and the physical, statements that have the logical form of a law; but they are not lawlike (in a strong sense to be described). If by absurdly remote chance we were to stumble on a non-probabilistic, true, psychophysical generalization, we would have no reason to believe it more than roughly true; we would have no reason to believe it was a law.

Following this view, it is important to keep in mind the fact that whether any given psychophysical generalization is true is a contingent, empirical matter. As we will see later, it is an a priori matter for Davidson that no such generalization can be a law.

The core idea of Davidson’s argument against the possibility of psychophysical laws can be found in the following passage:

Nomological statements bring together predicates that we know a priori are made for each other — know, that is, independently of knowing whether the evidence supports a connection between them. If we can know a priori when the predicates are made for each other, then we can know by the same token when they aren’t. Davidson finds that it is an a priori truth that mental and physical predicates are not made for each other. Here is the structure of his argument.

  1. Both mental and physical phenomena have distinct sets of features characteristic of their own domains, but these features are incompatible with each other.
  2. Bridging laws linking properties from two distinct theoretical discourses (in this case mental and physical) would transmit properties from one discourse to another, which in case of mental and physical phenomena would lead into incoherence.
  3. Therefore, there could be no psychophysical laws linking mental and physical phenomena.

b. Propositional Attitudes

According to Davidson, the paradigmatic criterion of the mental events is their susceptibility to the description “in terms of vocabulary of propositional attitudes.” Propositional attitudes, or intentional states as they are sometimes called, are various cognitive attitudes; we can have hope that the proposition p is true, we can fear that p is true, we can desire that p is true, and so forth. You and I can have different attitudes toward the proposition “Snow is white.” I hope that snow is white, whereas you believe that it is but don’t hope it is. The proposition itself, namely, that snow is white, towards which one has an attitude is said to give the content to one’s mental state.

Propositional attitudes have certain features (or are constrained by certain principles) that distinguish them from physical states and events. Davidson’s theory of propositional attitudes is guided by conclusions drawn from the project of Radical Interpretation, a project initiated by Quine. Imagine that you have encountered a group of people in an unfamiliar land who display what appear to you to be shared verbal and non-verbal behavior. What do they mean when they point at a rabbit running by and say, “Gavagai”? Interpreting their behavior by assigning meaning to their actions (of which linguistic utterances is a subclass) is the task of Radical Interpretation. The principles and techniques we would apply in the above described situation are not unlike the principles and techniques we commonly apply in interpretation of other people’s actions and utterances whose language we already share. Radical Interpretation, according to Davidson, is guided by normative principles and must proceed holistically:

This method is intended to solve the problem of the interdependence of belief and meaning by holding belief constant as far as possible while solving for meaning. This is accomplished by assigning truth conditions to alien sentences that make native speakers right when plausibly possible, according, of course to our own view of what is right.

These general normative principles that guide the task of Radical Interpretation, and therefore constrain the task of attribution of propositional attitudes, are principles such as “Don’t believe an open contradiction”, or “If you believe that p and q, then also believe that p.” It is important to keep in mind the fact that intentional states are capable of justifying other intentional states. In physical theory the movement of one ball is explained by the movement of the other. Having a belief that pressing on a lever will stop the flow of water doesn’t just explain my action of stopping the flow of water. This belief (together with the desire to stop the flow of water) also justifies my action in the sense that it makes it reasonable in the light of the above belief. (Intentional states justifying other intentional states will be discussed further in the second part of this article.) Davidson is explicit that it is a part of what it is for something to be a propositional attitude (like a belief) that it be subject to these normative principles. This makes these principles a priori and necessary constitutive of the concept of propositional attitudes. In contrast, our knowledge of things physical is a posteriori and contingent in nature.

So far, we have spent time explaining the normative character of the mental and have discussed that the interpretation must proceed holistically:

There is no assigning beliefs to a person one by one on the basis of his verbal behavior, his choices, or other local signs no matter how plain and evident, for we make sense of particular beliefs only as they cohere with other beliefs, with preferences, with intention, hopes, fears, expectation, and the rest.

It can be seen from the above remark that interpretation is holistic in the sense that the attribution of each individual mental state to another person must be made against the background of attribution of other mental states. In addition, the attribution to an agent of the entire system of propositional attitudes is further constrained by considerations that involve maximization of coherence and rationality.

c. No Psychophysical Laws

Davidson is quite aware of the fact that holism and interdependence are common to physical theory. In physical theory such a priori facts as the transitivity of “longer than” is what makes physical measurements possible. Thus, the physical realm is also characterized by the a priori laws constitutive of our conception of the physical. What sets the realms of the mental and the physical apart is the disparate commitments of each realm. Rationality and the governing normative principles are essential characteristics of the mental. Thus, the absence of rationality and normative principles is a characteristic of the physical. If there were bridging laws, we would find, unhappily, that the characteristics of the mental that have “no echo in physical theory” would be transmitted to the physical and vice versa. In the first of the above scenarios we would have to apply the Principle of Charity with its rule of maximization of coherence and rationality to the physical, which, according to Davidson, is plainly absurd. In the second scenario we would have the principles governing the attribution of the mental be preempted by the merely physical constraints. This happens for the following reason: if there were bridging laws of the type (BL), then neural states of the brain would be nomologically coextensive with certain intentional states. But neural states (being theoretical states of physical theory) are governed by conditions of attribution that in turn are regulated by the constitutive rules of the physical theory. Thus, constitutive rules of the mental are ignored in this scenario. Davidson concludes that:

There are no strict psychophysical laws because of the disparate commitments of the mental and physical schemes. It is a feature of physical reality that physical change can be explained by laws that connect it with other changes and conditions physically described. It is a feature of the mental that the attribution of mental phenomena must be responsible to the background of reasons, beliefs, and intentions of the individual. There cannot be tight connections between the realms if each is to retain allegiance to its proper source of evidence.

It is important for Davidson to note that the mental does have its own laws, for instance, the laws of rational decision making. The crucial difference between such laws and the laws that could be counted as psychophysical is the difference between the normative character of the former and the predictive power of the latter. When anomalism of the mental denies the existence of psychophysical and psychological laws, the sense of “law” is taken to involve strict nomological predictions and explanations of behavior. Thus, normative “laws” are quite compatible with anomalism of the mental. An interesting question is whether Davidson’s notion of what constitutes a “law” has merit won’t be discussed here.

d. No Psychological Laws

The claim of the anomalism of the mental consists of two subsidiary claims. Thus far we have considered the support for the claim that there are no psychophysical laws. Davidson also defends the claim that there could be no precise psychological laws, that is, there are no precise laws that relate mental states and events to other mental states and events. The argument for this claim can be found in “Psychology as Philosophy.” As the title suggests, Davidson intends to contrast the claim that psychology is more like philosophy with the claim that it is more like science and then refute the latter claim. One point deserves special attention before proceeding to the exegesis of Davidson’s argument against psychological laws. Actions, although undeniably physical under some descriptions, are considered to be mental by Davidson. This is so because, when we state which action someone is performing versus merely describing the physical movement his body is undergoing, we are contributing an interpretation of him and interpretation, as we have seen, is guided by certain normative constraints. Thus, the laws that could relate an agent’s mental states to his actions would count as psychological laws.

The gist of the argument against psychological laws can be found in the following passage:

It is an error to compare truisms like “If a man wants to eat an acorn omelette, then he generally will if the opportunity exists and no other desire overrides” with a law that says how fast a body will fall in a vacuum. It is an error, because in the latter case, but not the former, we can tell in advance whether the condition holds, and we know what allowance to make if it doesn’t.

If the above truism were a psychological law, then for the antecedent to obtain, the agent must want to eat an acorn omelette. But our knowledge of an agent’s desires crucially depends upon our attribution of other mental states to him (or her). In addition, knowing his action subsequent to his desire will help us interpret whether the agent had the desire in the first place. Thus both the antecedent and the consequent of the supposed psychological law are related to each other through the holism of interpretation.

What is needed in the case of action, if we are to predict on the basis of desires and beliefs, is a quantitative calculus that brings all relevant beliefs and desires into the picture. There is no hope of refining the simple pattern of explanation on the basis of reasons into such a calculus.

Since no such hope exists, any psychological generalization purporting to be law must rely upon generous escape clauses such as “if no other desire overrides,” ceteris paribus, and so forth. The necessity of such fail-safe clauses is dictated by the fact that for Davidson there is no “underlying mental reality whose laws we can study in abstraction from the normative and holistic perspectives of interpretation.”

3. Action

Actions, according to Davidson, are events. Events, in his ontology, are particular dated occurrences; the essential feature of which is susceptibility to redescription. In order to admit an entity into one’s ontology, one must specify the conditions of individuation for that entity. On Davidson’s view:

[E]vents are identical if and only if they have exactly the same causes and effects.

This criterion may seem to have an air of circularity about it, but if there is circularity it certainly is not formal. For the criterion is simply this: where x and y are events,

x = y if and only if [(z) (z caused x implies z caused y) and (z) (x caused z implies y caused z)].

It is important to keep in mind that for an event to be an action, the event must be describable in a specific way. Actions are events that people perform with intentions and for reasons. One and the same action can be specified as intentional under some description and as purely physical under another description. But in order to be an action an event must have at least one description under which it is specified as intentional. The above requirement for an action hinges on the larger distinction between specifying the whole of an event with wholly specifying it. The distinction comes up in the context of the discussion of causation and causal explanation:

The salient point that emerges so far is that we must distinguish firmly between causes and the features we hit on for describing them, and hence between the question whether a statement says truly that one event causes another and the further question whether the events are characterized in such a way that we can deduce, or otherwise infer, from laws or other causal lore, that the relation was causal.

In the case of one event causing another, any description that picks out the right event specifies the whole of the cause. Some descriptions, of course, will be richer in the information they disclose about an event. This richness should not affect in any way how much of a cause they refer to. The story is quite different when it comes to what Davidson calls “the further question” of causal explanation. Causal explanations are by their very nature attempts to explain events in terms of the causes of these events. But, according to Davidson, causal explanations are, in addition, sensitive to how the events in question are described. For instance, the two descriptions “Jack’s walking in the room” and “Jack’s stomping in the room” may refer to the same event that caused Jill to wake up. However the latter may serve as a causal explanation of Jill’s waking up, whereas the former may not.

One of Davidson’s major contributions to philosophy of action is his claim that explanation via reasons is a form of causal explanation. In order to understand Davidson’s claims that reasons are the causes of the actions that they are reasons for and that “reason explanation” is a form of causal explanation, we must understand how on his view causal explanation works.

One theory of causal explanation arises out of Hume’s position that wherever there is a causal relation between two distinct events a and b there must be a law relating two types of events A and B that the events in question instantiate. This position has been further developed in the middle of the twentieth century by Carl Hempel into the deductive-nomological theory of explanation (DN from now on). According to DN, an event E is causally explained just in case the statement asserting the occurrence of E deductively follows from

  1. the statement asserting the occurrence of its cause C , and
  2. the statement of some general causal law L.

The opponents of the DN model argue that one can judge that an event a caused an event b without knowing the laws that these events instantiate. Davidson contends that the opposition between the opponents and the champions of the DN model is more apparent than real. The solution to the conflict depends on the distinction between events and their descriptions:

Causality and identity are relations between individual events no matter how described. But laws are linguistic; and so events can instantiate laws, and hence be explained or predicted in the light of laws, only as those events are described in one or the other way.

In short, Davidson lends his support to the principle of Nomological Character of Causality. This principle says that “when events are related as cause and effect, they have descriptions that instantiate a law. It doesn’t say that every true singular statement of causality instantiates a law.” It is worth noting that Davidson accepts this principle on faith, as many commentators have pointed out. Unlike David Hume, who accepts the principle because his analysis of the nature of causation as a constant conjunction requires it, Davidson disavows analyzing the nature of causation itself. His goal, explicitly stated, is to provide an analysis of the logical form of causal statements.

We can now turn to the question of the causal explanation of action and briefly discuss Davidson’s impetus for his claim that reason explanation must be a form of causal explanation. Davidson’s opponents (the anti-causalists) on the explanation of actions claim that reason explanation is different in kind from causal explanation. There are two main types of arguments for the anti-causalist position: methodological and conceptual. Anti-causalists who rely on methodological arguments for their position, claim that a DN model that relies on the concept of lawful regularity has a place only in the physical sciences. By contrast, the primary constraint placed on explanation in the social sciences is a normative one. Thus, lawful regularities relating reasons to actions would be simply irrelevant to explanation in social sciences, according to anti-causalists.

Conceptual arguments are meant to establish the stronger claim that reasons cannot in principle be causes. One plausible argument of the conceptual variety rests on the assumption that “the presence of a reason cannot be ascertained independently of the occurrence of the action it rationalizes.” This, presumably, leads to the disparate evidential commitments of the causal explanation and reason explanation. Davidson himself appears to advocate the above point in the passage quoted above. Thus, all arguments against the causalist position, including the ones briefly mentioned, revolve around the normative constraints placed on the explanation of the mental.

In short, an explanation of an agent’s action can be considered adequate only if it shows the action in question to be reasonable against the background of an agent’s beliefs and desires. This latter condition together with the truth condition, which states that the propositional attitudes a rationalization attributes to an agent must be true, form the necessary conditions for the justification model of explanation. Davidson considers the above conditions necessary but not sufficient. The deficiency of the justification model is explained by drawing attention to the distinction between having a reason for an action and having the reason why one performs an action. For a reason to be the reason why one performs an action the reason must cause the action. For example, one has a reason to turn on the television, say, to watch one’s favorite TV show. But this need not be the reason why one turns on the television. This is because the above reason did not cause one to turn on the television. As Davidson puts it:

[S]omething essential has certainly been left out, for a person can have a reason for an action, and perform the action, and yet this reason not be the reason why he did it.

In our example, the reason for one to turn on the television, let’s say, is that one is lonely and desires company. Thus, one reason (namely, to keep one company) was the cause of the action while the other reason (namely, to watch one’s favorite show) was not. Davidson continues:

Of course, we can include this idea too in justification; but then the notion of justification becomes as dark as the notion of reason until we can account for the force of that “because.”

The mere possibility that a person acted on the basis of one reason rather than another presents an insurmountable obstacle. The anti-causalist has no way of accounting for the force of the “because” in the rationalization. Thus, the justification model is silent on what would count as the correct rationalization. The only solution, according to Davidson, is to view the efficacious reasons (the ones that account for the correct rationalization) as causes of action. This leaves us, according to Davidson, with only one alternative to justificationalism, namely, the view that reason explanation is a species of causal explanation.

4. References and Further Reading

Davidson’s research primarily ran in articles published from the 1960s through the 1990s, most of which have conveniently been reprinted. The first two collections contain Davidson’s most influential works, and the last volume cited below is a good place to begin. [This section on references and further reading was composed by Paul Saka.] See also the article Davidson: Philosophy of Language.

a. Primary Sources

  • Essays on Actions and Events. Oxford: Oxford University Press, 1980.
    • Includes “Mental Events,” which introduces anomalous monism; “The Logical Form of Action Sentences,” an important semantic theory of adverbs; “Actions, Reasons, and Causes”, which famously argues that rationalization is a species of causal explanation. To the revised edition (2001) is added “Adverbs of Action” and a short reply to Quine.
  • Inquiries into Truth and Interpretation. Oxford: Oxford University Press, 1984.
    • Includes “Semantics for Natural Languages,” a good place for beginners to start; “Truth and Meaning,” the locus classicus of Davidsonian semantics; “Quotation” and “On Saying That,” which offer extensional analyses of intensional phenomena; “Radical Interpretation,” “Belief and the Basis of Meaning,” and “On the Very Idea of a Conceptual Scheme” on the principle of charity; “Thought and Talk,” which argues that only verbal creatures can think; “Reality without Reference,” which concedes that reference is not real; and a pioneering treatment in analytic philosophy on metaphor. To the revised edition (2001) is added a short reply to Quine.
  • Subjective, Intersubjective, Objective. Oxford: Oxford University Press, 2001.
    • Includes “Knowing One’s Own Mind”, source of the Swampman argument.
  • Problems of Rationality. Oxford: Oxford University Press, 2004.
    • Follows up on themes from Davidson’s first collection; includes an interview of Davidson by Ernie Lepore.
  • Truth, Language, and History. Oxford: Oxford University Press, 2005.
    • Includes the highly cited “The Folly of Trying to Define Truth” plus six other articles on truth; six articles on language; two articles on anomalous monism; and minor articles in the history of philosophy.
  • Truth and Predication. Boston: Harvard University Press, 2005.
    • Part I is a revised version of Davidson’s 1989 Dewey Lectures, first published as “The Structure and Content of the Theory of Truth” in the Journal of Philosophy. Part II, on predication, is a version of Davidson’s 2001 Hermes Lectures.
  • The Essential Davidson. Oxford: Oxford University Press, 2006.
    • Consists of six articles taken from Essays on Actions and Events, five articles taken from Inquiries into Truth and Interpretation, three articles taken from Davidson’s other collections, and “A Coherence Theory of Truth and Knowledge”, taken from the Journal of Philosophy.

b. Secondary Sources

  • Ludwig, Kirk, ed. Donald Davidson. Cambridge: Cambridge University Press, 2003.
    • Accessible contributions, each on one aspect of Davidson’s work: actions, events, truth and meaning, radical interpretation, literature, knowledge.
  • Lepore, Ernest, and Ludwig, Kirk. 2005. Donald Davidson: Meaning, Truth, Language, and Reality. Oxford: Oxford University Press.
    • A sustained and authoritative treatment of how Davidson’s projects tie together, and their significance to philosophy.
  • Lepore, Ernest, and Ludwig, Kirk. 2009. Donald Davidson’s Truth-Theoretic Semantics. Oxford: Oxford University Press.
    • Foundations and applications of Davidsonian semantics, relevant for philosophers of language and linguists.
  • Hahn, Edwin Lewis. 1999. The Philosophy of Donald Davidson. The Library of Living Philosophers, volume XXVII. Peru, IL: Open Court Publishing Company.
    • Includes, as do all volumes in the Library of Living Philosophers, an intellectual autobiography and extensive bibliography.

Author Information

Vladimir Kalugin
Email: vladimir.kalugin@csun.edu
California State University, Northridge
U. S. A.

Avicenna (Ibn Sina) (c. 980—1037)

AvicennaAbu ‘Ali al-Husayn ibn Sina is better known in Europe by the Latinized name “Avicenna.” He is probably the most significant philosopher in the Islamic tradition and arguably the most influential philosopher of the pre-modern era. Born in Afshana near Bukhara in Central Asia in about 980, he is best known as a polymath, as a physician whose major work the Canon (al-Qanun fi’l-Tibb) continued to be taught as a medical textbook in Europe and in the Islamic world until the early modern period, and as a philosopher whose major summa the Cure (al-Shifa’) had a decisive impact upon European scholasticism and especially upon Thomas Aquinas (d. 1274). Primarily a metaphysical philosopher of being who was concerned with understanding the self’s existence in this world in relation to its contingency, Ibn Sina’s philosophy is an attempt to construct a coherent and comprehensive system that accords with the religious exigencies of Muslim culture. As such, he may be considered to be the first major Islamic philosopher. The philosophical space that he articulates for God as the Necessary Existence lays the foundation for his theories of the soul, intellect and cosmos. Furthermore, he articulated a development in the philosophical enterprise in classical Islam away from the apologetic concerns for establishing the relationship between religion and philosophy towards an attempt to make philosophical sense of key religious doctrines and even analyse and interpret the Qur’an. Late 20th century studies have attempted to locate him within the Aristotelian and Neoplatonic traditions. His relationship with the latter is ambivalent: although accepting some keys aspects such as an emanationist cosmology, he rejected Neoplatonic epistemology and the theory of the pre-existent soul. However, his metaphysics owes much to the “Amonnian” synthesis of the later commentators on Aristotle and discussions in legal theory and kalam on meaning, signification and being. Apart from philosophy, Avicenna’s other contributions lie in the fields of medicine, the natural sciences, musical theory, and mathematics. In the Islamic sciences (‘ulum), he wrote a series of short commentaries on selected Qur’anic verses and chapters that reveal a trained philosopher’s hermeneutical method and attempt to come to terms with revelation. He also wrote some literary allegories about whose philosophical value 20th and 21st century scholarship is vehemently at odds.

His influence in medieval Europe spread through the translations of his works first undertaken in Spain. In the Islamic world, his impact was immediate and led to what Michot has called “la pandémie avicennienne.” When al-Ghazali  led the theological attack upon the heresies of the philosophers, he singled out Avicenna, and a generation later when the Shahrastani gave an account of the doctrines of the philosophers of Islam, he relied upon the work of Avicenna, whose metaphysics he later attempted to refute in his Struggling against the Philosophers (Musari‘at al-falasifa). Avicennan metaphysics became the foundation for discussions of Islamic philosophy and philosophical theology. In the early modern period in Iran, his metaphysical positions began to be displayed by a creative modification that they underwent due to the thinkers of the school of Isfahan, in particular Mulla Sadra (d. 1641).

Table of Contents

  1. Life and Times
  2. Works
  3. Avicenna Latinus
  4. Logic
  5. Ontology
  6. Epistemology
  7. Psychology
  8. Mysticism and Oriental Philosophy
  9. The Avicennan Tradition and His Legacy
  10. References and Further Reading
    1. The Latin Avicenna (mainly sections of al-Shifa’)
    2. Studies in Avicenna Latinus
    3. Selected Works of Avicenna Available in European Language Translation
    4. General Introductions to Avicenna and His Thought
    5. Collections and Bibliographies
    6. Interpretations
    7. Avicenna’s Oriental Philosophy
    8. Metaphysics
    9. On Pyschology
    10. Existence-Essence

1. Life and Times

Sources on his life range from his autobiography, written at the behest of his disciple ‘Abd al-Wahid Juzjani, his private correspondence, including the collection of philosophical epistles exchanged with his disciples and known as al-Mubahathat (The Discussions), to legends and doxographical views embedded in the ‘histories of philosophy’ of medieval Islam such as Ibn al-Qifti’s Ta’rikh al-hukama (History of the Philosophers) and Zahir al-Din Bayhaqi’s Tatimmat Siwan al-hikma. However, much of this material ought to be carefully examined and critically evaluated. Gutas has argued that the autobiography is a literary device to represent Avicenna as a philosopher who acquired knowledge of all the philosophical sciences through study and intuition (al-hads), a cornerstone of his epistemological theory. Thus the autobiography is an attempt to demonstrate that humans can achieve the highest knowledge through intuition. The text is a key to understanding Avicenna’s view of philosophy: we are told that he only understood the purpose of Aristotle’s Metaphysics after reading al-Farabi’s short treatise on it, and that often when he failed to understand a problem or solve the syllogism, he would resort to prayer in the mosque (and drinking wine at times) to receive the inspiration to understand – the doctrine of intuition. We will return to his epistemology later but first what can we say about his life?

Avicenna was born in around 980 in Afshana, a village near Bukhara in Transoxiana. His father, who may have been Ismaili, was a local Samanid governor. At an early age, his family moved to Bukhara where he studied Hanafi jurisprudence (fiqh) with Isma‘il Zahid (d. 1012) and medicine with a number of teachers. This training and the excellent library of the physicians at the Samanid court assisted Avicenna in his philosophical self-education. Thus, he claimed to have mastered all the sciences by the age of 18 and entered into the service of the Samanid court of Nuh ibn Mansur (r. 976-997) as a physician. After the death of his father, it seems that he was also given an administrative post. Around the turn of the millennium, he moved to Gurganj in Khwarazm, partly no doubt to the eclipse of Samanid rule after the Qarakhanids took Bukhara in 999. He then left again ‘through necessity’ in 1012 for Jurjan in Khurasan to the south in search no doubt for a patron. There he first met his disciple and scribe Juzjani. After a year, he entered Buyid service as a physician, first with Majd al-Dawla in Rayy and then in 1015 in Hamadan where he became vizier of Shams al-Dawla. After the death of the later in 1021, he once again sought a patron and became the vizier of the Kakuyid ‘Ala’ al-Dawla for whom he wrote an important Persian summa of philosophy, the Danishnama-yi ‘Ala’i (The Book of Knowledge for ‘Ala’ al-Dawla). Based in Isfahan, he was widely recognized as a philosopher and physician and often accompanied his patron on campaign. It was during one of these to Hamadan in 1037 that he died of colic. An arrogant thinker who did not suffer fools, he was fond of his slave-girls and wine, facts which were ammunition for his later detractors.

2. Works

Avicenna wrote his two earliest works in Bukhara under the influence of al-Farabi. The first, a Compendium on the Soul (Maqala fi’l-nafs), is a short treatise dedicated to the Samanid ruler that establishes the incorporeality of the rational soul or intellect without resorting to Neoplatonic insistence upon its pre-existence. The second is his first major work on metaphysics, Philosophy for the Prosodist (al-Hikma al-‘Arudiya) penned for a local scholar and his first systematic attempt at Aristotelian philosophy.

He later wrote three ‘encyclopaedias’encyclopedias of philosophy. The first of these is al-Shifa’ (The Cure), a work modelled on the corpus of the philosopher, namely. Aristotle, that covers the natural sciences, logic, mathematics, metaphysics and theology. It was this work that through its Latin translation had a considerable impact on scholasticism. It was solicited by Juzjani and his other students in Hamadan in 1016 and although he lost parts of it on a military campaign, he completed it in Isfahan by 1027. The other two encyclopaedias were written later for his patron the Buyid prince ‘Ala’ al-Dawla in Isfahan. The first, in Persian rather than Arabic is entitled Danishnama-yi ‘Ala’i (The Book of Knowledge for ‘Ala’ al-Dawla) and is an introductory text designed for the layman. It closely follows his own Arabic epitome of The Cure, namely al-Najat (The Salvation). The Book of Knowledge was the basis of al-Ghazali’s later Arabic work Maqasid al-falasifa (Goals of the Philosophers). The second, whose dating and interpretation have inspired debates for centuries, is al-Isharat wa’l-Tanbihat (Pointers and Reminders), a work that does not present completed proofs for arguments and reflects his mature thinking on a variety of logical and metaphysical issues. According to Gutas it was written in Isfahan in the early 1030s; according to Michot, it dates from an earlier period in Hamadan and possibly Rayy. A further work entitled al-Insaf (The Judgement) which purports to represent a philosophical position that is radical and transcends AristotelianisingAristotle’s Neoplatonism is unfortunately not extant, and debates about its contents are rather like the arguments that one encounters concerning Plato’s esoteric or unwritten doctrines. One further work that has inspired much debate is The Easterners (al-Mashriqiyun) or The Eastern Philosophy (al-Hikma al-Mashriqiya) which he wrote at the end of the 1020s and is mostly lost.

3. Avicenna Latinus

Avicenna’s major work, The Cure, was translated into Latin in 12th and 13th century Spain (Toledo and Burgos) and, although it was controversial, it had an important impact and raised controversies inin medieval scholastic philosophy. In certain cases the Latin manuscripts of the text predate the extant Arabic ones and ought to be considered more authoritative. The main significance of the Latin corpus lies in the interpretation for Avicennism andAvicennism, in particular forregarding his doctrines on the nature of the soul and his famous existence-essence distinction (more about that below) andbelow), along with the debates and censure that they raised in scholastic Europe, in particular in ParisEurope. This was particularly the case in Paris, where Avicennism waslater proscribed in 1210. However, the influence of his psychology and theory of knowledge upon William of Auvergne and Albertus Magnus have been noted. More significant is the impact of his metaphysics upon the work and thought of Thomas Aquinas. His other major work to be translated into Latin was his medical treatise the Canon, which remained a text-book into the early modern period and was studied in centrescenters of medical learning such as Padua.

4. Logic

Logic is a critical aspect of, and propaedeutic to, Avicennan philosophy. His logical works follow the curriculum of late Neoplatonism and comprise nine books, beginning with his version of Porphyry’s Isagoge followed by his understanding and modification of the Aristotelian Organon, which included the Poetics and the Rhetoric. On the age-old debate whether logic is an instrument of philosophy (Peripatetic view) or a part of philosophy (Stoic view), he argues that such a debate is futile and meaningless.

His views on logic represent a significant metaphysical approach, and it could be argued generally that metaphysical concerns lead Avicenna’s arguments in a range of philosophical and non-philosophical subjects. For example, he argues in The Cure that both logic and metaphysics share a concern with the study of secondary intelligibles (ma‘qulat thaniya), abstract concepts such as existence and time that are derived from primary concepts such as humanity and animality. Logic is the standard by which concepts—or the mental “existence” that corresponds to things that occur in extra-mental reality—can be judged and hence has both implications for what exists outside of the mind and how one may articulate those concepts through language. More importantly, logic is a key instrument and standard for judging the validity of arguments and hence acquiring knowledge. Salvation depends on the purity of the soul and in particular the intellect that is trained and perfected through knowledge. Of particular significance for later debates and refutations is his notion that knowledge depends on the inquiry of essential definitions (hadd) through syllogistic reasoning. The problem of course arises when one tries to make sense of an essential definition in a real, particular world, and when one’s attempts to complete the syllogism by striking on the middle term is foiled because one’s ‘intuition’ fails to grasp the middle term.

5. Ontology

From al-Farabi, Avicenna inherited the Neoplatonic emanationist scheme of existence. Contrary to the classical Muslim theologians, he rejected creation ex nihilo and argued that cosmos has no beginning but is a natural logical product of the divine One. The super-abundant, pure Good that is the One cannot fail to produce an ordered and good cosmos that does not succeed him in time. The cosmos succeeds God merely in logical order and in existence.

Consequently, Avicenna is well known as the author of one an important and influential proof for the existence of God. This proof is a good example of a philosopher’s intellect being deployed for a theological purpose, as was common in medieval philosophy. The argument runs as follows: There is existence, or rather our phenomenal experience of the world confirms that things exist, and that their existence is non-necessary because we notice that things come into existence and pass out of it. Contingent existence cannot arise unless it is made necessary by a cause. A causal chain in reality must culminate in one un-caused cause because one cannot posit an actual infinite regress of causes (a basic axiom of Aristotelian science). Therefore, the chain of contingent existents must culminate in and find its causal principle in a sole, self-subsistent existent that is Necessary. This, of course, is the same as the God of religion.

An important corollary of this argument is Avicenna’s famous distinction between existence and essence in contingents, between the fact that something exists and what it is. It is a distinction that is arguably latent in Aristotle although the roots of Avicenna’s doctrine are best understood in classical Islamic theology or kalam. Avicenna’s theory of essence posits three modalities: essences can exist in the external world associated with qualities and features particular to that reality; they can exist in the mind as concepts associated with qualities in mental existence; and they can exist in themselves devoid of any mode of existence. This final mode of essence is quite distinct from existence. Essences are thus existentially neutral in themselves. Existents in this world exist as something, whether human, animal or inanimate object; they are ‘dressed’ in the form of some essence that is a bundle of properties that describes them as composites. God on the other hand is absolutely simple, and cannot be divided into a bundle of distinct ontological properties that would violate his unity. Contingents, as a mark of their contingency, are conceptual and ontological composites both at the first level of existence and essence and at the second level of properties. Contingent things in this world come to be as mentally distinct composites of existence and essence bestowed by the Necessary.

This proof from contingency is also sometimes termed “radical contingency.” Later arguments raged concerning whether the distinction was mental or real, whether the proof is ontological or cosmological. The clearest problem with Avicenna’s proofs lies in the famous Kantian objection to ontological arguments: is existence meaningful in itself? Further, Cantor’s solution to the problem of infinity may also be seen as a setback to the argument from the impossibility of actual infinites.

Avicenna’s metaphysics is generally expressed in Aristotelian terms. The quest to understand being qua being subsumes the philosophical notion of God. Indeed, as we have seen divine existence is a cornerstone of his metaphysics. Divine existence bestows existence and hence meaning and value upon all that exists. Two questions that were current were resolved through his theory of existence. First, theologians such as al-Ash‘ari and his followers were adamant in denying the possibility of secondary causality; for them, God was the sole agent and actor in all that unfolded. Avicenna’s metaphysics, although being highly deterministic because of his view of radical contingency, still insists of the importance of human and other secondary causality. Second, the age-old problem was discussed: if God is good, how can evil exist? Divine providence ensures that the world is the best of all possible worlds, arranged in the rational order that one would expect of a creator akin to the demiurge of the Timaeus. But while this does not deny the existence of evil in this world of generation and corruption, some universal evil does not exist because of the famous Neoplatonic definition of evil as the absence of good. Particular evils in this world are accidental consequences of good. Although this deals with the problem of natural evils, the problem of moral evils and particularly ‘horrendous’ evils remains.

6. Epistemology

The second most influential idea of Avicenna is his theory of the knowledge. The human intellect at birth is rather like a tabula rasa, a pure potentiality that is actualized through education and comes to know. Knowledge is attained through empirical familiarity with objects in this world from which one abstracts universal concepts. It is developed through a syllogistic method of reasoning; observations lead to prepositional statements, which when compounded lead to further abstract concepts. The intellect itself possesses levels of development from the material intellect (al-‘aql al-hayulani), that potentiality that can acquire knowledge to the active intellect (al-‘aql al-fa‘il), the state of the human intellect at conjunction with the perfect source of knowledge.

But the question arises: how can we verify if a proposition is true? How do we know that an experience of ours is veridical? There are two methods to achieve this.  First, there are the standards of formal inference of arguments —Is the argument logically sound? Second, and most importantly, there is a transcendent intellect in which all the essences of things and all knowledge resides. This intellect, known as the Active Intellect, illuminates the human intellect through conjunction and bestows upon the human intellect true knowledge of things. Conjunction, however, is episodic and only occurs to human intellects that have become adequately trained and thereby actualized. The active intellect also intervenes in the assessment of sound inferences through Avicenna’s theory of intuition. A syllogistic inference draws a conclusion from two prepositional premises through their connection or their middle term. It is sometimes rather difficult to see what the middle term is; thus when someone reflecting upon an inferential problem suddenly hits upon the middle term, and thus understands the correct result, she has been helped through intuition (hads) inspired by the active intellect. There are various objections that can be raised against this theory, especially because it is predicated upon a cosmology widely refuted in the post-Copernican world.

One of the most problematic implications of Avicennan epistemology relates to God’s knowledge. The divine is pure, simple and immaterial and hence cannot have a direct epistemic relation with the particular thing to be known. Thus Avicenna concluded while God knows what unfolds in this world, he knows things in a ‘universal manner’ through the universal qualities of things. God only knows kinds of existents and not individuals. This resulted in the famous condemnation by al-Ghazali who said that Avicenna’s theory amounts to a heretical denial of God’s knowledge of particulars. particulars.

7. Psychology

Avicenna’s epistemology is predicated upon a theory of soul that is independent of the body and capable of abstraction. This proof for the self in many ways prefigures by 600 years the Cartesian cogito and the modern philosophical notion of the self. It demonstrates the Aristotelian base and Neoplatonic structure of his psychology. This is the so-called ‘flying man’ argument or thought experiment found at the beginning of his Fi’-Nafs/De Anima (Treatise on the Soul). If a person were created in a perfect state, but blind and suspended in the air but unable to perceive anything through his senses, would he be able to affirm the existence of his self? Suspended in such a state, he cannot affirm the existence of his body because he is not empirically aware of it, thus the argument may be seen as affirming the independence of the soul from the body, a form of dualism. But in that state he cannot doubt that his self exists because there is a subject that is thinking, thus the argument can be seen as an affirmation of the self-awareness of the soul and its substantiality. This argument does raise an objection, which may also be levelled at Descartes: how do we know that the knowing subject is the self?

This rational self possesses faculties or senses in a theory that begins with Aristotle and develops through Neoplatonism. The first sense is common sense (al-hiss al-mushtarak) which fuses information from the physical senses into an epistemic object. The second sense is imagination (al-khayal) which processes the image of the perceived epistemic object. The third sense is the imaginative faculty (al-mutakhayyila) which combines images in memory, separates them and produces new images. The fourth sense is estimation or prehension (wahm) that translates the perceived image into its significance. The classic example for this innovative sense is that of the sheep perceiving the wolf and understanding the implicit danger. The final sense is where the ideas produced are stored and analyzed and ascribed meanings based upon the production of the imaginative faculty and estimation. Different faculties do not compromise the singular integrity of the rational soul. They merely provide an explanation for the process of intellection.

8. Mysticism and Oriental Philosophy

Was Avicenna a mystic? Some of his interpreters in Iran have answered in the positive, citing the lost work The Easterners that on the face of it has a superficial similarity to the notion of Ishraqi or Illuminationist, intuitive philosophy expounded by Suhrawardi (d. 1191) and the final section of Pointers that deal with the terminology of mysticism and Sufism. The question does not directly impinge on his philosophy so much since The Easterners is mostly non-extant. But it is an argument relating to ideology and the ways in which modern commentators and scholars wish to study Islamic philosophy as a purely rational form of inquiry or as a supra-rational method of understanding reality. Gutas has been most vehement in his denial of any mysticism in Avicenna. For him, Avicennism is rooted in the rationalism of the Aristotelian tradition. Intuition does not entail mystical disclosure but is a mental act of conjunction with the active intellect. The notion of intuition is located itself by Gutas in Aristotle’s Posterior Analytics 89b10-11. While some of the mystical commentators of Avicenna have relied upon his pseudo-epigraphy (such as some sort of Persian Sufi treatises and the Mi‘rajnama), one ought not to throw the baby out with the bath water. The last sections of Pointers are significant evidence of Avicenna’s acceptance of some key epistemological possibilities that are present in mystical knowledge such as the possibility of non-discursive reason and simple knowledge. Although one can categorically deny that he was a Sufi (and indeed in his time the institutions of Sufism were not as established as they were a century later) and even raise questions about his adherence to some form of mysticism, it would be foolish to deny that he flirts with the possibilities of mystical knowledge in some of his later authentic works.

9. The Avicennan Tradition and His Legacy

Avicenna’s major achievement was to propound a philosophically defensive system rooted in the theological fact of Islam, and its success can be gauged by the recourse to Avicennan ideas found in the subsequent history of philosophical theology in Islam. In the Latin West, his metaphysics and theory of the soul had a profound influence on scholastic arguments, and as in the Islamic East, was the basis for considerable debate and argument. Just two generations after him, al-Ghazali (d. 1111) and al-Shahrastani (d. 1153) in their attacks testify to the fact that no serious Muslim thinker could ignore him. They regarded Avicenna as the principal representative of philosophy in Islam. In the later Iranian tradition, Avicenna’s thought was critically distilled with mystical insight, and he became known as a mystical thinker, a view much disputed in late 20th and early 21st century scholarship. Nevertheless the major works of Avicenna, especially The Cure and Pointers, became the basis for the philosophical curriculum in the madrasa. Numerous commentaries, glosses and super-glosses were composed on them and continued to be produced into the 20th century. While our current views on cosmology, on the nature of the self, and on knowledge raise distinct problems for Avicennan ideas, they do not address the important issue of why his thought remained so influential for such a long period of time. In the 20th and 21st centuries, Avicenna has been attacked by some contemporary Arab Muslim thinkers in search of a new rationalism within Arab culture, one that champions Averroes against Avicenna.

10. References and Further Reading

a. The Latin Avicenna (mainly sections of al-Shifa’)

  • Liber de anima seu sextus de naturalibus I-III. ed. Simone van Riet, Leiden, 1972.
  • Liber de philosophia prima sive scientia divina I-IV. ed. Simone van Riet, Leidin, 1977.
  • Liber de pilosophia prima sive scientia divina V-X. ed. Simone van Riet, Leiden, 1980.
  • Liber primus naturalium: Tractatus primus de causis et principiis naturalium. ed. Simone van Riet, Leiden, 1992.
  • Liber quartus naturalium de actionibus et passionibus qualitatum primarum. ed. Simone van Riet, Leiden, 1989.

b. Studies in Avicenna Latinus

  • (eds), Islam and the Italian Renaissance. eds. Charles Burnett and Anna Contadini. Warburg Institute, 1999.
  • N. G. Siraisi, Avicenna in Renaissance Italy: The Canon and Medical Teaching in Italian Universities after 1500, Princeton, 1987.
  • Dag Hasse, Avicenna’s De Anima in the Latin West, London, 2000.
    • A study of the impact of Avicennan psychology upon the scholastics focusing on five key issues

c. Selected Works of Avicenna Available in European Language Translation

  • Epistola sulla vita future (Risalat al-Adhawiyya fi’l-ma’ad), tr. F. Luchetta, Padua, 1969.
    • Compare it with this useful and critical commentary by the theologian Ibn Taymiyya (d. 1328) – Yahya Michot, ‘A Mamluk theologian’s commentary on Avicenna’s Risala Adhawiyya’, Journal of Islamic Studies 14 (2003), 149-203, 309-63.
  • The Life of Ibn Sina, tr. William Gohlman, Albany, 1974.
  • Avicenna’s De Anima (Fi’l-Nafs), tr. F. Rahman, London, 1954.
  • Livre de directives et remarques (al-Isharat wa’l-Tanbihat), tr. Anne-Marie Goichon, 2 vols., Paris, 1951.
  • Remarks and Admonitions Part One: Logic (al-Isharat wa’l-Tanbihat: mantiq), tr. Shams Inati, Toronto, 1984.
  • La Métaphysique du Shifa’ I-IV et V-X, tr. G. Anawati, Paris, 1978-86.
  • Le livre de science (Danishnama-yi ‘Ala’i) I: Logique, Métaphysique II: science naturelle, mathématique, trs. M. Achena and Henri Massé, Paris, 1986.
  • Ibn Sina on Mysticism (al-Isharat wa’l-Tanbihat namat IX), tr. Shams Inati, London, 1998.
  • The Metaphysica of Avicenna (Ilahiyyat-i Danishnama-yi ‘Ala’i), tr. Parviz Morewedge, New York, 1972; rpt., Binghamton, 2003.
  • Lettre au Vizier Abu Sa’d, ed./tr. Yahya Michot, Paris, 2000.
  • The Metaphysics of Avicenna (al-Ilahiyyat min Kitab al-Shifa’), ed./tr. Michael Marmura, Provo, 2004.

d. General Introductions to Avicenna and His Thought

  • Cruz Hernández, Miguel. La vida de Avicena. Salamanca, 1997.
    • A short and accessible intellectual biography written by perhaps the foremost Spanish historian of Islamic philosophy.
  • Goichon, Anne-Marie. Lexique de la langue philosophique d’Avicenne. Paris, 1938.
    • A pioneering work which remains a highly useful research tool.
  • Goodman, Lenn. Avicenna. London, 1992.
    • Although an attempt by a contemporary philosopher to come to grips with the enduring contributions of Avicenna to philosophy, it suffers from some serious textual misreadings.
  • Gutas, Dimitri. Avicenna and the Aristotelian Tradition. Leiden/Boston, 1988.
    • A solid work of scholarship that discusses Avicenna’s corpus and thought within a paradigm of Islamic Aristotelianism.
  • Nasr, Sayyed Hossein. Three Muslim Sages. Cambridge, 1966.
    • An old and contentious presentation of Avicenna as a polymath rooted in the mystical experience of God.
  • Sebti, Miriam. Avicenne. Paris, 2003.
    • An interpretation from a continental philosophical approach.
  • Street, Tony. Avicenna. Cambridge, 2005.
    • A solid presentation of the key ideas based on the most up-to-date research.

e. Collections and Bibliographies

  • Special Issue of Documenti e studi sulla tradizione filosofica medievale. Padua, 8 (1997) on Avicenna.
  • Special Issue of Arabic Sciences and Philosophy. Cambridge, 10 (2000) on Avicenna.
  • Anawati, G. C. Essai de bibliographie avicennienne. Cairo, 1950.
  • Various Authors, ‘Avicenna’, Encyclopaedia Iranica. New York, II, 66-110.
  • Janssens, Jules. Bibliography of Works on Ibn Sina, 2 vols. Leiden, 1991-99.
  • Janssens, Jules and Daniel de Smet (ed). Avicenna and His Heritage. Leuven, 2001.
    • Proceedings from a 1999 conference that brought together specialists on the Arabic and the Latin Avicenna and their legacies.
  • Rashed, Roshdi and Jean Jolivet (eds), Etudes sur Avicenne, Paris, 1984.
    • An excellent collection that includes insightful pieces on Avicennan physics and metaphysics.
  • David Reisman and Ahmed al-Rahim (eds), Before and After Avicenna, Leiden/Boston, 2003.
    • The proceedings of the First Conference of the Avicenna Research Group (based at Yale).
  • Robert Wisnovsky (ed), Aspects of Avicenna (Princeton Papers: Interdisciplinary Journal of Middle East Studies, 9), Princeton, 2001.
    • Includes two good pieces on Avicennan psychology.

f. Interpretations

  • Arberry, Arthur J. Avicenna on Theology. London, 1954.
    • Includes translations of texts and raises the interesting question of what is ‘Islamic’ about Avicenna’s ‘Islamic philosophy’.
  • Corbin, Henry. Avicenna and the Visionary Recital, Princeton, 1961.
    • An influential and controversial interpretation of Avicenna through the lens of the later Iranian tradition portraying him as a mystic.
  • Gardet, Louis. La pensée religieuse d’Avicenne, Paris, 1951.
  • Heath, Peter. Allegory and Philosophy in Avicenna, Philadelphia, 1992.
    • An interesting approach to allegory that draws on Corbin and suffers from the assumption that the famous pseudo-Avicennan work the Mi’rajnama is authentic.
  • Lüling, G. ‘Die anderer Avicenna’, Zeitschrift der deutschen MorganländischenGesellschaft Suppl III.1 (1977), 496-513.
  • Marmura, Michael. ‘Avicenna and the kalam’, Zeitschrift für arabisch-islamisch Wissenschaft (Frankfurt) 7 (1991-2), 172-206.
    • Considers Avicenna’s debt to the metaphysics of kalam.
  • Marmura, Michael. ‘Plotting the course of Avicenna’s thought’, Journal of the American Oriental Society 111 (1991), 333-42.
    • A critical assessment of Gutas’s 1988 work.
  • Michot, Yahya. ‘La pandémie avicennienne’, Arabica (Paris) 40 (1993), 287-344.
    • On the widespread hegemony of Avicennan philosophy in Islamic thought from the 12th Century.
  • Thom, Paul. Medieval Modal Systems, London, 2004.
    • The best study of Avicenna’s modal logic and his contributions to the field.

g. Avicenna’s Oriental Philosophy

  • Cruz Hernández, Miguel. ‘El problema de la “auténtica” filosofía de Avicena’, Revista de Filosofía 5 (1992), 235-56.
  • Gutas, Dimitri. ‘Avicenna’s Eastern (“Oriental”) Philosophy’, Arabic Sciences and Philosophy 10 (2000), 159-80.
  • Nasr, Seyyed Hossein. ‘Ibn Sina’s Oriental Philosophy’, in S. H. Nasr and Oliver Leaman (eds), History of Islamic Philosophy, London/New York, 1996, I, 247-51.
    • A classic restatement of Nasr’s mystical understanding of Avicenna.
  • Pines, Shlomo. ‘La philosophie orientale d’Avicenne’, in The Collected Works of Shlomo Pines Volume III, Jerusalem, 1996, 301-33.
    • Interprets ‘oriental’ to signify an Eastern alternative Peripatetism.

h. Metaphysics

  • Robert Wisnovsky, Avicenna’s Metaphysics in Context, London, 2003.
    • An excellent study that locates the origins of Avicennan thought in what he calls the ‘Ammonian synthesis’ in Late Antiquity and then explains the development of Avicennan metaphysics.

i. On Psychology

  • Helmut Gätje, Studien zur Überlieferung der aristotelische Psychologie im Islam, Heidelberg, 1971.
    • A pioneering study of the key aspects of Aristotelian(ising) psychological theories in Islamic philosophy focusing on Avicenna.
  • Dag Hasse, Avicenna’s De Anima in the Latin West, London, 2000.
    • A study of the impact of Avicennan psychology upon the scholastics focusing on five key issues.
  • Michot, Jean R. La destinée de l’homme selon Avicenne, Brussels, 1986.
    • A key investigation of Avicennan psychology as a quest for an Islamic answer to the problem of the soul’s journey beyond this life and the persistence of personal identity.
  • Rahman, Fazlur. Avicenna’s Psychology, London, 1952.
    • A study that includes a translation of Avicenna’s De Anima.

j. Existence-Essence

  • Goichon, Anne-Maria. La distinction de l’essence et l’existence d’après ibn Sina (Avicenne), Paris, 1937.
  • Mayer, Toby. ‘Ibn Sina’s Burhan al-Siddiqin’, Journal of Islamic Studies 12 (2001), 18-39.
  • Parviz Morewedge, ‘Philosophical analysis of Ibn Sina’s essence-existence distinction’, Journal of the American Oriental Society 92 (1972), 42-35.
  • Rahman, Fazlur. ‘Essence and existence in Avicenna’, Mediaeval Studies (Toronto) 4 (1958), 1-16.
  • Rahman, Fazlur. ‘Essence and existence in Ibn Sina: the myth and the reality’, Hamdard Islamicus (Karachi) 4 (1981), 3-14.
  • Rizvi, Sajjad. ‘Roots of an aporia in later Islamic philosophy: the existence-essence distinction in the philosophies of Avicenna and Suhrawardi’, Studia Iranica (Paris) 29 (2000), 61-108.

Author Information

Sajjad H. Rizvi
Email: Sajjad.Rizvi@bristol.ac.uk
University of Bristol
United Kingdom

Antisthenes (c. 446—366 B.C.E.)

antistheKnown in antiquity as an accomplished orator, a companion of Socrates, and a philosopher, Antisthenes presently gains renown from his status as either a founder or a forerunner of Cynicism. He was the teacher to Diogenes of Sinope, and he is regarded by Diogenes Laertius as the first Cynic philosopher. He is credited with the authorship of over sixty titles, appears as one of the primary interlocutors in Xenophon’s Memorabilia and Symposium, and is mentioned as one of those present at Socrates’ death by Plato, with whom it seems he had a falling out. Antisthenes’ philosophical interests engage ethics rather than metaphysics or epistemology, and he advocates the practice of virtue through an ascetic life and the cultivation of wisdom. Like Socrates before him, Antisthenes adheres to ethical intellectualism, and like the Stoics who follow the Cynics, he claims that virtue is sufficient for happiness.

Table of Contents

  1. Life and Works
  2. Basic Tenets
  3. Philosophical Influence
  4. References and Further Reading

1. Life and Works

It is primarily through Xenophon’s dialogues and Diogenes Laertius’ Lives of Eminent Philosophers that certain aspects of Antisthenes’ life and thought are known. These sources are not, however, without problems: Xenophon is portraying Antisthenes as an interlocutor, which leads some scholars to question whether this character is in fact representative of the historical Antisthenes; Diogenes Laertius is thought of as a dubious source due to his penchant for recounting contradictory stories from multiple sources. Though each source is questionable independently, when they are treated in conjunction they provide a sketch of Antisthenes as both a Socratic and a Cynic thinker.

Born probably in either 446 or 445 B.C.E. of an Athenian father, also named Antisthenes, and a Thracian mother, Antisthenes was a nothos, which means literally someone born of an illegitimate union (due to being born from a slave, foreigner, or prostitute, or because one’s parents were citizens but not legally married) and therefore was not an Athenian citizen. Initially he was a pupil of Gorgias the rhetorician, and the rhetorical sounding titles that are ascribed to him by Diogenes Laertius almost certainly derive from this first phase of his career. In fact, of his prolific literary corpus, only his Ajax and Odysseus are extant, and both offer a demonstration of his rhetorical training under Gorgias.

After meeting Socrates and deriving great benefit from him, Antisthenes abandoned his study of rhetoric for philosophy and even encouraged his own pupils to join him under Socrates’ tutelage. His close friendship with Socrates is well documented in Xenophon’s dialogues, and his importance would have been aided by his position as an older and esteemed member of Socrates’ circle. In the years immediately following Socrates’ death, then, it is likely that Antisthenes was regarded as Socrates’ most important follower (see Kahn 4-5).

What little is known about Antisthenes’ life is marked by both his asceticism and humor. It is claimed that he was the first to double his cloak in order to sleep in it, and recommended this to Diogenes of Sinope (though Diogenes of Sinope is also claimed to be the first to do so) and that, in addition, he was equipped with those elements that would later be distinctive of the Cynics: the wallet and the staff. He chose to live in poverty, and more than one of the surviving anecdotes surrounds the ragged state of his cloak, usually involving those areas where the cloak is torn. In addition to eschewing luxuries so many of his fellow Athenians sought, he demonstrated an ad hoc and improvisational sense of humor which allowed him to ridicule commonly held beliefs and the mores of Athenian culture, a practice which would be perfected by Diogenes of Sinope.

2. Basic Tenets

Xenophon’s treatment of Antisthenes combines well with the details Diogenes Laertius provides of his philosophical position at 6.10-12. Though the list of his “favorite themes” is lengthy, it represents the central aspects of his ethical thought. In sum, the basic tenets are:

  1. Virtue can be taught.
  2. Only the virtuous are noble.
  3. Virtue is itself sufficient for happiness, since it requires “nothing else except the strength of a Socrates” (D.L. 6.11).
  4. Virtue is tied to deeds and actions, and does not require a great deal of words or learning.
  5. The wise person is self-sufficient.
  6. Having a poor reputation is something good, and is like physical hardship.
  7. The law of virtue rather than the laws established by the polis will determine the public acts of one who is wise.
  8. The wise person will marry in order to have children with the best women.
  9. The wise person knows who are worthy of love, and so does not disdain to love.

These themes, revolving as they do around virtue and the activity of the wise man, bear an unmistakable resemblance to Socrates’ convictions. The teachability of virtue, the emphasis on deeds over words, and the prominence of erōs are all explicitly found in Socratic literature. Furthermore, according to Diocles, Antisthenes held virtue to be the same for men as for women, a position that is echoed, if in a more inchoate form, in Socratic thought.

Antisthenes’ ethical views also, however, represent an innovation, and do not merely repeat those held by Socrates. First, the unambiguous statement of virtue as sufficient for happiness is a shift from Socrates’ hedging on this matter. Virtue and happiness are completely coincident and open to all. Second, he begins to separate morality and legality in a way that Socrates apparently did not. In Plato’s Crito, Socrates is clear that one is morally obliged to abide by the laws of one’s state, unless one can convince the state to change the laws. The Cynics show no such regard for nomos, a term which means both law and convention, whether it is in relation to cultural codes or legal regulations. By loosening law and virtue Antisthenes sets the stage for the more radical positions of Diogenes of Sinope and Crates.

Antisthenes takes a stronger position than did Socrates on the abstention from physical pleasures, claiming, he says, to prefer madness to pleasure (D.L. 6.3). The pursuit of pleasure is dangerous insofar as it can recommend precarious activities (as is recounted in the story of an adulterer fleeing for his life who Antisthenes claims could have escaped peril “at the price of an obol,” but more importantly, its effect on self-sufficiency is ruinous. One can become enslaved to pleasure and so lose all hope of being truly free. For this reason “When someone extolled luxury his reply was, ‘May the sons of your enemies live in luxury’” (D.L. 6.8).

Finally, he is much more obviously anti-theoretical than Socrates. Whereas Socrates claims to know nothing of theoretical philosophy, Antisthenes suggests that it is useless. Though the terms are not yet coined, the distinction is between metaphysics and ethics, and Antisthenes focuses upon the latter only. His privileging of practice over learning, or deeds over words, is clearly anti-theoretical, but it should not be viewed as opposed to reason. Reason, for Antisthenes, is the foundation of virtue. “Wisdom is a most sure stronghold which never crumbles away nor is betrayed. Walls of defense must be constructed in our own impregnable reasonings” (D.L. 6.13). Antisthenes’ caution against pleasure, his praise of poverty, and his privileging of reason will be palpable in the Cynics who follow him and Stoic cultivation of indifference.

3. Philosophical Influence

Antisthenes’ influence is primarily upon the “school” of Cynicism, both as a precursor and originator. Antisthenes’ life and thought provide a connection between Socrates and the Cynics. Diogenes Laertius makes just this point: “From Socrates he learned his hardihood, emulating his disregard of feeling, and thus he inaugurated the Cynic way of life”(D.L. 6.2). Some scholars are more dubious. Dudley, for example, claims that Antisthenes was a follower of Socrates, and nothing more. The attribution of “first Cynic” to Antisthenes is, on Dudley’s account, merely an invention of the Alexandrian writers of Successions meant to give the Stoic school the proper Socratic pedigree.

Branham and Goulet-Cazé propose that Antisthenes be considered a “forerunner” (The Cynics 7), and Navia claims that “in both Antisthenes and Diogenes we come upon one reaction to the problem of human existence, and one radical solution… for Cynicism emerged among the Greeks from both, as if from twin sources” (Classical Cynicism 67). The subtler approaches of Branham, Goulet-Cazé, and Navia grasp the impossibility of resolving the debate. The sources of antiquity have combined the tradition of Diogenes with that of Antisthenes. Thus, the Cynic movement is viewed as having begun with the Socratic ethical practices of Antisthenes, practices which receive their more robust instantiations through the life of Diogenes of Sinope.

The claim that Antisthenes had no connection to the Cynics is, given Antisthenes’ unique ethical position, tenuous. Antisthenes endorses the Socratic position, but contributes his own understanding of virtue and his insistence upon the importance of askēsis. His asceticism is comparable to that of Socrates, but his animosity toward pleasure and his pride in his poverty resembles better the position of later Cynics. Finally, the privileging of virtue and the claim that virtue is itself sufficient for happiness will be central to Stoic ethics. “Antisthenes gave the impulse to the indifference of Diogenes, the continence of Crates, and the hardihood of Zeno, himself laying the foundations of their state” (D.L. 6.15).

4. References and Further Reading

  • Billerbeck, Margarethe. Die Kyniker in der modernen Forschung. Amsterdam: B.R. Grüner, 1991.
  • Branham, Bracht and Marie-Odile Goulet-Cazé, eds. The Cynics: The Cynic Movement in Antiquity and Its Legacy. Berkeley: University of California Press, 1996.
  • Dudley, D. R. A History of Cynicism from Diogenes to the 6th Century A.D. Cambridge: Cambridge University Press, 1937.
  • Goulet-Cazé, Marie-Odile and Richard Goulet, eds. Le Cynisme ancien et ses prolongements. Paris: Presses Universitaires de France, 1993.
  • Kahn, Charles H. Plato and the Socratic Dialogue: The Philosophical Use of a Literary Form. Cambridge: Cambridge University Press, 1996.
  • Diogenes Laertius. Lives of Eminent Philosophers Vol. I-II. Trans. R.D. Hicks. Cambridge: Harvard University Press, 1979.
  • Long, A.A. and David N. Sedley, eds. The Hellenistic Philosophers, Volume 1and Volume 2. Cambridge: Cambridge University Press, 1987.
  • Malherbe, Abraham J., ed. and trans. The Cynic Epistles. Missoula, Montana: Scholars Press, 1977.
  • Navia, Luis E. Classical Cynicism: A Critical Study. Westport, Connecticut: Greenwood Press, 1996.
  • Navia, Luis E. Antisthenes of Athens. Westport, Connecticut: Greenwood Press, 2001.
  • Paquet, Léonce. Les Cyniques grecs: fragments et témoignages. Ottawa: Presses de l’Universitaire d’Ottawa, 1988.

Author Information

Julie Piering
Email: japiering@ualr.edu
University of Arkansas at Little Rock
U. S. A.

Political Philosophy of Alasdair MacIntyre

Alasdair MacIntyreThis article focuses on Alasdair MacIntyre’s contribution to political philosophy since 1981, although MacIntyre has also written influential works on theology, Marxism, rationality, metaphysics, ethics, and the history of philosophy. He has made a personal intellectual journey from Marxism to Catholicism and from Aristotle to Aquinas, and he is one of the preeminent Thomist political philosophers. The most consistent and most distinctive feature of MacIntyre’s work is his antipathy to the modern liberal capitalist world. He believes that modern philosophy and modern life are characterized by the absence of any coherent moral code, and that the vast majority of individuals living in this world lack a meaningful sense of purpose in their lives and also lack any genuine community. He draws on the ideal of the Greek polis and Aristotle’s philosophy to propose a different way of life in which people work together in genuinely political communities to acquire the virtues and fulfill their innately human purpose. This way of life is to be sustained in small communities which are to resist as best they can the destructive forces of liberal capitalism.

It is important to keep in mind that MacIntyre is not suggesting that we should merely tinker around the edges of liberal capitalist society; his goal is to fundamentally transform it. He does not believe that this will happen quickly or easily, and indeed it may not happen at all, but he believes that it will be a disaster for humanity if it does not happen. After Virtue famously closes with a warning about “the new dark ages which are already upon us” (After Virtue 263). It is also important to keep in mind that even if, after careful consideration, you do not agree with MacIntyre’s proposed solution, or you do not believe that it has any chance of actually coming about, it may still be that MacIntyre’s critique of the modern world is at least partially correct. MacIntyre is well aware that most of us who have been brought up in the liberal capitalist world see our world’s ideas and institutions as natural and desirable – not perfect, but fundamentally sound – and so we will not easily be persuaded that it is in fact inherently deeply flawed and profoundly unhealthy. But an openness to that possibility is essential to understanding MacIntyre.

Table of Contents

  1. Introduction
  2. Philosophy and Society
  3. The Current Moral Disorder and Its Consequences
  4. The Absence of Meaningful Moral Choices
  5. Emotivism and Manipulative Social Relations
  6. The Concept of a Practice and the Origin of the Virtues
  7. Politics in a World without Morality
  8. The Greek Way of Life
  9. Heroic Society and Homer
  10. The Athenian Polis and Aristotle
  11. Our Human Nature: Dependent Rational Animals and Human Virtues
  12. A New Politics
  13. A New Economics
  14. Conclusion
  15. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

Alasdair MacIntyre was born in 1929, in Glasgow, Scotland. He holds MA degrees from the University of Manchester and University College at Oxford, and taught at several institutions in the United Kingdom before moving to the United States in 1970. He has taught at several institutions in the United States, and he currently holds a position at Notre Dame University.

His first publication, “Analogy in Metaphysics,” appeared in 1950 when he was 21 years old. His first book, Marxism: An Interpretation, followed in 1953. Since then, he has written or edited nearly twenty books and hundreds of articles and book reviews on a wide range of subjects, including theology, Marxism, the nature of rationality, metaphysics, and the history of philosophy and ethics. For references that deal with his contributions to fields other than political philosophy, and for more detailed biographical information, see the References and Further Reading.

This essay concentrates on MacIntyre’s contributions to political philosophy and is primarily concerned with his best known work, After Virtue, which was originally published in 1981. A second edition of After Virtue was published in 1984; it included a postscript in which MacIntyre responded to a number of criticisms of the original edition. It is this second edition that will be cited below. The three main works which followed After Virtue expand on, clarify, or revise the arguments found there. These are Whose Justice? Which Rationality? (1988), Three Rival Versions of Moral Enquiry (1990), and Dependent Rational Animals (1999). The last is of particular importance for understanding the practical consequences of MacIntyre’s political philosophy. It is also likely to be the easiest of the three for the beginning student of MacIntyre’s work to read and understand. A useful source for MacIntyre’s thought is The MacIntyre Reader (1998), edited by Kelvin Knight, which brings together a number of MacIntyre’s shorter works going back to the 1950s, a pair of interviews with MacIntyre, excerpts from After Virtue and Whose Justice? Which Rationality?, and two thoughtful essays by Knight.

In the first of those essays, Knight claims that “MacIntyre’s politics may now, to an extent, be described in terms of resistance” (The MacIntyre Reader 23; see also Breen 2002 and McMylor 1994). Knight is certainly right about this. MacIntyre is trying to resist and transform essentially the entire modern world. His definition of “modern” stretches back roughly 350 years to the Enlightenment, although he considers the Enlightenment to have been a mistake; (After Virtue 118 and Chapters 4-6; see also Whose Justice? Which Rationality? Chapter 1), but in this article the term “modern” will mean the contemporary twentieth and twenty-first century world.

MacIntyre wants to overthrow the liberal capitalist ideology that currently dominates the world both in the realm of ideas and in its manifestations in political and social institutions and actions. He seeks to achieve this not through the use of force but by changing how people think about, understand, and act in the world. To show that the changes he wants are possible and desirable, he returns to an older conception of morality, derived from the teachings of St. Thomas Aquinas and ultimately, through Aquinas, the philosophy of Aristotle and the way of life of the Athenian polis. He portrays this older conception of morality as both superior to and fundamentally hostile to the modern order, and his philosophical arguments are meant to help restore it to the world. On the other hand, he understands that liberal capitalism has tremendous power and appeal both in the world of ideas and in the power it has over people in the social, political, and economic spheres. Ultimately his recommendation is that the particular conditions of the modern world require that those who agree with his arguments should, to the greatest possible degree, withdraw from the world into communities where the old morality can be kept alive until the time is right for it to re-emerge.

This article begins by describing the modern world as MacIntyre sees it, and then moves on to MacIntyre’s depiction of what he believes to be the very different world of the ancient Greeks, and specifically the ancient Athenians. Next, it contrasts the two and shows why MacIntyre believes the ancient world to be superior. The conclusion examines MacIntyre’s suggested alternative to the modern world, which draws on the ancient world without simply proposing a return to it.

It is important to keep in mind that MacIntyre is not suggesting that we should merely tinker around the edges of liberal capitalist society; his goal is to fundamentally transform it. He does not believe that this will happen quickly or easily, and indeed it may not happen at all, but he believes that it will be a disaster for humanity if it does not happen. After Virtue famously closes with a warning about “the new dark ages which are already upon us” (After Virtue 263). It is also important to keep in mind that even if, after careful consideration, you do not agree with MacIntyre’s proposed solution, or you do not believe that it has any chance of actually coming about, it may still be that MacIntyre’s critique of the modern world is at least partially correct. MacIntyre is well aware that most of us who have been brought up in the liberal capitalist world see our world’s ideas and institutions as natural and desirable – not perfect, but fundamentally sound – and so we will not easily be persuaded that it is in fact inherently deeply flawed and profoundly unhealthy. But an openness to that possibility is essential to understanding MacIntyre.

2. Philosophy and Society

As we work through MacIntyre’s argument, we will be talking about both the world of ideas – that is, philosophy – and the world of institutions and actions – that is, politics and society. Although at times we will consider these two worlds separately, one of MacIntyre’s most strongly held convictions is that they are closely connected. MacIntyre has not always been clear or consistent about the strength or direction of that connection, but the importance of the connection for MacIntyre’s argument has been consistent ever since After Virtue. Contemporary philosophers, he says, tend to interpret and argue about the works of past philosophers without paying attention to the intellectual and especially the social context in which those works were created. They act as though all past philosophers are contributing to the same argument, seeking timeless and eternal moral truths. But this is wrong, because philosophies are in large part derived from sociologies and are specific to particular societies: “Morality which is no particular society’s morality is to be found nowhere” (After Virtue 265-266; see also The MacIntyre Reader 258). Although philosophers can and should learn from the work of earlier philosophers, this is not their main source of ideas when they are doing their job properly. What philosophers primarily do is study the actual world in which they live – its politics, traditions, social organization, families and so on – and try to find the ideas and values that must underlie those institutions and practices, even if the members of the society cannot articulate them, or cannot articulate them fully. When the philosophers have done their work correctly, the philosophy they articulate will reflect their society; and because philosophers are uniquely suited to see the society as a whole they will be in a unique position to point out inconsistencies, propose new ideas consistent with the old ones that are nevertheless improvements on those ideas, and show why things that seem trivial are actually crucial to the society, and vice versa. They are also in a position to examine not only what it is that the people in their society do but why they do it, even when those people cannot explain it for themselves. These are the things that MacIntyre himself wants to do: show the inconsistencies and incoherencies at the center of modern conceptions of morality and society and transform them so that the modern expression of morality, structure of society, and practices of politics can be transformed as well. But philosophers do not and cannot stand outside of all societies to offer objective truths or objective moralities, since these must always be connected to particular societies.

So, the political, social, and economic life of a society constrains the kinds of ideas and morality it can have (at times MacIntyre seems to agree with Marx that these things do not merely constrain ideas and morals but actually determine them), and those ideas and that morality, especially as articulated by philosophers, in turn influence economics and politics (again, in different writings MacIntyre seems to have different views about how much influence they have). Let us see what MacIntyre has to say about modern ideas and institutions in After Virtue.

3. The Current Moral Disorder and Its Consequences

MacIntyre begins After Virtue by asking the reader to engage in a thought experiment: “Imagine that the natural sciences were to suffer the effects of a catastrophe…. A series of environmental disasters [which] are blamed by the general public on the scientists” leads to rioting, scientists being lynched by angry mobs, the destruction of laboratories and equipment, the burning of books, and ultimately the decision by the government to end science instruction in schools and universities and to imprison and execute the remaining scientists. Eventually, enlightened people decide to restore science, but what do they have to work with? Only fragments: bits and pieces of theories, chapters of books, torn and charred pages of articles, hazy memories and damaged equipment with functions that are unclear, if not entirely forgotten. These people, he argues, would combine these fragments as best they could, inventing theories to connect them as necessary. People would talk and act as though they were doing “science,” but they would actually be doing something very different from what we currently call science. From our point of view, in a world where the sciences are intact, their “science” would be full of errors and inconsistencies, “truths” which no one could actually prove, and competing theories which were incompatible with one another. Further, the supporters of these theories would be unable to agree on any way to resolve their differences.

Why does MacIntyre ask us to imagine such a world? “The hypothesis I wish to advance is that in the actual world which we inhabit the language of morality is in the same state of grave disorder as the language of natural science in the imaginary world which I described” (After Virtue 2, After Virtue 256). People in the modern liberal capitalist world talk as though we are engaged in moral reasoning, and act as though our actions are chosen as the result of such reasoning, but in fact neither of these things is true. Just as with the people working with “science” in the imaginary world that MacIntyre describes, philosophers and ordinary people are working today with bits and pieces of philosophies which are detached from their original pre-Enlightenment settings in which they were comprehensible and useful. Current moral and political philosophies are fragmented, incoherent, and conflicting, with no standards that can be appealed to in order to evaluate their truth or adjudicate the conflicts between them – or at least no standards that all those involved in the disputes will be willing to accept, since any standard will presuppose the truth of one of the contending positions. To use an analogy that MacIntyre does not use, one might say that it is as if we tore handfuls of pages from books by Jane Austen, Shakespeare, Danielle Steele, Mark Twain, and J.K. Rowling, threw half of them away, shuffled the rest, stapled them together, and then tried to read the “story” that resulted. It would be incoherent, and any attempt to describe the characters, plot, or meaning would be doomed to failure. On the other hand, because certain characters, settings, and bits of narrative would reappear throughout, it would seem as though the story could cohere, and much effort – ultimately futile – might be expended in trying to make it do so. This, according to MacIntyre, is the moral world in which we currently live.

One consequence of this situation is that we have endless and interminable debates within philosophy and, where philosophy influences politics, within politics as well (After Virtue 6-8, Three Rival Versions of Moral Enquiry 7 and Chapter 1). MacIntyre demonstrates this with regard to philosophers by a comparison of the positions of John Rawls and Robert Nozick on what justice is, positions which are mutually exclusive, but internally coherent. Each conclusion follows reasonably from its premises (After Virtue Chapter 17). Each position has many adherents who can point out the flaws in the other but cannot successfully defend their own position against attack. In the political world, one of the examples MacIntyre uses is the abortion issue in the United States. One side of the debate, drawing largely on a particular interpretation of Christian ethics, asserts that abortion is murder and hence is both morally unacceptable and deserving of legal punishment; the other side, usually drawing either on a conception of privacy or of rights or both, asserts that women should have the right to make a private decision about terminating a pregnancy, and therefore abortion, while possibly morally problematic, deserves the protection of the law. In either case, the conclusion follows logically, that is, reasonably, from the premises. But the starting premises are incompatible, and there is no way to gain everyone’s agreement to either set of premises, nor is there even any agreement on what kind of argument might be able to gain a consensus. (And a look at public opinion polls about abortion taken in the United States shows that the percentage of people for or against legal abortion in particular circumstances has basically remained unchanged since Roe v. Wade was decided in 1973).

It is also the case, according to MacIntyre, that those involved in these philosophical and political debates claim to be using premises that are objective, based on reason, and universally applicable. Many of them even believe these claims, misunderstanding the nature of their particular inadequate modern philosophy, just as the people in MacIntyre’s post-disaster world misunderstand what it means to be doing real science. But what they are really doing, whether they recognize it or not, is using the language of morality to try to gain their own preferences. They are not trying to persuade others by reasoned argument, because a reasoned argument about morality would require a shared agreement on the good for human beings in the same way that reasoned arguments in the sciences rely on shared agreement about what counts as a scientific definition and a scientific practice. This agreement about the good for human beings does not exist in the modern world (in fact, the modern world is in many ways defined by its absence) and so any attempt at reasoned argument about morality or moral issues is doomed to fail. Other parties to the argument are fully aware that they are simply trying to gain the outcome they prefer using whatever methods happen to be the most effective. (Below there will be more discussion of these people; they are the ones who tend to be most successful as the modern world measures success.) Because we cannot agree on the premises of morality or what morality should aim at, we cannot agree about what counts as a reasoned argument, and since reasoned argument is impossible, all that remains for any individual is to attempt to manipulate other people’s emotions and attitudes to get them to comply with one’s own wishes.

MacIntyre claims that protest and indignation are hallmarks of public “debate” in the modern world. Since no one can ever win an argument – because there’s no agreement about how someone could “win” – anyone can resort to protesting; since no one can ever lose an argument – how can they, if no one can win? – anyone can become indignant if they don’t get their way. If no one can persuade anyone else to do what they want, then only coercion, whether open or hidden (for example, in the form of deception) remains. This is why, MacIntyre says, political arguments are not just interminable but extremely loud and angry, and why modern politics is simply a form of civil war.

4. The Absence of Meaningful Moral Choices

But there is another problem. Just as no one can win an argument with anyone else by persuading them with reasons, no one can win such an argument with himself or herself in trying to determine what their own moral commitments should be. In other words, no one can have real reasons for choosing the moral positions and values that they do, and no one can have any real reasons for choosing any way of life over any other as the best possible life. So any choice about the kind of life one will lead (and of course these choices have to be made, either consciously or unconsciously) must be arbitrary; any individual could always just as easily have chosen some other life which would have a very different set of moral positions and values (After Virtue Chapter 4). And if I can choose to be anything, but have no way of discovering reasons that might persuade me that some choice is the best, then it is impossible for me to make any kind of meaningful commitment to any of my choices, and it will be extremely easy to revise my morals in the name of expediency. The temptation will therefore be strong to choose moral principles on the grounds of effectiveness. I will choose my values at any given time because they happen to be useful as a way of attaining something else I value, rather than rationally choosing the best possible life and then letting that choice of the best life determine what I should value and what I should do. Perhaps I will choose values that enable me to be more popular in my community, or values that are useful for justifying my desire for money, or values that I believe will make me more successful at my job. What most people cannot do and are not even aware that they should do is tie their moral positions to a coherent and defensible version of the good life for human beings. The modern philosophies that have received the most attention and support – theories of utility such as those put forward by Jeremy Bentham and John Stuart Mill, and theories of rights such as those advanced by John Locke and John Rawls – cannot provide such a description of the good life for human beings, and MacIntyre regards them as having failed in their ambitions to do so and therefore to have failed in their project of creating new moral systems even on their own terms (After Virtue Chapter 6).

Many would disagree with MacIntyre at this point. They would say that these moral debates are interminable not because of anything specific to modernity but because by their nature they do not and cannot have any resolution. In their view, the situation MacIntyre has described is not a sign of philosophical or political failure in modern times, it is simply a recognition that there are many diverse definitions of what the best life for human beings is and therefore what is just, or good, or virtuous, and that while many of them are legitimate, none is or can be absolutely true. It follows that each of us is entitled to our own viewpoint on these matters and to choose the version of the best life and the best moral code that we individually prefer, provided of course we do not harm others. In After Virtue, MacIntyre calls this point of view emotivism, “the doctrine that all evaluative judgments and more specifically all moral judgments are nothing but expressions of preference, expressions of attitude or feeling, insofar as they are moral or evaluative in character” (After Virtue 11-12, emphasis in original). In a world where people subscribe to emotivism, moral judgments, since they cannot be used for reasoned persuasion, are used for two reasons: to express our own preferences, and to try to change the emotions and attitudes of those with whom we disagree in order to make them agree with us and share our preferences. MacIntyre believes that emotivism is a false doctrine, because we can in fact rationally determine the best possible life for human beings and therefore can have moral judgments that are more than mere preferences, but it is nevertheless a doctrine that many people today subscribe to, and they act as though it is true. Because so many people act as if it is true, it takes on a degree of power in the world. This is one example of the linkage between how people think and how they live: “A moral philosophy – and emotivism is no exception – characteristically presupposes a sociology” (After Virtue 23; see also Three Rival Versions of Moral Enquiry 80). Although few people would, if asked, say that they subscribed to the doctrine of emotivism (indeed, few people would even be able to explain what it is), it is only possible to make sense of their actions and lives if we say that they are acting according to emotivist principles – they act as though morality is nothing but an arbitrary choice that is an expression of their will, and so this is the doctrine to which we can say they subscribe.

5. Emotivism and Manipulative Social Relations

If we are to fully understand emotivism as a philosophical doctrine, MacIntyre says, we must understand what it would look like if it were socially embodied. That is, if we stipulate that nearly all the people in a given society subscribe to emotivism, what can we expect their society look like? How will they behave? It turns out, MacIntyre says, that such a society would look much like ours, and that (as has been said) we act as though we believe emotivism to be true. MacIntyre says that “the key to the social content of emotivism….is the fact that emotivism entails the obliteration of any genuine distinction between manipulative and non-manipulative social relations” (After Virtue 23). Each of us regards the other members of our society as means to ends of our own. Because I cannot persuade people, and because we cannot have any common good that is not purely temporary and based on our separate individual desires, there is no kind of social relationship left except for each of us trying to use the others to achieve our own selfish goals. Even for someone who did not want to live this way, the fact that others would be trying to gain power over them in order to manipulate them would mean that they would still need to seek as much power as they could simply to avoid being manipulated. It would also mean that each of them would need to manipulate others in ways that would make it more difficult or impossible for them to be manipulated in return. This is similar to the argument that animates a good deal of Hobbes’ Leviathan, where the constant battle for power over one another in a state of nature leads to a life that is solitary, poor, nasty, brutish, and short, and eventually to the recognition of the need for a sovereign with absolute power – although this, of course, is not the solution MacIntyre advocates.

6. The Concept of a Practice and the Origin of the Virtues

In After Virtue, MacIntyre tries to explain another element of what is missing in modern life through his use of the concept of a practice. He illustrates this with the example of a person wishing to teach an uninterested child how to play chess.

The teaching process may begin with the teacher offering the child candy to play and enough additional candy if the child wins to motivate the child to play. It might be assumed that this is sufficient to motivate the child to learn to play chess well, but as MacIntyre notes, it is sufficient only to motivate the child to learn to win – which may mean cheating if the opportunity arises. However, over time, the child may come to appreciate the unique combination of skills and abilities that chess calls on, and may learn to enjoy exercising and developing those skills and abilities. At this point, the child will be interested in learning to play chess well for its own sake. Cheating to win will, from this point on, be a form of losing, not winning, because the child will be denying themselves the true rewards of chess playing, which are internal to the game. The child will also, it should be noted, enjoy playing chess; there is pleasure associated with developing one’s skills and abilities that cannot come if one cheats in order to win.

MacIntyre concludes that there are two kinds of goods attached to the practice of chess-playing and to practices in general. One kind, external goods, are goods attached to the practice “by the accidents of social circumstance” – in his example, the candy given to the child, but in the real world typically money, power, and fame (After Virtue 188). These can be achieved in any number of ways. Internal goods are the goods that can only be achieved by participating in the practice itself. If you want the benefits to be gained by playing chess, you will have to play chess. And in pursuing them while playing chess, you gain other goods as well – you will get an education in the virtues. The two kinds of goods differ as well in that external goods end up as someone’s property, and the more one person has of any of them the less there is for anyone else (money, power, and fame are often of this nature). Internal goods are competed for as well, “but it is characteristic of them that their achievement is a good for the whole community who participate in the practice” (After Virtue 190-191). A well played chess game benefits both the winner and loser, and the community as a whole can learn from the play of the game and develop their own skills and talents by learning from it.

MacIntyre believes that politics should be a practice with internal goods, but as it is now it only leads to external goods. Some win, others lose; there is no good achieved that is good for the whole community; cheating and exploitation are frequent, and this damages the community as a whole. (MacIntyre has changed his terminology since After Virtue. He now calls internal goods “goods of excellence,” and external goods are now called “goods of effectiveness.” See The MacIntyre Reader 55).

One important way to understand the community surrounding a genuine practice is as a community of teachers and learners, with each individual community member filling each of these roles at different times. “It belongs to the concept of a practice as I have outlined it…that its goods can only be achieved by subordinating ourselves within the practice in our relationship to other practitioners ” (After Virtue 191). Throughout my time as a participant in a practice, but especially at the beginning, I must put myself under the authority of others. To continue MacIntyre’s example of chess playing beyond where he develops it, notice that I, the player, rely on other chess players to teach me rules and strategies, to evaluate my play and suggest improvements, answer questions, encourage and guide me, and provide opponents. In competing with one another, we develop one another’s skills, and each of us is able to recognize and value those skills in the other and hence values the other person for exhibiting those skills.

MacIntyre notes that when individuals first start to engage in a practice, they have no choice but to agree to accept external standards for the evaluation of their performance and to agree to follow the rules set out for the practice: “A practice involves standards of excellence and obedience to rules as well as the achievement of goods” (After Virtue 190). As a newcomer, I lack the knowledge and experience that would let me evaluate myself and my efforts, so I must rely on others to judge me according to the standards of the practice. And I cannot simply subordinate the standards to my will; I cannot simply decide that I am a grand master at chess because I want to be one. The standards that determine who is and who is not a grand master are already established, and I must accept them. Unilaterally declaring myself a grand master will not place me at the top of the chess hierarchy; it will place me outside it altogether. As I gain in talent, experience, and knowledge, I can begin to have input into the standards themselves, but I will never gain the ability to move outside them if I want to continue to participate in the practice. Nor will I ever gain the ability to move outside the rules if I want to be part of the practice, although in some cases the community can agree to change the rules if they believe it is beneficial to the practice. So, for example, the rules of chess have changed since the game’s origin, and MacIntyre would likely say that this has happened in order to more fully develop the principles of the game.

MacIntyre also emphasizes that chess, like other practices, has a history and is part of a tradition. So he might point out that an important part of becoming a grand master at chess is studying the records of games that have been played by previous grand masters, reading commentaries on those games, examining their philosophies, practice regimens, and the psychological tactics they employed on their opponents, and so on. The rules and standards have developed in the past and are binding on the present, and although they can sometimes be changed by the community as a whole those changes should be consistent with the principles of the game as it has developed in the past. This would seem to be a very conservative doctrine, as it is in the hands of someone like Edmund Burke (cf. Reflections on the Revolution in France), but MacIntyre is explicit that traditions that are in good order require ongoing internal debates about the meaning of the tradition and how it is to be improved and developed for the future. He is not advocating blind loyalty to the past, nor is he saying that all change is bad. He is only acknowledging that the present rests on the past and must take that past into account in its self-understanding as well as in its planning for the future. We have already mentioned changes in the rules of chess, but other transformations can occur without changing the rules. Today, for example, chess players may decide that they must revise what they know about the game and how it is played in order to compete against computer opponents which use very different methods of playing than human opponents do. This requires new approaches and tactics which will become part of the tradition that is available to players in the future. But developing new methods does not require starting from scratch – the past provides materials for use in the present and should not be dismissed as irrelevant.

Although MacIntyre does not emphasize this, he likely would agree with Burke that the idea that one is part of a tradition can serve to strengthen the community, as it encourages the present practitioners to think of themselves as tied to the past and with an obligation to the future, so that they will work to surpass the standards of the past and leave a tradition that is in good order to those who will practice it in the future.

Practices are also important because it is only within the context of a practice that human beings can practice the virtues. Goods that are external to practices, such as money and power, can be achieved in a variety of ways, some good and some bad. But achieving the goods that are internal to a practice, according to MacIntyre, requires the presence of the virtues, and in After Virtue he defines the virtues in terms of practices: “A virtue is an acquired human quality the possession and the exercise of which tends to enable us to achieve those goods which are internal to practices and the lack of which effectively prevents us from achieving any such goods….we have to accept as necessary components of any practice with internal goods and standards of excellence the virtues of justice, courage, and honesty” (After Virtue 191). The necessity of these virtues follows logically from the definition of a practice, as we shall see, but it is important to understand that as far as MacIntyre is concerned, virtues and therefore morality can only make sense in the context of a practice: they require a shared end, shared rules, and shared standards of evaluation. The virtues also define the relationships among those who share a practice: “….the virtues are those goods by reference to which, whether we like it or not, we define our relationships to those other people with whom we share the kind of purposes and standards which inform practices” (After Virtue 191). We must have the virtues if we are to have healthy practices and healthy communities. Let us consider the three virtues of honesty, courage, and justice and see how they arise from practices.

Members of a practice must be honest with each other when they instruct others in the principles of the practice, when they explain the rules to them, and when they evaluate their performance. And we have already seen that the practitioners must not lie or cheat when they engage in the practice, or they will not really be engaging in it and will not gain the benefits of doing so. Courage, MacIntyre says, is a virtue “because the care and concern for individuals, communities and causes which is so crucial to so much in practices requires the existence of such a virtue” (After Virtue 192). Practitioners of a shared practice come to genuinely care about each other, and genuinely caring about others means a willingness to risk harm or danger on their behalf, and that is what courage is. Finally, “Justice requires that we treat others in respect of merit or desert according to uniform and impersonal standards,” and we have seen that these are the standards that are a part of a practice (After Virtue 192). So virtues such as honesty, courage, and justice have meaning in the context of a practice, raising the possibility that there is a way out of the moral chaos that surrounds us today.

MacIntyre is vague about what things do and do not constitute practices; he gives some examples of each, stating that playing chess is a practice but playing tic-tac-toe isn’t; farming is, but planting turnips isn’t. More important to him than narrowly defining the boundaries of a practice is arguing that particular kinds of activities certainly are practices. Why does MacIntyre care so much about practices? It is because he believes that there are a number of things that have been practices in the past, currently are not, but could (and should) be again, and chief among these is politics. It is possible to think of politics as a practice within a community that has a shared aim, and where the members of that community have the same standards of excellence, the same rules, and the same traditions. Indeed, in MacIntyre’s view, politics is a sort of meta-practice, because it is the practice of determining the best life for human beings, a life which will include engaging in other practices. Here MacIntyre parallels Aristotle’s language about politics as the science ordering the other sciences (Aristotle, Nicomachean Ethics I.2). The benefits of a practice would then flow to those who participated in politics – in fact, certain important benefits could only be achieved by political participation – and politics would make people more virtuous rather than less virtuous as it now does. To see why politics currently makes people worse instead of better, and how this inevitably follows from our current moral anarchy, we need to take a closer look at contemporary politics.

7. Politics in a World without Morality

MacIntyre argues that today we live in a fragmented society made up of individuals who have no conception of the human good, no way to come together to pursue a common good, no way to persuade one another about what that common good might be, and indeed most of us believe that the common good does not and cannot exist. What kind of politics can such a society have? “Politically the societies of advanced Western modernity are oligarchies disguised as liberal democracies. The large majority of those who inhabit them are excluded from membership in the elites that determine the range of alternatives between which voters are permitted to choose. And the most fundamental issues are excluded from that range of alternatives.” (The MacIntyre Reader 237; see also The MacIntyre Reader 248, 272). What MacIntyre means by “the most fundamental issues” are the issues of what the best way of life is for individual human beings and for human communities as a whole, and how each can be ordered so as to enable the other to flourish. Modern politics has no space for such issues. Prior to the 2004 election in the United States he published a short essay on the Internet arguing that in light of this lack of meaningful alternatives about the most fundamental issues the proper thing to do was refrain from voting. There are no meaningful alternatives on these issues because almost all citizens subscribe, consciously or not, to the modern idea that issues about the best way of life are not capable of political resolution or consensus and that they must be left to each individual to decide. MacIntyre and other critics of liberalism, which they see as the political manifestation of emotivism, argue that liberalism claims to be neutral about the best way of life and moves debates about it out of the public sphere and into the private, claiming that the state should take no position about what the good life or the good state is. This however has the effect of privileging a certain kind of life and a certain kind of state in the name of neutrality; it is another of the deceptions of the modern world. Because liberalism asserts that each individual has a right to pursue happiness in his or her own way, and because the versions of happiness individuals pursue are inevitably mutually incompatible (I wish to have prayer in schools, you do not; I wish to outlaw abortion, which you support; I wish to raise taxes on the wealthy to feed the poor, which you reject), and because we cannot persuade one another or agree on a common good, politics is, as MacIntyre says, “civil war carried on by other means” (After Virtue 253).

MacIntyre’s famous comment, quoted earlier, about the new dark ages we are living in is followed by the observation that in contrast to the earlier dark ages, the barbarians are not at the gates but in fact have been governing us for some time (After Virtue 263). This conclusion is what we would expect if MacIntyre’s view of the world is right. We would be ruled by people who are ruthlessly aggressive, ignorant of or actually hostile to the virtues required for civilized life, and destructive of social life. Since politics today is about using ideas and arguments not to search for truth but to manipulate others in the quest for power, we would expect the people with the most power to be the ones who are best at manipulating others for their own purposes and who have the greatest desire for power. The reasons they would give to justify their power would be false, but widely accepted, and they would use that power for their own selfish ends. Furthermore, they would pursue that power through whatever means they felt would be most effective, in the absence of any of the standards of right and wrong or success and failure that a practice would provide. In such a world, MacIntyre says, things that would appear to be vices would in fact be virtues. For example, keeping one’s word, which as we have already seen MacIntyre considers to be one of the most important virtues (it is part of honesty), would frequently have negative consequences for those who practiced it, since it might end up being an obstacle to achieving some goal most effectively. So instead of condemning people for not keeping their word, we praise them for the virtue of “adaptability” and the ability to change as the situation demands it. If politics were a practice with the possibility of internal goods and virtues, this would not be the case; but since it is currently not a practice, and therefore has only external goods to offer, it is. Anyone who has read The Prince cannot read MacIntyre on this point without recalling Machiavelli’s advice to the prince about the need to be adaptable and the only relevant standards being those of success or failure; MacIntyre would certainly agree that the modern world is characterized by its Machiavellian politics.

It would also be in the interest of the ruling elite that would arise that no one raises any of the fundamental questions about the best life for human beings and the community considered earlier, because any answer to those questions, and indeed any attempt to find answers, could only undermine the legitimacy of their rule which is based on the belief that there are no such answers. MacIntyre says in After Virtue that claims to rule are based on the claim to possess bureaucratic competence as described by Max Weber: people claim that they should have power because they are the ones that can use it most effectively, although the goals that they are pursuing in such an effective fashion are never questioned or discussed. MacIntyre further believes that these claims of managerial competence are and must be false; they are another of the deceptions of the modern age (After Virtue Chapter 6-8). But even if these claims were valid, valuing the effective use of power without considering the ends for which it is being used is a mistake. Trying to answer questions about the proper ends of human life not only reveals the nature of our current problems and the responsibility of those in power for creating and perpetuating them but it also leads to the realization that the world needs radical change before it can even be possible to discover the answers.

MacIntyre argues that modern politics has no place for patriotism, because there is no patria, or fatherland. Although there can be nationalism, jingoism, and propaganda, there can be no genuine, healthy affection for the nation or for our fellow citizens because we lack a shared project that would connect us to the nation or to our fellow citizens. It would be bizarre for people to have a feeling of attachment to the modern state, since it is bound to thwart many of their projects, allows them no effective voice, and gives them no unifying vision of the good life or any kind of shared community. And if the state is purely instrumental, to be used to advance one’s own projects, why would anyone be willing to die for it, since death means the end of all such projects? Yet the state requires such a patriotic attachment, because it needs people willing to serve as soldiers, police officers, and in other similar life- and safety-threatening jobs. In trying to create such an attachment, the state reveals its own nature and its absurdity: “The modern state…behaves part of the time towards those subjected to it as if it were no more than a giant, monopolistic utility company and part of the time as if it were the sacred guardian of all that is most to be valued. In the one capacity it requires us to fill in the appropriate forms in triplicate. In the other, it periodically demands that we die for it” (The MacIntyre Reader 227; see also The MacIntyre Reader 236).

Finally, in addition to these political problems, the modern age is also characterized by global capitalism, which in MacIntyre’s view has its own deeply pernicious consequences. First, it reinforces emotivism by making the pursuit of one’s preferences the highest good. By doing so, it is like emotivism in that it promotes a false view of human happiness. We will see shortly what MacIntyre sees as the truly happy human life, or at least the potentially happy life, which is lived according to the objective standards of virtue found within a tradition. But we can say here that that life does not involve simply accumulating money or the things that money can buy. Money has a role to play in the virtuous life; there are certain virtues, such as generosity, which are impossible or at least very difficult to carry out without money – here MacIntyre agrees with Aristotle. But a life spent pursuing money is a wasted life, as far as MacIntyre is concerned.

Second, capitalism as an ideology also promotes the instrumental manipulation of people we have already discussed. The capitalist manager manipulates their employees in the production of goods, and the marketing department manipulates customers in order to get them to consume those goods. Free market economies “in fact ruthlessly impose market conditions that forcibly deprive many workers of productive work, that condemn parts of the labor force in metropolitan countries and whole societies in less developed areas to irremediable economic deprivation, that enlarge inequalities and divisions of wealth and income, so organizing societies into competing and antagonistic interests” (The MacIntyre Reader 249). And it is money that dominates the modern politics that is constructed by this capitalist competition and antagonism (Dependent Rational Animals 131). Money and the harm it does to the political process will not be removed from politics until people choose to pursue goods of excellence rather than goods of effectiveness. Capitalism is therefore not only harmful in and of itself but also for its effects on politics.

8. The Greek Way of Life

Given his abiding interest in and admiration for the polis, it would not be surprising if MacIntyre has another meaning for “barbarians” when he describes the people who rule us today: for the ancient Greeks, anyone who did not live in a polis and participate in polis life was a barbarian, and when we see what MacIntyre thinks the polis was and what kind of life pursued there, we will see that the people who are on top in the world today are very far from living that kind of life – as, of course, we all are. So he is probably using the word as it was originally used, in addition to using it for its modern meaning. Overcoming the modern barbarians would mean creating and defending a modern version of the polis – and to do this, we must understand the ancient version of the polis.

It is time, then, to turn to the ancient world which was destroyed by the modern world we have been describing (MacIntyre offers a history of how the new world came to replace the old one in After Virtue, Chapter 16). Most of our attention will be focused, as MacIntyre’s is, on the Athenian polis, or city-state, in the time of Aristotle, and on Aristotle’s thought, which MacIntyre believes is an expression of the way of life of the Athenian upper class. As with his description of modernity, his descriptions of the ancient world and Aristotle’s thought are contentious, and there are many points on which other scholars disagree with his arguments and his conclusions. We will be focusing on the contrast between the ancient world and the modern world and the reasons MacIntyre believes the former to be in many ways superior. Keep in mind that ultimately he wants us to learn from the institutions and ideas of the past and modify them to fit the conditions of the modern world; the final part of this essay will describe how his new world would differ from the world in which we now live.

MacIntyre does not want to try to recreate the polis, nor does he believe it would be possible even if it were desirable. MacIntyre also does not simply offer uncritical praise of the polis. He is strongly opposed to many of the institutions that made day-to-day polis life possible: slavery, the treatment of women, the elitism of its politics and political philosophy, and its exclusion of outsiders. One can summarize these positions by saying that MacIntyre rejects those elements of the polis and of Aristotle’s thought that are hierarchical in a way that subordinates some people (actually most people) for the good of others. So MacIntyre realizes that there is much in the polis that we do not and should not wish to restore. He believes that it is possible to separate the positive features of the polis from its negative features, keeping the former while rejecting the latter; whether he is correct in this is an open question.

9. Heroic Society and Homer

For MacIntyre, understanding the polis means understanding its predecessor: heroic society as described by Homer in the Iliad and Odyssey (After Virtue Chapter 10; Whose Justice? Which Rationality? Chapter 2). In heroic society, MacIntyre says, people did not see themselves as we moderns do, as individuals bearing rights and seeking autonomy from external control through the manipulation of others. They also did not see themselves as constructing their own identities, choosing what they wanted to be and who they were. Instead, their identities came from their place within their society: “The self becomes what it is in heroic societies only through its role; it is a social creation, not an individual one” (After Virtue 129). Each individual had a fixed role resulting from their location in the social network, primarily through their particular ties to their family and kin, and each individual had the specific obligations and privileges attached to that location.

Many of these obligations were not chosen by the person bearing them, and that person was not free to choose other obligations instead. Nor would trying to evade one’s obligations be praised as an example of adaptability; it would be condemned as a violation of the social order, which was the framework on which morality was built. People in this society did not try to determine morality in terms of abstract objective rules which applied to all equally – to try to place oneself outside of society was to cease to exist, because each person’s identity made sense only in the context of that society. As MacIntyre puts it, each individual in such a society “has a given role and status within a well-defined and highly determinate system of roles and statuses….In such a society a man [sic] knows who he is by knowing his role in these structures; and in knowing this he knows also what he owes and what is owed to him by the occupant of every other role and status” (After Virtue 122). So in any particular situation, an individual would be able to understand what they should do in a straightforward way: the thing for them to do is the thing that it is appropriate for a person in their position to do by showing the proper regard for someone, meeting the particular obligations they have, doing what their duty requires them to do, and so forth. And it is also clear what actions must be performed in order to do these things. All they must do is ask what a person in their position is supposed to do in this situation and then do it.

In MacIntyre’s view, this kind of society, unlike modern societies, can have a genuine moral code, since failing to do what a person in a particular position is supposed to do is a moral failure, and that person can and will be judged accordingly by the other members of the society, who know what that person’s duties, obligations, and privileges are and have legitimate claims on that person for them. This moral code is based on what is agreed to be the shared end of the society and the best way to achieve it, which gives each member their proper role in the society and their proper tasks. Heroic society is not by any means democratic, and so it would appear that democracy is not necessary to have this kind of society, but MacIntyre does believe that societies which include practices and virtues nowadays will prove to be democratic – much more democratic than they are now, in fact.

Recall our earlier discussion of the practices and the virtues. Taken as a whole, this kind of society can be understood as a kind of practice. Each individual agrees about what the virtues are – those traits that make it possible for them to carry out their obligations as they ought to in order to bring about the best possible life for the society as a whole – and they follow the virtues in living out their lives. There is also a determinate pattern to the life of each individual in the society, as each meets their obligations and fulfills their role like characters in a story. Remember the earlier suggestion that making sense out of morality today is like trying to tell a coherent story by mixing up parts of five or six very different novels. In this society, each individual is like a character in a story that is told by the society as a whole. The story is about what the good life is, and it provides a shared narrative for everyone. What is good for the individual and what is good for society are mutually reinforcing. If each individual does what they are supposed to do, the society will function as it should, and at the same time the society provides the context for the happy life spent in pursuit of the virtues that give meaning to the lives of its members.

10. The Athenian Polis and Aristotle

MacIntyre asserts that the virtues of heroic society and the heroic ideal carry forward into classical Athens, but since Athenian society is organized very differently than heroic society, this leads to difficulties. The virtues that are expressed in a society organized primarily around family and kinship networks have to be expressed differently in a society organized around the principle of the equality of citizens and the activity of politics. In MacIntyre’s view, much of Athenian philosophy and art is engaged in redefining the heroic virtues to make them fit the new context of the polis; again we see how philosophy and society are interrelated, with changes in society leading to changes in philosophy. MacIntyre’s definition of the polis is somewhat idiosyncratic: “The application of [the virtues as a way to measure an individual’s goodness] in a community whose shared aim is the realization of the human good presupposes of course a wide range of agreement in that community on goods and virtues, and it is this agreement which makes possible the kind of bond between citizens which, on Aristotle’s view, constitutes a polis” (After Virtue 155; see also Whose Justice? Which Rationality? 33-34). Restoring this agreement is the sense in which MacIntyre wants to return to the polis.

That the polis was the setting for the good life was, MacIntyre says, taken for granted by everyone participating in the debate about what the virtues could mean in their new setting, and in After Virtue he examines four of the voices in this debate: Plato, the sophists, playwrights such as Sophocles, and Aristotle. It is Aristotle who comes to be MacIntyre’s focus, because it is Aristotle “whose account of the virtues decisively constitutes the classical tradition as a tradition of moral thought” (After Virtue 147). MacIntyre believes that Aristotle is essentially expressing the Athenian way of life in the form of a philosophy. Some scholars would disagree with this argument, but let us consider Aristotle more closely in order to see MacIntyre’s argument.

Aristotle’s philosophy has at its heart the idea of a telos, or final purpose. Think about a knife for a moment. If you were asked to describe a knife, what would you say about it? You would probably describe its size and shape, what it is made out of, the fact that it has a handle and a blade, and you would probably also say that its purpose is to cut things. That purpose is its telos, and your description of the knife would be incomplete in an important way if you did not include it. It is fairly easy to see that something made by human beings has a telos, since humans generally create things for specific purposes. But Aristotle believes that things in the natural world also have a telos. The acorn has as its telos growing into a big, tall, strong oak tree, full of healthy acorns. The baby thoroughbred horse has as its telos being a swift runner; the wolf cub will grow up to hunt well; and so on. Human beings also have a telos, and according to Aristotle it is to be happy by living a life in accordance with the virtues. This is the inherent purpose of human life, and each of us is intended by nature to live a virtuous life in the same way the acorn is meant to be an oak tree and the colt is meant to be a swift racehorse. We do not get to choose what our telos is, any more than a knife or an acorn or a horse does. We do get to choose whether or not we are going to try to achieve it, and we can be held responsible if we do not (The MacIntyre Reader, “Plain Persons and Moral Philosophy”).

The idea of a telos can be used to provide standards for normatively evaluating things. For example, if I have a knife that will not hold an edge, or has a handle that falls off, I have a knife that will not be able to fulfill its telos. It cannot do what it is supposed to do and what it was made to do. I can therefore say that it is a bad knife. Similarly, a wolf that is fat and lazy, or unable to scent animals, or runs slowly, is not the ideal wolf. It has not become what it was supposed to be. And human beings, if they do not pursue the life of happiness through virtuous behavior that is their telos, are bad human beings. They are guilty of moral failure, and everyone who agrees about what the human telos is will have to agree to that, in the same way they will have to agree that a knife that falls apart whenever someone tries to use it is a bad knife. Thus, for people who share a telos and whose community expresses that shared telos, morality has context and meaning.

It should be pointed out here that contemporary philosophies such as emotivism deny that there is a human telos (with ruinous consequences as far as MacIntyre is concerned). The idea that there is a human telos carries with it its own problems. Most obviously, it has at least so far proven impossible to unite all people behind a particular idea of what that telos is, or to demonstrate how we can be sure that a telos even exists. Often, the idea that nature or the gods want people to pursue certain goals and behave in certain ways has been used as a pretext for human tyranny. Many would point to the Taliban in Afghanistan, or the Catholic Inquisition, as an example of this. Also, there have been historical eras in which people in different societies strongly believed that there was a telos, but disagreed about what it was (in fact, the era of the polis in Greece was one such era). This has often led to war. The liberal idea of religious toleration, based on the idea that the proper work of government is the protection of people’s bodies and property rather than their soul (see Locke’s Letter Concerning Toleration), was in part the result of the religious wars, which were in part about the best life for human beings, that ravaged Europe for centuries (and ravage other parts of the world today). MacIntyre points out, however, that just because we haven’t reached agreement on this subject doesn’t mean that we can’t, and he argues that the belief that we can’t is a historically specific belief, rather than an objective and permanent truth about how the world works. If we reason correctly, and examine competing philosophical traditions of moral enquiry, we can choose the most accurate one. (This is the task of Three Rival Versions of Moral Enquiry).

You may want to think about physical health as an analogy. If I want to be healthy, I am much more likely to succeed if I am willing to exercise, eat sensibly, avoid tobacco and other drugs, and do what my doctor tells me, even when that means undergoing painful surgery, paying for expensive treatments, or swallowing foul-tasting medicines. I am certainly free not to do any of these things. I can smoke, overeat, lie on the couch all day, and never go near a doctor’s office. But in that case I won’t be healthy, and I don’t get to redefine “health” to cover my condition. If I said I was living such a lifestyle because I was trying to live a healthy life, anyone who knew anything about health would laugh at me. Since health is preferable to sickness, I should be willing to reject unhealthy behaviors that are temporarily pleasant to achieve what is really good for me in the long run. Yet often I do not. In the same way, I should give up things that do not bring me closer to my telos by contributing to a virtuous life. But, again, often I do not. And if we accept that certain things are inherently good or harmful for our bodies because of our nature as particular kinds of animals, why shouldn’t we accept the same principle regarding our souls?

As human beings, we are not always inclined to live a virtuous life devoted to the pursuit of the virtues, but that is the life that we should lead. MacIntyre calls this the distinction between “human nature as it is” and “human nature as it could be if it realized its telos” (After Virtue 52). The role of ethical theory is to take us from the former condition to the latter, teaching us how to overcome the weaknesses of our human nature and become what we are capable of becoming, as well as why this ought to be our good. It is like a road map, showing us where we are and where we need to get to and identifying the hazards along the way. Recall that MacIntyre said that in the modern world people believe that they do not have any fixed telos or purpose; there is nothing that we are meant to become, no innate goal that we move towards. (MacIntyre points to Hobbes and Leviathan as an example of this philosophical belief and its consequences). Absent any conception of what human beings are supposed to become if they realized their telos, there can be no ethical theory, because it simply has no purpose. For people with no destination, a road map has no value.

We have seen MacIntyre’s description of modernity and its problems, and we have seen his description of the life of the polis and the philosophy of Aristotle. This brings us to the choice MacIntyre says confronts us. In After Virtue he says that we can either choose the modern world, with its emotivism, liberalism and capitalism – a world which, if we are honest, is actually a Nietzschean world – or we can choose to return to a morality and a conception of the virtues based on the philosophy of Aristotle (After Virtue Chapter 18). MacIntyre wants us to reject Nietzsche and choose Aristotle – not on the basis of the kind of arbitrary decision made under emotivism, but on the grounds that the kind of rational morality proposed by Aristotle does not fall prey to the criticisms of Nietzsche. It remains to describe what the future would hold if MacIntyre were successful in his project. How would a world based on the experience of the polis and the philosophy of Aristotle that world differ from the world we live in today? After Virtue ends without providing much guidance – MacIntyre says that we are waiting for a new Saint Benedict (who was the founder of monasticism in the Catholic tradition) to lead us out of the new dark ages (After Virtue 263) – but in his later writings he has offered more detail about what a better world would look like.

11. Our Human Nature: Dependent Rational Animals and Human Virtues

Much of what MacIntyre has to say on this topic is found in Dependent Rational Animals, and that book will be the focus of this section of the essay. MacIntyre intends the book to answer two questions: “Why is it important for us to attend to and to understand what human beings have in common with members of other intelligent animal species?” and “What makes attention to human vulnerability and disability important for moral philosophers?” (Dependent Rational Animals ix). The book reflects MacIntyre’s change of position regarding whether “an ethics independent of biology” is possible (Dependent Rational Animals x). In After Virtue he had rejected Aristotle’s biological teleology – which is the idea that human beings have a telos because of the particular kind of creature that we are. Aristotle says that only human beings have the ability to speak and reason and therefore our telos is to develop that reason. In Dependent Rational Animals MacIntyre now accepts the idea of a biological teleology, but much of his argument for this is based on the idea that it is not human beings alone that have the ability to speak and reason; dolphins and gorillas can also do these things, and we can learn something about humans from how these other animals pursue their individual and collective goods. What we learn is that for human beings the key to flourishing is to be an independent practical reasoner (Dependent Rational Animals 77). What are the consequences of this?

MacIntyre now believes that any successful ethical theory must comprehend three aspects of human existence: we are dependent, we are rational, and we are animals. The first and third of these, he says, are seldom taken into account by philosophers, and the second is frequently overemphasized. Aristotle comes in for particular criticism for denying the merit of the experiences of dependent human beings and making a virtue out of self-sufficient superiority (Dependent Rational Animals 6-7, 127). These are flaws which can be seen to contribute to MacIntyre’s turning away from Aristotle and towards Aquinas, whose account of the human telos and virtues includes resources that allow us to include everyone in the community rather than a small elite as Aristotle’s philosophy does. Much of the book is concerned with placing human beings in relationship to other animals, especially with regard to intelligence and rationality. MacIntyre argues that human beings retain their animal natures in important ways (Dependent Rational Animals 49) and that we are like gorillas and dolphins in that members of each species “pursue their respective goods in company with and in cooperation with each other” (Dependent Rational Animals 61).

Because we are animals, we are vulnerable to a wide range of inadequacies, deficiencies, and illnesses and are in need of the help of others if we are to survive and even more help if we are to thrive. Each of us has had the experience of dependency in infancy and childhood and most of us will face physical dependency again as we age. The kind of dependency that MacIntyre focuses on is our dependency on others to learn how to be rational and how to be ethical. This need is strongest in children, who at first simply follow whatever desires they happen to have at the moment. One of the things that parents must do (MacIntyre focuses on the mother throughout his discussion of parenting, without giving any reasons for this) is to teach their children that what they desire is not necessarily what is best for them at that time or what is best for them in the context of their life as a whole. Even when we pass beyond childhood, we still need others to watch and comment on our motives and actions, to insure that those aim at what is good for us and not merely at satisfying our temporary and potentially harmful desires. These are our friends, who provide us with insight and self-understanding, not least because they call us to account for our actions when those seem immoral, short sighted, or out of character. To provide such an account I must first reflect on my motivations and goals, and then explain them in such a way that my friend can make sense of them.

This is one of the ways in which I need other people, receive things from them, and am dependent on them. Throughout my life, other people assist me in developing the use of my reason, and I am dependent on others for this; I cannot become rational on my own. I can only grow if I can reason with and learn from others, and this requires certain traits from me: the virtues (honesty, courage, and justice, for example). Each of us also finds that others are dependent on us at different times and in different ways, and we are obligated to assist them in developing the same qualities and virtues others are helping us to develop; and this assistance is itself a virtue. We therefore find ourselves as part of a community of giving and receiving which is a network of duties and obligations. Potentially, of course, these same networks are dangerous; MacIntyre acknowledges that these structures of giving and receiving are also structures of unequal power distribution and potentially of domination and deprivation (Dependent Rational Animals 102). We must take care to see that they are not used in this way. But this network of obligations in the service of a shared good – the development of human capacities to reason and behave virtuously – means that this kind of society resembles the polis as MacIntyre understands it.

So acknowledging our nature as a particular kind of animal forces us to acknowledge our dependence on others to develop our rationality and become independent and our need to use our rationality to help dependent others (hence the title: Dependent Rational Animals). MacIntyre says that each of these is a different kind of virtue: the virtues of dependence differ from the virtues of independence but are nonetheless virtues (Dependent Rational Animals Chapter 10). This in turn requires us to acknowledge the networks of relationships of which we are a part, and once we have done this we can and must deliberate about the social and political institutions we wish to create in order to promote and protect these networks. Collectively promoting the social structures we need in order to flourish as individuals enables us to escape from false dichotomies between self-interest and the common interest and between selfishness and altruism. In supporting the networks that are necessary if we are to flourish, I am promoting both my interest and everyone else’s, and I am looking out for the common good as well as my own individual good. Practices, then, are both consequences of our nature as the kind of animals we are, when we properly understand the kind of animals we are, and forms of social order that are in keeping with our nature, as opposed to contemporary forms of social order (liberalism and capitalism) which are not.

12. A New Politics

MacIntyre has shown that his ideal society would be different from our own in two particular areas, politics and economics, and now it is time to consider what he believes we should do in order to bring this ideal society into being. As was stated at the very beginning of this essay, MacIntyre is writing in order to resist the modern world, including modern politics. “Modern systematic politics, whether liberal, conservative, radical, or socialist, simply has to be rejected from a standpoint that owes genuine allegiance to the tradition of the virtues; for modern politics itself expresses in its institutional forms a systematic rejection of that tradition” (After Virtue 255). When we have made the changes MacIntyre wants to see, politics will no longer be civil war by other means: “the politics of such communities…is not a politics of competing interests in the way in which the politics of the modern state is” (Dependent Rational Animals 144). It is instead a shared project, and one that is shared by all adults, rather than being limited to a few elites who have gained power through manipulation and use that power to gain the goods of effectiveness for themselves. Politics will not be about people selfishly fighting over power and money; instead there will be “a conception of political activity as one aspect of the everyday activity of every adult capable of engaging in it” (Dependent Rational Animals 141). Human beings, as the kind of creatures we are, need the internal goods/goods of excellence that can only be acquired through participation in politics if we are to flourish. Therefore, everyone must be allowed to have access to the political decision-making process. The matters to be discussed and decided on will not be limited as they are now; they will extend to questions about what the good life is for the community and those who make it up. Politics will be especially concerned with the virtues of justice and generosity, ensuring that citizens get what they deserve and what they need. And it is an important requirement of this new politics that, everyone must “have a voice in communal deliberation about what these norms of justice require” (Dependent Rational Animals 129-130). This kind of deliberation requires small communities; although not every kind of small community is healthy, a healthy politics can only take place in a small community. Although their size cannot be precisely specified, they will be intermediate in scale between the family and the modern state (Dependent Rational Animals 131).

Politics will be understood and lived as a practice, and it will be about the pursuit of internal goods/goods of excellence rather than external goods/goods of effectiveness. “It is only because and when a certain range of moral commitments is shared, as it must be within a community structured by networks of giving and receiving, that not only shared deliberation, but shared critical enquiry concerning that deliberation and the way of life of which it is a part, becomes possible” (Dependent Rational Animals 161). When the community deliberates collectively about its best way of life it is choosing a telos, or final end. And that final end will be one which reflects the needs of all the citizens, including the need to have and use the virtues, which are part of our nature as dependent rational animals.

MacIntyre’s communities will also have traditions and histories, and they will have people who are authorities to whom the rest of us will submit ourselves while we learn about those traditions and histories. Think back to the discussion of chess. Authority in chess is derived from a mastery of the virtues internal to the game (or goods of excellence) rather than external virtues (or goods of effectiveness). Chess players with authority do not have authority because they dominate others, or because they have wealth or political power. Players recognize who has mastered the virtues internal to the game, and try to learn from them. Rather than hating or resenting or fearing those with authority, they welcome and value them; the powerful seek to share their knowledge and skills for the good of the game, rather than for purposes of domination or exploitation. All the players recognize the rules of the game that make it possible for the game to educate us in its virtues, and they follow those rules because they recognize them as necessary and desirable. They are loyal to the game, they enjoy it, and they genuinely care about those with whom they share it. There is competition, to be sure, but it is in the service of pursuing a common good. The political community, for MacIntyre, must be this kind of community.

13. A New Economics

Capitalism must be replaced or transformed, or at least ways must be found to shield individual small communities from its effects. “The tradition of the virtues is at variance with central features of the modern economic order and more especially its individualism, its acquisitiveness and its elevation of the values of the market to a central social place” (After Virtue 254). The ideas that the purpose of life is to get rich and that the well-being of a society can be measured by its economic production will both be rejected, for these both reflect a focus on the goods of effectiveness rather than the goods of excellence. In addition, capitalism undermines communities of all kinds, including the family; we must have a way of life that puts the common good first. “Market relationships can only be sustained by being embedded in certain types of local nonmarket relationship, relationships of uncalculated giving and receiving, if they are to contribute to overall flourishing, rather than, as they so often in fact do, undermine and corrupt communal ties” (Dependent Rational Animals 117). There are many possibilities for how we might construct new economic systems. “The institutional forms through which such a way of life is realized, although economically various, have this in common: they do not promote economic growth and they require some significant degree of insulation from and protection from the forces generated by outside markets” (Dependent Rational Animals 145; The MacIntyre Reader 249). The society MacIntyre prefers will have only small inequalities of income and wealth, to prevent people from being excluded from the community by their poverty or placing themselves above it on account of their great wealth, both of which phenomena we certainly see today (and which Aristotle recognized in his day). If MacIntyre is correct that growing up as human beings is about learning to overcome our immediate desires and learning to see our long term good, then advertising and marketing, which teach us to give in to our immediate desires, are going to become much less effective. Markets must be subordinated to the development of the virtues in individuals and the community, rather than the other way around, which is what happens in the world in which we now live.

14. Conclusion

MacIntyre’s ideal world would be very different from today’s world, and it is one that would undoubtedly take decades, and probably centuries, to arrive, just as the replacement of Aristotelian morality by liberal capitalism took a very long time. What are we to do in the meantime if we wish to carry out MacIntyre’s vision? MacIntyre says that we can begin to work on the kinds of small communities that are capable of preserving the practices and virtues even in the face of liberal capitalism (Whose Justice? Which Rationality? 99). We need to focus our energies on building and maintaining the kinds of small communities where practices and the virtues have a place and protecting them as much as possible from the depredations of the modern state and modern capitalism. At the end of Three Rival Versions of Moral Enquiry, he proposes ways to modify universities and their curricula to bring them closer to the kind of communities he wants to encourage. As far back as 1968’s Marxism and Christianity, MacIntyre was advocating “a politics of self-defence for all those local societies that aspire to achieve some relatively self-sufficient and independent form of participatory practice-based community” (Marxism and Christianity xxvi, cited in The MacIntyre Reader 23; see The MacIntyre Reader 248 and Breen 187). Small communities will also make it possible for people to evaluate political candidates in a variety of settings and judge them on the basis of integrity rather than adaptability (The MacIntyre Reader 249). We can evaluate our leaders on their actual characters rather than seeing them through the distortions of advertising and the manipulation of propaganda.

MacIntyre’s objections to liberal capitalism show the influences of both the Marxism to which he subscribed early in his career and the Catholic Church of which he is now a member. Both Marxism and Catholicism, for different reasons, critique the unbridled pursuit of wealth under capitalism. But there are many reasons to doubt that the kind of society MacIntyre promotes will turn out as he wishes. Many authors, from Adam Smith to Hayek to von Mises, have argued that attempts to control or limit markets inevitably have as a consequence attempts to control and limit human beings in ways that lead to the gulag rather than to the virtues. They would also argue that MacIntyre’s proposals, by limiting or discouraging economic growth, would condemn the poor to continued poverty and prevent improvements in living standards in general, and would punish people who are able to successfully provide people with what they want while profiting from this success. This would kill initiative and innovation and lead to stagnation. Whether people agree or disagree, MacIntyre would probably take some satisfaction in the fact that at least there is an argument going on – a serious discussion about the ultimate values and way of life the community should pursue – which is typically avoided or stifled on those rare occasions when it does arise. The next step would be to make this kind of argument a part of mainstream political discussions.

If his ideas become widespread and are widely adopted, MacIntyre’s small communities, like St. Benedict’s monasteries, will preserve the practices, the virtues, and morality until such a time as they can re-emerge into the world. In the meantime they will be the best way of life for those who are fortunate and hard-working enough to be a part of them. And of course those who, like MacIntyre, practice philosophy in his tradition must continue to strengthen and develop the arguments found in the Aristotelian tradition as it has developed through Aquinas, and continue to draw attention to the flaws and weaknesses of liberal philosophy in the hope of persuading others to change their allegiances.

15. References and Further Reading

a. Primary Sources

This bibliography includes only the most significant books from the period beginning with After Virtue and is in chronological order.

  • MacIntyre, Alasdair. After Virtue. Second Edition. Notre Dame: University of Notre Dame Press, 1984 (1981).
    • The foundation of his later work and the most important of his books to read. Includes his arguments about the failures of modern philosophy and politics and how those failures might be overcome, or at least diminished, with the help of the philosophy of Aristotle and the political way of life of the Greek city-state.
  • MacIntyre, Alasdair. Whose Justice? Which Rationality? Notre Dame: University of Notre Dame Press, 1988.
    • MacIntyre addresses “both what makes it rational to act in one way rather than another and what makes it rational to advance and defend one conception of practical rationality rather than another” (p. ix).
  • MacIntyre, Alasdair. Three Rival Versions of Moral Enquiry. Notre Dame: University of Notre Dame Press, 1990.
    • MacIntyre discusses three rival versions of moral enquiry: encyclopedia, tradition, and geneaology. He describes how they conflict with one another and the possibility that one of these traditions can “emerge as indisputably rationally superior” (p. 5). It is the Thomist tradition, he argues, that proves to be rationally superior to the others.
  • Knight, Kelvin. The MacIntyre Reader. Notre Dame: University of Notre Dame Press, 1998.
    • This is a collection of articles by MacIntyre, extracts from After Virtue and Whose Justice? Which Rationality?, and a pair of interviews of MacIntyre, along with an introductory essay on MacIntyre by Knight. The book is an excellent source for anyone looking for an overview of MacIntyre’s career, and Knight’s essay is an outstanding analysis of MacIntyre’s project. There is also a very thorough Guide to Further Reading, in essay form, in which Knight again reveals a sympathetic and extensive knowledge of MacIntyre’s work. Highly recommended.
  • MacIntyre, Alasdair. Dependent Rational Animals: Why Human Beings Need the Virtues. Chicago: Open Court, 1999.
    • MacIntyre begins this book with the claim that any moral philosophy must begin by acknowledging that human beings are a particular kind of animal with particular needs and goods that are determined by our animal nature. He then establishes what that nature is, and argues that it requires us to develop our rationality while acknowledging our dependence on others, thus providing us with a telos. He provides a sketch of what kind of social organization would be necessary to enable each of us to fulfill our telos, and how that kind of organization differs from the organization of the modern world.

b. Secondary Sources

  • Ballard, Bruce W. Understanding MacIntyre. Lanham: University Press of America, Inc., 2000.
    • This short (90 page) book has two parts: the first part explains the fundamentals of MacIntyre’s thought for beginning students, and the second part brings MacIntyre into contact with thinkers such as Marx, Kierkegaard, and Graybosch. Unfortunately these chapters are too brief to be really useful on their own; Chapter 10, for example, entitled “MacIntyre and His Critics,” is a mere five pages long.
  • Breen, Keith. “Alasdair MacIntyre and the Hope for a Politics of Virtuous Acknowledged Dependence.” Contemporary Political Theory (2002) 1, 181-201.
    • A political analysis of Dependent Rational Animals. The author concludes that MacIntyre must moderate his claims if he is to avoid self-contradiction and “a despairing purism.”
  • Fuller, Michael. Making Sense of MacIntyre. Aldershot, UK: Ashgate, 1998.
    • This 144 page book has a title that might lead one to expect an introductory volume, but while there is a summary of MacIntyre’s themes, the author also uses other philosophers, such as Donald Davidson and especially Richard Rorty, to make sense of MacIntyre’s thought, and the reader who is not already familiar with Davidson and Rorty may find this material difficult to understand.
  • Horton, John, and Susan Mendus, eds. After MacIntyre: Critical Perspectives on the Work of Alasdair MacIntyre. Notre Dame: Notre Dame University Press, 1994.
    • This collection of essays is wide-ranging, including essays on MacIntyre’s conception of justice, his characterization of liberalism, his interpretation of Aquinas and his critique of the Enlightenment. The last chapter is written by MacIntyre himself; entitled “A Partial Response To My Critics,” it offers MacIntyre’s responses to some of the criticisms offered by the other authors. (MacIntyre’s willingness to engage with his critics is both rare and admirable).
  • McMylor, Peter. Alasdair MacIntyre: Critic of Modernity. London: Routledge, 1994.
    • The author is a sociologist who treats MacIntyre’s work as social criticism. Part one is entitled “MacIntyre – Christianity and/or Marxism?” and part two is “Markets, Managers, and the Virtues.”
  • Murphy, Mark C., ed. Alasdair MacIntyre. Cambridge: Cambridge University Press, 2003.
    • A collection of eight essays by various scholars, with an introduction by Murphy, that address different aspects of MacIntyre’s thought. Chapters 6 and 7, “MacIntyre’s Political Philosophy,” by Murphy, and “MacIntyre’s Critique of Modernity,” by Terry Pinkard, were especially helpful in working on this essay. The book concludes with an excellent bibliography of works by and about MacIntyre.
  • Weinstein, Jack Russell. On MacIntyre. No location: Thomson Wadsworth, 2003.
    • Intended for the beginning philosophy student and the general reader. Chapter 2 is a brief biography of MacIntyre’s life with an emphasis on his intellectual influences; Chapter 3 focuses on MacIntyre’s theological work, particularly MacIntyre’s early comparisons of Christianity and Marxism.

Author Information

Ted Clayton
Email: clayt1ew@cmich.edu
Central Michigan University
U. S. A.

George Edward Moore (1873—1958)

moore-ge G. E. Moore was a highly influential British philosopher of the early twentieth century. His career was spent mainly at Cambridge University, where he taught alongside Bertrand Russell and, later, Ludwig Wittgenstein. The period of their overlap there has been called the “golden age” of Cambridge philosophy. Moore’s main contributions to philosophy were in the areas of metaphysics, epistemology, ethics, and philosophical methodology. In epistemology, Moore is remembered as a stalwart defender of commonsense realism. Rejecting skepticism on the one hand, and, on the other, metaphysical theories that would invalidate the commonsense beliefs of “ordinary people” (non-philosophers), Moore articulated three different versions of a commonsense-realist epistemology over the course of his career.

Moore’s epistemological interests also motivated much of his metaphysical work, which to a large extent was focused on the ontology of cognition. In this regard, Moore was an important voice in the discussion about sense-data that dominated Anglo-American epistemology in the early twentieth century.

In ethics, Moore is famous for driving home the difference between moral and non-moral properties, which he cashed-out in terms of the non-natural and the natural. Moore’s classification of the moral as non-natural was to be one of the hinges upon which moral philosophy in the Anglo-American academy turned until roughly 1960.

Moore’s approach to philosophizing involved focusing on narrow problems and avoiding grand synthesis. His method was to scrutinize the meanings of the key terms in which philosophers expressed themselves while maintaining an implicit commitment to the ideals of clarity, rigor, and argumentation. This aspect of his philosophical style was sufficiently novel and conspicuous that many saw it as an innovation in philosophical methodology. In virtue of this, Moore, along with Bertrand Russell, is widely acknowledged as a founder of analytic philosophy, the kind of philosophy that has dominated the academy in Britain and the United States since roughly the 1930s.

Moore also had a significant influence outside of academic philosophy, through his contacts in the Cambridge Apostles and the Bloomsbury group. In both academic and non-academic spheres, Moore’s influence was due in no small part to his exceptional personality and moral character.

Table of Contents

  1. Biography
  2. Metaphysics and Epistemology
    1. Internal Relations and Absolute Idealism
    2. The Identity Theory of Truth, Propositional Realism, and Direct Realism
    3. Sense-Data and Indirect Realism
    4. From the Ontology of Cognition to Criteriology
  3. Ethics
    1. Goodness and Intrinsic Value
    2. The Open Question Argument and the Naturalistic Fallacy
    3. Ideal Utilitarianism
    4. The Influence of Moore’s Ethical Theory
  4. Philosophical Methodology
  5. Moore’s Influence and Character
  6. References and Further Readings
    1. Primary Sources
    2. Secondary Sources

1. Biography

George Edward Moore was born on November 4, 1873, one of seven children of Daniel and Henrietta Moore. There were eight Moore children in all, as Daniel had a daughter from his first wife. G. E. Moore was raised in the Upper Norwood district of South London. His early education came at the hands of his parents: his father taught him reading, writing, and music; and his mother taught him French. Moore was a more-than-competent pianist and composer. At eight he was enrolled at Dulwich College, where he studied mainly Greek and Latin, but also French, German, and mathematics. At eighteen he entered Cambridge University, where he began as a student in Classics.

His first two years of University study proved to be less than challenging, his time at Dulwich having already prepared him exceptionally well in Greek and Latin. It was during this time that Moore became interested in philosophy. As he later reminisced:

I had indeed at Dulwich read Plato’s Protagoras …; but I was certainly not then very keenly excited by any of the philosophical questions which that dialogue raises …. What must have happened, during this second year at Cambridge, was that I found I was very keenly interested in certain philosophical statements which I heard made in conversation. (Moore 1942a, 13)

The conversations in question involved such notables as Henry Sidgwick, James Ward, and J.M.E. McTaggart, who became his teachers, and Bertrand Russell—then a student two years ahead of Moore—who for a time became his friend and philosophical ally. Moore’s and Russell’s relationship was lifelong, but it became strained early on. It was Russell who convinced Moore to study Moral Science, a division of philosophy in the British University system. In 1896, Moore took first-class honors in both Classics and Moral Science. After this, he attempted to win a Prize-Fellowship, as McTaggart and Russell had done before him. He succeeded in 1898, on his second attempt, and remained at Cambridge as a Fellow of Trinity College until 1904.

Beginning around 1897, and continuing through his time as a Fellow, Moore began to act as a “professional” philosopher, participating in the doings of the extant philosophical societies (such as the Aristotelian Society and the Moral Sciences Club) and publishing his work. Many of his best known and most influential works date from this period. It was also during this period that Moore instigated the momentous break from the then dominant philosophy of Absolute Idealism that would prove to be the first step toward the rise of analytic philosophy.

After his fellowship ended, Moore left Cambridge for a period of seven years, during which time he lived in Edinburgh and Richmond, Surrey, and worked independently on various philosophical projects. He returned to Cambridge in 1911 as a lecturer in Moral Science, and he remained there for the majority of his career, and, indeed, his life. He earned a Litt.D. in 1913, was elected a fellow of the British Academy in1918, and was chosen as James Ward’s successor as Professor of Mental Philosophy and Logic in 1925. He occupied that position until 1939, when he retired and was succeeded by Wittgenstein. From 1940 to1944 Moore was a visiting professor at several universities in the United States. He then returned to Cambridge, but not to teaching. He served as editor of Mind, the leading philosophical journal of the day, from 1921 to 1947. In 1951, he was awarded the British Order of Merit.

Beyond his professional career, Moore had a successful family life. In 1916 at age 43, he married Dorothy Ely, who had been his student. The couple had two sons: Nicholas (b.1918) and Timothy (b. 1922). By all accounts, Moore was an exemplary husband and father.

Moore died in Cambridge on October 24, 1958. He is buried in St. Giles’ churchyard.

2. Metaphysics and Epistemology

Two facts make it difficult to separate Moore’s contributions to metaphysics from his contributions to epistemology. First, his main contributions to metaphysics were in the ontology of cognition, which is often treated as a branch of epistemology. Second, his main contributions to epistemology were motivated by what he called the “commonsense” or “ordinary” view of the world, and this is properly a metaphysical conception, a worldview or Weltanschauung. Consequently, the next section treats Moore’s metaphysics and his epistemology together.

a. Internal Relations and Absolute Idealism

Moore became interested in philosophy at a time when Absolute Idealism had dominated the British universities for half a century, in a tradition stretching from S.T. Coleridge and T.H. Green to F.H. Bradley and J.M.E. McTaggart. McTaggart was Moore’s earliest philosophical mentor. Moore’s earliest philosophical views were inherited directly from him.

Absolute Idealism is a brand of metaphysical monism. It implies that, although the world presents itself to us as a collection of more or less discrete objects (this bird, that table, the earth and the sun, etc.), it really is one indivisible whole, whose nature is mental (or spiritual, or ideal) rather than material. Thus it is also a form of anti-realism, since it claims that the world of ordinary experience is something of an illusion—not that the objects of ordinary experience do not exist, but that they are not, as we normally take them to be, discrete. Instead, every object exists and is what it is at least partly in virtue of the relations it bears to other things—more precisely, to all other things. This is called the doctrine of internal relations, which Moore understood as the view that all relations are necessary. On this view, my coffee cup is not just the apparently self-contained entity that I lift off the table and draw to my lips. Instead, it contains, as essential parts of itself, relations to every other existing thing; thus, as I draw it to my lips, I draw the universe along with it, and am responsible for, in a sense, reconfiguring the universe. Since, on this view, everything that exists does so only in virtue of its relations to everything else, it is misleading to say of any one thing, for example, my coffee cup, that it exists simpliciter. The only thing that exists simpliciter is the whole—the entire network of necessarily related objects.

Though Moore accepted Absolute Idealism for a short while in his undergraduate years, he is best remembered for the views he developed in opposition to it. In fact, what is most characteristic of Moore’s mature philosophy is a thoroughgoing realism about what he came to call the “commonsense” or “ordinary” view of the world. This involves a lush metaphysical pluralism (the belief that there are many things that exist simpliciter) that stands in sharp contrast to the monism of the Absolute Idealists.

Inklings of Moore’s misgivings about Absolute Idealism begin to appear as early as 1897, in his first (unsuccessful) Prize-Fellowship dissertation on “The Metaphysical Basis of Ethics.” Though in it he openly identifies with the British Idealist school, it is here that Moore first raises a point that proved to be the hole in the Idealists’ dike. The Idealists’ doctrine of the internality of all relations has implications for the ontology of cognition. Specifically, it implies that objects of knowledge/cognition are not independent of their knowers. In other words, being known (cognized, perceived, etc.) makes a difference to the nature and being of the thing being known, the “object” of knowledge. Indeed, it was this aspect of the view which marked it as Idealist, as the Idealists commonly posited a great Mind, often simply called “the Absolute,” that “grounded” the whole of reality by cognizing it. And it is this view in the ontology of cognition that Moore obliquely rejects in his 1897 dissertation. He does not address it directly and in specie, but only in the restricted context of moral epistemology. In discussing Kant’s moral epistemology, Moore argues that Kant’s conception of practical reason conflates the faculty of judgment with judgments themselves (that is, bearers of objective truth), which he thinks should be kept separate. To maintain a sharp distinction between cognitive faculties and their activities, on the one hand, and their objects, on the other, is a staple of Austro-German philosophy from Bolzano and Lotze to Husserl, and it is likely that Moore got the idea from reading in that tradition (cf. Bell 1999).

At this point, Moore had neither the doctrine of internal relations nor British Idealism in his sights. It is probably more accurate to say that he was objecting to what is frequently called psychologism—the view that apparently objective truths (for example, of logic, mathematics, ethics, etc.) are to be accounted for in terms of the operations of subjective cognitive or “psychological” faculties. Psychologism was common to nearly all versions of Kantian and post-Kantian Idealism, including British Absolute Idealism. It was also a common feature of thought in the British empirical tradition, from Hume to Mill. For the British Idealists, psychologism was a consequence of the doctrine of internal relations as the latter applies to the ontology of cognition.

It was not long before Moore recognized this. Accordingly, he expanded the scope of his 1897 criticism from the ontology of moral knowledge to the ontology of knowledge in general, and this quickly became the principal weapon in his rebellion against British Idealism. This began in earnest in his successful 1898 Prize-Fellowship dissertation, which formed the basis for his first influential paper, “The Nature of Judgment” (Moore 1899). In both of these works, Moore pushes the anti-psychologistic distinction between subjective faculties/activities and their objects. He couples this, however, with a peculiar account of the nature of truth, of propositions and of ordinary objects.

b. The Identity Theory of Truth, Propositional Realism, and Direct Realism

The Idealist F.H. Bradley had held that truth was a matter of correspondence between a judgment (which was made up of ideas) and its object. At first glance Bradley’s view appears to be the classical correspondence theory of truth, but it is actually a peculiar inversion of that theory. On the classical correspondence theory, the “truth maker” is the object, not any subject who does the believing of this truth. That is, facts makes truths be true; believers don’t do this. But, given the Idealists’ views about the ontological priority of the mental/ideal and the internality of all relations, it follows that any judgment’s being true is ultimately due to the great Mind, the Absolute. Thus, as Moore notes at the beginning of his paper, while Bradley affirms that truth is not a relation between reality and our judgments, but rather judgments “in themselves,” he does not remain true to this view, and ends up flirting with psychologism.

Replacing Bradley’s overtly psychologistic terms “idea” and “judgment” with the more neutral terms “concept” and “proposition,” and maintaining his anti-psychologistic distinction between subject and object, Moore rejects the Idealistic inversion of the correspondence theory of truth. He does not simply revert to the classical version, however. Instead, he seeks to secure the objectivity of truth by eliminating the notion of correspondence entirely. Truth could not be a matter of correspondence between proposition and object, Moore argues, since in a case like “2+2=4” we regard the proposition as true even though there is no object in the empirical world to which the proposition corresponds. Thus, propositions must be regarded as true (or false) “in themselves,” without reference either to a subject which entertains them as elements in occurrent acts of consciousness, or to any object beyond them which they might be “about.” Instead, when a proposition is true, it is because a peculiar relation obtains among the concepts that make it up. Since this view casts the proposition as its own truth-maker, it has been called the “identity theory” of truth, (cf. Baldwin 1991). Moore sums up his view this way:

A proposition is composed not of words, nor yet of thoughts, but of concepts. Concepts are possible objects of thought; but this is no definition of them. … It is indifferent to their nature whether anybody thinks them or not. They are incapable of change, and the relation into which they enter with the knowing subject implies no action or reaction [on the part of the proposition]. … A proposition is a synthesis of concepts; and just as concepts are themselves immutably what they are, so they stand in infinite relations to one another equally immutable. A proposition is constituted by any number of concepts, together with a specific relation between them; and according to the nature of this relation the proposition may be either true or false. What kind of relation makes a proposition true, what false, cannot be further defined, but must be immediately recognised. (Moore 1899, 179-180)

Thus understood, propositions seem to be a lot like Platonic Forms: they are unchanging bearers of truth that exist independently of any “instances” of consciousness. Historically, there is nothing peculiar in this (apart from its appearance in the British context, perhaps). In fact, these views of Moore’s are in keeping with what may be called the “standard” nineteenth and early-twentieth century view of propositions held by Bolzano, Frege, Russell, W.E. Johnson, and L.S. Stebbing (cf. Willard 1984, 180 f.; Bell 1999).

What is novel in Moore, however, is his identity theory of truth, and his related identification of ordinary objects with propositions. One aspect of the standard view was that whenever a proposition happened to be involved in an occurrent act of consciousness, it played the role of “object”—the act was immediately of or about the proposition. Thus, prima facie, the only form of epistemological realism compatible with the standard view is “indirect” or “representative” realism. This is the view that the external world is not given to us directly, but only as mediated by a surrogate object, like a proposition or, in Moore’s later philosophy, a sense-datum. But this aspect of the standard view chaffed against Moore’s growing partiality for common-sense (or “naïve”) realism, which assumes direct realism in epistemology. Thus, in order to secure direct, cognitive access to the external world, Moore cleverly eliminated the would-be mediators by identifying propositions with the objects of ordinary experience themselves.

His first move in this direction was to show that the identity theory of truth applies to propositions that, unlike “2+2=4,” do seem to require a relation to something outside themselves in order to be true. For instance, it is hard to see how the sentence “The cat is on the mat” could be true in itself, apart from a relation to some state of affairs in the empirical world. However, Moore says:

… this description [of truth] will also apply to those cases where there appears to be a reference to existence. Existence is itself a concept; it is something which we mean; and the great body of propositions, in which existence is joined to other concepts or syntheses of concepts are simply true or false according to the relation in which it stands to them. (Moore 1899, 181)

So, “The cat is on the mat” is true when the concepts constitutive of it (“cat,” “mat,” “on,” and so forth) are united with the concept “existence” by that indefinable, internal relation that is truth. Thus also for “The cat exists.” It is not that the proposition is true only if the cat exists; rather, it is that the cat exists only if the proposition is true in virtue of its own internal structure.

By making existence both dependent on truth and, like truth, internal to a proposition, Moore is in effect identifying the class of existents with the class of true propositions that involve the concept “existence” as a constituent. As Moore goes on to say “an existent is seen to be nothing but a concept or complex of concepts standing in a unique relation to the concept of existence,” and thus “it now appears that perception is to be regarded philosophically as the cognition of an existential proposition” (Moore 1899, 182-3). In this way, “the opposition of concepts to existents disappears,” (Moore 1899, 183), and Moore secures a direct realist account of cognition.

By the same token, he commits himself to what is, on the face of it, an unlikely view of the world: given the identity theory of truth, “it seems necessary to regard the world as formed of concepts” (Moore 1899, 182). But, Moore reminds us, this is not to be taken as a claim that reality is at bottom mentalistic or Ideal; for his account of concepts and propositions has already made clear that these exist independently of any acts of thinking. Thus, he says:

…the description of an existent as a proposition … seems to lose its strangeness, when it is remembered that a proposition is here to be understood, not as anything subjective—as an assertion or affirmation of something—but as the combination of concepts which is affirmed. (Moore 1899, 183)

Whether this really does alleviate the description’s strangeness is contestable; but it is clear that Moore means for it to be consistent with our commonsense view of the world. Unfortunately, however, the view has a peculiar consequence that is anything but commonsensical. Bertrand Russell called it the problem of “objective falsehoods.” Given Moore’s theory of truth and its attendant realism about propositions, false propositions have, or may have, the same ontological status as true propositions. At the very least, they are somehow “there” to be asserted or affirmed just as true propositions are. Moreover, since truth and falsity are prior to and independent of existence, there is no obvious reason why a false proposition could not include “existence” as a concept just as a true one can. By 1910, Bertrand Russell—who at first accepted Moore’s views—had convinced both himself and Moore that they were to be rejected precisely for these reasons (see Russell 1906, 1910; Moore 1953; see also the discussion of these matters in Baldwin 1991).

Nonetheless, Moore had held this view of truth and reality for approximately a decade, during which time many of his most influential works were published. Among these was his celebrated paper “The Refutation of Idealism” (Moore 1903b). Here he tackles Idealism head-on and in specie. Asserting that all forms of Idealism rest on the claim that esse is percipi (“to be is to be perceived,” or, as Moore treats it, “to be is to be experienced”), Moore argues that the claim is false. He begins by analyzing in great detail several possible meanings of the formula “esse is percipi.” Ultimately, he determines that Idealists take it to be an analytic truth, in that it is proved by the law of contradiction. Thus, they also believe existence and cognition to be somehow identical. According to this, for yellow to exist just is for someone to have a sensation of yellow. In identifying yellow and the sensation of yellow, the Idealist “fails to see that there is anything whatever in the latter that is not in the former” and thus, for him, “yellow and the sensation of yellow are absolutely identical” (Moore 1903b, 442). But, according to Moore, this is a mistake. Careful attention to the sensation of yellow, on the one hand, and yellow, on the other, will reveal that they are not identical. As he says, “the Idealist maintains that object and subject are necessarily connected, mainly because he fails to see that they are distinct” (Moore 1903b, 442); but Moore thinks he can show that they are distinct, and he deploys two arguments to this end.

His first argument turns upon what would later come to be called the paradox of analysis—an intractable problem that, ironically, would plague Moore’s own later work. The paradox can be explained in terms of the familiar act of defining a term. In any case of definition, one is confronted with two bits of language: the term to be defined (the definiendum) and the term that does the defining, the definition itself (the definiens). Both definiendum and definiens are supposed to have the same meaning—else the latter would not be able to illuminate the meaning of the former. But if both terms mean the same, it is hard to see how giving a definition could be illuminating. Consider the case of the definiendum “bachelor” and its definiens “unmarried man.” In order for “unmarried man” to be a good definition of “bachelor,” it must mean the same as “bachelor.” But if it means exactly the same thing, then it seems that saying “‘bachelor’ means ‘unmarried man’” shouldn’t be any different from saying “‘bachelor’ means ‘bachelor’” or “‘unmarried man’ means ‘unmarried man.’” And yet there does seem to be a difference in that we find the one informative; but the others, not. Thus it seems that there is a difference in meaning between “bachelor” and “unmarried man.”

In sum, then, the paradox is this: a term and its definition must say the same thing in order for the definition to be correct, and yet they must say something different in order for the definition to be informative. The paradox can be put into the form of a dilemma:

  1. If a definiens is correct, then its meaning is the same as that of the definiendum.
  2. If a definiens is informative, then its meaning is not the same as that of the definiendum.
  3. A defniens’ meaning cannot be both the same and not the same as that of the definiendum.
  4. Thus, a definiens cannot be both correct and informative.

Now, this paradox functions in Moore’s first argument against the formula “Esse is percipi” in the following way. The formula itself can be read as a definition. Just as we say, “A bachelor is an unmarried man,” so the Idealist says, “To exist is to be cognized,” or “Yellow is the sensation of yellow.” However, if the two really were identical, it would be superfluous to assert that that they were; thus, the fact that the Idealist sees some need to assert the formula reveals that there is, as with any definiendum and its definiens, some difference between existence and cognition, or yellow and the sensation of yellow. As Moore says,

Of course, the proposition [that is, the formula] also implies that experience is, after all, something distinct from yellow—else there would be no reason to insist that yellow is a sensation: and that the argument [that is, the formula] both affirms and denies that yellow and the sensation of yellow are distinct is what sufficiently refutes it. (Moore 1903b, 442)

The argument may seem decisive. However, we should note that it turns upon Moore’s decision to push the Idealists toward the second horn of the “paradox of analysis” dilemma. Both horns are utterly destructive to “knowledge by description” (of which definitional knowledge is a type), so the Idealists would fare no better with the first horn. But the paradox of analysis is a problem not only for the Idealists, but for everyone who wants to affirm the practice of giving a definition, or, as Moore would later call it, an “analysis” of a concept. Thus, one might be inclined to hold off on embracing either horn, and instead concentrate on resolving the paradox. Charity requires that we extend this reprieve to our adversaries as well. Indeed, except for the fact that Moore hadn’t yet fully grasped the scope of the paradox lying just below the surface of his argument, we’d have to say that he was being terribly unfair by insisting that the Idealists hurry up and impale themselves on the second horn.

Moore’s second argument is much better. It is essentially an application of the now familiar, anti-psychologistic distinction between subject and object. He begins by comparing a sensation of blue with a sensation of green. These are the same in one respect, in virtue of which they are both called “sensations”; but they differ in another respect, in virtue of which the one is said to be “of blue” and the other “of green.” Moore gives the name “consciousness” to the respect in which they are the same, and the respects in which they are different he calls “objects” of sensation or of consciousness. Thus, he says, every sensation is a complex of consciousness and object.

Having distinguished consciousness from object, Moore goes on to distinguish object from sensation. Focusing now on a single sensation, the sensation of blue, Moore says that, when it exists, either (1) consciousness alone exists, (2) the object alone (that is, blue) exists, or (3) both exist together (presumably this is the sensation of blue). But each of these possibilities represents a different state of affairs: neither (1) consciousness alone, nor (3) consciousness and blue together are identical to (2) blue. Thus it is not the case that the sensation of blue is identical to blue, and it is therefore false that esse is percipi.

This negative conclusion of Moore’s essay is the refutation of idealism, properly speaking. However, the essay also has a positive conclusion, which purports to establish the truth of a direct realist account of cognition. Most philosophers in the modern period have accepted some form of representationalism, according to which we have direct cognitive access only to our own mental states (ideas, impressions, perceptions, judgments, etc.). But, according to Moore, what his analysis of consciousness shows is that, “whenever I have a mere sensation or idea, the fact is that I am then aware of something which is … not an inseparable aspect of my experience;” and this has the monumental consequence that,

there is … no question of how we are to ‘get outside the circle of our own ideas and sensations.’ Merely to have a sensation is already to be outside that circle. It is to know something which is as truly and really not a part of my experience, as anything which I can ever know. (Moore 1903b, 450)

Consistent with his 1899 view, we have direct cognitive access to the objects of our experience.

c. Sense-Data and Indirect Realism

The direct realism of Moore’s early period depended heavily upon an ontology of cognition that included both his propositional realism and his identity theory of truth. When the problem of objective falsehoods finally drove him to abandon both, a revised account of cognition was required to secure some form of epistemological realism. For instance, no longer could he explain the difference between “2+2=4” and “The cat is on the mat” by referring to the presence of the concept “existence” in the latter proposition. Instead, Moore now cashed out the difference in terms of what he called “sense-data.”

Examples of include color patches (the octagonal patch of red associated with a stop sign) and appearances (the elliptical appearance of a coin when viewed at an angle). Beyond examples of this sort, exactly what sense-data are was never made sufficiently clear by Moore or others. Thanks largely to Moore, their nature was kept a matter of ongoing debate in the early twentieth century.

Most proponents of sense-data construed them as mental entities responsible for mediating our sensory experiences of external objects. For example, in perceiving a stop-sign, what one is immediately conscious of is some set of sense-data through which are conveyed the stop-sign’s size, shape, color, and so on. The stop-sign itself remains “outside the circle of ideas,” or rather, sense-data, and we are thus aware of it only indirectly. In its usual form, sense-data theory is a form of representationalism consistent with indirect realism, not direct realism.

Moore initially accepted this representationalist view of sense-data; but he was not long content with it, since it seemed to leave the commonsense view of the world open to skeptical doubts of a familiar, Cartesian variety. Consequently, he modified sense-data theory to make it a form of direct realism, just as he had previously done with proposition theory. His strategy in both cases was the same: by making the purported mental-mediators identical with external objects, he would eliminate the need for a mediator and make external objects directly available to consciousness. Thus, for a period of about fifteen years, Moore attempted off-and-on to defend a view according to which sense-data were identical to external objects or parts of such objects. For instance, a sense-datum could be identical to the whole of an object in the case of a sound, while for visible objects, which always have “hidden” sides (the underside of a table or the back side of a coin, for example) a single sense-datum could be identical to only a part of the object’s surface.

Ultimately, Moore could not sustain this sense-data version of direct realism any better than his previous, propositional version. It gave way under the weight of arguments such as the argument from illusion and the argument from synthetic incompatibility. The latter runs as follows. Suppose that person A is looking at the front side of a coin straight-on, and person B is looking at the same coin from an angle. To A, the front side of the coin appears to be circular; to B, it appears to be elliptical. The sense-data theorist accounts for this by saying that A is seeing a circular sense-datum, while B is seeing an elliptical sense-datum. But, given that A and B are looking at the same part of the coin’s surface (the whole surface of the front side), Moore’s proposal that sense-data are identical to parts of the surfaces of external objects entails that the whole surface of the front side of the coin is both circular and elliptical at the same time; but this implies a contradiction, and so cannot be true.

The argument from illusion raises problems analogous to the problem of “objective falsehoods,” which drove Moore from his early propositional realism. On the representationalist version of sense-data theory, we can explain the difference between true perceptions and false (illusory) perceptions by referring to the correspondence and lack of correspondence between a sense-datum and the external object it represents. On Moore’s direct realist version, however, it makes no sense to speak of a sense-datum as failing to correspond to the object. Since sense-data are identical to objects or their parts, there can be no sense-data without there being—or, rather their being—an object, and this implies both that illusion is impossible (which flies in the face of experience) and that all those experiences that we would normally call “illusory” really aren’t—the “illusory object” really exists if illusory sense-data exist.

By 1925, Moore conceded that he could find no way around these sorts of arguments (cf. Moore 1925), hence he fell back on a version of indirect realism.

d. From the Ontology of Cognition to Criteriology

With his failed attempt to sustain a direct realist version of sense-data theory, Moore had come to the end of his rope in trying to work out an adequate, realist ontology of cognition. This did not lead to his abandoning either epistemological or metaphysical realism in general, however. To do so would have been a genuine possibility, since to abandon direct realism is to admit that we have no direct evidence of the existence of the commonsense world. While “indirect” or “representational” versions of realism are possible, it is nonetheless natural to see representationalism as opening the door to the very sort of anti-realism (in forms like idealism, phenomenalism, and so on) that Moore had labored to overthrow.

Instead of sliding down the potentially slippery slope from representationalism to anti-realism, however, Moore dug in his heels, insisting that we are justified in accepting the commonsense view of the world despite the fact that we cannot adequately explain, ontologically, how the world is given to us. As Moore himself put it, “We are all, I think, in the strange position that we do know many things…and yet we do not know how we know them.” (Moore 1925; in 1959, 44).

This approach comes through clearly in Moore’s 1925 paper “A Defense of Common Sense.” Here, Moore acknowledges that direct realism, indirect realism, and phenomenalism are more or less equally matched contenders for the correct account of cognition. Since we cannot determine the correct account, we do not know how it is that we know. However, he argues, it would be wrong to see this as grounds for calling into question that we know or what we know. Indeed, there are many things that we know perfectly well, despite our inability to say how we know them. Among these “beliefs of common sense” are such propositions as “There exists at present a living human body, which is my body,” “Ever since it [this body] was born, it has been either in contact with or not far from the surface of the earth,” and “I have often perceived both body and other things which formed part of its environment, including other human bodies” (Moore 1925; in 1959, 33).

Moore claims that he knows these and many other propositions to be certainly and wholly true; and one of the other propositions that Moore claims to know with certainty is that others have also known the aforementioned propositions to be true of themselves, just as he knows them to be true of himself. By claiming that these propositions of common sense (hereafter CS propositions) are certainly true, Moore means to oppose the skeptic who would deny that we know anything with certainty. By claiming that CS propositions are wholly true, he means to oppose the Idealist, who would claim that no statement about some isolated object can be true simpliciter, since each object has its identity only as a part of the whole universe.

In support of his view, Moore claims that each CS proposition has an “ordinary meaning” which specifies exactly what it is one knows when one knows it. This “ordinary meaning” is perfectly clear to most everyone, except for some philosophers who

seem to think that [for example] the question “Do you believe that the earth has existed for many years past?” is not a plain question, such as should be met either by a plain “Yes” or “No,” or by a plain “I can’t make up my mind,” but is the sort of question which can be properly met by: “It all depends on what you mean by ‘the earth’ and ‘exists’ and ‘years’….” (Moore 1925; in 1959, 36)

But Moore thinks that to call things into question this way is perverse; and, far from being the task of philosophy, it actually undermines that task. For even the skeptic tacitly assents to the truth of CS propositions, at least in referring to himself as a philosopher, by making references to other philosophers with whom he may disagree, and so on:

For when I speak of ‘philosophers’ I mean, of course (as we all do), exclusively philosophers who have been human beings, with human bodies that have lived upon the earth, and who have at different times had many different experiences. (Moore 1925; in 1959, 40)

On the face of it, Moore’s general idea seems to be that the truth of CS propositions, and hence of the commonsense view of the world, is built into the terms of our ordinary language, so that if some philosopher wants to say that some CS proposition is false, he thereby disqualifies the very medium in which he expresses himself, and so speaks nonsensically. Either that or he is using terms in something other than their ordinary senses, in which case his claims have no bearing on the commonsense view of the world.

Since the bounds of intelligibility seem to be fixed by the ordinary meanings of CS propositions, the job of the philosopher begins by accepting them as starting points for philosophical reflection. Then, the philosopher questions not their truth, but what Moore calls their correct analysis. Giving an analysis resembles giving a definition, and in fact it is very difficult to say what distinguishes the two. For Moore, the difference is ontological: definition is performed upon words, analysis upon propositions and concepts. But both involve setting forth two terms that are supposed to mean the same, one of which is supposed to elucidate the other. In definition these are the definiendum (the term being defined) and the definiens (the term doing the defining); in analysis, they are the analysandum (the term being analyzed) and the analysans (the term doing the analyzing). Both may take the same verbal form, for example, “A brother is a male sibling” or “‘Brother’ means ‘male sibling’.” These sentences could express either an analysis or a definition, depending upon the intentions of the speaker. The difference cannot be determined just be looking. This was a matter of great confusion for Moore’s contemporaries. In any case, it is as analyses of CS propositions that views like direct realism, indirect realism, sense-data theory, phenomenalism, and the like have their place in philosophy. These views should not, according to Moore, disqualify or in any way challenge the commonsense view of the world, but only give us a deeper understanding of what it is to have a sensory experience, or to think a thought, etc.

Moore’s new approach to defending common sense is also apparent in what is arguably his most famous paper, “Proof of an External World” (Moore 1939). Here, after expending considerable effort to nail down the meaning of “external object” as “something whose existence does not depend on our experience,” Moore claims that he can prove some such objects exist

By holding up my two hands, and saying, as I make a certain gesture with the right hand, ‘Here is one hand’, and adding, as I make a certain gesture with the left, ‘and here is another’. (Moore 1939; in 1993, 166)

Moore’s complete line of thought seems to be this: “Here is one hand” is a CS proposition with an ordinary meaning. Using it in accordance with that meaning, presenting the hand for inspection is sufficient proof that the proposition is true—that there is indeed a hand there. Ditto for the other hand. But a hand, according to the ordinary meaning of “hand,” is a material object; and a material object, according to the ordinary meaning of “material object,” is an external object. Because there are two hands, and because hands are external objects, it follows that there is an external world, according to the ordinary meaning of “external world.”

Neither Moore’s defense of common sense nor his proof of an external world were universally convincing. Some misunderstood the latter as an attempt to disprove skepticism. Taken this way, it is clearly a miserable failure. However, as Moore himself later insisted, he never meant to disprove skepticism, but only to prove the existence of the external world:

I have sometimes distinguished between two different propositions, each of which has been made by some philosophers, namely (1) the proposition ‘There are no material things’ and (2) the proposition ‘Nobody knows for certain that there are any material things.’ And in my latest British Academy lecture called ‘Proof of an External World’ … I implied with regard to the first of these propositions that it could be proved to be false in such a way as this; namely, by holding up one of your hands and saying ‘This hand is a material thing; therefore there is at least one material thing’. But with regard to the second of the two propositions …. I do not think I have ever implied that it could be proved to be false in any such simple way … (Moore 1942b, 668)

Even without this misunderstanding, however, Moore’s new approach to promoting common sense is open to the charge of begging the question by simply assuming that CS propositions are true according to their ordinary meanings. Wittgenstein put the point bluntly: “Moore’s mistake lies in this—countering the assertion that one cannot know that, by saying ‘I do know it’” (Wittgenstein 1969, § 521). By stonewalling the skeptic in this way, Moore was in effect refusing to recognize that, lacking a plausible, direct realist account of cognition, there are legitimate grounds for questioning the truth of CS propositions. If it is possible that direct realism is false, then it is possible that none of our experiences connect us with the commonsense world. Thus, we have no indubitable evidence for there being such a world, and, supposing there are such things as CS propositions and their ordinary meanings, it is possible that they fail to represent reality accurately. Thus, both Moore’s defense and his proof are ill-founded, and can be maintained only by begging the question. Or so the objection goes.

Some have attempted to defend Moore, or at least Moorean style rejoinders to skepticism, by taking seriously Moore’s claim that he was not trying to disprove skepticism, and his admission that this would be a very hard thing to do. If we put aside the issue of proof, we can interpret Moore’s new approach as first, making a clean division between the ontology of cognition and what has come to be recognized as the other main aspect of epistemology—criteriology; and, second, attempting to deal with skepticism solely in terms of the latter. Whereas the ontology of cognition deals with the problem of how we know, criteriology deals with the problem of what we know, in the sense of what we are justified in believing. On this view, then, the issue is not whether commonsense realism is certainly true and skepticism certainly false; rather, the issue is what we ought to believe or regard as true given that we can neither prove nor disprove either position. On this interpretation, central to the Moorean approach is what has come to be called “the G. E. Moore shift” (a term coined by William Rowe). Consider a standard sort of skeptical argument:

  1. If I cannot tell the difference between waking and dreaming, then I cannot be sure that I have a body.
  2. I cannot tell the difference between waking and dreaming.
  3. Therefore, I cannot be sure that I have a body

Employing the G. E. Moore shift, we rearrange the propositions of the skeptic’s argument, thus:

  1. If I cannot tell the difference between waking and dreaming, then I cannot be sure that I have a body.
  2. I am sure that I have a body.
  3. Therefore, I can tell the difference between waking and dreaming.

The strategy can be generalized as follows, where CS is any proposition of common sense (such as “I am sure that I have a body”), and S is any skeptical proposition (such as “I cannot tell the difference between waking and dreaming”):

The Skeptic’s Argument

  1. If S then not-CS
  2. S
  3. not-CS

Moore’s Response (using “the shift”)

  1. If S then not-CS
  2. CS
  3. not-S

Both arguments are valid, but only one can be sound. Since both accept the conditional (1), the question of soundness comes down to the question of whether S or CS is true. And here Moore and the skeptic would be at an impasse, except that (according to Moore) we have more reason to believe any proposition of common sense than any skeptical proposition. That is because every skeptical proposition worth its salt is going to rest on some speculative account of the ontology of cognition that puts a mental surrogate (such as a proposition or a sense-datum) in place of what we would normally say was the object of our experience. But, given the highly uncertain nature of theories in the ontology of cognition, we are wise to treat them and claims based on them (as all legitimate skeptical claims are) with suspicion, and to refuse to let them bear too much weight in our decisions about what to believe. Thus, we should always end up on the side of commonsense.

In fact, this seems to be Moore’s procedure in a late paper called “Four Forms of Scepticism.” Taking as his S the claim made by Bertrand Russell that “I do not know for certain that this is a pencil,” Moore claims that it rests upon several assumptions, one of which is the denial of direct realism. And even though he admits to agreeing with Russell that direct realism is likely false, Moore nonetheless advocates rejecting S:

of no one of these [presuppositions of S] …do I feel as certain as that I do know for certain that this is a pencil. Nay, more: I do not think it is rational to be as certain of any one of these…propositions, as of the proposition that I do know that this is a pencil. (Moore 1959, 226)

It is clear that Moore is using the “shift” strategy. What is not clear is just what the source of justification for CS is supposed to be. In this case, at least, the shift seems to involve an appeal to a criterion of justification—and of rationality—that is not affected by the fact that we lack an adequate account of cognition. But Moore never tells us exactly what this criterion is. Since Moore, it has been the norm to attempt to do criteriology apart from the ontology of cognition, and the question about the criterion (or criteria) for justification remains a central matter of debate.

3. Ethics

Moore’s ethical views are presented in two books and two papers: Principia Ethica, Ethics, “The Conception of Intrinsic Value,” and “Is Goodness a Quality?” (respectively: Moore 1903a, 1912, 1922b, and 1932). Despite being vastly outnumbered by his writings on epistemology and metaphysics, his work in ethics was just as influential. The discrepancy in volume is due mainly to the fact that the details of Moore’s ethical views were far more stable, undergoing far less revision and development, than those of his metaphysical and epistemological views.

a. Goodness and Intrinsic Value

Moore’s most important ethical work is Principia Ethica. It had a profound impact in both philosophy and culture almost immediately upon its publication. In it, Moore lays out a version of ethical realism consistent with his early propositional realism and its attendant doctrines. In accordance with his “identity theory” of truth, ethical propositions, just like non-ethical propositions, are objectively true or false in themselves. Combined with his view that ordinary objects are identical to true existential propositions, this implies that ordinary objects which possess value do so intrinsically: they are true existential propositions that involve the concept “good.” Thus, an object’s status as good or bad (or, in the aesthetic realm, beautiful or ugly) depends on nothing outside of itself—neither its causes and effects nor its relationship to human beings, their preferences, or their judgments. It depends solely on the involvement of “good” as a concept, or, in the idiom of existence, a property.

Ethical propositions, then, differ from non-ethical ones only in virtue of the kinds of concepts they involve. Specifically, ethical propositions involve a range of unique concepts that we call “ethical” or “moral,” such as “good,” “right,” “duty,” etc. The most fundamental of these is “good”; the others count as moral concepts/properties only because they bear logical relationships (in the broad sense of “relations of meaning”) to “good.” This point will be discussed further below. For now, we will focus on Moore’s views concerning the nature of “good” itself.

The central thesis of Principia Ethica is that “good” is a simple, non-natural concept (or property). As we shall see (in Section 3b), it is not completely clear what Moore means by “non-natural.” What he means by “simple” however, is clear enough; so we shall start with that. For something to be ontologically simple (which is the sense in question here) is for it to possess no parts, to admit of no divisions or distinctions in its own constitution. A simple is not made up out of anything, and thus cannot be broken down into anything. Simples are therefore unanalyzable. In the case of “good,” it is a concept not made up of other concepts. Consequently it cannot be analyzed—broken down into constituents—in the way that “bachelor” can (see Section 2b). Moore illustrates the situation by comparing “good” to color concepts like “yellow.” Color concepts cannot be known by analytic description, but only by acquaintance, that is, direct cognition. Attempts at description or definition (that is, analysis) such as “yellow is a color brighter than blue,” fail to capture the essence of yellow. Likewise, purported analyses of “good,” in terms concepts like “pleasure” or “desire” or “evolutionary progress,” fail to capture what is meant by “good.”

b. The Open Question Argument and the Naturalistic Fallacy

Moore demonstrates the unanalyzability of “good” by what has come to be known as “the open question argument”: for any definition of “good”—“good(ness) is X”—it makes sense to ask whether goodness really is X, and whether X really is good. For instance, if we say “goodness is pleasure,” it makes sense to ask, “is goodness really pleasure?” and “is pleasure truly good?” Moore’s point is that every attempt at definition leaves it an open question as to what good really is. But this could be the case only if the definition failed to capture all of what is meant by “good.” Consider the case discussed above: “a bachelor is an unmarried man.” Here it makes no sense to respond “yes, but is a bachelor really an unmarried man?” or “but is every unmarried man really a bachelor?” The reason it doesn’t is that the full meaning of “bachelor” is captured by “unmarried man.” On the other hand, the reason it makes sense to ask these kinds of questions about purported definitions of “good” is that they fail to capture its full meaning. Since this is true of every purported definition of “good,” “good” cannot be defined; it can only be recognized in particular cases through acts of intuitive apprehension.

On this account, any ethical theory that attempts to define the good—and nearly all of them do—errs. Moore famously dubbed this particular error “the naturalistic fallacy.” In general, the fallacy “consists in identifying the simple notion which we mean by ‘good’ with some other notion” (Moore 1903a, 58); or, negatively, the “failure to distinguish clearly that unique and indefinable quality which we mean by good” (Moore 1903a, 59). To this extent, it is clear what Moore means by “the naturalistic fallacy.” However, his choice of “naturalistic” to describe this error is quite puzzling, as is his description of “good” as a non-natural property. In the modern era, “nature” has frequently been used as a synonym for the material world, the world studied by the natural sciences. Accordingly, “naturalistic” has usually been reserved for philosophical views amenable to the natural sciences, views like scientism, empiricism, materialism, and so on. In the Principia, Moore’s direct statements about the meanings of “natural,” “naturalistic,” etc., are in keeping with this norm. At one point, he describes “nature” (and hence the natural) as “that which is the subject-matter of the natural sciences and also of psychology” (Moore 1903a, Ch. 2 § 26). He also offers two alternative characterizations of the natural. The first is in terms of temporality, the second in terms of the capacity for independent existence in time (this latter applies specifically to properties). Even here he does not depart from the norm, for the objects of scientific inquiry are usually taken to be temporal individuals such as events or material individuals at varying levels of granularity (atoms, molecules, cells, “ordinary middle-sized objects,” planets, etc.).

On the one hand, then, Moore’s use of “natural” seems to be unremarkable. What is peculiar, on the other hand, is his use of “naturalistic” to describe the fallacy of equating “good” with any other concept. Moore’s “naturalistic fallacy” is not a matter of mistaking the temporal for the atemporal. Neither is it a matter of mistaking the empirical and the scientific for the non-empirical and non-scientific. This description might apply to hedonistic views that equate good with pleasure, since pleasure can be treated as an object of empirical study either for psychology or physiology. However, Moore means to charge even metaphysical theories of ethics—such as those of Aristotle, Aquinas and Kant—with commiting the naturalistic fallacy (cf. Moore 1903a, Ch. 4), and none of these equates goodness with something empirical or scientific in the modern sense. In fact, the naturalistic fallacy is really just a matter of mistaking the non-synonymous for the synonymous (thus William Frankena suggested in an important 1939 paper that it should be called “the definist fallacy”), and this has nothing to do with the distinction between the natural and the non-natural per se, as that distinction is normally understood.

All this points to the fact that either Moore has a much broader understanding of “natural” than he admits to in the Principia, or “naturalistic fallacy” is not an apt name for the phenomenon at issue. In the Principia, Moore seems prepared to accept the latter possibility when he claims “I do not care about the name: what I do care about is the fallacy. It does not matter what we call it, provided we recognise it when we meet with it” (Moore 1903a, Ch. 1, § 12). However the natural/non-natural terminology must have meant more to him than he let on, for he retained it throughout his career, even parting ways with ordinary usage to do so. This occurs in a 1922 paper on “The Conception of Intrinsic Value.” Here, Moore holds that value concepts alone are to be counted as non-natural, so that “non-natural” is practically equivalent to “moral” and “natural” to “non-moral.” Thus, in the end, it seems that Moore did have a much broader understanding of “natural”—and a correspondingly narrower conception of “non-natural”—than is articulated in the Principia.

c. Ideal Utilitarianism

Although it is the focus of his later book Ethics, only a single chapter of the Principia is given to what Moore called “practical ethics.” This is the area of ethics that has to do with behavior, and hence deals in concepts like “right,” “permissible,” “obligatory,” and the like. In both places, Moore promotes a view that has come to be called “ideal utilitarianism.”

Moore’s account of intrinsic value is limited to objects; it does not include actions. Actions, for Moore, possess value only instrumentally, insofar as they are productive of good consequences. Thus “right,” “duty,” and “virtue” are different ways of labeling actions (or dispositions to act) that are useful as means to good ends. They differ in meaning only insofar as the secondary details of the causal situation differ: “duty” marks a action as productive of more good than any possible alternative, “right” or “permissible” marks an action as productive of no less good than any possible alternative (Moore 1903a, Ch. 5, § 89), while virtues are dispositions to perform particularly unattractive duties:

as duties from expedient actions, so virtues are distinguished from other useful dispositions, not by any superior utility, but by the fact that they are dispositions, which it is particularly useful to praise and to sanction, because there are strong and common temptations to neglect the actions to which they lead. (Moore 1903a, Ch. 5, § 103)

Moore’s view is that there is no important difference in meaning between concepts like “duty” “right” and “virtue” on the one hand, and “expedient” or “useful” on the other. In this he agrees with the classic utilitarians Jeremy Bentham and John Stuart Mill. However, whereas classic utilitarianism is hedonistic (that is, it defines good in terms of pleasure), Moore defends the sui generis status of “good” (see Section 3a). Moore’s utilitarianism is not, therefore, hedonistic. Instead, it is said to be ideal. To understand what this means, we must note two features of Moore’s view.

First, Moore’s utilitarianism is pluralistic. Since, on Moore’s account, “good” is a property/concept whose meaning is completely independent of any others, it can be instanced in any number of wholes—objects or states of affairs—of a variety of types. This means that many different kinds of objects can have intrinsic value—not just states of pleasure, as the classic utilitarians have it.

Second, “good” for Moore is a degreed property—one object or state of affairs can have more or less value than another. This is implicit in the way Moore distinguished between “duty” and “right.” “Duty” concerns producing the most good possible, while “right” concerns producing no less good than other options. Both definitions assume that possible outcomes (states of affairs) can be ranked in respect of their degrees of value. This is made explicit in Chapter 6 of the Principia, where Moore articulates his conception of an ideal state of affairs. In general, Moore says, an ideal state is one that is “good in itself in a high degree” (Moore 1903a, Ch. 6, § 110). Ideal utilitarianism, therefore, will be a brand of utilitarianism in which actions are to be ordered not to the greatest happiness or pleasure, but to those states of affairs possessing the highest degree of good.

Indeed, as Moore has set things up, duty will always be directed toward some ideal state (toward the state with the highest degree of good). Thus, to know which states are ideal, and, more specifically, which are most valuable and hence the most ideal, is crucial for practical ethics. According to Moore, the most valuable states we know of are the pleasures of personal relationships and aesthetic enjoyment. Thus, he concludes, “the ultimate and fundamental truth of Moral Philosophy” is that

it is only for the sake of these things [that is, the two ideal states of aesthetic and interpersonal enjoyment]—in order that as much of them as possible may at some time exist—that any one can be justified in performing any public or private duty; that they are the raison d’être of virtue; that it is they—these complex wholes themselves, and not any constituent or characteristic of them—that form the rational ultimate end of human action and the sole criterion of social progress. (Moore 1903a, Ch. 6, § 113)

d. The Influence of Moore’s Ethical Theory

Moore’s ethical theory had a tremendous influence both within and beyond the academy. Within the academy, non-cognitive theories of ethics dominated until nearly 1960. This was the logical consequence of adapting Moore’s ethical theory to a naturalistic worldview. Both his own and subsequent generations of philosophers took to heart Moore’s treatment of moral value as non-natural and his corresponding refusal to allow any characterization of good in natural terms. In doing so, however, they either failed to recognize or simply ignored the fact that Moore’s use of “natural” etc. was somewhat idiosyncratic. Taking these terms in their standard sense, Moore’s claims about “good” indicated that it was not merely indefinable, but unknowable by any scientific or “natural” means. Together with a scientistic outlook that restricted either the knowable or the existent to the scientifically verifiable, this yielded the view that “good” was unknowable.

It was essentially this view—albeit given a linguistic twist—that provided the theme upon which the most prominent ethical theories of the early- to mid-1900s counted as so many variations. This began with the logical positivist treatment of ethics. According to the logical positivists’ “verifiability principle of meaning,” the meaning of a proposition is its manner of empirical verification. If a proposition cannot be verified empirically, it is thereby revealed as meaningless. Given the Moorean characterization of “good” as non-natural and the usual sense of “non-natural” as connoting, among other things, “non-empirical,” the verification principle made ethical propositions meaningless. Still, ethical discourse obviously plays an important role in human life. According to the logical positivists, this was to be explained by treating ethical propositions not as statements of fact, but as expressions of emotion. For example, “honesty is good” is to be taken as equivalent to “hooray for honesty!” This view, commonly called “emotivism,” was popularized by A. J. Ayer in his book Language, Truth and Logic (Ayer 1936), and later modified by C. L. Stevenson (1944, 1963).

To an extent, emotivism had been anticipated in Moore’s treatment of practical ethics, in his view that

the true distinction between duties and expedient actions is not that the former are actions which it is in any sense more useful or obligatory or better to perform, but that they are actions which it is more useful to praise and to enforce by sanctions, since they are actions which there is a temptation to omit. (Moore 1903a, Ch. 5, § 101)

In other words, the language of practical ethics adds to non-ethical language only the connotation of approval or disapproval and their consequent “hortatory force” (cf. Daly 1996, 45-47). In emotivisim this claim was extended to all ethical discourse.

The larger part of the mid-century debate over the status of ethical claims was taken up with creative rejections of emotivism which were nonetheless in keeping with the basic Moorean disjunction between the moral and the natural(/empirical/scientific). Such alternatives came from Stuart Hamphire (1949), J. O. Urmson (1950), Stephen Toulmin (1950), and R. M. Hare (1952). British and American philosophers began to part ways with the Moorean disjunction only in the late 1950s and early 1960s, due largely to the work of Elizabeth Anscombe (Anscombe 1958) and Phillipa Foot (1958, 1959, 1961).

Beyond the academy, Moore’s emphasis on the value of personal relationships and aesthetic experiences endeared him to members of the Bloomsbury group, who embraced Moore as their patron saint. Bloomsbury was a group of avant-garde writers, artists, and intellectuals that proved to be immensely influential in culture beyond the academy. The group included (among others) Clive Bell, Roger Fry, Desmond McCarthy, John Maynard Keynes, and Leonard and Virginia Woolf. Many of the Bloomsbury men were also members of the Cambridge Apostles, and had first met each other and Moore in that context. Moore had been elected to this secret student society in 1894. As members of Bloomsbury, they embraced Moore’s idealization of friendship and aesthetic enjoyment as the highest human goods, and, through their own example and through their work, conveyed at least some of Moore’s views and values beyond the halls of academia and into the broader culture.

However, they also used Moore’s intuition-based moral epistemology as a justification for flouting the mores of their culture, especially in the area of sexual ethics. In fact, on account of Bloomsbury’s reputation for moral laxity, Moore’s views were often unfairly criticized as encouraging libertine behavior. This is clearly a case of guilt by association, as Moore himself never claimed that “free love” was a good. The closest he comes to the topic is in discussing social conventions about chastity as an example of rules that might, under certain circumstances, be suspended (Moore 1903a, ch. 5, §§ 95-96). However, far from endorsing that they actually be suspended, he argues that it is obligatory to obey the conventions of one’s society, since this will usually generate a state of greater good (in the form of social harmony) than violating them.

The situation with Bloomsbury illustrates the greatest weakness of Moore’s ethical system. It is not a theoretical weakness, but a practical one. From a theoretical perspective, intuitionism is invulnerable, and it is invulnerable because intuition is unverifiable—if someone claims to have an intuition that such and such is the case, there’s nothing anyone can do to prove or disprove it. However, because it is unverifiable, intuition can be used to justify anything. This is the practical problem with intuitionist ethics. Of course, the problem is not unique to Moore’s version of intuitionism, but attaches to intuitionism in specie.

4. Philosophical Methodology

Moore is usually regarded as an important methodological innovator. In fact his method of philosophical analysis is supposed to have been a formative inspiration for the analytic movement in philosophy. However, it is a bit misleading to speak of “Moore’s philosophical method.” Moore was what we might call an occasional philosopher. By his own admission, he possessed no innate drive to develop a systematic philosophy; rather, he was agitated into philosophizing only by the bizarre challenges some philosophers’ claims posed to his commonsense beliefs:

I do not think that the world or the sciences would ever have suggested to me any philosophical problems. What has suggested philosophical problems to me is things which other philosophers have said about the world or the sciences. (Moore 1942a, 14)

In the Library of Living Philosophers volume on Moore, V.J. McGill criticizes Moore’s piecemeal approach to philosophy. He rightly notes that Moore attempted to develop no grand system of philosophy, but worked instead in a few specific areas, for example, ethics, perception, and philosophical method. McGill blames Moore’s approach to philosophy on his commitment to a method which was simply not suited to deal with other sorts of philosophical issues. In his reply to McGill, however, Moore rejects this idea:

it is, of course true that there are ever so many interesting philosophical problems on which I have never said a word … Mr. McGill suggests that the reason why I have not dealt with some of these other questions may have been that I was wedded to certain particular methods, and that these methods were not suitable for dealing with them. But I think I can assure him that this was not the case. I started discussing certain kinds of questions, because they happened to be what interested me most; and I only adopted certain particular methods (so far as I had adopted them) because they seemed to me suitable for those kinds of questions. I had no preference for any method…. (Moore 1942b, 676)

In a sense, then, Moore did not have a method. But, of course, he did have a way of going about his philosophizing, and one might call this “Moore’s method.” In this case, the “method” would consist, first, in tackling isolated philosophical problems rather than trying to build a philosophical system. Second, in tackling one of these isolated problems, it would involve the attempt to get very clear on what was meant by the propositions and concepts essential to stating the problem—in other words, the propositions and concepts would have to be analyzed. Likewise with the propositions and concepts involved in the answer (or possible answers).

In point of historical fact, Moore’s use of analysis to solve isolated philosophical problems—and so his “method”—proved to have a greater impact on philosophy than any of his developed theories in metaphysics, epistemology, or ethics. Though his early views about truth and propositions provided a necessary metaphysical and epistemological departure from British Idealism, these merely facilitated the rise of analytic philosophy. The substance of the movement came from Moore’s use of analysis as a method. Indeed, though use of the word “analysis” in philosophy antedates Moore, it was Moore who first used it in the sense that ultimately gave the movement its name.

Unfortunately, much of Moore’s influence in this regard was based on a mistake. It was mentioned above that the empirical equivalence of definition and analysis was a source of confusion for Moore’s contemporaries. Despite Moore’s best efforts to explain otherwise, many took him to have invented and endorsed linguistic analysis. Norman Malcolm represents this common misconception when he says, “The essence of Moore’s technique of refuting philosophical statements consists in pointing out that these statements go against ordinary language” (Malcolm 1942, 349). Malcolm goes on to tie Moore’s entire philosophical legacy to his “linguistic method:”

Moore’s great historical role consists in the fact that he has been perhaps the first philosopher to sense that any philosophical statement that violates ordinary language is false, and consistently to defend ordinary language against its philosophical violators” (Malcolm 1942, 368)

But Moore explicitly rejected the idea that his analyses had been in any important sense “linguistic.” “In my usage,” he insisted, “the analysanda must be a concept, or idea, or proposition, and not a verbal expression” (Moore 1942b, 663 f.):

I never intended to use the word [“analysis”] in such a way that the analysandum would be a verbal expression. When I have talked of analyzing anything, what I have talked of analyzing has always been an idea or concept or proposition, and not a verbal expression; that is to say, if I talked of analyzing a “proposition,” I was always using “proposition” in such a sense that no verbal expression (no sentence, for instance), can be a “proposition,” in that sense. (Moore 1942b, 661)

Our survey of Moore’s metaphysics in Section 2b makes it clear enough that a Moorean proposition is anything but a linguistic entity. How, then, did this misunderstanding arise? Even a brief survey of Moore’s work will reveal that he often used terms such as “meaning,” “definition,” and “predicate” to describe what he was dealing with or looking for in his philosophical activities, and it is easy to see how these suggest that he was engaged in some linguistic enterprise. In a particularly glaring example from Principia Ethica, Moore identifies the object of his of study in clearly grammatical terms: “My discussion hitherto has fallen under two main heads. Under the first, I tried to shew what “good”—the adjective “good”—means” (Moore 1903a, Ch. 5, § 86). In this case, it seems that Moore himself conflated a linguistic entity—the adjective “good”—with a conceptual one.

With characteristic humility, Moore was quick to count himself as partially responsible for the linguistic interpretation of his method. “I have often,” he admitted, “in giving analyses, used this word ‘means’ and thus given a false impression; …” (Moore 1942b, 664 f.). Though the linguistic interpretation of Moore persisted until well after his death, recent scholarship has continued to hammer the point home that this is a mistake, and the message seems to have finally been heard.

Even apart from the linguistic error, however, the general contours of Moore’s genuine “method” seem to have had a lasting impact of their own. In his recent work on the history of analytic philosophy, Scott Soames counts as two of the movement’s three characteristic features “an implicit commitment…to the ideals of clarity, rigor, and argumentation” (Soames 2003, xiii) and “a widespread presumption…that it is often possible to make philosophical progress by intensively investigating a small, circumscribed range of philosophical issues while holding broader, systematic issues in abeyance” (Soames 2003, xv), and among its two most important achievements he includes “the recognition that philosophical speculation must be grounded in pre-philosophical thought” (Soames 2003, xi). Each of these can be traced directly back to Moore and his “method.”

5. Moore’s Influence and Character

It cannot be doubted that Moore was one of the most influential philosophers of the early twentieth century. It is peculiar, though, that his influence seems to have had little to do with his actual views. Though his early views about truth and propositions influenced Bertrand Russell for a time, they have long since ceased to play a role in mainstream philosophical discussions. The same can be said of his views in ethics and, except in the very general respects mentioned by Soames, philosophical methodology. Moreover, even when the influence of Moore’s ethical and methodological views was at its highest, there remains the fact that much of the detailed content of his views was ignored by those who claimed to be influenced by them. For both the “ordinary language” branch of analytic philosophy and the Bloomsbury group, Moore’s views were influential mainly in the sense that they provided forms into which they could pour their own content. And yet Moore himself was revered by all.

This puzzle about Moore’s influence has been addressed by Paul Levy (Levy 1979), who argues that Moore’s influence was due more to his character than to his views. And, in fact, the uniqueness of Moore’s character is frequently mentioned by those who knew him and have written about him. G. J. Warnock, for instance, would seem to agree with Levy when he says:

…special notice should be paid to the character of Moore…it was not solely by reason of his intellectual gifts that Moore differed so greatly from his immediate predecessors, or influenced so powerfully his own contemporaries. He was not, and never had the least idea that he was, a much cleverer man than McTaggart … or Bradley. It was in point of character that he was different, and importantly so. (Warnock 1958, 12)

Foremost among his virtues were his unwavering honesty and his devotion to clarity and truth. Moore was never afraid to appear silly or naïve in his search for truth, and so he always said exactly what he thought in the best way he knew how. He was never afraid to admit an error. He gave no appearance of trying to promote either himself or his own agenda or system. This was remarkably refreshing in a context dominated by a philosophical system that had achieved the status of orthodoxy. He held both himself and others to exacting intellectual standards while at the same time exhibiting a spirit of great generosity and kindness in his personal relationships. Gilbert Ryle, the most prominent Cambridge philosopher in the generation after Moore, describes Moore’s significance this way:

He gave us courage not by making concessions, but by making no concessions to our youth or our shyness. He treated us as corrigible and therefore as responsible thinkers. He would explode at our mistakes and muddles with just that genial ferocity with which he would explode at the mistakes and muddles of philosophical high-ups, and with just the genial ferocity with which he would explode at mistakes and muddles of his own. (Ryle 1971, 270)

Similar reports come from Moore’s associates outside of academic philosophy. For instance, Leonard Woolf (a member of Bloomsbury and the Apostles) recalls:

There was in him an element which can, I think, be accurately called greatness, a combination of mind and character and behaviour, of thought and feeling, which made him qualitatively different from anyone else I have ever known. I recognize it in only one or two of the many famous dead men whom Ecclesiaasticus and others enjoin us to praise for one reason or another. (Woolf 1960, 131)

There is no doubt that Moore’s character captured a certain philosophical ideal established by Socrates long ago. Whatever we make of Moore’s views, we can be grateful for his character and whatever influence it had and continues to have.

6. References and Further Readings

The most complete bibliography of Moore’s writings is found in the 1971 edition of The Philosophy of G. E. Moore (listed, as “Schilpp, ed. 1942” in section b, below).

a. Primary Sources

  • Moore, G. E. 1899: “The Nature of Judgment,” Mind 8, 176-93. Reprinted in Moore 1993, 1-19.
  • Moore, G. E. 1901-2: “Truth” in J. Baldwin (ed.) Dictionary of Philosophy and Psychology, London: Macmillan. Reprinted in Moore 1993, 20-2.
  • Moore, G. E. 1903a: Principia Ethica, Cambridge: Cambridge University Press. Moore, G. E. 1903b: “The Refutation of Idealism” Mind 12, 433-53. Reprinted in Moore 1993, 23-44.
  • Moore, G. E. 1912: Ethics, London: Williams & Norgate.
  • Moore, G. E. 1922a: Philosophical Studies, K. Paul, London: Trench, Trubner & Co.
  • Moore, G. E. 1922b: “The Conception of Intrinsic Value” in Moore 1922a.
  • Moore, G. E. 1925: “A Defense of Common Sense” in J. H. Muirhead ed., Contemporary British Philosophy, London: Allen and Unwin, 193-223. Reprinted in Moore 1959, 126-148, and Moore 1993, 106-33.
  • Moore, G. E. 1939: “Proof of an External World,” Proceedings of the British Academy 25, 273-300. Reprinted in Moore 1993, 147-70.
  • Moore, G. E. 1942a: “An Autobiography,” in Schilpp ed., 1942, 3-39.
  • Moore, G. E. 1942b: “A Reply to My Critics,” in Schilpp ed., 1942, 535-677.
  • Moore, G. E. 1953: Some Main Problems of Philosophy, New York: Macmillan.
  • Moore, G. E. 1959: Philosophical Papers, London: George Allen and Unwin.
  • Moore, G. E. 1993: G. E. Moore: Selected Writings, ed. Thomas Baldwin, London: Routledge.

b. Secondary Sources

  • Ambrose and Lazerowitz (eds.). 1970: G. E. Moore: Essays in Retrospect, London: Allen and Unwin.
  • Anscombe, Elizabeth. 1958: “Modern Moral Philosophy,” Philosophy: The Journal of the Royal Institute of Philosophy, vol. 33, no. 124, 1-19
  • Ayer, A. J. 1936, Language, Truth, and Logic, London: Gollancz.
  • Ayer, A. J. 1971: Russell and Moore: The Analytical Heritage, Cambridge: Harvard University Press.
  • Baldwin, T. 1990: G. E. Moore, London: Routledge.
  • Baldwin, T. 1991: “The Identity Theory of Truth,” Mind, New Series, Vol. 100, No. 1, 35-52.
  • Bell, David. 1999: “The Revolution of Moore and Russell: A Very British Coup?” in Anthony O’Hear (ed.), German Philosophy Since Kant, Cambridge and New York: Cambridge University Press.
  • Daly, Cahal B. 1996: Moral Philosophy in Britain: From Bradley to Wittgenstein, Dublin: Four Courts Press.
  • Foot, Phillipa. 1958: “Moral Arguments,” Mind, Vol. 67, 502-513.
  • Foot, Phillipa. 1959: “Moral Beliefs,” Proceedings of the Aristotelian Society, Vol. 59, 83-104.
  • Foot, Phillipa. “Goodness and Choice,” Proceedings of the Aristotelian Society, Supplemental Vol. 35, 45-61.
  • Frankena, William. 1939: “The Naturalistic Fallacy,” Mind, Vol. 48, 464-477.
  • Hampshire, Stuart. 1949: “Fallacies in Moral Philosophy,” Mind, Vol. 58, 466-482.
  • Hare, R. M. 1952: The Language of Morals, Oxford: Clarendon Press.
  • Hutchinson, Brian. 2001: G. E. Moore’s Ethical Theory, Cambridge: Cambridge University Press.
  • Keynes, J. M. 1949: “My Early Beliefs” in Two Memoirs, London: Hart-Davis.
  • Levy, P. 1979: Moore: G. E. Moore and the Cambridge Apostles, Oxford and New York: Oxford University Press.
  • Lewy, Casmir. 1964: “G. E. Moore on the Naturalistic Fallacy,” Proceedings of the British Academy, vol. 50, 251-262.
  • Malcolm, N. 1942: “Moore and Ordinary Language,” in Schilpp (ed.) 1942, 343-368.
  • Olthuis, James H. 1968: Facts, Values and Ethics: a Confrontation with Twentieth-Century British Moral Philosophy, in Particular G. E. Moore, New York: Humanities Press.
  • Schilpp, P. A., ed. 1942: The Philosophy of G. E. Moore, Evanston: Northwestern University Press.
  • Soames, Scott. 2003 . Philosophical Analysis in the Twentieth Century, vol. 1, Princeton and Oxford: Princeton University Press.
  • Stroll, A. 1994: Moore and Wittgenstein, Oxford and New York: Oxford University Press.
  • Stroll, A. 2000. Twentieth-Century Analytic Philosophy. New York: Columbia University Press.
  • Sylvester, R. P. 1990: The Moral Philosophy of G. E. Moore, Philadelphia: Temple University Press.
  • Regan, T. 1986: Bloomsbury’s Prophet, Philadelphia: Temple University Press.
  • Russell, B. 1906: “On the Nature of Truth,” Proceedings of the Aristotelian Society.
  • Russell, B. 1910: Philosophical Essays, London, New York, and Bombay: Longmans Green.
  • Ryle, G. 1971: “G. E. Moore,” in Collected Papers, vol. I, London: Hutchinson.
  • Stevenson, C. L. 1944: Ethics and Language, New Haven: Yale University Press.
  • Stevenson, C. L. 1963: Facts and Values, New Haven: Yale University Press.
  • Toulmin, Stephen. 1950: The Place of Reason in Ethics, Cambridge: Cambridge University Press.
  • Urmson, J.O. 1950: “On Grading,” Mind, Vol. 59, 145-169.
  • Warnock, G.J. 1958: English Philosophy Since 1900, London: Oxford University Press.
  • Willard, D. 1984: Logic and the Objectivity of Knowledge: A Study in Husserl’s Early Philosophy, Athens, Ohio: Ohio University Press.
  • Wittgenstein, L. 1969: On Certainty, Oxford: Blackwell.
  • Woolf, L. 1960: Sowing: An Autobiography of the Years 1880-1904, London: Hogarth Press.

Author Information

Aaron Preston
Email: Aaron.Preston@valpo.edu
Malone College
U. S. A.

Non-Cognitivism in Ethics

A non-cognitivist theory of ethics implies that ethical sentences are neither true nor false, that is, they lack truth-values. What this means will be investigated by giving a brief logical-linguistic analysis explaining the different illocutionary senses of normative sentences. The analysis will make sense of how normative sentences play their proper role even though they lack truth values, a fact which is hidden by the ambiguous use of those sentences in our language. The main body of the article explores various non-cognitivist logics of norms from the early attempts by Hare and Stevenson to the more recent ones by A. Gibbard and S. Blackburn. Jorgensen’s Dilemma and the Frege-Geach Problem are two important aspects of this logic of norms. Jorgensen’s Dilemma is the problem in the philosophy of law of inferring normative sentences from normative sentences, which is an apparent problem because inferences are typically understood as involving sentences with truth values. The Frege-Geach Problem is a problem in moral philosophy involving inferences in embedded contexts or in illocutionary mixed sentences. The article ends with a taxonomy of non-cognitivist theories. See also Ethical Expressivism.

Table of Contents

  1. Metaethical assumptions
    1. Different illocutionary acts
    2. Difference between language and metalanguage
    3. Ambiguity of normative sentences
    4. Definitions of ethical non-cognitivism
  2. The problem of a logic of norms
    1. Jorgensen’s dilemma: its importance for non-cognitivism
  3. From earlier non-cognitivism to the “new norm-expressivism”
    1. C. L. Stevenson and the role of persuasion
    2. R. M. Hare and the dictive indifference of logic
    3. The new “norm-expressivism”
  4. The Frege-Geach Problem
    1. Blackburn solutions to the Frege-Geach Problem
    2. Gibbard solution to the Frege-Geach Problem
  5. The significance of the Geach-Frege Problem and Jorgensen’s Dilemma for non-cognitivism
  6. A Taxonomy of Ethics
  7. References and Further Reading

1. Metaethical assumptions

In this section, we will introduce some preliminary linguistic notions that will allow us to give a better account of the cognitivism vs. non-cognitivism divide.

Canonically, forms of language are mainly divided in two species: cognitive sentences (cognitive use of language) and non-cognitive sentences (instrumental use of language). Cognitive sentences are fact-dependent or bear truth-values, while non-cognitive sentences are, on the contrary, fact independent and do not bear truth-values.

Cognitive sentences typically describe states of affairs, such as “The earth is square” or “Schwarzenegger won the last California election;” such sentences are verifiable and can be either true or false. On the other hand, sentences such as “You shall not steal,, “You ought to pay your taxes,” and “Don’t shut the door, please,” do not describe states of affairs nor can be understood as carrying falsehood or truth, but they rather have a different kind of illocutionary force.

a. Different illocutionary acts

Before introducing the notion of illocutionary force, we need to say more about language and its usage. The basic part of a language carrying meaning is called a sentence, such as “The actual king of France is bald” or “Close that door, please!” Thereby, a speaker’s actual empirical performance (here and now) of an actual linguistic expression is not mentioned. We are rather referring to a class including all the possible empirical performances made by a possible speaker in any language and in any occurrence of that determined expression. On the other hand, propositions are the meaning of sentences: they are true or false, they can be known, believed or doubted and, finally, they are kept constant in respect of their translation from a language to another (Lyons, 1995, p. 141).

The same proposition may be used in different occurrences for doing different things. In other words, the same proposition can be used for asserting, questioning, asking, demanding and so on. A sentence, therefore, can be understood as an illocutionary act. The general form of illocutionary acts, according to Searle, is:

F(p)

where “F” stands for any indicator of illocutionary force, and “p” takes expressions for propositions. In this way, we can symbolize different kinds of illocutionary acts such as assertions:

├ p      such as in “You are going to shut the door”

commands:

!p        such as in “Shut the door!”

or questions:

?p        such as in “Are you going to shut the door?”

According to Reichenbach (1947, p. 337), illocutionary acts are not true or false. They are indeed instruments constructed with the help of propositions, and therefore they belong to language; this is what distinguishes them from other instruments devised to reach a certain aim. We can distinguish two – not necessarily separated – elements within an illocutionary act, namely the propositional indicator (p) and the indicator of illocutionary force (F). What is called propositional content (or proposition, or radical-proposition) is symbolized with “p” and it is the invariant ingredient in an illocutionary act (in our example above is: “your going to shut the door” or the possible state of affair “you are going to shut the door”). Indeed, it describes the “descriptive content” of a sentence; or, in other words, it stands for a possible state of affair containing meaning and, consequently, having truth-values.

On the contrary, illocutionary acts show the way a proposition is used or what illocutionary force the sentence belongs to. Therefore, illocutionary force has no semantic meaning whatsoever and so it does not form part, for example, of the conceptual amount of a norm sentence. Importantly, illocutionary forces are not alethic modalities-like (such as “is necessary that”); they are not like intensional operators and therefore they cannot be used for creating propositions starting from propositions. For this reason Frege’s Rule states signs of illocutionary force cannot (a) being iterated and (b) fall under the range of propositional connectives.

Finally, the illocutionary dimension has a perlocutionary element attached. According to Levinson (1983, p. 237), a perlocutionary act is specific to the circumstances of issuance and is therefore not conventionally achieved just by uttering that particular utterance, and includes all those effects, intended or unintended, often indeterminate, that some particular utterance in a particular situation may cause. The main difference between a perlocutionary act and an illocutionary act stands on the fact that the former has a conventional nature, as it can be represented in explicit form using the performative formula; this conventional nature does not apply to perlocutionary act. In the following, we will see the importance of perlocutionary acts within the emotive theories of ethics, which represent a kind of non-cognitivist theory.

b. Difference between language and metalanguage

Another fundamental notion to understand is considering the difference between cognitivism and non-cognitivism concerns a linguistic difference between language and meta-language. This distinction makes clear another problematic feature intrinsic to the ordinary use of natural languages such as the ambiguity of normative sentences and prescriptions. Often non-cognitivist positions are confused with relativistic positions because of the shift from the object language into the meta-language. When we say, “Hitler was a bad leader,” we are uttering a normative sentence. When we say, “Winston said Hitler was a bad leader” we are not uttering a normative although relativistic sentence. Rather we are moving from the object-language (that is the sentence “Hitler was a bad leader”) to a meta-linguistic one (that is “Winston said Hitler was a bad leader”) which is typically a descriptive sentence (taken as a whole) talking about a normative sentence (that is: “Hitler was a bad leader”). There is no room for relativism here: the latter is not a moral sentence but simply a descriptive sentence (or, following Max Weber, a sociological sentence), which, according to B. Russell (1935, p. 214-215), belongs to psychology or biography. An important feature of descriptive sentences holds that “The descriptive sentences of obligation and permission are relative in a sense in which the prescriptive sentences are not”; they always refer to the utterer/authority of that sentence (that in our case is Winston): “conceptually, the reference to the authority is necessary to identify the normative proposition [that is “Hitler was a bad leader”] expressed by a normative sentence used in a descriptive way” (Alchourrón, 1993)

c. Ambiguity of normative sentences

Notice that normative sentences are ambiguous; they can be uttered both in descriptive and in normative ways at the level of common language. In other words, the same normative sentence can be used either to perform prescriptions as well as to describe that a particular norm exists. Jeremy Bentham (1970, p. 104; Bentham, 1789, chap. XVII, § XXIX n.1; see Alchourron and Bulygin, 1989 and Bulygin, 1982) was intuitively aware of ambiguity in normative sentences. In fact, this semantical shift is due to a peculiar capacity of natural languages to mix up the language level with meta-language level to the extent in which we cannot appreciate any difference between them when using ordinary language. According to Bentham, on the contrary, such a linguistic difference should be clear; in fact he pointed out that “The property and very essence of law, it may be said, is to command; the language of the law then should be the language of command. For expressing commands there is in all languages a particular mood, which is styled the imperative” (Bentham, 1970, p. 105). Bentham also argues that “There is still enough that serves, and that as effectually as in the other case, to distinguish the imperative from the ordinary didactic, narrative, informative or assertive style: the language of the will from the language of the understanding” (ibid.). This distinction is very important in the practice of law and in the field of ethics because “What is been termed a declaratory law, so far as it stands distinguished from either a coercive or a discoercive law, is not properly speaking a law. It is not the expression of an act of will exercised at the time: it is a mere notification of the existence of a law, either of the coercive or the discoercive kind, as already subsisting; of the existence of some document expressive of some act of will, exercised, not at the time, but at some former period” (Bentham, 1789, p.).

More recently, von Wright made that intuition more precise, explaining, “Tokens of the same sentences are used, sometimes to enunciate a prescription (that is, to enjoin, permit, or prohibit a certain action), sometimes again to express a proposition to the effect that there is a prescription enjoining, or permitting or prohibiting a certain action. Such propositions are called norm-propositions [or descriptive sentences of norms]” (von Wright, 1963, p. viii). Norms “should be carefully distinguished from ‘normative propositions’, i.e. descriptive propositions stating that ‘p’ is obligatory (forbidden or permitted) according to some unspecified norm or set of norms. Normative propositions – which can be regarded as propositions about sets (systems) of norms – also contain normative terms like ‘obligatory’, ‘prohibited’, etc. but these have a purely descriptive meaning” (Alchourrón e Bulygin, 1981).

The most influential analysis on the nature of normative sentences (especially in the field of philosophy of law) was carried out by Hans Kelsen (especially in Kelsen, 1941).

d. Definitions of ethical non-cognitivism

Ethical non-cognitivism claims that prescriptions have a different nature than descriptive sentences; they have no truth-values, they are not describing anything, and they have a different illocutionary role. That is to say, they do not express factual claims or beliefs and therefore are neither true nor false (they are not truth-apt); they belong to a different illocutionary force, the prescriptive mood.

These theories, as opposed to cognitivist theories, are not holding that ethical sentences are objectively and consistently true or false, neither even presupposing new entities platonic-like (in the way naturalistic theories do), and therefore they do not need to explain the way in which we can epistemically access these theories (see Blackburn, 1984, p. 169 and Hale, 1993). In other words, non-cognitivism claims that the principal feature of normative sentences (their lacking of truth values) is a consequence of the illocutionary role of such sentences. In fact, these sentences are not bearing any cognitive meaning (such as assertions or descriptions), but they are just used to utter prescriptions.

Therefore, cognitivist theories reject three traditional theses: (1) Hume’s Law (that is the claims that a moral conclusion cannot be validly inferred from non-moral premises), as some cognitivist theories suppress the distinction between cognitive and normative sentences; (2) Ockham’s Razor, because some of cognitivist theories do multiply entities without necessity, as they presuppose a (platonic) realm of norms; and (3) Jorgensen’s Dilemma (see below).

Non-cognitivist theories do not infringe Ockham’s Razor as they are not implying any platonic entity (we saw the difference between normative sentences and descriptive sentences is just at the illocutionary level) and they accept the challenge of Hume’s Law.

We can find two main theories within noncognitivism: emotivism and prescriptivism. These two theories, often confused, need to be carefully distinguished. Indeed emotivism and prescriptivism are different for two main reasons; for emotivists a normative sentence is basically a sentence which expresses a speaker’s feeling (such as “Gasp!”). For prescriptivists a normative sentence is used for uttering overriding universalizable prescriptions (such us: “You shalt not steal!”). Another difference between those two theories is about the possibility of a genuine logic of norms. Emotivists, at least in classical formulations (from Ayer to Stevenson) claim a logic of norms is very problematic or even impossible to build: while for prescriptivists (in particular in Hare’s theory or in von Wright’s works) the possibility for a logic of norms is open, although problematic.

2. The problem of a logic of norms

The main challenge non-cognitivist theories face is about the possibility of a logic of norms. Cognitivist theories are not facing this dilemma as they claim there is no difference between normative and descriptive sentences; therefore the classic logic based on truth-values is sufficient for normative reasoning. What about norms lacking truth-values?

The problem of a logic of norms is a vexata quaestio that dates back, in modern times, to Language, Truth and Logic by A.J. Ayer (1936). Ayer claimed that ethical sentences are pseudo concepts aimed at expressing emotions or commands having no real meaning. The only purpose of ethical sentences is to persuade the listener to act in a certain way. In other words, ethical sentences have only a perlocutory function. Therefore it is no possible to talk about disagreement and unsoundness in ethics; neither is it possible to speak about ethical reasoning because ethical sentences such as “parsimony is a virtue” and “parsimony is a vice” are not expressing propositions (that is are not true or false). Thus they can’t be incompatible. On the other hand, Ayer acknowledged that people do discuss about questions regarding values, but they are not actually ethical dilemmas involving values but factual questions. In fact, people, according to Ayer, reason about empirical facts on which state of affairs to perform and not about agreeing on an ethical belief.

According to M. Warnock (1978) Ayer’s is a negative theory of ethics because it lacks of meaning and scientific basis. The last word in ethics is rather ideological, that is to state the superiority of a moral system over another. Ayer’s skeptical conclusion is a consequence of the linguistic model he adopted (that is basically Wittgenstein’s Tractatus picture-theory, 1922). In fact, Ayer is not able (at least in Language Truth and Logic) to distinguish in normative sentences between an emotive (perlocutionary) part and a descriptive (meaning) part. The distinction is necessary to give ethics its full significance back.

Two years after Ayer’s Language, Truth and Logic, another author dealt with the problem of the foundation of a logic of norms. Jorgen Jorgensen (in “Imperativer og Logik”, 1937-38) claimed that “any imperative sentences may be considered as containing two factors which I may call the imperative factor and the indicative factor, the first indicating that some thing is commanded or wished and the latter describing what it is that is commanded or wished.” In an actual sentence it is not possible to distinguish between those two factors because a command void of content is impossible; but the indicative factor can be kept apart from the imperative mood and it can be used to express indicative sentences describing the action, changes or state of affairs which can be ordered or wished. For example, in the imperative “Close the door!” somebody is ordering that a door be closed. The order is that the proposition “the door once open is now closed” be true. Methodologically, Jorgensen was in line with the modern distinction in sentences between illocutionary force and propositional content (see i.e. Searle, 1969).

Jorgensen concluded, “it seems to be a syntactical rule that from an imperative sentence of the form “Do so and so,” an indicative sentence of the form “This is so and so” may be derived.” In other words, Jorgensen claimed imperative sentences can be transformed in indicative sentences in two ways: (1) the imperative factor is put outside the brackets much as the assertion sign in the ordinary logic and the logical operations are only performed within the brackets; or (2) for each imperative sentences there is an equivalent indicative sentence which is derived from the former. This derived indicative sentence applies to the rules of classical logic and thereby indirectly applies the rules of logic to the imperative sentences so that entailments of the latter may be made explicit.

Jorgensen’s first solution acknowledges the application of logic only within the propositional content (or indicative factor) without using the normative (or imperative) constituent. This solution is very similar to R.M. Hare’s dictive indifference of logic (Hare, 1949 and 1952) in which, we will see, logic is valid only at the phrastics level. Jorgensen’s second solution, on the other hand, seems to propose that normative sentences and descriptive sentences are linked through an isomorphic relation; that is prescriptions hold as the same logical rules as their descriptive counterparts. G.H. von Wright (1963) will successively explore this solution. Therefore Jorgensen, differently from Ayer, moved to an idea of ethics, which is called moderate emotivism close to Stevenson’s (1944) and Hare’s (1949). In fact, Jorgensen acknowledges a descriptive component within prescriptive sentences and also he thinks that it is possible to apply logic to norms.

a. Jorgensen’s dilemma: its importance for non-cognitivism

More importantly, Jorgensen proposed the so-called Jorgensen’s Dilemma, which is the first attempt to analyze the problem of the inference of norms (prescriptive sentences) from norms (prescriptive sentences) moving from the point that norms (prescriptive sentences) are lacking of truth-values. In fact, Jorgensen analyzes this problem moving from the so-called Poincare’s argument (a variant of Hume’s Law) in which is studied the role of logical inference into prescriptive contexts (that are lacking of truth-values). Jorgensen still thinks logical inference is a concept linked to a classical idea of logic, where an inference is when we get true conclusions starting from true premises. However Jorgensen noticed that in ordinary normative reasoning we perform inferences can be accepted as true; such as:

1.Keep your promises
2.This is a promise of yours
__________________________
├ Therefore, keep this promise

Where at least one of the premises (in our case the premise 1.) is prescriptive. Hence, Jorgensen finds himself in front of the following “puzzle”:

“According to a generally accepted definition of logical inferences only sentences which are capable of being true or false can function as premises or conclusion in a inference; nevertheless it seems evident that a conclusion in the imperative mood may be drawn from two premises one of which or both of which are in the imperative mood” (Jorgensen, 1937-38).

There are two ways to explain this phenomenon: widening the notion of logic inference beyond the “mere” sphere of truth, or bypassing this distinction by using descriptive sentences equivalent to prescriptive sentences and applying them to the classical notion of logic inference. Otherwise it is not possible to apply the notion of logical inference to norms: any normative discourse turns to be illogical (as Ayer claimed).

The essence of the challenge of non-cognitivism is therefore expressed: how is possible to apply the notion of logical inference whatsoever to the realm of sentences lacking of truth-values?

3. From earlier non-cognitivism to the “new norm-expressivism”

If we believe norms are lacking of truth-values but a logic of norms is possible, we are thinking about an objectivist and non-cognitivist theory of norms, such as Hare’s; while if we believe that logical inference cannot be applied to sentences lacking of truth-values, therefore we have a non-cognitivist and subjectivist theory of norms, such as Ayer’s.

a. C. L. Stevenson and the role of persuasion

C. L. Stevenson (1944) developed another non-cognitivist and subjectivist theory of norms. Stevenson acknowledges that in moral sentences there is a descriptive component, which has no cognitive function but rather a quasi-imperative force which, operating through suggestion and intensified by your tone of voice, readily permits you to begin to influence or to modify another person’s behavior. Therefore, according to Stevenson, ethical terms are instruments used in a cooperative enterprise that leads to a mutual readjustment of human interest. So, when using ethical sentences, we are not using logical inference, but, actually, we are using methods of persuasion. According to Hare (1987), Stevenson treated what were perlocutionary features of moral language as if they were constitutive of its meaning, and as a result became an irrationalist, because perlocutionary acts are not subject to logical rules.

b. R. M. Hare and the dictive indifference of logic

According to Hare, normative sentences are characterized by three ingredients: prescriptivity, universalizability and overridingness/supervenience; these three ingredients are logical characteristics of normative sentences by virtue of their meaning (Hare, 1989).

According to Hare, moral sentences are prescriptions that are sentences used for guiding an action or to reply at the question: “What shall I do?” (Hare, 1952). In other words, an indicative (or descriptive) sentence is used for telling someone that something is the case; an imperative is not about that – it is used for telling someone to make something the case (ibid.). Differently from emotive theories (such as Stevenson’s), Hare claims that telling someone to make something the case implies a persuasive process from the speaker to the listener. Emotive theories, according to Hare, judge the success of imperative solely by their effects, that is, by whether the person believes or does what we are trying to get him or her to believe or do. It does not matter whether the means used to persuade him are fair or foul, so long as they persuade him/her. Persuasions imply a lack of rationality by moral theories; therefore using persuasion does not mean rationally replying to the question “What shall I do?”, but rather it is an attempt to answer the question in a particular way.

Universalizability is a feature moral sentences share with descriptions, but, according to Hare still is a logic component of neustics (Hare’s term for descriptive component of a sentence). Roughly speaking it means that terms like “ought” and “must” are similar to words like “all” rather than “red” or “blue”. In other words, normative concepts have to be compared to logical operators (such as “all” or “some” or “It is necessary that”) and not to predicates (see Hare, 1963 and 1967). Moreover, the rules that define their logical behavior make them universalizable. Another interpretation of the thesis of Universalizability claims that Universalizability is not about the way moral terms function, but it is a principle (axiom) which is part of any possible normative system as such (see Hare, 1982). In other words, Universalizability is similar to the “Golden Rule” (“Treat others only in a way that you’re willing to be treated in the same situation”) or to impartiality, rather than an actual formal axiom in a ethical system. This thesis has been attacked by several authors such as A. MacIntyre (1957), B. Williams (1985) and M. Singer (1985). All those scholars agree that actually there are several levels of universalizability which Hare’s monolithical formulation would melt. Particularly, MacIntyre argues that Hare does not make clear between “generality” (that is general principles) and “universality” (universal principles).

Supervenience is a feature moral sentences share with descriptions too. This issue is discussed also in the philosophy of mind. In moral philosophy, the issue of supervenience concerns the relationship which is said to hold between moral properties and natural or non-moral properties. Alternatively, it is put forward as a claim about a certain feature of moral terms or moral predicates. When it is said of “trust” that it is, say, good, “trust” is good because or in virtue of some subjacent or underlying property of it. Generally, it is held that these subjacent properties are natural properties of “trust”.

For Hare overridingness is a feature, not just of evaluative words, properties, or judgments, but of the wider class of judgments which have to have, at least in some minimal sense, reasons or grounds of explanations (Hare, 1989). Basically, Hare believes that overridingness and universalizability are similar concepts in that both involve a universal premise such as in the Golden Rule.

From a logical-linguistic point of view, Hare distinguishes in a sentence between a phrastic and a neustic:

“I shall call the part of the sentence that is common to [assertive and imperative] moods (…) the phrastic; and the part different in the case of commands and sentences (…) the neustic” (Hare, 1952).

Roughly speaking, a phrastic is that component in the sentence we called the descriptive component above, and a neustic is the illocutionary part in a sentence. According to Hare, logical connectives are part of phrastics; combinations of those connectives are able to create, are valid in the case we deal with normative sentences as well as we deal with descriptive sentences. It is, indeed, the proper function of these connectives to establish relations between sentences; in other words, the validity of a reasoning depends upon the logical links subsisting among phrastics. Hare’s thesis is called “dictive indifference of logic”: “we shall see (…) that these connectives are all descriptive and not dictive. In fact, it is the descriptive part of sentences with which formal logicians are almost exclusively concerned; and this means that what they say applied as much to imperatives as to indicatives; for to any descriptor (or phrastic) we can add either kind of dictor (or neustic), and get a sentence” (Hare, 1949). Therefore no difference will subsist between a logic of imperatives and a logic of assertions: “The method of reasoning used in (…) [imperative] inferences is, of course, exactly which is used in indicative logic: these considerations in no way support that there can be a separate ‘Logic of Imperatives’, but only that imperatives are logical in the same way as indicatives” (Ibid.). Phrastics, indeed, are the same in imperatives and assertions, and we can assert “that any formula of formal logic which is capable of an indicative interpretation is capable also of an imperative one,” that is, we can substitute an indicative neustic with an imperative one, leaving the phrastic unchanged (Ibid.).

c. The new “norm-expressivism”

Starting from the 80s there was a renewal of analysis of morals in an emotivist key. These analyses were made by Simon Blackburn and by Allan Gibbard. In their work the emotive theory of morals is revised and enriched even accepting room for a logic of norms (in opposition to what happened in the earlier emotive theories, such as Stevenson’s).

Blackburn’s quasi-realism (1984) moves from the actual practice in the ordinary language to express itself in a realistic way even when uttering moral sentences. Blackburn claims that practice is to be, so to speak, the way we made projections of our attitudes onto the world; in Blackburn’s own words, “we say we project an attitude or habit, or other commitment which is not descriptive onto the world, when we speak and think as though there were a property of things which our saying describe, which we can reason about, know about, be wrong about and so on” (Blackburn, ibid.).

Blackburn, on one hand, rehabilitates emotive theories of morals and, on the other hand, says – contrary to Mackie’s error theory – our use of realist terminology is respectable and not in contract with its projective origin. We will see in the next section how Blackburn can make room for a logic of norms.

Gibbard’s (1990) central concept is the idea that calling something rational is to express one’s acceptance of norms that permits it. It applies to the rationality of actions, and it applied to the rationality of beliefs and feelings (ibid.). For Gibbard, cognitive analyses fail to recognize that judging a behavior as rational means to endorse it; even classical non-cognitivist analyses fails this point as they admit that moral judgment are not feelings, but judgments of what moral feelings it is rational to have. Feelings we think, can be apt or not, moral judgments are judgments of when guilt and resentment are apt.

The primary function of norms (which Gibbard justifies on evolutionary basis) is to facilitate the social cooperation, and while true factual sentences are coupled with world representations, normative ones have the function of making social cooperation stable, and not linked to environmental and social changes. Gibbard’s theory is a non-cognitivist but naturalistic one, which is necessary to give an account of rationality in terms of accepting a norm which is, in its turn, a standard for rationality of actions; on the contrary it would turn in a vicious circle.

Norms rule everybody’s feelings and actions and they are the main component of a moral judgment; to judging an action as wrong, in Gibbard’s terms, it means that an actor’s feelings of guilt and judging people’s anger are apt feelings. Of course, these will be changing from culture to culture. Finally, Gibbard suggests that normative judgments – because their social function – commit us to adopt higher level norms to encourage social cooperation.

Gibbard’s key concept is “accepting a norm” which is to justify on a psychological theory of meaning in a similar way to Stevenson’s theory. For Gibbard, a norm is a significant kind of a psychological state of the mind, which is not fully understandable for us. Therefore, Gibbard’s theory rests on an ambiguity; on one hand, value judgments are lacking of truth-values, but on the other hand, they express the existence of someone’s mental states.

4. The Frege-Geach Problem

The Frege-Geach problem (also known as the “embedding problem”) is used as the main “test” to understand rationality in non-cognitivist theories. The problem was posed in P. Geach’s article “Assertion” (Geach, 1964), but the discussion starts back from Geach’s article “Imperatives and Deontic Logic” (Geach, 1958). In particular, Geach used his own test to attack non-cognitivist claims; in fact, if we find a positive solution to the Geach-Frege Problem we are de facto giving significance to non-cognitivist moral reasoning. On the contrary, if no solution to the problem is provided, the only option left open to moral reasoning is cognitivism or excluding ethics into the realm of rationality (likewise radical forms of emotivism such as Ayer).

Briefly, the Frege-Geach problem is that sentences that express moral judgments can form part of semantically complex sentences in a way that an expressivist cannot easily explain. According to Geach, the sentence “Telling the lies is wrong” has the same meaning regardless of whether it occurs on its own or as the antecedent of “If telling the lies is wrong, then getting your little brother to tell lies is also wrong”. This must be so, since we may derive “Telling your little brother to tell lies is wrong” from them and both by modus ponens without any fallacy of equivocation. Yet nothing is expressed (in the relevant sense) by “Telling lies is wrong” when it forms the antecedent of the conditional, since the antecedent is not itself the same illocutionary force as the premise, and so its meaning (regardless of where it occurs) apparently cannot be explained by an expressivist analysis. Analogous problems within other kinds of embedded contexts (Unwin, 1999).

However, Geach recommends attention to Frege’s distinction between assertion and predication, or in other words, between illocutionary force and propositional content, respectively. In fact, if we assume the role of the illocutionary force, there would be a slight change in the meaning of the word “wrong” in the antecedent of the conditional “If telling the lies is wrong, then getting your little brother to tell lies is also wrong” and in its occurrence as consequence in the same conditional sentence. This problem is even clearer using modus ponens:

1. If tormenting the cat is wrong, then getting your little brother to torment the cat is also wrong
2. Tormenting the cat is wrong
Therefore, getting your little brother to torment the cat is wrong.

In the case above it is difficult to say that the occurrence of “wrong” as antecedent of the 1st conditional (which appears to be descriptive) has exactly the same meaning as “wrong” in the 2nd sentence (which appears to be normative).

We saw non-cognitivism is characterized by the assumption that norms lack truth-values. Yet, the contexts introduced by ordinary logic operators such as “and”, “not”, “or”, “if… then”, and the quantifiers, together with predication itself, are normally explicated in terms of the more basic semantic concepts of truth. Therefore, it seems that this option is not available to non-cognitivists, in general, and in particular to expressivists.

a. Blackburn solutions to the Frege-Geach Problem

S. Blackburn (1984) redefines the Frege-Geach Problem in terms of whether expressive theories can cope with unasserted contexts in such a way as to allow sentences the same meaning within them, as they have when they are asserted. According to Blackburn, we use evaluative sentences as if they were not different from assertions (because of our projective attitude), and, therefore, we intuitively treat them as if they were bearing truth-values and linked to descriptive sentences.

The problem will be about the interpretation of connectives to be used to build up more complex commitments having in their own several illocutionary characteristics (such as in a conditional). Blackburn suggests commitments are used to create more complex sentences which is accepted only if all its parts are accepted, according to the following solution: “the notion of commitment is then capacious enough to include both ordinary beliefs, and these other attitudes, habits and prescriptions” (Blackburn, ibid., p. 192). Therefore a conditional will express someone’s endorsement to an attitude (which is an expression of a moral standpoint, too) preceded by a belief. In other words, it expresses a higher-order attitude, that is, an expression of disapproval or approval toward a combination of attitudes (such as of lying). Conditionals, as they are used in ordinary language, show the way we express an endorsement over involvement of commitments – which is expression of a moral standpoint. In other words, we can see that using conditional forms (in normative contexts) is a higher level form (compared to simple sentences like “it’s wrong telling lies”) which serves to express one’s attitudes on attitudes, or meta-attitudes.

Blackburn introduces these kinds of sentences formally in the following way:

(a) H! (B!p → B!q)

Where H! stands for the “Hooray” operator (expressive counterpart of the deontic operator “O” – for obligation), B! is the “Booh” operator (expressive equivalent to the deontic “F” – for forbidden). What appears between slashes shows that our argument is an attitude or a belief, which express a first order attitude (such as “The playing for West Ham is wrong”).

The main limit of Blackburn’s solution of the Frege-Geach problem concerns the nature of the H! and B! operators, while iterated in a higher order sentence. Blackburn’s formulation does not make clear the illocutionary role of the operator. If we interpret all the operators in the formula (a) in an expressive (or prescriptive) way, (that is lacking of truth-values), the whole expression will not make sense. According to Barcan Marcus (1966), iteration of normative operators looks like stammering. Otherwise. if we interpret (according to Blackburn) the external operator H! in an expressive (or prescriptive) way and those into the slashes as descriptive ones, we will have a correct way of interpreting operators but no solution to the Frege-Geach problem. The formula (a) above, indeed, is formally correct but does not solve the problem about the identity of meaning for example between the antecedent of the 1st conditional in the Modus Ponens shown above (which is descriptive) and its 2nd sentence (which is normative).

b. Gibbard solution to the Frege-Geach Problem

Gibbard tries to solve the Frege-Geach problem using a slightly modified version of possible worlds semantics that he labeled as “factual-normative worlds”. Factual-normative worlds are an ordered pair where “w” is a possible world (or a set of facts) and “n” is a complete system of general norms. The pair constitutes a creedal-normative state completely opinionated (Gibbard, 1990, p. 95).

According to Gibbard, any particular normative judgment holds or not, as a matter of logic, in the factual-normative world . That is, the pair is a set of sound and complete norms where, for each possible human behavior, we can state the normative status (Forbidden, Obligatory or Indifferent) associated with it. In this way each individual can understand the normative qualification of his or her action.

Consider a human observer who is uncertain both factually and normatively. When the observer will think about the rightness of a normative judgment, she or he will rule out any possible action which is not included into a set constituted by all the factual elements and all the normative elements in which that normative judgment is valid. Let’s take for instance, the modus ponens above:

1. If tormenting the cat is wrong, then getting your little brother to torment the cat is also wrong
2. Tormenting the cat is wrong
Therefore, getting your little brother to torment the cat is wrong.

The first premise rules out all the combinations in which it is not wrong to get your little brother to tell lies. The second premise rules out the set of combination between norms and facts in which is wrong to torment the cat. Therefore both premises together rules out the whole set of norms and facts in which it is not wrong to get your little brother to torment the cat; including any combination that the conclusion rules out.

What does it mean for a sentence to be valid in a particular factual-normative world? According to Gibbard it means that for each sentence containing a normative predicate there is a n-corresponding descriptive version which makes a normative predicate (such as “rational”) refer to a particular set of norms (that is “rational” according to the system n). Hence, Gibbard concludes, for any logically complex sentence S containing normative predicates in embedded contexts, we may construct the descriptive sentence Sn that arises from replacing all normative predicates in S by their n-corresponding version. Therefore we can operate with embedded contexts saying the sentence S holds in if and only if Sn holds in a possible world .

Actually Gibbard’s solution to the Geach-Frege problem is rather a bypass method to avoid the problem because he explains the functioning of normative language by means of descriptive language and semantical models. According to Sinnot-Armstrong’s criticism (1993), Gibbard’s analysis appears to be compatible with a realist view on norms because of his ambiguous use of normative judgment (which is a state of mind) and his use of possible world semantics.

5. The significance of the Geach-Frege Problem and Jorgensen’s Dilemma for non-cognitivism

The Geach-Frege problems and Jorgensen’s Dilemma are faces of the same coin. The first deals with the problem of mixed, or embedded, contexts (normative and descriptive) and how it is possible to deal with mixed sentences. The main problem here is the interpretation of connectives and logical operators in contexts that are partially lacking truth-values.

Jorgensen’s Dilemma, on the other hand, deals with making inferences between norms, that is, sentences that are lacking of truth-values, and to create a logical foundation that makes sense of inferences between norms we actually find sound in the everyday discourse. The Jorgensen’s Dilemma also tries to explain the very nature lying behind moral disagreements and the way we can rationally deliberate on them.

Both are questions involving the different illocutionary role of normative/expressive sentences and their solution represents a challenge to non-cognitivism. A positive solution to both challenges would open a room to the rationality of non-cognitive discourse in ethics. On the contrary, a negative one would show that the only option for rationalism in ethics is cognitivism or — in the worst case scenario — to irrationality and ethical nihilism.

Finally it is worth notice that while both cover a similar perspective, the Frege-Geach problem is more popular in moral philosophy, whereas Jorgensen’s Dilemma is more popular in the philosophy of law. It is difficult to understand the reasons for that different interest. We can only guess that it was because the analysis of sentences in terms of the Frege-Reichenbach model was popular among moral philosophers while it was virtually unknown (until the works by Alchourron and Bulygin, 1971) among philosophers of law.

6. A Taxonomy of Ethics

The following scheme is a development from R. M. Hare’s A Taxonomy of Ethical Theories (Hare, 1997, p. 42)

Descriptivism: Meanings of moral sentences are wholly determined by syntax and truth conditions.

Naturalism: Truth conditions of moral sentences are non-moral properties.

Objectivistic naturalism: These properties are objective.

Subjective naturalism: These properties are subjective.

Intuitionism: Truth conditions of moral sentences are sui generis moral properties.

Non-descriptivism: Meanings of moral sentences are not wholly determined by syntax and truth conditions.

Emotivism: Moral sentences are not governed by logic.

Rationalistic non-descriptivism: Moral sentences are governed by logic.

Universal prescriptivism: The logic, which governs moral sentences, is the logic of universal prescriptions.

Expressivism: The moral sentences are about beliefs and/or commitments; their logic is different from the logic of descriptive sentences.

7. References and Further Reading

  • Alchourrón, 1993: “Philosophical Foundations of Deontic Logic and the Logic of Defeasible Conditionals”, in Meyer e Wieringa (1993), Deontic Logic in Computer Science, Chichester, Wiley, pp.43-84.
  • Alchourrón, C. E. and Bulygin, E. (1981): “The Expressive Conception of Norms”, in Hilpinen, H. (ed.) (1981), New Essays in Deontic Logic, Dordrecht, D. Reidel, pp. 95-124
  • Alchourrón, C. E. and Bulygin, E. (1989): “Limits of Logic and Legal Reasoning”, in Martino, A.A. (ed.) (1989), Deontic Logic, Computational Linguistics and Legal Information Systems, Amsterdam, North-Holland, pp. 1-20.
  • Ayer, A. J. (1936): Language, Truth and Logic, London, Gollancz
  • Bentham, J. (1789): An Introduction to the Principles of Morals and Legislation, eds. Burns, J.H. and Hart, H.L.A., London, Athlone Press, 1970
  • Bentham, J. (1970): Of Laws in General, ed. Hart, H.L.A., London, Athlone Press, 1970.
  • Blackburn, S. (1984): Spreading the Word, Oxford, Clarendon.
  • Bulygin, E. (1982): “Norms, normative propositions and legal statements”, in Floistad, G. (ed.), Contemporary Philosophy A New Survey, The Hague, M. Nijhoff, pp. 157-163; rist. in Alchourron e Bulygin (1991), pp. 215-238.
  • Geach, P. T., (1958): “Imperative and Deontic Logic”, Analysis, 18, 3, pp. 49-56.
  • Geach, P. (1964): “Assertion”, Philosophical Review, 74, pp. 449-465
  • Gibbard, A. (1990): Wise Choices, Apt Feelings. A Theory of Normative Judgement, Oxford, Clarendon Press
  • Hale, B., (1993): “Can There Be a Logic of Attitudes?”, in Haldane, J., e Wright, C, (eds.) (1995), pp. 337-363
  • Hare, R. M. (1949): Imperatives Sentences, in Mind, LVIII;  in Hare (1971), pp.1-21.
  • Hare, R. M. (1952): The Language of Morals, Clarendon, Oxford.. Hare, R.M. (1963): Freedom and Reason, Oxford, Oxford U.P.
  • Hare, R. M. (1967): “Some Alleged Differences between Imperatives and Indicatives”, in Mind, LXXVI
  • Hare R. M. (1982): Moral Thinkings: Its Levels, Methods and Point, Oxford, Oxford U.P
  • Hare R. M. (1989): Essays in Ethical Theory, Oxford, Oxford U.P.
  • Hare R. M. (1997):Sorting Out Ethics, Oxford, O.U.P.
  • Jørgensen, J. (1937-38): “Imperatives and Logic”, in Erkenntnis, 7, pp. 288-296
  • Kelsen, H. (1941): “The Pure Theory of Law and Analytical Jurisprudence”, in Harvard Law Review, 60, pp. 44-70
  • Levinson, S. C. (1983): Pragmatics. Cambridge, Cambridge U.P.
  • Lyons, J. (1995): Linguistic Semantics. An Introduction, Cambridge, Cambridge U.P.
  • MacIntyre, A. (1957): “What Morality is Not”, Philosophia, XXXII (123), pp. 325-335.
  • Marcus, B. (1966): “Iterated Deontic Modalities”, Mind, 75, pp. 580-582.
  • Reichenbach, H (1947): Elements of Symbolic Logic, New York, McMillan
  • Russell, B. (1935): Religion and Science, Oxford U.P.
  • Searle, J.R. (1969): Speech Acts. An Essay in the Philosophy of Language, London, O.U.P.
  • Singer, M. (1985): “The Generalization Principle”, in Potter, N.T. e Simmons M. (eds.) Morality and Universality, Boston, Dordrecht, pp. 47-73.
  • Sinnott-Armstrong, W. (1993): “Some problems for Gibbard’s norm-expressivism”, Philosophical Studies, pp. 297-313.
  • Stevenson, C.L. (1944): Ethics and Language, New Haven, Yale U.P
  • Unwin, N. (1999): “Norms and Negation: A Problem for Gibbard’s Logic”, The Philosophical Quarterly, 51(202), pp.60-75
  • von Wright, G. H. (1963): Norm and Action. A Logical Inquiry, London, Routledge & Kegan Paul
  • Warnock, M. (1978): Ethics since 1900, Oxford, Oxford U.P.,
  • Williams, B. A. O. (1985): Ethics and the Limits of Philosophy, Cambridge (Mass.), Cambridge U.P.

Author Information

Antonio Marturano
Email: marturano@btinternet.com
University of Exeter
United Kingdom

Universals

Universals are a class of mind-independent entities, usually contrasted with individuals (or so-called “particulars”), postulated to ground and explain relations of qualitative identity and resemblance among individuals. Individuals are said to be similar in virtue of sharing universals. An apple and a ruby are both red, for example, and their common redness results from sharing a universal. If they are both red at the same time, the universal, red, must be in two places at once. This makes universals quite different from individuals; and it makes them controversial.

Whether universals are in fact required to explain relations of qualitative identity and resemblance among individuals has engaged metaphysicians for two thousand years. Disputants fall into one of three broad camps. Realists endorse universals. Conceptualists and Nominalists, on the other hand, refuse to accept universals and deny that they are needed. Conceptualists explain similarity among individuals by appealing to general concepts or ideas, things that exist only in minds. Nominalists, in contrast, are content to leave relations of qualitative resemblance brute and ungrounded. Numerous versions of Nominalism have been proposed, some with a great deal of sophistication. Contemporary philosophy has seen the rise of a new form of Nominalism, one that makes use of a special class of individuals, known as tropes. Familiar individuals have many properties, but tropes are single property instances. Whether Trope Nominalism improves on earlier Nominalist theories is the subject of much recent debate. In general, questions surrounding universals touch upon some of the oldest, deepest, and most abstract of philosophical issues.

Table of Contents

  1. Introduction
    1. The Nature of Universals
    2. Reasons to Postulate Universals
    3. The Problem of Universals
  2. Versions of Realism
    1. Extreme Realism
    2. Strong Realism
    3. Objections to Realism
  3. Versions of Anti-Realism
    1. Predicate Nominalism
    2. Resemblance Nominalism
    3. Trope Nominalism
    4. Conceptualism
  4. Concluding Thoughts
  5. References and Further Reading

1. Introduction

An inventory of reality’s most fundamental entities would almost certainly include individuals. Individuals are singular objects. They can exist over time, but in only one place at a time. Individuals also have properties (also called qualities), at least most of which can vary over time. A ripening apple goes from being green to being red, for instance. Almost everyone agrees that individual apples exist, and that they are colored, but are redness and greenness entities themselves? If so, what are they like? And if redness and greenness are not real entities, how could our apple be colored at all? Without its distinctive qualities, an apple wouldn’t even be an apple.

Let us use the term “universal” for properties (or qualities).  In a philosophical tone of voice we can now ask, “Are there really such universals? If so, what is their nature? How are they related to individuals?” These questions start us down a road philosophers have been exploring since philosophy itself was young.

We can approach the question about the existence of universals from a linguistic perspective. Consider how often we speak of things having properties: “That apple is red;” “The oven is hot;” or “My shirt is dirty.” Such sentences have a subject-predicate structure. The subject term refers to the individual described in the sentence. The predicate, on the other hand, describes; it tells us something about the way that individual is, how it is qualified. Do predicates also refer? Some philosophers think they do. Alongside the individuals picked out by subject terms of sentences, it is thought, there are entities of a different kind, picked out by predicates. Once again we can call these “universals”.

Prima facie, there seems to be every reason to believe in universals. They look to be just as much a part of our experience as individuals are. Philosophical questions and problems arise, however, when we try to specify their natures. If universals are real, but are not individuals, what are they? Some philosophers contend that universals are too strange to accept into our world view. In a similar vein, it has been alleged that any philosophical work done by universals can be done just as well without them; whether they are strange or not, many argue, universals are simply unnecessary. Of course, it would need to be shown that universals really can be dispensed with, and we’ll return to this controversy. But first we will examine competing Realist conceptions of the nature of universals.

a. The Nature of Universals

In fundamental debates in metaphysics, it can be useful to understand the type of entity or concept in contrastive terms. For instance, it is helpful to understand universals by contrasting them with individuals. What then, is an individual, or a particular, in the philosophical or metaphysical sense of the term?

Traditionally, the term “individual” is used to pick out members of a certain category of existents, each member of which is said to be unique. More precisely, individuals are said to be non-repeatable (not multi-exemplifiable), which means that they can’t be in more than one place at a time. Examples include the familiar objects of sense-experience, such as chairs or tigers. A room may contain many chairs that are virtually alike in their intrinsic qualities, but each chair is nonetheless a distinct thing in one place at one time. By contrast, the universal “chair” is repeated around the room.

The individuals familiar from experience are also said to be material: they fill regions of space with impenetrable “stuff,” and are locatable in space and time. Some philosophers are committed to other types of individuals, as well: immaterial ones (such as souls and sense-data) and even ones that are also outside space and time (such as numbers and God). The crucial contrast for our purposes, however, is between what are repeatable (universals) and what are not (individuals).

Although individuals are nonrepeatable, universals can serve their characteristic functions only if they differ from individuals in this respect. In order to ground relations of qualitative identity, for instance, universals must be multi-exemplifiable (or repeatable), able to be here and there at the same time. My apple and yours are both individuals, and this implies that each can be in only one place at a time. But if the redness they share is a universal, then the redness they share is a real non-individual, literally in both. The apples are similar in virtue of sharing this universal, redness. And if redness is shared in this way, then it is in at least two places at once.

As we proceed we will get more precise about these characterizations, and explore variations that have been defended in opposing Realist accounts. But we can appreciate already why some philosophers balk at the existence of universals. For, as just noted, all defenders want to say that universals are repeatable. It seems, however, that defenders of universals must also say that universals are wholly present in each of the places they exist.

To explain, suppose we were to destroy one of the apples considered above. We’d have one fewer individual, to be sure. Would there be a diminishment of redness itself? It doesn’t seem so, since redness is held to be an entity in its own right. Nor does it seem to make sense to say that redness increases when another apple ripens and turns red. These considerations suggest that a universal is wholly present in each of its instances, and that the existence of a universal at one place is unrelated to its simultaneous existence at any other place. It’s not clear, however, how universals could be both wholly present in each of the places they exist, and, at the same time, present in many different places at once. This certainly would make them unusual, to say the least.

Moreover, it seems to be a mark of materiality that a material thing can be in only one place at a time. If so, then universals cannot be material. This in turn creates a problem when it comes to causation. For as we usually understand causal relations, one thing affects another by interacting with it, say by colliding with it. But that seems possible only if the entities in question are material. For these reasons it is difficult to explain how universals interact with other things that exist. The puzzle becomes more acute when we wonder how we can know universals at all. Don’t they have to interact with our brains for us to know them? If they are not material, this interaction is quite mysterious.

In summation, we’ve seen that universals are quite different from individuals, and in ways that make them odd. Philosophers with low tolerance for strangeness tend to dismiss them for these reasons. Why, then, do some philosophers continue to believe in them, despite their unusual natures?

b. Reasons to Postulate Universals

Universals are called on to serve many philosophical functions. For most of this article, we’ll focus on one particularly famous one – the role universals play in professed solutions to what has come to be called “The Problem of Universals.”

First, a word or two about postulating entities is in order. Here we might compare the philosophical enterprise of deciding whether universals exist with the scientific enterprise of deciding whether strange unobservable entities, like quarks or neutrinos, exist. The scientific case is itself controversial, but many scientists and philosophers believe in the existence of unobservables, provided the theories that postulate them best explain the observable phenomena under study. For example, many believe the universe contains what physicists call “black holes,” in part because the best (perhaps only) way to explain a range of stellar phenomena is to suppose that black holes are responsible. Again, this is controversial, but if the explanation provided is the best (or only) explanation, many scientists and philosophers claim a right to believe the postulated unobservables exist.

In parallel, we now ask, “Are there any philosophical puzzles or problems that can best be solved by believing in universals?” In fact, universals have been called on to answer a range of philosophical questions. Recall our points about subjects, predicates and reference. Prima facie, a name wouldn’t be a name if there weren’t something for it to refer to. Some philosophers think that the meaning of a name just is its referent. What about general terms, terms that can be said of many things, such as “red“ or “wise”? What gives those terms meaning? Some have said that predicates must have referents to be meaningful, and universals fit the bill.

Universals have also been called on to solve problems in the theory of knowledge. Plato, for instance, said that for us to know something, that which is known must be unchanging. Since material individuals are subject to change, Plato argued, there must be things that don’t change, suitable as objects of genuine knowledge, not just belief. Universals might fit the bill here, too.

Relatedly, some philosophers have argued that we need universals to understand the stable, unchanging laws of nature that govern individuals’ changes. Indeed, it has been argued that a law of nature just is a relation among universals, by which one universal brings about, or necessitates, others.

Our focus in this essay concerns another role for universals, perhaps the most famous one. They are said to answer what seems a very simple question, but which turns out to be one of the most famous and long-standing issues in philosophy. This returns us to the so-called “Problem of Universals.”

c. The Problem of Universals

Often we predicate properties of individuals. When we say that both cherries and rubies are red, for instance, we seem to say individuals share common properties, those that make cherries cherries, those that make rubies rubies, and those that make both red. Predicates are said of many subjects, then, but is there anything in reality to match the linguistic one-over-many? Are there general truths? Is there commonality in nature, in reality; or is commonality imagined and illusory, perhaps a mere product of language? If the latter, how can we accommodate the intuition that it is the world, and not our conventions, that make predications true or false? The Problem of Universals arises when we ask these questions. Attempts to solve this problem divide into three broad strategies: Realism, Nominalism, and Conceptualism. We’ll take these in turn, and consider the pros and cons of each.

2. Versions of Realism

We’ll begin by examining versions of Realism, all of which claim that yes, there are universals; yes, there are truths about the general; yes, there is commonality in nature. Unless we accept universals into our world view, the Realist argues, we will be unable to explain a fundamental and apparent fact, namely, that there is genuine commonality and systematicity in nature. Again, experience suggests that the individuals we encounter share properties with other individuals. Some are red, and some are not; some are blue, and some are not; some are emeralds, and some are not. Realists claim what makes it the case that these individuals seem to share properties is that in fact they do. There is an entity, a universal, present in each of these individuals at once, which in turn explains our right to say that they are qualitatively identical.

a. Extreme Realism

The oldest, and most famous, variant of Realism comes from Plato. Plato’s position is that in order to explain the qualitative identity of distinct individuals, we must accept that there is another entity besides the resembling individuals, an entity we’ve called a universal, and which Plato would call a Form. If two apples, for example, are both red, this is because there is a Form of Red that is able to manifest itself in both those apples at once.

Really there are three different components in this picture. There is the individual, a particular apple; there is the red of that apple – which exists right “in” or with that apple; and finally, there is the Form of Red, which manifests itself in the red of this apple (and of course, the red of other apples). What, then, is the nature of the Form itself, which provides for the bit of red we see in this apple or in that?

On Plato’s view, Forms are immaterial. They are also outside of space and time altogether. They are wholly abstract, we might say. Of course, for the Form of Red to make an individual apple red, the Form must somehow be related to the apple. Plato postulates a relation of participation to meet this need, and speaks of things “participating” in Forms, and getting their qualities by virtue of this relation of participation. One last point about the nature of Forms proves crucial. For the Form of Red to explain or ground the redness of an apple, the Form of Red must itself be red, or so it seems. How could a Form make an apple red, if the Form were not itself red?

As we noted, Plato’s account of generality was the first one, and it has held great appeal ever since. But it is also subject to serious criticisms. Interestingly, one of the most devastating objections to the theory of Forms comes from Plato himself. We will return later to this famous objection, which has come to be known as the Third Man Argument. Because of the power of this argument, many philosophers sympathetic to Realism have looked elsewhere for a solution to the Problem of Universals. We’ll explore one alternative now.

b. Strong Realism

Although the first position is credited to Plato, this next one is widely thought to be inspired by Aristotle. The key in this position is its rejection of independently existing Forms. As we noted in Section 2a., Extreme Realists posit an explanatory triad involving an individual, the quality of this individual, and the Form that grounds the quality of this individual (and that one, and others). Strong Realists, in contrast, resist this triad. When an individual has a quality, there is simply the individual and its quality. No third, independent thing is needed to ground possession of the quality. A universal, on this view, just is the quality that is in this individual and any other qualitatively identical individuals. The universal red, for example, is in this apple, that apple, and all apples that are similarly red. It is not distinct and independent from the individuals that have this color. Because it is a universal it can exist in many places at once. According to Strong Realism, the universal red in my apple is numerically identical to the red in yours; one universal is in two individuals at once. It is wholly present in each place where it exists.

As we’ll see, Strong Realism is immune to the Third Man Argument. It also reduces the strangeness of Realism. We need not have Forms that are abstract, in the sense of being outside of space and time, mysteriously grounding the qualities of material individuals. The Strong Realist’s universals are in space and time, and are able to be in many places at once. Multiple exemplification may be considered strange, but it not as strange as existence outside space and time.

c. Objections to Realism

We turn now to objections. We’ve already seen what might be called the Strangeness Objection. This is the intuition some philosophers have that universals are just too odd-natured to be accepted into our world view. These philosophers typically countenance only what is material, spatiotemporal, and nonrepeatable; and universals just don’t fit the bill. Philosophers who believe in only individuals are known as Nominalists. We’ll return to them later. We should note, however, that there are other versions of Realism in addition to the two we’ve discussed. Medieval philosophers spent much time exploring these issues, and formulated many versions of Realism. This introduction to the Problem of Universals will not explore these other variants, though they too are vulnerable to the objection that closes this section.

Extreme Realism is challenged by the Third Man Argument. Recall the essentials of that position, in particular, what is said about the nature of the Forms. For any given quality had by an individual there is a Form of that quality, one that exists separately from individuals, and also from the quality found in each particular individual. There is the apple, the red of this apple (and the red of that apple), and the Form of Red. By participating in the Form of Red, the apple gets its particular bit of redness. And finally, as we saw, the Form Red must itself be red. Otherwise it couldn’t provide for the redness of the apple. Suppose we now ask, “What explains the red of the Form of Red, which itself, as we said, is red?” Coming to believe in the existence of Forms begins with the urge to explain the redness of apples and other material individuals, but once this step is taken, the Extreme Realist is forced to explain the redness of the Form of Red itself.

To explain the redness of the Form of Red, in Extreme Realist fashion, we will have to say that the Form of Red participates in a Form. After all, a fundamental tenet of Extreme Realism is that possession of a quality always results from participation in a Form. Presumably, a Form cannot participate in itself. Therefore, if the redness of the Form of Red is to be explained, we’ll need to say that the Form of Red participates in a higher-order Form, Red2 . Moreover, participation in Red2 will explain the redness of Red1 only if the higher-order Form, Red2, is itself red. Of course, now we will have to explain the redness of the Form of Red2, and that will require us to introduce yet another Form, in this case, the Form of Red3, which the Form of Red2 participates in to get its redness.

It is clear that this will go on indefinitely. So it seems that we will never have an explanation of why or how the Form of Red is actually red. That means we’ll never be able to explain why our original apple is red. That was what we wanted initially, and so it seems that Plato’s theory is unable to provide an answer. This has led many to reject Plato’s theory. (There is, not surprisingly, a large body of secondary literature which explores whether Plato’s theory can survive this objection and what Plato himself thought about it, since, as we’ve mentioned, it was Plato himself who first raised the objection.)

The Third Man Argument threatens only Extreme Realism. Strong Realists do not rely on independently existing Forms to explain the redness of individuals, and so they need not explain why an independent existent – the Form of Red – is itself red. Instead, Strong Realists can simply note that the universal present in each apple is itself red, and the red of this universal explains the red of each apple, and also their similarity with respect to color.

However, the objection to which we now turn threatens all variants of Realism. This final objection is not so much an argument that Realism is intrinsically flawed, but rather that Realism is unnecessary. A general principle governing many metaphysical debates is that, other things being equal, the fewer types or kinds of entities in one’s ontology, the better. Those opposed to Realism argue that they can meet the explanatory demands we’ve discussed without relying on universals. If qualitative resemblance and identity can be accounted for without universals, and if any other work done with universals can be done as well without them, then, the opponents of Realism argue, we should do without them. We will then have fewer categories in our ontology, which, other things being equal, is to be preferred.

For this reason, opponents of Realism try to solve the Problem of Universals without universals. The question we will track is whether such solutions are in fact adequate. If not, perhaps commitment to universals, however unpalatable, is necessary.

3. Versions of Anti-Realism

We’ll call any proposed solution to the Problem of Universals that doesn’t endorse universals a version of “Anti-Realism”. Anti-Realists divide into two camps: Nominalists and Conceptualists. Nominalists maintain that only individuals exist. They argue that the Problem of Universals can be solved through proper thinking about individuals, and by appeal to nothing more than the natures of, and relations among, individuals. Conceptualists, in contrast, deny that individuals suffice to solve the Problem, but they also resist appealing to mind-independent universals. Instead, qualitative identity and resemblance are explained by reference to concepts or ideas. We will explore this Conceptualist strategy at the conclusion of our discussion of Anti-Realism. First we will survey a range of Nominalist theories.

a. Predicate Nominalism

How can we explain the qualitative identity of distinct individuals without relying on universals? One strategy begins by giving an account of what makes a single individual, which we will call “Tom,” red. A minimal, but perhaps sufficient answer is to say that Tom is red because the predicate “is red” can be truly said of Tom. As for the predicate “is red” itself, it is just a particular string of words on a page (or this screen), or else a string of spoken sounds. Expanding this strategy we get the view that two individuals, say Tom and Bob, are red simply because the linguistic expression, the predicate “is red,” is truly said of both. We account for commonality in nature by reference to individuals—in this case the individuals Bob and Tom, and also linguistic expressions such as the predicate “is red.”

On this view then, all that exist are individuals and words for talking about those individuals. This seems metaphysically innocuous, but many philosophers charge that Predicate Nominalism ignores the Problem of Universals, and does not solve it. Why is it true to say that both Bob and Tom are red, for instance, and not green or blue? What is it about the world, the individuals, that explains why they are that way and not some other way? What explains their similarity? Predicate Nominalists just leave it as a brute fact that some things are red (or blue, or green). More precisely, what they leave brute is the fact that, for any given individual, some predicates correctly apply and others don’t. But when it comes to explaining these facts, Predicate Nominalism will go no further. This refusal to take the Problem of Universals seriously has even landed Predicate Nominalism the label “Ostrich Nominalism.”

b. Resemblance Nominalism

Another Nominalist strategy is to collect individuals into sets based on resemblance relations, and then account for qualitative identity and resemblance by appeal to commonalities of set membership. An individual’s redness, for example, is explained by the fact that it belongs to the set of red things. The fact that two individuals are both red is explained by their both belonging to the same set of red things. A given set, such as the set of red things, is constructed by adding to it individuals that resemble each other more closely than they resemble any nonmembers, that is, the individuals that aren’t red. In this way, Resemblance Nominalists explain individuals’ supposed shared qualities by talking only about resemblance relations. Things that resemble each other belong to a common set. Membership in a certain set defines what it is to have a certain property, and two members of a set can be said to share a property, or be qualitatively identical, in virtue of simply belonging to the same set of resembling individuals.

In the course of trying to account for two distinct properties, however, Resemblance Nominalists can end up constructing the same set twice. If two distinct properties were to pick out the same set, however, this would cause a serious problem. For instance, it is thought that everything that has a heart also has a kidney. If so, the set of individuals constructed for the property “has a heart” will have the same members as the set constructed for the property “has a kidney.” Two sets with the same members are really just one set, not two, by the very definition of “set,” so Resemblance Nominalists are forced to say that having a heart is one and the same property as having a kidney. But that is clearly false.

A second problem for the Resemblance Nominalist arises when we wonder about the method of set construction. Accounting for an individual’s redness requires building a set with that individual and other resembling individuals as members. But, unfortunately for Resemblance Nominalism, some members of the red-set actually turn out to not be red at all. To explain, remember that the construction of the set proceeds by grouping particulars that resemble each other, and, importantly, things can resemble each other in various respects. Our red apple resembles other red apples, red stop signs, and red books, and all those things would thus get into the set. But our red apple also resembles a green apple, of the same type, which isn’t ripe yet. So that green apple would go in the set. Other things, too, will resemble our apple, but not by being red. As such, it seems that Resemblance Nominalism “explains” our individual’s being red by reference to a set containing non-red things, which is just to say it doesn’t explain it at all.

The tempting reply here is, “Sure, the green apple does resemble our red apple, but not in the right way. If you stop building sets with the wrong kinds of resemblance, you won’t let non-red members into the set.” The problem with this reply is that the only way to stop these “bad” resemblances is to include in the set only things that are red. But remember, being red is what the Nominalist is trying to explain in the first place, and so we can’t use being red to guide set construction. To do so would be circular.

A third objection arises when we consider the resemblance relation itself. Resemblance Nominalism cannot succeed without this relation; it bears most of the explanatory load. Arguably, then, the position is committed to the existence of resemblance relations. This seems to generate a serious problem. Individuals resemble one another, of course, but resemblance itself is not an individual. So, if the position is committed to resemblance relations, and if resemblance relations are not individuals, then it seems that Resemblance Nominalism is a misnomer. Upon close inspection, the position looks to be a kind of Realism. Suppose three things (a, b, and c) resemble one another, and belong in the same set. We have three individuals in this case, but what about the instances of resemblance that hold among those individuals? Are they the same kind of resemblance? They had better be, if the previous objection is to be avoided! Resemblance Nominalists, then, need to posit instances of, and kinds of, resemblance, all of which suggests we actually have a universal here—namely, the resemblance relation that holds between a and b, between b and c, and between a and c. If resemblance itself is a universal, Resemblance Nominalists are committed to at least one universal. Perhaps they should make life easier (if not simpler) and let them all in!

The above objections have moved some Nominalists to develop alternative accounts. Many have turned to Trope Nominalism, which we will discuss next. Trope Nominalism is committed to a new kind of entity, tropes. This may seem surprising, since Nominalists insist on ontological simplicity. But while Nominalists allow only individuals into their ontology, this doesn’t preclude explanatory appeals to tropes. For tropes, as we will see, are a class of individuals. Perhaps with this innovation Nominalists will fare better.

c. Trope Nominalism

Though they were known to Medieval philosophers, tropes are relatively new to contemporary metaphysics, and have been called on to address a number of very different philosophical issues, including the Problem of Universals. Trope theory can be understood, somewhat paradoxically, as making properties into particulars. Tropes are a type of individual. While ordinary individuals are qualitatively complex, a trope is qualitatively simple, and is, in fact, a particular property instance. The blue of the sky is a particular trope numerically distinct from the blue-trope of your T-shirt, even if the two tropes are qualitatively identical.

For the tropist, ordinary individual objects can be conceived as bundles or collections of tropes; and an ordinary object, which is a complex particular, has a certain quality in virtue of having, as a member of the complex, a particular trope, which is that particular character. An apple thus is a complex of tropes—a red trope plus an apple-shape trope, plus a sweet trope, plus a crisp trope, and so forth. If the apple is red, that is because there is a red trope, a red individual, that is a member of that bundle or complex. Red is not a property the trope has; rather, the red trope is the red itself. (Instead of treating an ordinary object as nothing more than a bundle of tropes, another option is to treat an individual as a substance that possesses a bundle of tropes. For simplicity, we will set that option aside. Whether an object is, or instead has, a bundle of tropes, the coming points hold.)

Trope Nominalism explains qualitative identity between two distinct ordinary individuals by saying that the first individual has a constituent trope that is qualitatively identical to, but numerically distinct from, a trope had as constituent by the second individual. Two apples are red, for instance, because each has a red trope “in” them, and these tropes themselves are individuals that exactly resemble each other. Importantly, because this is a version of Nominalism, we don’t say the tropes resemble each other because they share a universal. Instead, they simply resemble each other. If we like, we can expand on the claim that red tropes resemble each other by constructing sets of resembling individuals. In this case, we would have a set of red tropes, the members of which resemble each other more closely than they resemble any other tropes. In summary, then, by appeal to qualitatively identical, but numerically distinct tropes, we can explain qualitative similarities among ordinary objects, all without reliance on universals.

How is this better than Resemblance Nominalism? Remember that Resemblance Nominalism was vulnerable because it explained qualitative identity of individuals by reference to sets of resembling individuals. The trouble was that the individuals collected into sets are ordinary objects, ones that have many properties, so they can resemble each other in many ways. For this reason, no noncircular criterion of set construction could exclude members with the wrong property. Tropes, however, have only one property, so if individual tropes are collected into sets, there won’t be members that don’t belong. The set of red tropes will have only red tropes in it. Trope Nominalists can now make unproblematic appeal to “resemblance among individuals.” This has convinced many that Trope Nominalism is a serious contender against Realism.

As well, recall that Resemblance Nominalism faced the charge that only a resemblance universal could account for resemblance relations among individuals. Trope Nominalism has a reply here too. (As always, in any complex philosophical discussion, there are various ways to reply to objections, just as there are many objections. We outline here just one of the ways Trope theories have responded to this objection.) Whereas Resemblance Nominalists seemed forced to countenance a resemblance universal, Trope Nominalists can appeal to resemblance tropes! Should we have, for example, three identical red tropes, then there will be a resemblance relation between a and b, a similar relation between b and c, and a similar relation between a and c. Trope Nominalism can treat each of these resemblances as distinct tropes. When three red tropes are mutually resembling, then, in addition to the red tropes themselves, there are three resemblance tropes. And just as the resemblance among the three red individuals is a basic fact, so too is the resemblance among these resemblance relations. Not all resemblances are alike, of course, but in this case they are. All properties are tropes, and properties include not just ones like “red,” but also ones like “resembles.”

But there are still problems, perhaps, for Trope Nominalism. Recall that we began by wondering how distinct ordinary things could be said to be qualitatively identical without introducing a universal common to both. Tropists instruct us to view ordinary particulars as complexes of tropes, and allow that there can be qualitatively similar but numerically distinct tropes present in different complexes. Qualitative similarity among ordinary objects is explained by the qualitative similarities of their constituent tropes. Finally, the qualitative similarity among distinct tropes is explained by the fact that some (for example, red) tropes resemble each other more closely than other (for example, non-red) tropes. The last point is the crucial one. We are told that it is simply a brute fact that some tropes resemble each other, and that others don’t. That is just the way things are, and there is no further explanation to be given. But tropes were meant to do explanatory work; so, at the level of tropes, we want and expect an account of generality. If trope theories are presented as a solution to the Problem of Universals, they should explain how there can be truths to explain the appearance of generality in reality. What we end up with, though, is brute and ungrounded qualitative identity among distinct tropes. In essence then, the tropist dismisses, but does not solve, a question about the nature of generality, by making generality a brute fact. Unlike Predicate Nominalism, the tropist goes to great lengths to develop a theory, but in the end seems to offer no more explanation of generality. We know that our original objects resemble each other. Why? Because they have tropes that resemble each other. But the latter resemblance is not explained. And so it seems we’ve not gone very far in explaining our original resemblance. What we want is an explanation of qualitative similarity. Accounting for it in terms of qualitative similarity—now at the level of tropes—does no more than relocate the question. The very relation we sought to understand reappears as our answer.

Again, qualitative similarity across ordinary particulars is explained by the relation of qualitative similarity holding among the tropes that constitute those particulars. But that seems either to postpone answering the question, or to answer it by appealing to the very fact we wanted explained. At best, this explanation is unsatisfying; at worst, it is circular. We are left with qualitative identity as a brute, unexplained phenomenon, triggering the reasonable question: What then have we really gained with trope theories?

d. Conceptualism

A final strategy for avoiding universals comes by making generality not a feature of reality, but instead a feature of our minds and the concepts or ideas in minds. Conceptualism thus seeks a third way, as they see it, between the excesses of Realism, and the unilluminating resemblance relations of Nominalism. Because many individuals can fall under the same concept, Conceptualism hopes to accommodate the intuition that qualitative identity and resemblance are grounded in the sharing of something, but in a way that doesn’t appeal to dubious items such as universals. According to this view, individuals a and b are red because the concept of redness applies to both. The concept red is general, not because it denotes a real non-individual, but only because many diverse particulars fall under, or conform to, that concept.

As tidy as this seems, it too suffers from problems. To see this, we need to realize that concepts can be misapplied in some cases, such as when we say of a cat that it is a dog. And misapplied concepts explain nothing deep about generality. Conceptualism’s appeal to concept application must concern only correct concept application. As such, it is fair to ask, “What makes it the case that the concept red is rightly applied to both a and b, but not of some third individual, c?” To treat this fact as brute and inexplicable is to revert to problematic Predicate Nominalism. So it seems the Conceptualist must say that the concept red applies to a and b, but not c, because a and b share a common feature, a feature c lacks. Otherwise, the application of red is unconstrained by the individuals to which it applies. But simply noting that a and b resemble each other isn’t going to help, because that just is the fact we originally sought to explain, put differently. The Conceptualist might now say that a and b share a property. But if this isn’t to amount to a restatement of the original datum, it must now be interpreted as the claim that some entity is in both a and b. That, of course, turns our supposed Conceptualist strategy back into Realism.

Critics say Conceptualism solves no problems on its own. In trying to ground our right to predicate the concept red of a and b, we are driven back to facts about a and b themselves and that leaves Conceptualism as an unstable position. It teeters back and forth between Realism, on the one hand, and Nominalism, on the other.

4. Concluding Thoughts

As with many issues in philosophy, we started with a fairly simple question and found it difficult to reach a satisfactory answer. Qualitative similarity is a seemingly undeniable feature of our experience of the world. And there seems to be every reason to expect an explanation for this common fact. But upon closer inspection we find that we must either accept some rather unusual items into our world view, or go through some fairly elaborate theorizing to reach an answer. And that elaborate theorizing itself seems full of problems.

Perhaps this explains why the Problem of Universals has had such a hold on philosophers for all these years. We sense that there must be an adequate solution to be found, but our failure to find one prods our reason and imagination. Of course, we’ve only skimmed the surface of this debate in this essay, and nearly every move we’ve discussed has been debated, reformulated, argued for and against, analyzed, accepted as obviously true and rejected as obviously false. A consensus does seem to be emerging though, as one of the main contributors to the debate in recent decades has articulated, that two genuine contenders are left: Strong Realism and Trope Nominalism. As always, there is much work to be done on this issue, despite its distinguished heritage. We hope this introduction to the problem has inspired you to seek a new path, to find a flaw in our reasoning, to note what hasn’t been noted before. You might turn out to be the next Plato.

5. References and Further Reading

  • Armstrong, D.M. Universals: An Opinionated Introduction (Boulder: Westview Press, 1989).
    • An excellent survey of nearly every position in the debate over universals, by one of the most important contributors to this century’s version of the debate.
  • Armstrong, D.M. What is a Law of Nature? (Cambridge: Cambridge University Press, 1983).
    • An overview of the debate over the laws of nature, with a defense of univerals as the required elements in an adequate account.
  • Campbell, K. Abstract Particulars (Oxford: Basil Blackwell Ltd., 1990).
    • An important introduction to the theory of tropes, showing the versatility and potential of this metaphysical category.
  • Loux, M. Metaphysics: A Contemporary Introduction (London: Routledge, 1998).
    • Covers foundational debates on a number of areas, with particular attention to the Problem of Universals.
  • Simons, P. “Particulars in Particular Clothing: Three Trope Theories of Substance,” Philosophy and Phenomenological Research 54 (1994), pp. 553-75.
    • A sophisticated exploration of various trope theories with important proposals for advancing this theory. Reveals the potential power of this position as an alternative to Realism.
  • Spade, P.V. (trans.) Five Texts on the Mediaeval Problem of Universals (Indianapolis: Hackett Publishing Co., 1994).
    • Indispensable collection of important Medieval texts with useful guides and comments.
  • Vlastos, G., “The Third Man Argument in the Parmenides,” Philosophical Review 63 (1954), pp. 319-49.
    • A landmark article on Plato’s Third Man Argument, one that rekindled widespread interest in Plato’s metaphysics.

Author Information

Mary C. MacLeod
Email: mmacleod@iup.edu
Indiana University of Pennsylvania
U. S. A.

and

Eric M. Rubenstein
Email: erubenst@iup.edu
Indiana University of Pennsylvania
U. S. A.

Johann Gottlieb Fichte (1762—1814)

fichte_j_gJohann Gottlieb Fichte is one of the major figures in German philosophy in the period between Kant and Hegel. Initially considered one of Kant’s most talented followers, Fichte developed his own system of transcendental philosophy, the so-called Wissenschaftslehre. Through technical philosophical works and popular writings Fichte exercised great influence over his contemporaries, especially during his years at the University of Jena. His influence waned towards the end of his life, and Hegel’s subsequent dominance relegated Fichte to the status of a transitional figure whose thought helped to explain the development of German idealism from Kant’s Critical philosophy to Hegel’s philosophy of Spirit. Today, however, Fichte is more correctly seen as an important philosopher in his own right, as a thinker who carried on the tradition of German idealism in a highly original form.

Table of Contents

  1. Fichte’s Beginnings (1762-1794)
    1. Early Life
    2. Fichte’s Sudden Rise to Prominence
  2. The Jena Period (1794-1799)
    1. Fichte’s Philosophical Vocation
    2. Fichte’s System, the Wissenschaftslehre
    3. Background to the Wissenschaftslehre
    4. Working Out the Wissenschaftslehre and the End of the Jena Period
  3. The Berlin Period (1800-1814)
    1. The Eclipse of Fichte’s Career
    2. Popular Writings from the Berlin Period
    3. Fichte’s Return to the University and his Final Years
  4. Conclusion
  5. References and Further Reading
    1. Fichte’s Writings in German
    2. Fichte’s Writings in English Translation
    3. Other Philosophers’ Writings in English Translation
    4. Suggested Secondary Literature in English, French, and German

1. Fichte’s Beginnings (1762-1794)

a. Early Life

Fichte was born on May 19, 1762 to a family of ribbon makers. Early in life he impressed everyone with his great intelligence, but his parents were too poor to pay for his schooling. Through the patronage of a local nobleman, he was able to attend the Pforta school, which prepared students for a university education, and then the universities of Jena and Leipzig. Unfortunately, little is known about this period of Fichte’s life, but we do know that he intended to obtain a degree in theology, and that he had to break off his studies for financial reasons around 1784, without obtaining a degree of any sort. Several years of earning his living as an itinerant tutor ensued, during which time he met Johanna Rahn, his future wife, while living in Zurich.

In the summer of 1790, while living in Leipzig and once again in financial distress, Fichte agreed to tutor a university student in the Kantian philosophy, about which he knew very little at the time. His immersion in Kant’s writings, according to his own testimony, revolutionized his thinking and changed his life, turning him away from a deterministic view of the world at odds with human freedom towards the doctrines of the Critical philosophy and its reconciliation of freedom and determinism.

b. Fichte’s Sudden Rise to Prominence

More wandering and frustration followed. Fichte decided to travel to Königsberg to meet Kant himself, and on July 4, 1791 the disciple had his first interview with the master. Unfortunately for Fichte, things did not go well, and Kant was not especially impressed by his visitor. In order to prove his expertise in the Critical philosophy, Fichte quickly composed a manuscript on the relation of the Critical philosophy to the question of divine revelation, an issue that Kant had yet to address in print. This time, Kant was justifiably impressed by the results and arranged for his own publisher to bring out the work, which appeared in 1792 under the title An Attempt at a Critique of all Revelation.

In this fledgling effort Fichte adhered to many of Kant’s claims about morality and religion by thoughtfully extending them to the concept of revelation. In particular, he took over Kant’s idea that all religious belief must ultimately withstand critical scrutiny if it is to make a legitimate claim on us. For Fichte, any alleged revelation of God’s activity in the world must pass a moral test: namely, no immoral command or action, i.e., nothing that violates the moral law, can be attributed to Him. Although Fichte himself did not explicitly criticize Christianity by appealing to this test, such a restriction on the content of a possible revelation, if consistently imposed, would overturn some aspects of orthodox Christian belief, including, for example, the doctrine of original sin, which states that everyone is born guilty as a result of Adam and Eve’s disobedience in the Garden of Eden. This element of Christian theology, which is said to be grounded in the revelations contained in the Bible, is hardly compatible with the view of justice underwritten by the moral law. Attentive readers should have instantly gleaned Fichte’s radical views from the placid Kantian prose.

For reasons that are still mysterious, Fichte’s name and preface were omitted from the first edition of An Attempt at a Critique of all Revelation, and thus the book, which displayed an extensive and subtle appreciation of Kant’s thought, was taken to be the work of Kant himself. Once it became known that Fichte was the author, he instantly became a philosophical figure of importance; no one whose work had been mistaken for Kant’s, however briefly, could be rightfully denied fame and celebrity in the German philosophical world.

Fichte continued working as a tutor while attempting to fashion his philosophical insights into a system of his own. He also anonymously published two political works, “Reclamation of the Freedom of Thought from the Princes of Europe, Who Have Oppressed It Until Now” and Contribution to the Rectification of the Public’s Judgment of the French Revolution. It became widely known that he was their author; consequently, from the very beginning of his public career, he was identified with radical causes and views.

In October 1793 he married his fiancée, and shortly thereafter unexpectedly received a call from the University of Jena to take over the chair in philosophy that Karl Leonhard Reinhold (1758-1823), a well-known exponent and interpreter of the Kantian philosophy, had recently vacated. Fichte arrived in Jena in May 1794.

2. The Jena Period (1794-1799)

a. Fichte’s Philosophical Vocation

In his years at Jena, which lasted until 1799, Fichte published the works that established his reputation as one of the major figures in the German philosophical tradition. Fichte never exclusively saw himself as an academic philosopher addressing the typical audience of fellow philosophers, university colleagues, and students. Instead, he considered himself a scholar with a wider role to play beyond the confines of academia, a view eloquently expressed in “Some Lectures Concerning the Scholar’s Vocation,” which were delivered to an overflowing lecture hall shortly after his much anticipated arrival in Jena. One of the tasks of philosophy, according to these lectures, is to offer rational guidance towards the ends that are most appropriate for a free and harmonious society. The particular role of the scholar — that is, of individuals such as Fichte himself, regardless of their particular academic discipline — is to be a teacher of mankind and a superintendent of its never-ending progress towards perfection.

Throughout his career Fichte alternated between composing, on the one hand, philosophical works for scholars and students of philosophy and, on the other hand, popular works for the general public. This desire to communicate to the wider public — to bridge the gap, so to speak, between theory and praxis — inspired his writings from the start. In fact, Fichte’s passion for the education of society as a whole should be seen as a necessary consequence of his philosophical system, which continues the Kantian tradition of placing philosophy in the service of enlightenment, i.e., the eventual liberation of mankind from its self-imposed immaturity. To become mature, according to Kant’s way of thinking, which Fichte had adopted, is to overcome our willing refusal to think for ourselves, and thus to accept responsibility for failing to think and act independently of the guidance of external authority.

b. Fichte’s System, the Wissenschaftslehre

Fichte called his philosophical system the Wissenschaftslehre. The usual English translations of this term, such as “science of knowledge,” “doctrine of science,” or “theory of science,” can be misleading, since today these phrases carry connotations that can be excessively theoretical or too reminiscent of the natural sciences. Therefore, many English-language commentators and translators prefer to use the German term as the untranslated proper name that designates Fichte’s system as a whole.

Another potential source of confusion is that Fichte’s book from 1794/95, whose full title is Foundations of the Entire Wissenschaftslehre, is sometimes simply referred to as the Wissenschaftslehre. Strictly speaking, this is incorrect, since this work, as its title indicates, was meant as the foundations of the system as a whole; the other parts of the system were to be written afterwards. Much of Fichte’s work in the remainder of the Jena period attempted to complete the system as it was envisioned in the 1794/95 Foundations.

c. Background to the Wissenschaftslehre

Before moving to Jena, and while he was living in the house of his father-in-law in Zurich, Fichte wrote two short works that presaged much of the Wissenschaftslehre that he devoted the rest of his life to developing. The first of these was a review of a skeptical critique of Kantian philosophy in general and Reinhold’s so-called Elementarphilosophie (“Elementary Philosophy”) in particular. The work under review, an anonymously published polemic called Aenesidemus, which was later discovered to have been written by Gottlob Ernst Schulze (1761-1833), and which appeared in 1792, greatly influenced Fichte, causing him to revise many of his views, but did not lead him to abandon Reinhold’s concept of philosophy as rigorous science, an interpretation of the nature of philosophy that demanded that philosophical principles be systematically derived from a single foundational principle known with certainty.

Reinhold had argued that this first principle was what he called the “principle of consciousness,” namely, the proposition that “in consciousness representation is distinguished through the subject from both object and subject and is related to both.” From this principle Reinhold attempted to deduce the contents of Kant’s Critical philosophy. He claimed that the principle of consciousness was a reflectively known fact of consciousness, and argued that it could lend credence to various Kantian views, including the distinction between the faculties of sensibility and understanding and the existence of things in themselves. Schulze responded by offering skeptical objections against the legitimacy of Kant’s (and thus Reinhold’s) concept of the thing in itself (construed as the causal origin of our representations) and by arguing that the principle of consciousness was neither a fundamental principle (since it was subject to the laws of logic, in that it had to be free of contradiction) nor one known with certainty (since it originated in merely empirical reflection on the contents of consciousness, which reflection Schulze, following David Hume, persuasively argued could not yield a principle grounded on indubitable evidence).

Fichte, to his consternation, found himself in agreement with much of Schulze’s critique. Although he was still eager to support the Kantian system, Fichte, as a result of reading Schulze, came to the conclusion that the Critical philosophy needed new foundations. Yet the search for new foundations, in Fichte’s mind, was never equivalent to a repudiation of the Kantian philosophy. As Fichte would frequently claim, he remained true to the spirit, if not the letter, of Kant’s thought. His review of Schulze’s Aenesidemus provides one especially tantalizing hint about how he would subsequently attempt to remain within the spirit of Kant’s thought while attempting to reconstruct it from the ground up: philosophy, he says, must begin with a first principle, as Reinhold maintained, but not with one that expresses a mere fact, a Tatsache; instead, Fichte countered, it must begin with a fact/act, a Tathandlung, that is not known empirically, but rather with self-evident certainty. The meaning and purpose of this new first principle would not become clear to his readers until the publication of the 1794/95 Foundations.

In addition to his review of the Schulze book, and still prior to his arrival in Jena, Fichte sketched out the nature and methodology of the Wissenschaftslehre in an essay entitled “Concerning the Concept of the Wissenschaftslehre,” which was intended to prepare his expectant audience for his classes and lectures. Here Fichte sets out his conception of philosophy as the science of science, i.e., as Wissenschaftslehre. The Wissenschaftslehre is devoted to establishing the foundation of individual sciences such as geometry, whose first principle is said to be the task of limiting space in accordance with a rule. Thus the Wissenschaftslehre seeks to justify the cognitive task of the science of geometry, i.e., its systematic efforts at spatial construction in the form of theorems validly deduced from axioms known with self-evident certainty. The Wissenschaftslehre, which itself is a science in need of a first principle, is said to be grounded on the Tathandlung first mentioned in the Aenesidemus review. The precise nature of this fact/act, with which the Wissenschaftslehre is supposed to begin, is much debated, even today. Yet it is the essential core of the Jena Wissenschaftslehre in general and the 1794/95 Foundations in particular.

d. Foundations of the Entire Wissenschaftslehre

In the 1794/95 Foundations Fichte expresses the content of the Tathandlung in its most general form as “the I posits itself absolutely.” Fichte is suggesting that the self, which he typically refers to as “the I,” is not a static thing with fixed properties, but rather a self-producing process. Yet if it is a self-producing process, then it also seems that it must be free, since in some as yet unspecified fashion it owes its existence to nothing but itself. This admittedly obscure starting point is subject to much scrutiny and qualification as the Wissenschaftslehre proceeds. In more modern language, and as a first approximation of its meaning, we can understand the Tathandlung as expressing the concept of a rational agent that constantly interprets itself in light of normative standards that it imposes on itself, in both the theoretical and practical realms, in its efforts to determine what it ought to believe and how it ought to act. (Fichte’s indebtedness to the Kantian notion of autonomy in the form of self-imposed lawfulness should be obvious to anyone familiar with the Critical philosophy.)

Given the difficulty of the notion, unfortunately, Fichte’s Tathandlung has perplexed his readers from its first appearance. The principle of the self-positing I was initially interpreted along the lines of Berkeley’s idealism, and thus as claiming that the world as a whole is somehow the product of an infinite mind. This interpretation is surely mistaken, even though one can find passages that seem to support it. More important, though, is the question of the epistemic status of the principle. Is it known with the self-evident certainty that Fichte, following Reinhold, claims must ground any attempt at systematic knowledge? Furthermore, how does it serve as a basis for deducing the rest of the Wissenschaftslehre?

Fichte’s method is sometimes said to be phenomenological, restricting itself to what we can discover by means of reflection. Yet Fichte does not claim that we simply find the fully formed Tathandlung residing somewhere within us; instead, we construct it in order to explain ourselves to ourselves, to render intelligible to ourselves our normative nature as finite rational beings. Thus the requisite reflection is not empirical but transcendental, i.e., an experimental postulate adopted for philosophical purposes. That is, the principle is presupposed as true in order to make sense of the conditions for the possibility of our ordinary experience.

Such a method leaves open the possibility of other explanations of our experience. Fichte claims, however, that the alternatives can actually take only one form. Either, he says, we can begin (as he does) with the I as the ground of all possible experience, or we can begin with the thing in itself outside of our experience. This dilemma involves, as he puts it, choosing between idealism and dogmatism. The former is transcendental philosophy; the latter, a naturalistic approach to experience that explains it solely in causal terms. As Fichte famously said in the first introduction to the Wissenschaftslehre from 1797, the choice between the two depends on the kind of person one is, because they are said to be mutually exclusive yet equally possible approaches.

If, however, such a choice between starting points is possible, then the principle of the self-positing I lacks the self-evident certainty that Fichte attributed to it in his earlier essay on the concept of the Wissenschaftslehre. There are, in fact, those who do not find it at all self-evident, namely, the dogmatists. Fichte clearly thinks that they are mistaken in their dogmatism, yet he offers no direct refutation of their position, claiming only that they cannot demonstrate what they hope to demonstrate, namely, that the ground of all experience lies solely in objects existing independently of the I. The dogmatist position, Fichte implies, ignores the normative aspects of our experience, e.g., warranted and unwarranted belief, correct and incorrect action, and thus attempts to account for our experience entirely in terms of our causal interaction with the world around us. Presumably, however, those who begin with a disavowal of normativity — as the dogmatists do, because they are that kind of person — can never be brought to agree with the idealists. There is thus an argumentative impasse between the two camps.

Fichte’s remarks about systematic form and certainty in “Concerning the Concept of the Wissenschaftslehre” give the impression that he intends to demonstrate the entirety of the Wissenschaftslehre from the principle of the self-positing I through a chain of logical inferences that merely set out the implications of the initial principle in such a way that the certainty of the first principle is transferred to the claims inferred from it. (The method of Spinoza’s Ethics comes to mind, but this time with only a single premise from which to begin the proofs.) Yet this hardly seems to be Fichte’s actual method, since he constantly introduces new concepts that cannot be plausibly interpreted as the logical consequences of the previous ones. In other words, the deductions in the Foundations of the Entire Wissenschaftslehre are more than merely analytical explications of the consequences of the original premise. Instead, they both articulate and refine the initial principle of the self-positing I in accordance with the demands made on the idealist who is attempting to clarify the nature of the self-positing I by means of reflection.

After Fichte postulates the self-positing I as the explanatory ground of all experience, he then begins to complicate the web of concepts required to make sense of this initial postulate, thereby carrying out the aforementioned construction of the self-positing I. The I posits itself insofar as it is aware of itself, not only as an object but also as a subject, and finds itself subject to normative constraints in both the theoretical and practical realms, e.g., that it must be free of contradiction and that there must be adequate reasons for what it believes and does. Furthermore, the I posits itself as free, since these constraints are ones that it imposes on itself. Next, by means of further reflection, the I becomes aware of a difference between “representations accompanied by a feeling of necessity” and “representations accompanied by a feeling a freedom” — that is, a difference between representations of what purports to be an objective world existing apart from our representations of it and representations that are merely the product of our own mental activity. To recognize this distinction in our representations, however, is to posit a distinction between the I and the not-I, i.e., the self and whatever exists independently of it. In other words, the I comes to posit itself as limited by something other than itself, even though it initially posits itself as free, for in the course of reflecting on its own nature the I discovers limitations on its activity.

Our understanding of the nature of this limitation is made increasingly more complex through further acts of reflection. First, the I posits a check, an Anstoß, on its theoretical and practical activity, in that it encounters resistance whenever it thinks or acts. This check is then developed into more refined forms of limitation: sensations, intuitions, and concepts, all united in the experience of the things of the natural world, i.e., the spatio-temporal realm ruled by causal laws. Moreover, this world is found to contain other finite rational beings. They too are free yet limited, and the recognition of their freedom places further constraints on our activity. In this way the I posits the moral law and restricts its treatment of others to actions that are consistent with respect for their freedom. Thus, by the end of Fichte’s deductions, the I posits itself as free yet limited by natural necessity and the moral law: its freedom becomes an infinite task in which it seeks to make the world conform to its normative standards, but only by doing so in an appropriately moral fashion that allows other free beings to do the same for themselves.

e. Working Out the Wissenschaftslehre and the End of the Jena Period

Fichte’s writings during the rest of the Jena period attempt to fill out and refine the entire system. The Foundations of Natural Right Based on the Wissenschaftslehre (1796/97) and The System of Ethical Theory Based on the Wissenschaftslehre (1798) concern themselves with political philosophy and moral philosophy, respectively. The task of the former work is to characterize the legitimate constraints that can be placed on individual freedom in order to produce a community of maximally free individuals who simultaneously respect the freedom of others. The task of the latter work is to characterize the specific duties of rational agents who freely produce objects and actions in the pursuit of their goals. These duties follow from our general obligation to determine ourselves freely, i.e., from the categorical imperative.

Besides filling out projected portions of the system, Fichte also began to revise the foundations themselves. Since he considered the mode of presentation of the Foundations of the Entire Wissenschaftslehre unsatisfactory, he began drawing up a new version in his lectures, which were given three times between 1796 and 1799, but which he never managed to publish. These lectures, which in some respects are superior to the Foundations of the Entire Wissenschaftslehre, were published posthumously and are now known as the Wissenschaftslehre nova methodo.

Prior to publishing any systematic presentation of his philosophy of religion, Fichte became embroiled in what is now known as the Atheismusstreit, the atheism controversy. In an essay from 1798 entitled “On the Basis of Our Belief in a Divine Governance of the World” Fichte argued that religious belief could be legitimate only insofar as it arose from properly moral considerations — a view clearly indebted to his book on revelation from 1792. Furthermore, he claimed that God has no existence apart from the moral world order. Because neither view was orthodox at the time, Fichte was accused of atheism and ultimately forced to leave Jena.

Two open letters, both from 1799 and written by philosophers whom Fichte fervently admired, compounded his troubles. First, Kant disavowed the Wissenschaftslehre for mistakenly having tried to infer substantive philosophical knowledge from logic alone. Such an inference, he claimed, was impossible, since logic abstracted from the content of knowledge and thus could not produce a new object of knowledge. Second, Friedrich Heinrich Jacobi accused the Wissenschaftslehre of nihilism: that is, of producing reality out of mere mental representations, and thus in effect from nothingness. Whether or not these criticisms were just (and Fichte certainly denied that they were), they further damaged Fichte’s philosophical reputation.

3. The Berlin Period (1800-1814)

a. The Eclipse of Fichte’s Career

In 1800 Fichte settled in Berlin and continued to philosophize. He was no longer a professor, because there was no university in Berlin at the time of his arrival. To earn a living, he published new works and gave private lectures. The Berlin years, while productive, represent a decline in Fichte’s fortunes, since he never regained the degree of influence among philosophers that he had enjoyed during the Jena years, although he remained a popular author among non-philosophers. His first major Berlin publication was a popular presentation of the Wissenschaftslehre designed to answer his critics on the question of atheism. Known as The Vocation of Man, it appeared in 1800 and is probably Fichte’s greatest literary production. (It seems, although this is never explicitly stated anywhere in the book, that much of it was inspired by the personally stinging critique of Jacobi’s open letter.)

Fichte continued to revise the Wissenschaftslehre, yet he published very little of the material developed in these renewed efforts to perfect his system, mostly because he feared being misunderstood as he had been during the Jena years. His reluctance to publish gave his contemporaries the false impression that he was more or less finished as an original philosopher. Except for a cryptic outline that appeared in 1810, his Berlin lectures on the Wissenschaftslehre, of which there are numerous versions, only appeared posthumously. In these manuscripts Fichte typically speaks of the absolute and its appearances, i.e., a philosophically suitable stand-in for a more traditional notion of God and the community of finite rational beings whose existence is grounded in the absolute. As a result, Fichte is sometimes said to have taken a religious turn in the Berlin period.

b. Popular Writings from the Berlin Period

In 1806 Fichte published two lecture series that were well-received by his contemporaries. The first, The Characteristics of the Present Age, employs the Wissenschaftslehre for the purposes of the philosophy of history. According to Fichte, there are five stages of history in which the human race progresses from the rule of instinct to the rule of reason. The present age, he says, is the third age, an epoch of liberation from instinct and external authority, out of which humanity will ultimately progress until it makes itself and the world it inhabits into a fully self-conscious representative of the life of reason. The second, The Way Towards the Blessed Life, which is sometimes said to be a mystical work, treats of morality and religion in a popular format.

Another famous series of lectures, Addresses to the German Nation, given in 1808 during the French occupation, was intended as a continuation of The Characteristics of the Present Age, but exclusively for a German audience. Here Fichte envisions a new form of national education that would enable the German nation, not yet in existence, to reach the fifth and final age outlined in the earlier lecture series. Once again, Fichte demonstrated his interest in larger matters, and in a manner perfectly consistent with his earlier insistence from the Jena period that the scholar has a cultural role to play.

c. Fichte’s Return to the University and his Final Years

When the newly founded Prussian university in Berlin opened in 1810, Fichte was made the head of the philosophy faculty; in 1811 he was elected the first rector of the university. He continued his philosophical work until the very end of his life, lecturing on the Wissenschaftslehre and writing on political philosophy and other subjects. When the War of Liberation broke out in 1813, Fichte canceled his lectures and joined the militia. His wife Johanna, who was serving as a volunteer nurse in a military hospital, contracted a life-threatening fever. She recovered, but Fichte fell ill with the same ailment. He died on January 29, 1814.

4. Conclusion

Although Fichte’s importance for the history of German philosophy is undisputed, the nature of his legacy is still very much debated. He has sometimes been seen as a mere transitional figure between Kant and Hegel, as little more than a philosophical stepping stone along Spirit’s path to absolute knowledge. This understanding of Fichte was encouraged by Hegel himself, and no doubt for self-serving reasons. Nowadays, however, Fichte is studied more and more for his own sake, in particular for his theory of subjectivity, i.e., the theory of the self-positing I, which is rightly seen as a sophisticated elaboration of Kant’s claim that finite rational beings are to be interpreted in theoretical and practical terms. The level of detail that Fichte provides on these matters exceeds that found in Kant’s writings. This fact alone would make Fichte’s work worthy of our attention. Yet perhaps the most persuasive testament to Fichte’s greatness as a philosopher is to be found in his relentless willingness to begin again, to start the Wissenschaftslehre anew, and never to rest content with any prior formulation of his thought. Although this leaves his readers perpetually dissatisfied and desirous of a definitive statement of his views, Fichte, true to his publically declared vocation, makes them into better philosophers through his own example of restless striving for the truth.

5. Suggestions for Further Reading

a. Fichte’s Writings in German

  • Gesamtausgabe der Bayerischen Akademie der Wissenschaften. Ed. R. Lauth, H. Jacobs, and H. Gliwitzky. Stuttgart-Bad Cannstatt: Frommann, 1964ff.
  • Fichtes Werke, 11 vols. Ed. Immanuel Hermann Fichte. Berlin: Walter de Gruyter & Co., 1971.
    • Reprint of the 19th century edition of Fichte’s writings.

b. Fichte’s Writings in English Translation

(Publication dates during Fichte’s lifetime are given in brackets.)

  • Fichte: Early Philosophical Writings [1790-1799]. Trans. and ed. Daniel Breazeale. Ithaca: Cornell University Press, 1988.
    • Includes “Review of Aenesidemus,” “Concerning the Concept of the Wissenschaftslehre,” and “Some Lectures Concerning the Scholar’s Vocation.”
  • Attempt at a Critique of all Revelation [17921, 17932]. Trans. Garrett Green. Cambridge: Cambridge University Press, 1978.
  • “Reclamation of the Freedom of Thought from the Princes of Europe, Who Have Oppressed It Until Now” [1793]. Trans. Thomas E. Wartenberg. In What is Enlightenment? Eighteenth-Century Answers and Twentieth-Century Questions, ed. James Schmidt. Berkeley: University of California Press, 1996.
  • “On the Spirit and the Letter in Philosophy” [1794]. Trans. Elizabeth Rubenstein. In German Aesthetic and Literary Criticism: Kant, Fichte, Schelling, Schopenhauer, Hegel, ed. David Simpson. Cambridge: Cambridge University Press, 1984.
  • Foundations of the Entire Science of Knowledge [1794/95]. In The Science of Knowledge, trans. and ed. Peter Heath and John Lachs. Cambridge: Cambridge University Press, 1982.
    • Also includes the two introductions to the Wissenschaftslehre from 1797.
  • “On the Linguistic Capacity and the Origin of Language” [1795]. In Language and German Idealism: Fichte’s Linguistic Philosophy, trans. and ed. Jere Paul Surber. Atlantic Highlands, NJ: Humanities Press, 1996.
  • Foundations of Transcendental Philosophy (Wissenschaftslehre) nova methodo (1796/99). Trans. and ed. Daniel Breazeale. Ithaca: Cornell University Press, 1992.
    • Posthumously published lectures given between 1796 and 1799.
  • Foundations of Natural Right [1796/97]. Trans. Michael Baur, ed. Frederick Neuhouser. Cambridge: Cambridge University Press, 2000.
  • Introductions to the Wissenschaftslehre and Other Writings [1797-1800]. Trans. and ed. Daniel Breazeale. Indianapolis: Hackett Publishing Company, 1994.
    • Includes the two introductions to the Wissenschaftslehre from 1797 as well as “On the Basis of Our Belief in a Divine Governance of the World” from 1798.
  • The Science of Ethics as Based on the Science of Knowledge [1798]. Trans. A E. Kroeger. London: Kegan Paul, 1897.
    • German title would be better translated as The System of Ethical Theory Based on the Wissenschaftslehre. An unreliable translation.
  • The Vocation of Man [1800]. Trans. Peter Preuss. Indianapolis: Hackett Publishing Company, 1987.
  • “A Crystal Clear Report to the General Public Concerning the Actual Essence of the Newest Philosophy: An Attempt to Force the Reader to Understand” [1801]. Trans. John Botterman and William Rasch. In Philosophy of German Idealism, ed. Ernst Behler. New York: Continuum, 1987.
  • The Characteristics of the Present Age and The Way Towards the Blessed Life [1806]. In The Popular Works of Johann Gottlieb Fichte, 2 vols., trans. and ed. William Smith. London: Chapman, 1848/49. Reprint ó London: Thoemmes Press, 1999.
  • Addresses to the German Nation [1808]. Trans. R. F. Jones and G. H. Turnbull. Chicago: Open Court, 1922. Reprint ó Westport, CT: Greenwood Press, Inc., 1979.
  • “The Science of Knowledge in its General Outline” [1810]. Trans. Walter E. Wright. Idealistic Studies 6 (1976): 106-117.

c. Other Philosophers’ Writings in English Translation

  • Di Giovanni, George and H. S. Harris, eds. Between Kant and Hegel: Texts in the Development of Post-Kantian Idealism. Albany: State University of New York Press, 1985. Revised edition ó Indianapolis, Indiana: Hackett Publishing Company, Inc., 2000.
    • Includes excerpts from Reinhold’s The Foundation of Philosophical Knowledge and Schulze’s Aenesidemus.
  • Jacobi, Friedrich Heinrich. The Main Philosophical Writings and the Novel Allwill. Trans. and ed. George di Giovanni. Montreal: McGill-Queen’s University Press, 1994.
    • Includes Jacobi to Fichte.

d. Suggested Secondary Literature in English, French, and German

  • Baumanns, Peter. J. G. Fichte: Kritische Gesamtdarstellung seiner Philosophie. Freiburg/M¸nchen: Verlag Karl Alber, 1990.
  • Beiser, Frederick C. German Idealism: The Struggle Against Subjectivism, 1781-1801. Cambridge, Massachusetts: Harvard University Press, 2002.
    • Part II interprets the Wissenschaftslehre from the point of view of Fichte’s critique of subjectivism.
  • Bowman, Curtis. “Johann Gottlieb Fichte: Foundations of the Entire Science of Knowledge.” In Central Works of Philosophy (Volume 3: The Nineteenth Century), ed. John Shand. Chesham: Acumen Publishing Limited, 2005.
    • An interpretation of Fichte’s best known book, suitable for first-time readers.
  • Breazeale, Daniel. “Fichte and Schelling: The Jena Period.” In The Age of German Idealism (Routledge History of Philosophy, Volume VI), ed. Robert C. Solomon and Kathleen M. Higgins. London: Routledge, 1993.
  • Breazeale, Daniel. “Fichte, Johann Gottlieb.” In Routledge Encyclopedia of Philosophy, vol. 3. London: Routledge, 1998.
  • Breazeale, Daniel and Tom Rockmore, eds. Fichte: Historical Contexts/Contemporary Controversies. Atlantic Highlands, New Jersey: Humanities Press, 1994.
  • Breazeale, Daniel. New Essays in Fichte’s Foundation of the Entire Doctrine of Scientific Knowledge. Amherst, New York: Humanity Books, 2001.
  • Breazeale, Daniel. New Essays on Fichte’s Later Jena Wissenschaftslehre. Evanston, Illinois: Northwestern University Press, 2002.
  • Breazeale, Daniel. New Perspectives on Fichte. New Jersey: Humanities Press, 1996.
  • Henrich, Dieter. “Fichte’s Original Insight.” Trans. David Lachterman. Contemporary German Philosophy 1 (1982): 15-53.
  • Jacobs, Wilhelm G. Johann Gottlieb Fichte. Reinbek bei Hamburg: Rowohlt, 1984.
    • A brief illustrated biography.
  • La Vopa, Anthony J. Fichte: The Self and the Calling of Philosophy, 1762-1799. Cambridge: Cambridge University Press, 2001.
    • Intellectual biography of Fichte’s early life and the Jena period.
  • Martin, Wayne. Idealism and Objectivity: Understanding Fichte’s Jena Project. Stanford: Stanford University Press, 1997.
  • Neuhouser, Frederick. Fichte’s Theory of Subjectivity. Cambridge: Cambridge University Press, 1990.
  • Philonenko, Alexis. L’oevre de Fichte. Paris: Libraire Philosophique J. Vrin, 1984.
  • Pinkard, Terry. German Philosophy, 1760-1860: The Legacy of Idealism. Cambridge: Cambridge University Press, 2002.
    • Chapter 5 is devoted to Fichte.
  • Rohs, Peter. Johann Gottlieb Fichte. Munich: C. H. Beck, 1991.
  • Seidel, George. Fichte’s Wissenschaftslehre of 1794: A Commentary on Part I. West Lafayette, Indiana: Purdue University Press, 1993.
  • Zöller, Günter. Fichte’s Transcendental Philosophy: The Original Duplicity of Intelligence and Will. Cambridge: Cambridge University Press, 1998.

Author Information

Curtis Bowman
Email: cbhome@earthlink.net
U. S. A.

Introspection

Introspection is the process by which someone comes to form beliefs about her own mental states. We might form the belief that someone else is happy on the basis of perception – for example, by perceiving her behavior. But a person typically does not have to observe her own behavior in order to determine whether she is happy. Rather, one makes this determination by introspecting.

When compared to other beliefs that we have, the beliefs that we acquire through introspection seem epistemically special. What exactly this amounts to is discussed in the first part of this essay. The second part addresses the nature of introspection. Though the term “introspection” literally means “looking within” (from the Latin “spicere” meaning “to look” and “intra” meaning “within”), whether introspecting should be treated analogously to looking – that is, whether introspection is a form of inner perception – is debatable. Philosophers have offered both observational and non-observational accounts of introspection. Following the discussion of these various issues about the epistemology and nature of introspection, the third section of this essay addresses an important use to which introspection has been put in philosophical discussions, namely, to draw metaphysical conclusions about the nature of mind.

Table of Contents

  1. 1. The Epistemic “Specialness” of Introspection
    1. a. Infallibility
    2. b. Self-intimation
    3. c. Self-warrant
    4. d. Immediacy
  2. 2. The Nature of Introspection
    1. a. Observational Models
    2. b. Non-Observational Models
    3. c. Skepticism about Introspection
  3. 3. Introspection and the Nature of Mind
    1. a. Introspectibility as a Mark of the Mental
    2. b. Introspective Arguments for Dualism
  4. 4. References and Further Reading

1. The Epistemic “Specialness” of Introspection

We form beliefs about our own mental states by introspection. How exactly introspection works will be discussed in the next section. But however it works, philosophers have long taken note of the fact that each individual’s introspective capacity seems to place her in a unique position to form beliefs, and gain knowledge, of her own mental states. An individual’s introspective beliefs about her own mental states seem in some way more secure than her beliefs about the external world, including her beliefs about the mental states of other people. Correspondingly, her introspective beliefs about her own mental states seem more secure than the beliefs that anyone else could form about her mental states. In these ways, there seems to be something epistemically special about the beliefs that we form on the basis of introspection. Typically, this specialness has been referred to as the privileged access that we have to our own mental states.

To say that an individual has privileged access to her own mental states is to say that she is in a better position than anyone else to acquire knowledge (or perhaps, justified beliefs) about them. But what exactly does privileged access amount to? In this section, of the numerous different claims that philosophers have made in this regard are discussed. (See Alston 1971 for a particularly comprehensive discussion of these and similar claims.)

a. Infallibility

In the Meditations on First Philosophy, Descartes worries that he may be deceived by an evil demon. As a result, all of his beliefs about the external world may well be false. But however powerful the demon may be, Descartes claims that it cannot deceive him about the contents of his own mind. Though it might not be true that he is seeing, hearing and feeling what he thinks he is, it is nonetheless true, he says, that “I certainly seem to see, to hear, and to be warmed. This cannot be false.” (Descartes 1641/1986)

This passage has been commonly interpreted in terms of infallibility. As such, it gives us one of the strongest claims that philosophers have made about the epistemic specialness of our self-knowledge: One cannot have a false belief about one’s own mental states. In this way, I am in a privileged position to make judgments about my mental states, since other people can have false beliefs about my mental states. But, necessarily, if I believe that I am in a particular mental state, then I am in that mental state.

Before discussing this thesis, it is worth noting that there has been some unfortunate terminological messiness in this area. Sometimes the terms “incorrigibility” or “indubitability” have been used as a synonym for what has just been referred to as “infallibility.” For example, when Armstrong (1963) asks whether introspective knowledge is incorrigible, he has in mind the claim that it is logically impossible for someone to be mistaken when she makes a sincere introspective report. He then explicitly uses the words “incorrigible” and “indubitable” interchangeably. (See also Shoemaker 1963, who uses the term “incorrigible” to refer to any sincere introspective report in which “it does not make sense to suppose, and nothing could be accepted as showing, that [the individual] is mistaken, i.e., that what he says is false.”) However, the terms “incorrigibility” and “indubitability” are also often distinguished from one another, and from “infallibility,” to pick out related, but different, kinds of epistemic specialness. On this usage, an individual’s introspective belief is said to be incorrigible when no one else can have grounds for correcting it; an individual’s introspective belief is said to be indubitable when she herself can have no grounds for rejecting it. (See Alston 1971 and Gallois 1996.) Note that these three kinds of epistemic specialness can clearly come apart. For example, we can conceive (at least in principle) of cases in which an individual’s introspective report was false even though no one else had grounds for correcting it, or in which the individual herself has no grounds to reject it. It thus seems best to keep separate the terms “infallibility,” “incorrigibility,” and “indubitability.” This essay reserves the term “infallibility” for the claim discussed above that it is not possible for me to believe that I am in a given mental state unless I am in that mental state.

One further qualification is also needed. As stated above, the infallibility thesis concerns our self-knowledge generally, rather than just our introspective knowledge, and is thus overly broad. Suppose that in the course of a polite disagreement, a friend accuses me of being angry at her. In fact, she is lying to cover her own anger at me. But, because she is normally reliable, I might take her accusation at face value and become convinced that I am angry at her. This case, in which I have the belief that I am angry even though I am not, shows that we can have fallible self-knowledge. (See Gertler 2003b for some similar examples.) The case does not show, however, that we can have fallible introspective knowledge. In fact, one might suppose that my belief in the case above is mistaken precisely because it was not formed on the basis of introspection, but rather on the basis of my friend’s testimony. Proponents of infallibility undoubtedly intend the infallibility thesis to apply only to introspective knowledge and not to self-knowledge more generally. To make this clear, we can insert the following qualification in the statement of the infallibility claim: Necessarily, if I believe on the basis of introspection that I am in a particular mental state, then I am in that mental state.

Thus understood, the infallibility thesis enjoys some intuitive support, particularly when it comes to certain types of mental states like sensations. How can I be wrong that I am in pain right now? (See Shoemaker 1990 for an attempt to flesh out the inherent plausibility of the infallibility thesis.) Nonetheless, it is now almost uniformly rejected by both philosophers and psychologists alike. Some obvious counterexamples come from our assessments of our emotional states and character traits. Individuals are notoriously poor judges of whether they are feeling jealous, for example. And of course there are widespread examples from literature and cinema where it is plain to everyone but the bickering hero and heroine themselves that, despite their protestations to the contrary, they are really in love.

Arguing against the infallibility thesis, Churchland (1988) suggests that we make mistakes in our introspective judgments because of expectation, presentation, and memory effects, – three phenomena that are familiar from the case of perception. As an example where expectations come into play, he offers the case of a captured spy whose interrogators have repeatedly tortured him by briefly pressing a hot iron against his back. What would happen if, after 19 times with the hot iron, the torturers surreptitiously use an ice cube instead? Since the spy strongly expects to feel pain, Churchland suggests that the spy’s immediate reaction to the ice cube will not differ significantly from the reactions that he had to the hot iron, i.e., he will mistakenly think he is feeling pain. (See also Warner 1993.) Likewise, Churchland argues that when a sensation is presented to us for a very short duration of time, mistakes are not just likely but inevitable. Finally, he asks us to consider someone who suffered neural damage at a young age and has subsequently not felt pain or any other tactile sensation for 50 years. Then suppose that her neural deficits were somehow overcome. In such a situation, Churchland argues that it would be quite implausible to suppose that she would be able instantly and infallibly to discriminate and identify all of her newly regained sensations.

Churchland’s criticisms of the infallibility thesis in some ways echo worries raised by James almost a century earlier. As James noted, “Even the writers who insist upon the absolute veracity of our immediate inner apprehension of a conscious state have to contrast with this the fallibility of our memory or observation of it, a moment later.” He concludes that “introspection is difficult and fallible; and that the difficulty is simply that of all observation of whatever kind.” (James 1890/1950)

Another line of objection to the claim of infallibility derives from some remarks of Wittgenstein (1958). In the course of offering his private language argument, he worries about how an individual in isolation would be able to develop a language to refer to her own sensations. The problem is that in such cases there “is no criterion of correctness. One would like to say: whatever is going to seem right to me is right. And that only means that here we can’t talk about ‘right.’” Armstrong (1963) fleshes out the objection as follows (see also Wright 1989):

If introspective mistake is ruled out by logical necessity, then what sense can we attach to the notion of gaining knowledge by introspection? We can speak of gaining knowledge only in cases where it makes sense to speak of thinking wrongly that we have gained knowledge. In the words of the slogan: ‘If you can’t be wrong, then you can’t be right either.’ If failure is logically impossible, then talk of success is meaningless.

In the empirical domain, work in a variety of areas provides important evidence for the fallibility of introspection. Influential studies by Nisbett and Wilson (1977) suggest that we often misdescribe our own reasoning processes. In one study, subjects were presented with four pairs of stockings and asked to indicate which pair had the highest quality. The leftmost pair was preferred by a factor of almost four to one. However, unbeknownst to the subjects, all four pairs of stockings were identical. Though position effects were clearly playing a role in the subjects’ choice, none of them identified position when asked to explain their reasoning, and those who were asked explicitly whether position played any role in their reasoning process all denied it. The evidence from this and other studies thus suggests that people often form mistaken beliefs about what reasoning processes they are utilizing; as Nisbett and Wilson conclude, the evidence is “consistent with the most pessimistic view concerning people’s ability to report accurately about their cognitive processes.”

However interesting this result, Nisbett and Wilson’s work might not seem especially threatening to most proponents of infallibility, since it concerns introspective access only to higher order reasoning processes, and in particular, the ability to recognize outside influences on those processes. But who would have ever thought that we were infallible with respect to that? In contrast, empirical work on “changeblindness,” which calls into question our introspective access to our current perceptual states, seems to pose a deeper threat. According to work done by Kevin O’Regan (who works, ironically, at the Universite Rene Descartes in France), subjects typically fail to notice even large changes to objects in their visual field, as long as the change occurs simultaneously with some other “disruption,” such as a blink or a mudsplash on a windshield. (See, e.g., O’Regan et al, 1999.)

One might try to qualify the infallibility thesis to address some of the above objections. For example, one might restrict the infallibility thesis only to those judgments that are made after careful reflection. Alternatively, one might restrict the infallibility thesis to a subclass of mental states. For example, Jackson (1973) defends a limited infallibility thesis, claiming that we are infallible only about our current phenomenal states. However, Schwitzgebel (2005) adduces numerous considerations to suggest that we should reject even these attenuated infallibility theses. According to Schwitzgebel, we are prone to gross error even in introspective judgments that are often taken to be epistemically the most secure, namely, those about currently ongoing visual experience. Though we typically assume that visual experience consists of a broad stable field with imprecision or haziness only at the borders, introspective experiments that force us to direct our attention away from the focal center reveal that a surprisingly small portion of one’s visual field has any real clarity and precision. (See also Dennett 1991.)

b. Self-intimation

Another account of our privileged access stems from the doctrine of self-intimation. A mental state is self-intimating if it is impossible for a person to be in that mental state and not know that she is that mental state. This doctrine is sometimes referred to as omniscience (see Alston 1971); if whenever an individual is in a mental state she has knowledge of that mental state, then that individual is omniscient with respect to her own mental life. This doctrine is also sometimes referred to as the transparency thesis – the claim that whatever happens within a mind is completely transparent to it. (See Shoemaker 1990.) As such, the doctrine is closely associated with the Cartesian conception of the mind. But though Descartes himself seemed to endorse both infallibility and self-intimation, it is useful to note that they can come apart. An individual might be infallible about her mental states without the mental states being self-intimating; in such a case, whatever beliefs she has about her mental states will be true, but there may nonetheless be some mental states about which she has no beliefs. Likewise, even if mental states are self-intimating, we might still have false introspective beliefs. Self-intimation requires that whenever an individual is in a mental state she will form the belief that she is in that mental state, but it does not rule out her falsely forming the very same belief when she is not in that mental state.

Like the infallibility thesis, the self-intimation thesis enjoys some inherent plausibility. In fact, self-intimation may even seem to follow from the very notion of a mental state. If what it is for an individual to have a mental state is for her to be conscious of it, how could self-intimation be denied? Insofar as we think of the mental in terms of the conscious, and insofar as we think of being conscious of a mental state as being aware of it, the self-intimation thesis seems like a truism.

Unfortunately for the proponent of self-intimation, however, there are two obvious problems with this line of reasoning. First, as the work of Freud has suggested, we should not limit the mental to the conscious. Second, the claim that consciousness should be analyzed in terms of awareness is itself highly controversial. (See e.g., Armstrong 1981; Block 1995.)

This second point relates to Armstrong’s case (1981) of the distracted truck driver, which is often offered as an objection to the self-intimation thesis. When driving for long periods of time at night, a truck driver may suddenly “come to” and realize that he has been driving for quite some time without being aware of what he has been doing. Though the truck driver was clearly in a conscious state while he was driving (after all, he was engaging in a fairly sophisticated activity), he had no introspective awareness of that state.

The self-intimation thesis also falls victim to many of the same objections that plague the infallibility thesis. Just as we can have false beliefs about many of our mental states, we may also fail to form beliefs about many of our mental states. Even if the jealous lover does not falsely believe that she is not jealous, she might nonetheless fail to recognize her feelings of jealousy. In fact, the only way that we are able to explain much of human behavior is to assume that individuals often lack knowledge of their own mental states. Why do the hero and the heroine bicker so much, to return to an example from above? Presumably this occurs because they are unaware of their true feelings for one another.

The proponent of the self-intimation thesis may be able to sidestep some of these objections by limiting the scope of the thesis in an appropriate way. Chisholm (1981) offers a self-intimation thesis limited to conscious states about which an individual reflects, i.e., whenever an individual who is in a conscious state reflects on whether she is in such a state, she will form a justified belief that she is in such a state. In recent years Shoemaker has also championed a limited version of the self-intimation thesis: “it is implicit in the nature of certain mental states that any subject of such states that has the capacity to conceive of itself as having them will be aware of having them when it does, or at least will become aware of this under certain conditions (e.g. if it reflects on the matter).” (Shoemaker 1988; see also Shoemaker 1995.) The mental states that Shoemaker has in mind are beliefs and desires. Shoemaker argues for his version of the self-intimation thesis by invoking considerations of Moore’s Paradox. Named for G.E. Moore, the paradox concerns assertions of the form “P, but I don’t believe that P” (e.g. “It is raining but I don’t believe that it is raining.”) In short, Shoemaker argues that any rational individual who has the first-order belief P will be able to avoid holding Moore-paradoxical beliefs. Thus, assuming rationality, the mere possession of a belief is enough to ensure that an individual will believe that he has that belief. We will return to Shoemaker’s view in our discussion of the nature of introspection in Section 2.

c. Self-warrant

A third account of privileged access can be found in the notion of self-warrant. As Alston (1976) defines the notion, “a self-warranted belief enjoys an immunity from lack of justification; it cannot be the belief it is and fail to be justified.” If privileged access is to be understood in terms of self-warrant, then that would mean that whenever an individual has a belief about her own mental states, she is justified in holding that belief. As was the case with the infallibility claim, for this claim to be plausible it must presumably be limited to beliefs formed by introspection: if an individual believes on the basis of introspection that she is in a particular mental state, then her belief is justified.

Importantly, in contrast to the proponent of infallibility, the proponent of self-warrant does not claim that the relevant belief must be true. Self-warrant leaves open the possibility of error. As such, it is a considerably weaker claim than either of the two claims previously considered. Moreover, there is something intuitively plausible about it. Suppose that, on the basis on introspection, I form the belief I intend to go to the faculty meeting this afternoon. Granted, I might be wrong, and perhaps other people could supply me with evidence that would convince me that I am wrong. But that said, I have no reason to reject the belief. And that alone – when introspective beliefs are in question – seems to justify me in holding the belief. This point generalizes our introspective beliefs about other conscious mental states as well. Typically, nothing is required to justify an introspective belief about one’s own conscious mental state other than the fact that it is a belief about one’s own conscious mental state. As Alston (1976) argues, if someone were to report to us that she presently is imagining a blue jay, or that she is thinking about lunch, or that she has an itch on her left leg, then we take it for granted that these reports are justified; “We would unhesitatingly brand as absurd a request for justification such as ‘Why do you believe that?’, ‘What reason do you have for supposing that?’, or ‘How do you know that?’”

Against this, Gallois (1996) argues that invoking self-warrant cannot provide an adequate explanation of the epistemic distinctiveness of our introspective beliefs. Gallois suggests that ultimately there is no way of understanding self-warrant except in terms of non-evidential justification; any other analysis will lead to the implausible conclusion that all beliefs are self-warranted. But that means that what is really doing the work to explain the distinctive epistemic nature of our introspective knowledge is the fact that it is non-evidentially justified – the notion of self-warrant itself does no explanatory work. Non-evidential justification will be discussed in connection with the notion of immediacy, below.

d. Immediacy

An additional claim that is often made about an individual’s introspective access to her own mental states is that it is immediate or direct. To claim that introspective access is immediate is to claim that our introspective beliefs are non-inferential and non-evidentially based. In this respect, our introspective beliefs are significantly different from perceptual beliefs (and perhaps, from all of our other beliefs as well).

Immediacy is often linked with infallibility. One reason that introspective beliefs might be thought to be infallible is that they are immediate; the fact that they are not inferred from any other beliefs or based on any other evidence bestows on them an immunity from error. This position is often associated with Russell, and in particular, his distinction between knowledge by description and knowledge by acquaintance: “We shall say that we have acquaintance with anything of which we are directly aware, without the intermediary of any process of inference or any knowledge of truths.” (Russell 1912) For Russell, the only things with which we have such acquaintance are our current mental particulars, and when we are acquainted with some such particular – when our access to it is immediate – our judgments about it cannot be wrong:

At any given moment, there are certain things of which a man is ‘aware,’ certain things which are ‘before his mind.’ … If I describe these objects, I may of course describe them wrongly, hence I cannot with certainty communicate to another what are the things of which I am aware. But if I speak to myself, and denote them by what may be called ‘proper names,’ rather than by descriptive words, I cannot be in error. (Russell 1910.)

Leaving aside the question of whether Russell is right to connect immediacy with infallibility, a further question remains: can immediacy provide us with an adequate understanding of privileged access? Many philosophers have argued that it cannot. For example, Alston (1971) complains that the notion of immediate awareness is not well-understood. It will not help to try to comprehend the notion in causal or special terms, since we do not have a good sense of how these notions apply to mental states. He suggests further that even once the notion is clarified, it still will not serve to explain our privileged access. (Alston 1976). The primary problem concerns the following question: What, exactly, are we supposed to have immediate awareness of? Alston notes that we can have awareness of particulars (my sensation of this patch of color) or facts (that this patch of color is red). But since we do not enjoy privileged access with respect to all of our beliefs about the particular, it looks as if immediate awareness to particulars cannot do the work that it is supposed to do. The problem does not arise if our immediate awareness is of a particular fact about the particular – an immediate awareness of the fact that this patch of color is red can explain why a belief in that fact would be epistemically privileged. However, here we have merely traded one problem for another, since it is not at all clear what sense it makes to say that facts can be immediately apprehended.

Heil (1988) offers an additional reason to deny that immediacy or directness gives us a sufficient explanation of privileged access. According to Heil, a mental state’s being one’s own is neither necessary nor sufficient for it to be knowable directly. It is possible, in principle, that I might fail to know many of my mental states directly, and it might further be possible that I might know someone else’s mental states directly. (Suppose, for example, that Anne could be wired in such a way so that she is connected to Emily’s nervous system. In this case, Anne might know Emily’s mental states directly.) As he concludes, “a characterization of my privileged access based exclusively on what is directly known is anemic, hence unsatisfactory.”

2. The Nature of Introspection

However we are to understand the special epistemic status of our introspective judgments, we might naturally think that this status owes to the nature of introspection. But what is the nature of our introspective capacity? Philosophers who have attempted to answer this question fall, broadly speaking, into two camps: those who give observational models of introspection, and those who give non-observational models of introspection. In what follows, we address each of these accounts in turn. We will also briefly consider the skeptical view of an additional camp of philosophers, those who deny that there is any special introspective capacity for which to account.

a. Observational Models

One of the most common accounts of introspection is modeled on perception: just as our perceptual capacity enables us to observe the outer world, our introspective capacity enables us to observe the inner world. As such, introspection can be thought of as an inner sense. This view is often thought to have originated with Locke, who claimed that one source of our ideas is:

the Perception of the Operations of our own Minds within us …. This Source of Ideas, every Man has wholly in himself: And though it be not Sense, as having nothing to do with external Objects; yet it is very like it, and might properly enough be call’d internal Sense. (Locke 1689/1975)

Armstrong (1968, 1981) is probably the main contemporary advocate of the inner sense view. In the course of advocating a materialist theory of mind, Armstrong advances a view of introspection as a self-scanning process in the brain. According to Armstrong, the scanning state and the state scanned must be distinct states: “although they are both mental states, it is impossible that the introspecting and the thing introspected should be one and the same mental state. A mental state cannot be aware of itself, any more than a man can eat himself up.” (Armstrong 1968, 324) Having offered this consideration, which is often referred to as the distinct existences argument, Armstrong also argues that the relationship between the two states is causal.

Given this picture of introspection, it is no surprise that proponents of the inner sense view typically reject several of the claims discussed in Section 1 above. Since they view the introspective state and the introspected state as distinct states, they claim that it must be possible for one to occur without the other. Thus, they reject the self-intimation claim. Since it seems possible that the scanning mechanism could malfunction, they also reject the infallibility claim.

This does not mean, however, that the inner sense view of introspection should be seen as deflationary. Lycan (1996), who offers a version of Armstrong’s self-scanning view, emphasizes the importance of introspection to our mental lives: “Introspective consciousness is no accident … As a matter of engineering, if we did not have the devices of introspection, there would be no we to argue about, or to do the arguing.” Here Lycan stresses the evolutionary advantages conferred by our capacity for introspection. The complexity of our sensory, cognitive and motor systems demands that we be able to engage in an internal monitoring of these systems.

In recent years, Shoemaker has been one of the most persistent critics of the inner sense model of introspection. According to Shoemaker (1994), if introspection were to conform to a perceptual model, even one broadly construed, then it would have to satisfy two conditions. The first is what he calls the “causal condition” – introspective beliefs about one’s own mental states are caused by those mental states, by a reliable belief-producing mechanism. The second is what he calls the “independence condition” – the existence of mental states is independent of any introspective beliefs about them. Shoemaker’s main concern with the inner sense model is that introspection fails to satisfy this second condition. His arguments here relate to his arguments for the self-intimation thesis, discussed above. According to Shoemaker, rationality demands that a creature be sensitive to her own mental states, and thus it is of the essence of mental states to reveal themselves to introspection. (See also Falvey 2000.)

Many of the additional criticisms of the inner sense view stem from alleged disanalogies with “outer” sense. For example, there is no organ of introspection the way that there are organs of sense perception. Armstrong (1968) dismisses this criticism by noting that even one of the outer senses – namely, proprioception – proceeds without a sensory organ. Lormand (2000) makes the further point that there are mental processes such as imagination, dreaming, and hallucination that we think of as “sensory” even though they do not proceed by way of organs of perception.

Another disanalogy arises from the fact that introspecting lacks any distinctive phenomenology. Lyons (1986) takes this to show that it cannot literally be a form of inner perception. Each of our other senses has a distinct phenomenology; think, for example, of the phenomenology of tasting or of touching. However, the phenomenology of introspecting seems to derive wholly from what is being introspected; in and of itself, there is nothing that it is like to introspect.

This last criticism relates to the so-called diaphanousness or transparency of experience (not to be confused with the epistemic transparency claim discussed above that is associated with the Cartesian conception of mind). Experience is said to be transparent in the sense that we ‘see’ right through it to the object of that experience, analogously to the way we see through a pane of glass to whatever is on the other side of it. For example, when I am having an experience of a red tomato, and I try to focus on the experience, there seems to be nothing on which I can train my focus except the tomato itself. If experience is transparent in this way, then introspection is not a matter of “looking within.”

Moved by considerations of experiential transparency, some philosophers – most notably Dretske (1995, 1999) – have offered a perceptual model of introspection that differs dramatically from the inner sense view. Dretske claims that all mental states are representational states. But this means that there is no longer any need, or any use, for the sort of internal scanning mechanism posited by proponents of the inner sense view. Instead:

One becomes aware of representational facts by an awareness of physical objects. One learns that A looks longer than B, not by an awareness of the experience that represents A as longer than B, but by an awareness of A and B, the objects the experience is an experience of. On a representational theory of the mind, introspection becomes an instance of displaced perception—knowledge of internal (mental) facts via an awareness of external (physical) facts. (Dretske 1995)

On this displaced perception view, then, not only should we reject the infallibility thesis and the self-intimation thesis, but we should also reject the immediacy thesis. Introspective knowledge for someone like Dretske will be inferential knowledge – inferred from our knowledge of the external world.

In addition to the displaced perception view, there are other views that are at least broadly speaking observational views of introspection but yet deny that introspection should be construed along the lines of the traditional inner sense view. For example, Nichols and Stich (2003a, 2003b; see also Nichols 2000) have offered a view of introspection that works by way of a “monitoring mechanism.” The input to the mechanism is one’s own mental state; the output is a belief that one has that mental state. As stated, the monitoring mechanism sounds very much like Armstrong’s self-scanning mechanism, and thus looks like a version of the inner sense model of introspection. However, the view proposed by Stich and Nichols differs from standard versions of the inner sense view in its explicit denial that the monitoring mechanism detects the presence of the inputted mental state by way of phenomenological features.

b. Non-Observational Models

In the previous section, we saw Shoemaker’s criticisms of the inner-sense model of introspection. Having developed these criticisms, Shoemaker (1988, 1990, 1994) offers his own view of how introspection works. This view is not observational. Rather, on Shoemaker’s view, there is a constitutive connection between being in a mental state and having introspective knowledge about that state: “Our minds are so constituted, or our brains are so wired, that for a wide range of mental states, one’s being in a certain mental state produces in one, under certain conditions, the belief that one is in that mental state.” (Shoemaker 1994)

For Shoemaker, this constitutive connection owes to the fact that we are rational creatures. It is an essential part of being rational that a being has the capacity for introspection. Shoemaker argues for this by primarily by invoking considerations of Moore’s Paradox (see above; section 1c). This argument aims to show that ‘self-blindness’ is not possible; in order to explain an individual’s possession of an introspective belief about a given mental state, we need only to invoke the fact that the individual has the relevant mental state plus normal intelligence, rationality, and conceptual capacity.

A similar account is offered by Gallois (1996), who argues that whenever I have a justified belief, I am entitled to infer from what I believe to the fact that I so believe it. This non-evidential inference will be made by any rational creature, since it is the only way that we can make sense of the world around us; in the absence of such an inference, an individual will not be able to contrast her beliefs about the world with the world as it actually is. What would result, according to Gallois, is an irrational view of the world around us. Thus, rationality demands the self-attribution of beliefs. Gallois then offers related considerations to show that rationality also demands the self-attribution of other mental states. For example, unless we attribute perceptual states to ourselves, we will be unable to contrast how the world appears to us with how it actually is.

Obviously, the plausibility of the sort of non-observational account that Shoemaker and Gallois offer will depend on the notion of rationality involved. Additionally, proponents of this sort of non-observational account must defend themselves against charges of circularity. Briefly put, the charge of circularity arises since it might naturally be thought that an adequate account of rationality will have to make reference to our introspective capacity. (See Kind 2003 and Siewert 2003 for criticisms of Shoemaker’s account.)

The Theory Theory of self-awareness (TTSA) offers a very different kind of non-observational model. TTSA derives directly from the “Theory Theory,” a view which claims that an individual’s network of commonsense folk-psychological beliefs constitute a theory which she uses to explain and predict the behavior of others. Typically, this inferential, theory-based understanding that we achieve of others’ mental states is contrasted with the direct, non-inferential understanding that we can have of our own mental states. Recent results from developmental psychology, however, call this contrast into question. For example, Gopnik (1993; see also Gopnik and Meltzoff 1994) presents evidence that very young children make errors about their own psychological states parallel to the kinds of errors that they make about others’ psychological states. These errors are not easily explained if we assume a sharp divide between the way we come to know about our own mental states and the way we come to know about others’ mental states. Gopnik thus concludes that the child’s theory of mind applies not only to others but to herself as well:

The important point is that the theoretical constructs themselves, and particularly the idea of intentionality, are not the result of some direct first-person apprehension that is then applied to others. Rather, they are the result of a cognitive construction. The child constructs a theory that explains a wide variety of facts about the child’s experience and behavior and about the behavior and language of others.

Recent research on autism and schizophrenia is also often cited by proponents of TTSA. For example, Carruthers (1996b) discusses experimental results suggesting that autistic individuals lack introspective access to many of their own current mental states. If we think of autism as a kind of “mind-blindness,” then these results are exactly what would be predicted by TTSA.

In developing his own version of TTSA, however, Carruthers (1996a) departs from Gopnik’s claim that self-knowledge is inferential. Rather, Carruthers thinks that mental states should be thought of as akin to the theoretical entities of physics; they are the theoretical entities of folk psychology. Introspection should likewise be thought of as akin to the kind of theory-laden perception that often goes on in the physical sciences. For example, armed with the appropriate background information, a physicist might sometimes simply see that electrons are being emitted by the substance that she is studying. Likewise, claims Carruthers, each of us can sometimes simply see – “that is, know intuitively and non-inferentially” – what mental states we have. Depending on what sense we make of Carruthers invocation of “seeing” here, this version of the TTSA might be best classified as an observational model of introspection (though obviously one that is quite different from the traditional inner-sense view).

Opponents of this view typically raise two very different sorts of criticisms. First, they criticize the data for the theory, suggesting that the research from developmental psychology does not in fact support the conclusions that proponents of TTSA want to draw. For example, Nichols (2000) argues that there are developmental asynchronies between a child’s ability to posit knowledge and ignorance to herself and her ability to posit knowledge and ignorance to others. Were TTSA to be true, however, we should expect these abilities to develop in parallel. Second, they criticize the theory itself. For example, Nichols and Stich (2003b) argue that the theory is underdescribed in one very critical respect. For TTSA to be plausible, the proponent has to allow that there is special information available in the first-person case that is not available in the third-person case. But proponents of TTSA have no plausible account of what this special information might be. Consider Gopnik’s remark that “we may well be equipped to detect certain kinds of internal cognitive activity in a vague and unspecified way, what we might call ‘the Cartesian buzz’.” (Gopnik 1993) Stich and Nichols reasonably note that the postulation of some mysterious ‘buzz’ does not offer much help in this regard.

c. Skepticism about Introspection

Many philosophers who take a skeptical view towards introspection were influenced by the views of Wittgenstein. Wittgenstein is often associated with a view called expressivism about introspection, i.e., the claim that what appear to be introspective reports of our mental states are in fact not reports at all, but rather mere expressions of those mental states. Saying “I am in pain” is akin to saying “ouch.” As expressions, rather than reports, of one’s pain, neither of these utterances has any propositional content. Such expressions, in other words, are non-cognitive. This view parallels expressivism in ethics, where utterances like “Giving money to charity is morally right” and “Killing an innocent person is wrong” are interpreted as expressions of approval and disapproval. Whether Wittgenstein actually was an expressivist about introspection is, as is often the case with Wittgensteinian interpretation, a complicated and controversial exegetical question. But certainly some of his remarks are at least suggestive of expressivism, as for example when he says: “the verbal expression of pain replaces crying and does not describe it.” (Wittgenstein 1958)

It is worth noting that some philosophers have recently embraced expressivism without embracing skepticism about introspection. The basic line is to divorce expressivism from non-cognitivism, i.e., to deny that mental state self-ascriptions are reports without denying that such self-ascriptions can be judged true or false. In this spirit, Falvey (2000) argues that the denial that mental state self-ascriptions are reports amounts only to the denial of the observational model of introspection. Mental state self-ascriptions can be truth-apt even if they are mere expressions. His subsequent account of self-knowledge hinges on the notion of sincerity of utterance. According to Falvey, when an individual sincerely self-ascribes a mental state, the sincerity of her utterance will guarantee that she is in that mental state. Although Falvey recognizes that in general the sincerity of an utterance is not sufficient for the truth of that utterance, he argues that mental state self-ascriptions are special in that the gap between sincerity and truth collapses. Moreover, the absence of this gap is what explains privileged access. (See Bar-On 2005 for a different version of neo-expressivism.)

An additional source of skepticism about introspection comes from the rejection of the Cartesian picture of the mind. Cartesianism encourages us to think of the mind like a theater in which the ongoing show can be viewed by only one individual, the person whose mind it is. Critics of Cartesianism suggest that this picture seduces us into falsely positing a faculty for viewing the show, i.e., a faculty of introspection. Along with the rejection of Cartesianism, they urge the rejection of any commitment to a faculty of introspection.

One such critic is Ryle, who argues that the standard philosophical view of introspection is a logical mess. (Ryle 1949) His primary criticism takes the form of a regress argument. On the standard view, self-knowledge consists in a higher-order attention to some lower-order state. But this entails that we would also have to attend to the higher-order state. And the situation is actually even worse than this, since the state of attending to that higher-order state would itself have to be attended to, and so on, leading to a vicious infinite regress.

Importantly, in rejecting introspection, Ryle does not deny that we can attain self-knowledge. We can achieve self-knowledge exactly the same way that we can achieve knowledge of other people, namely, by drawing inductive conclusions on the basis of observed behavior. As this suggests, skepticism about introspection goes along with a rejection of privileged access. On Ryle’s view, there is nothing epistemically special about our judgments about our own mental states. In fact, not only do we typically fail to be in a better position to make judgments about our own mental states than about others’ mental states, or than the position others are in with respect to one’s own mental states, but we might on occasion be in a worse position. After all, one is often inclined to view one’s self with a considerable lack of objectivity.

In a similar spirit to Ryle’s account of introspection is Lyons’ (1986) “replay” account of introspection, according to which introspection is simply a process of perceptual replay. For example, if someone introspects in order to determine whether she is angry at her colleague, Lyons claim that what she will do is to call to mind the things that she did when she was last with the colleague, – what she said, how she reacted, etc. In sum, for Lyons introspection “is not a special and privileged executive monitoring process, over and above the more plebeian processes or perception, memory, and imagination; it is those processes put to a certain use.”

Dennett, one of Ryle’s most famous students, is also skeptical of standard views of introspection. According to Dennett, in many instances where we think we are introspecting, we are actually theorizing. (Dennett 1991) Moreover, since we are notoriously bad at this theorizing, our first-person access to our own mental states is considerably less privileged than is commonly thought.

3. Introspection and the Nature of Mind

Having discussed the epistemic status and the nature of introspection, we now turn briefly to two claims about introspection which have played significant roles in discussions of the nature of mind. First, we discuss whether introspection can provide a criterion of mentality. Second, we discuss whether introspection can provide support for a dualist answer to the mind-body problem. Both of these claims are associated with Descartes, and both have come under fire in recent discussions of philosophy of mind.

a. Introspectibility as a Mark of the Mental

In claiming that the mind is transparent, Descartes was in essence making a claim about the scope of introspection: the introspective capacity has complete access to all of the contents of the mind. This gives rise to a further claim associated with a Cartesian conception of mind, namely, that introspectibility is the mark of the mental. For Descartes, there is nothing to the mind but that which is accessible to introspection.

In making this claim, Descartes should not be seen as committed to the implausibly strong view that a state must actually be introspected in order to count as a mental state. An individual can have mental states that, at any particular moment, are not present to her consciousness. For example, of the many beliefs that an individual holds, only a very few are occurrent at any point and time. Most of them are non-occurrent – they are standing beliefs that are recalled to consciousness only when needed. Take your belief that 6+7=13; presumably, before reading the previous sentence, that belief was not present to your consciousness. But the fact that it was not then being introspected does not incline us to deny that you then held the belief.

The accessibility that Descartes has in mind is accessibility in principle. Although prior to reading the sentence above you were not introspectively accessing your belief that 6+7=13, you could in principle have introspectively accessed that belief at any time. A belief remains introspectively accessible in principle even if there are many moments in time in which the belief is not being introspected. You might have some mental states to which it is more difficult to gain introspective access. In some cases it might require careful reflection; in other cases, it might even require some kind of psychoanalysis. But as long as the state can, in principle, be brought to consciousness, Descartes counts the relevant state as mental.

The problem, however, is that there are some states that we intuitively think of as mental states but that seem even in principle inaccessible to introspection. At least since the work of Freud, we have recognized the existence of mental states that are deeply unconsciousness. There can be some desires, for example, that are so deeply repressed that they cannot be made available to introspection even with the best psychoanalysis that money can buy. Such states, in other words, are not even in principle accessible to introspection.

With some slight tweaking to our accessibility-in-principle claim, it might be possible to avoid this problem. For example, Brook and Stainton (2000) offer the following suggestion. Consider some deeply unconscious states that we are assuming are not even introspectively accessible in principle. In other words, no matter how hard you were to try, you could not bring them to introspective awareness. Nonetheless: “were you to become aware of them, (directly aware of them, not aware of them by inferring them from behavior or something else), it would be by becoming able to introspect them.” The only way you could have direct access to such states, in other words, would be through introspection.

Even this suggestion, however, is not enough to save the claim that introspectibility is the mark of the mental. First of all, it is not clear how we should evaluate the above counterfactual conditional, given that the mental states in question are ex hypothesi inaccessible to introspection. Second of all, there is another class of mental states for which it is even harder to make sense of the supposition that we could become aware of them directly. Consider here any states that are typically thought to be at the “sub-personal” level. For example, if we accept Chomsky’s theory of language acquisition, each of us mentally represents all sorts of basic linguistic rules. These representations, however, are inaccessible in principle to introspection. Moreover, these states – unlike the sorts of repressed desires just considered – do not even seem to be suitable targets for introspective awareness.

For these reasons, it is unlikely that we will be able to use introspectibility as a criterion of the mental. Perhaps introspectibility can serve as a sufficient condition for a state’s being a mental state, but it cannot provide us with a necessary condition. Despite what Descartes thought, our mental life seems to outrun our introspective capacity.

b. Introspective Arguments for Dualism

In the Second Meditation, Descartes (1641) presents the famous line of reasoning often referred to as the Cogito – I think, therefore I am. Even if a powerful demon were to deceive me about the external world, “he will never bring it about that I am nothing so long as I think that I am something.” And so Descartes concludes that he can be certain that he exists.

Having achieved certainty about his existence, however, Descartes does not yet have any certain knowledge about what kind of being he is. He then goes on to examine the nature of the human mind. The course of this examination has suggested the following argument:

  1. Descartes cannot doubt that he (his mind) exists.
  2. Descartes can doubt that his body exists.
  3. Descartes’ mind is not the same thing as Descartes’ body, i.e., dualism is true.

Whether Descartes intended to be using the reflections of the Second Meditation to be offering this argument for dualism is a thorny exegetical question that we sidestep here. For our purposes, the question is whether these considerations do support dualism. More specifically, we are interested in closely related considerations that specifically invoke introspection:

  1. Mental states are known by introspection.
  2. Brain states are not known by introspection.
  3. Therefore, mental states are not identical to brain states.

According to Leibniz’ Law, if a has a property that b lacks, then a is not identical to b. Here we seem to have found a property that mental states have that brain states lack, namely, that they are known by introspection. Unfortunately for the dualist, however, this argument commits an intensional fallacy. For Leibniz’ Law to apply, the property in question must be extensional, that is, it must apply to an object independently of how we refer to that object. In this case, the property “is known by” fails to be extensional.

Faced with this objection, the dualist might offer the following amended argument:

  1. Mental states are knowable by introspection.
  2. Brain states are not knowable by introspection.
  3. Therefore, mental states are not identical to brain states.

The dualist can plausibly claim that the property invoked by this argument – being knowable by introspection – is a genuine, extensional property, and thus he can avoid the intensional fallacy committed by the previous argument. But this argument falls victim to a related objection, as explicated by Churchland (1985). According to Churchland, the materialist has no reason to accept premise 2: “if mental states are indeed identical with brain states, then it is really brain states that we have been introspecting all along, though without appreciating their fine-grained nature.” The fact that temperature is identical to mean molecular kinetic energy means that we can sense mean molecular kinetic energy by feeling, whether we realize that’s what we’re sensing by feeling or not. The fact that we don’t realize that we can introspect brain states does not mean that mental states are not identical to brain states.

In contemporary discussions of the mind-body problem, the above argument from introspection has not played much of a role. However, related considerations from introspection are still in play. For example, Chalmers (1996) offers an argument from “epistemic asymmetry” to show that consciousness cannot be reductively explained. According to this argument:

Our grounds for belief in consciousness derive solely from our own experience of it. Even if we knew every last detail about the physics of the universe … that information would not lead us to postulate the existence of conscious experience. My knowledge of consciousness, in the first instance, comes from my own case, not from any external observations. It is my first-person experience of consciousness that forces the problem on me.

Although this passage (and Chalmers’ discussion of the argument) does not specifically mention introspection, it seems clear that the way one gains first-person experience of consciousness is through introspection.

More generally, many of the contemporary arguments offered in discussions of the mind-body problem rely on premises that can only be supported by introspection, or by introspective projection. Consider, for example, Jackson’s Knowledge Argument. Mary, who is locked in a black and white room and has never had any color sensations, learns every physical fact there is about color. Nonetheless, claims Jackson, when she leaves the room and sees a ripe tomato for the first time, she will learn some new fact about the color red. Thus, there are facts that escape the physicalist story. (Jackson 1982) Whether or not this argument succeeds in establishing the falsity of physicalism is hotly debated, but for our purposes, what’s most important is the following question: how can we judge the truth of Jackson’s claim that Mary learns (or even seems to learn) a new fact about color when she leaves the room? What we must do, it seems, is to imagine ourselves in Mary’s position and judge what we think our epistemic position would be upon exiting the room. In other words, we engage in a sort of introspective projection. In this way, introspection continues to play a key role in this and many other arguments relating to the mind-body problem.

4. References and Further Reading

  • Armstrong, D. 1963. “Is Introspective Knowledge Incorrigible?” The Philosophical Review 72: 417-432.
  • Armstrong, D. 1968. A Materialist Theory of Mind. Humanities Press.
  • Armstrong, D. 1981. The Nature of Mind and Other Essays. Cornell University Press.
  • Alston, W. 1971. “Varieties of Privileged Access.” American Philosophical Quarterly 8: 223-241.
  • Alston, W. 1976. “Self-Warrant: A Neglected Form of Privileged Access.” American Philosophical Quarterly 13: 257-272.
  • Bar-On, D. 2005. Speaking My Mind: Expression and Self-Knowledge. Oxford University Press.
  • Bermudez, J. 2003. “The Elusiveness Thesis, Immunity to Error through Misidentification, and Privileged Access.” In Gertler 2003b: 213-231.
  • Block, N. 1995. “On a Confusion About a Function of Consciousness.” Behavioral and Brain Sciences 18: 227-247.
  • Brook, A. and Stainton, R. 2000. Knowledge and Mind. The MIT Press.
  • Carruthers, P. 1996a. “Simulation and Self-Knowledge: A Defence of Theory-Theory.” In Theories of Theories of Mind, ed. P. Carruthers and P. Smith. Cambridge University Press, 1996.
  • Carruthers, P. 1996b. “Autism as Mind-Blindness: An Elaboration and Partial Defence.” In Theories of Theories of Mind, ed. P. Carruthers and P. Smith. Cambridge University Press, 1996.
  • Cassam, Q. (ed.) 1994. Self-Knowledge. Oxford University Press.
  • Chalmers, D. 1996. The Conscious Mind. Oxford University Press.
  • Chisholm, R. 1981. The First Person. University of Minnesota Press.
  • Churchland, P. M. 1988. Matter and Consciousness. The MIT Press.
  • Dennett, D. 1991. Consciousness Explained. Little, Brown & Company.
  • Descartes, R. 1641. Meditations on First Philosophy. In The Philosophical Writings of Descartes, trans. J. Cottingham, R. Stoothoff and D. Murdoch. Cambridge University Press, 1985.
  • Dretske, F. 1995. Naturalizing the Mind. The MIT Press.
  • Dretske, F. 1999. “The Mind’s Awareness of Itself.” Philosophical Studies 95: 103-124.
  • Falvey, K. 2000. “The Basis of First-Person Authority.” Philosophical Topics 28: 69-99.
  • Gallois, A. 1996. The Mind Within, The World Without. Cambridge University Press.
  • Gertler, B. 2000. “The Mechanics of Self-Knowledge.” Philosophical Topics 28: 125-146.
  • Gertler, B. 2001. “Introspecting Phenomenal States.” Philosophy and Phenomenological Research 63: 305-328.
  • Gertler, B. (ed.) 2003a. Privileged Access: Philosophical Accounts of Self-Knowledge. Ashgate Press.
  • Gertler, B. 2003b. “Introduction: Philosophical Issues about Self-Knowledge.” In Gertler 2003a.
  • Gopnik, A. 1993. “How We Know Our Own Minds: The Illusion of First-Person Knowledge of Intentionality.” Behavioral and Brain Sciences 16:1-14.
  • Gopnik, A. and Meltzoff, A. 1994. “Minds, Bodies, and Persons: Young Children’s Understanding of the Self and Others As Reflected in Imitation and Theory of Mind Research.” In Self-Awareness in Animals and Humans, ed. S. Parker, R. Mitchell and M. Boccia. Cambridge University Press, 1994.
  • Heil, J. 1988. “Privileged Access.” Mind 97: 238-251. Reprinted in Externalism and Self-Knowledge, ed. P. Ludlow and N. Martin. CSLI Publications, 1998.
  • Jackson, F. 1973. “Is There a Good Argument Against the Incorrigibilty Thesis.” Australasian Journal of Philosophy 51: 51-62.
  • Jackson, F. 1982. “Epiphenomenal Qualia.” Philosophical Quarterly 32: 127-36.
  • James, W. 1890/1950. The Principles of Psychology. Dover Publications.
  • Kind, A. 2003. “Shoemaker, Self-Blindness and Moore’s Paradox,” The Philosophical Quarterly 53: 39-48.
  • Kornblith, H. 1998. “What Is It Like To Be Me?” Australasian Journal of Philosophy 76: 48-60.
  • Locke, J. 1689/1975. Essay Concerning Human Understanding, ed. P. Nidditch. Clarendon Press.
  • Lormand, E. 2000. “Shoemaker and ‘Inner Sense’.” Philosophical Topics 28.
  • Lycan, W. 2003. “Dretske’s Ways of Introspecting.” In Gertler 2003a.
  • Lycan, W. 1996. Consciousness. The MIT Press.
  • Lyons, W. 1986. The Disappearance of Introspection. The MIT Press.
  • O’Regan, J., Rensink, R. and Clark, J. 1999. “Blindness To Scene Changes Caused By “Mudsplashes.” Nature 398: 34.
  • Nichols, S. 2000. “The Mind’s ‘I’ and the Theory of Mind’s ‘I’: Introspection and Two Concepts of Self.” Philosophical Topics 28: 171-199.
  • Nichols, S. and Stich, S. 2003a. “How to Read Your Own Mind: A Cognitive Theory of Self-Consciousness.” In Consciousness: New Philosophical Perspectives, ed. Q. Smith and A. Jokic. Oxford University Press.
  • Nichols, S. and Stich, S. 2003b. Mindreading. Oxford Univerity Press.
  • Nisbett, R. and Wilson, T. 1977. “Telling More than we can Know: Verbal Reports on Mental Processes.” Psychological Review 84: 231-259.
  • Russell, B. 1912. The Problems of Philosophy. Oxford University Press.
  • Russell, B. 1910. “Knowledge by Acquaintance and Knowledge by Description.” Proceedings of the Aristotelian Society 11: 108-128.
  • Schwitzgebel, E. 2005. “The Unreliability of Naïve Introspection.” (Unpublished manuscript).
  • Shoemaker, S. 1963. Self-Knowledge and Self-Identity. Cornell University Press.
  • Shoemaker, S. 1988. “On Knowing One’s Own Mind.” In Philosophical Perspectives 2: Epistemology, ed. J. Tomberlin. Ridgeview Publishing Company. Reprinted in Shoemaker 1996.
  • Shoemaker, S. 1990. “First Person Access.” In Philosophical Perspectives 4: Action Theory and Philosophy of Mind, ed. J. Tomerlin. Ridgeview Publishing Company. Reprinted in Shoemaker 1996.
  • Shoemaker, S. 1994. “Self-Knowledge and ‘Inner-Sense’.” Philosophy and Phenomenological Research 54: 249-314. Reprinted in Shoemaker 1996.
  • Shoemaker, S. 1995. “Moore’s Paradox and Self-Knowledge.” Philosophical Studies 77: 211-228. Reprinted with revisions in Shoemaker 1996.
  • Shoemaker, S. 1996. The First-Person Perspective and Other Essays. Cambridge University Press.
  • Siewert, C. 2003. “Self-Knowledge and Rationality: Shoemaker on Self-Blindness.” In Gertler 2003a.
  • Warner, R. 1993. “Incorrigibility.” In Objections to Physicalism, ed. H. Robinson. Clarendon Press, 1993.
  • Wittgenstein, L. 1958. Philosophical Investigations, trans. G.E.M. Anscombe. Macmillan Publishing Co.
  • Wright, C. 1989. “Wittgenstein’s Later Philosophy of Mind: Sensation, Privacy, and Intention.” The Journal of Philosophy 86: 622-634.
  • Wright, C., Smith, B., and Macdonald, C. (eds.) 1998. Knowing Our Own Minds. Clarendon Press.

Author Information

Amy Kind
Email: amy.kind@claremontmckenna.edu
Claremont McKenna College
U. S. A.

John Rawls (1921—2002)

RawlsJohn Rawls was arguably the most important political philosopher of the twentieth century. He wrote a series of highly influential articles in the 1950s and ’60s that helped refocus Anglo-American moral and political philosophy on substantive problems about what we ought to do. His first book, A Theory of Justice [TJ] (1971), revitalized the social-contract tradition, using it to articulate and defend a detailed vision of egalitarian liberalism. In Political Liberalism [PL] (1993), he recast the role of political philosophy, accommodating it to the effectively permanent “reasonable pluralism” of religious, philosophical, and other comprehensive doctrines or worldviews that characterize modern societies. He explains how philosophers can characterize public justification and the legitimate, democratic use of collective coercive power while accepting that pluralism.

Although most of this article will be devoted to TJ, the exposition of that work will take account of Political Liberalism and other later works of Rawls. TJ sets out and defends the principles of Justice as Fairness. Rawls takes the basic structure of society as his subject matter and utilitarianism as his principal opponent. Part One of TJ designs a social-contract-type thought experiment, the Original Position (OP), and argues that parties in the OP will prefer Justice as Fairness to utilitarianism and various other views. In order to understand the argument from the OP, one must pay special attention to the motivation of the parties to the OP, which is philosophically stipulated and provided with a Kantian interpretation. Part Two of TJ checks the fit between the principles of Justice as Fairness and our more concrete considered views about just institutions, thereby helping move us towards a reflective equilibrium that supports those principles. Part Three of TJ addresses the stability of a society organized around Justice as Fairness, arguing that there will be an important congruence in such a society between people’s views about justice and what they value. By the time he wrote Political Liberalism, however, Rawls had decided that an inconsistency in TJ called for recasting the argument for stability. In other ways, the argument of TJ rested on important simplifications, which had the effect of setting aside questions about international justice, disability, and familial justice. Rawls turned to these “problems of extension,” as he called them, at the end of his career.

Table of Contents

  1. Biographical Sketch
  2. Rawls’s Mature Work: A Theory of Justice (1971)
    1. The Basic Structure of Society
    2. Utilitarianism as the Principal Opponent
    3. The Original Position
      1. The Conditions and Purpose of the Original Position
      2. The Motivations of the Parties to the Original Position
      3. Kantian Influence and Interpretation of the Original Position
    4. The Principles of Justice as Fairness
    5. The Argument from the Original Position
    6. Reflective Equilibrium
    7. Just Institutions
    8. Stability
    9. Congruence
  3. Recasting the Argument for Stability: Political Liberalism (1993)
  4. Problems of Extension
  5. References and Further Reading

1. Biographical Sketch

John Bordley Rawls was born and schooled in Baltimore, Maryland, USA. Although his family was of comfortable means, his youth was twice marked by tragedy. In two successive years, his two younger brothers contracted an infectious disease from him—diphtheria in one case and pneumonia in the other—and died. Rawls’s vivid sense of the arbitrariness of fortune may have stemmed in part from this early experience. His remaining, older brother attended Princeton for undergraduate studies and was a great athlete. Rawls followed his brother to Princeton. Although Rawls played baseball, he was, in later life at least, excessively modest about his success at that or at any other endeavor.

Rawls continued for his Ph.D. studies at Princeton and came under the influence of the first of a series of Wittgensteinean friends and mentors, Norman Malcolm. From them, he learned to avoid entanglement in metaphysical controversies when possible. Rawls’s doctoral dissertation (1950) already showed, however, that he would not be content to deconstruct our impulse to ask metaphysical questions; instead, he devoted himself to constructive philosophical tasks. Turning away from the then-influential program of attempting to analyze the meaning of the moral concepts, he replaced it with what was—for a philosopher—a more practically oriented task: that of characterizing a general method of moral decision making. Part of this dissertation work was the basis of his first published article, “Outline of a Decision Procedure for Ethics.” (1951). This was an early attempt to tackle the central question of Rawls’s mature theory: what sort of decision procedure can we imagine that would help us resolve disputed claims in a fair way?

Of equal significance to Rawls’s turn away from conceptual analysis and towards a more practical conception of moral philosophy was his encounter, during a year (1952-3) as a Fulbright Fellow in Oxford, with exciting, substantive work in legal and political philosophy, especially that of H.L.A. Hart and Isaiah Berlin. Hart had made progress in legal philosophy by connecting the idea of social practices with the institutions of the law. Rawls’s second published essay, “Two Concepts of Rules” (1955), uses a conception of social practices influenced by Hart to explore a kind of rule-utilitarianism. Compare TJ at 48n.. In Isaiah Berlin, Rawls met a brilliant historian of political thought—someone who, by his own account, had been driven away from philosophy by the aridity of mid-century conceptual analysis. Berlin influentially traced the historical careers of competing, large-scale values, such as liberty (which he distinguished as either negative or positive) and equality. Not long after his time in Oxford, Rawls embarked on what was to become a life-long project of finding a coherent and attractive way of combining freedom and equality into one conception of political justice. Cf. PL at 327. This project first took the form of a series of widely-discussed articles about justice published between 1958 and 1969.

After teaching at Cornell and MIT, Rawls took up a position in the philosophy department at Harvard in 1962. There he remained, being named a University Professor in 1979. Throughout his career, he devoted considerable attention to his teaching. In his lectures on moral and political philosophy, Rawls focused meticulously on great philosophers of the past—Locke, Hume, Rousseau, Leibniz, Kant, Hegel, Marx, Mill, and others—always approaching them deferentially and with an eye to what we could learn from them. Mentor to countless graduate students over the years, Rawls inspired many who have become influential interpreters of these philosophers.

The initial publication of A Theory of Justice in 1971 brought Rawls considerable renown. This complex book, which reveals Rawls’s thorough study of economics as well as his internalization of themes from the philosophers covered in his teaching, has since been translated into 27 languages. While there are those who would claim a greater originality for Political Liberalism, TJ remains the cornerstone of Rawls’s reputation.

2. Rawls’s Mature Work: A Theory of Justice (1971)

a. The Basic Structure of Society

The subject matter of Rawls’s theory is societal practices and institutions. Some social institutions can provoke envy and resentment. Others can foster alienation and exploitation. Is there a way of organizing society that can keep these problems within livable limits? Can society be organized around fair principles of cooperation in a way the people would stably accept?

Rawls’s original thought is that equality, or a fair distribution of advantages, is to be addressed as a background matter by constitutional and legal provisions that structure social institutions. While fair institutions will influence the life chances of everyone in society, they will leave individuals free to exercise their basic liberties as they see fit within this fair set of rules. To carry out this central idea, Rawls takes as the subject matter of TJ “the basic structure of society,” defined (as he later put it) as “the way in which the major social institutions fit together into one system, and how they assign fundamental rights and duties and shape the division of advantages that arises through social cooperation.” PL at 258. Rawls’s suggestion is, in effect, that we should put all our effort into seeing to it that “the rules of the game” are fair. Once society has been organized around a set of fair rules, people can set about freely “playing” the game, without interference.

b. Utilitarianism as the Principal Opponent

Rawls explains in the Preface to the first edition of TJ that one of the book’s main aims is to provide a “workable and systematic moral conception to oppose” utilitarianism. TJ at xvii. Utilitarianism comes in various forms. Classical utilitarianism, the nineteenth century theory of Jeremy Bentham and John Stuart Mill, is the philosophy of “the greatest good of the greatest number.” The more modern version is average utilitarianism, which asks us not to maximize the amount of good or happiness, but rather its average level in society. The utilitarian idea, as Rawls confronts it, is that society is to be arranged so as to maximize (the total or average) aggregate utility or expected well-being. Utilitarianism historically dominated the landscape of moral philosophy, often being “refuted,” but always rising again from the ashes. Rawls’s view was that until a sufficiently complete and systematic alternative is put on the table to compete with utilitarianism, its recurrence will be eternal. In addition to developing that constructive alternative, however, Rawls also offered some highly influential criticisms of utilitarianism. His critique of average utilitarianism will be described below. About classical utilitarianism, he famously complains that it “adopt[s] for society as a whole the principle of choice for one man.” In so doing, he suggests, it fails to “take seriously the distinction between persons.” TJ at 24.

c. The Original Position

Recognizing that social institutions distort our views (by sometimes generating envy, resentment, alienation, or false consciousness) and bias matters in their own favor (by indoctrinating and habituating those who grow up under them), Rawls saw the need for a justificatory device that would give us critical distance from them. The original position (OP) is his “Archimedean Point,” the fulcrum he uses to obtain critical leverage. TJ at 230-32. The OP is a thought experiment that asks: what principles of social justice would be chosen by parties thoroughly knowledgeable about human affairs in general but wholly deprived—by the “veil of ignorance”—of information about the particular person or persons they represent?

i. The Conditions and Purpose of the Original Position

The OP, as Rawls designs it, self-consciously builds on the long social-contract tradition in Western political philosophy. In classic presentations, such as John Locke’s Second Treatise of Civil Government (1690), the social contract was sometimes described as if it were an actual historical event. By contrast, Rawls’s social-contract device, like his earlier decision procedure, is frankly and completely hypothetical. While Rawls is most emphatic about this in his later work, for example, PL at 75, it is clear already in TJ. He insists there that it is up to the theorist to construct the social-contract thought-experiment in the way that makes the most sense given its task of helping us select principles of justice. Especially because of its frankly hypothetical nature, Rawls’s OP “carries to a higher level of abstraction the familiar theory of the social contract as found, say in Locke, Rousseau, and Kant.” TJ at 10.

The idea is to help justify a set of principles of social justice by showing that they would be selected in the OP. The OP is accordingly set up to build in the moral conditions deemed necessary for the resulting choice to be fair and to insulate the results from the influence of the extant social order. The veil of ignorance plays a crucial role in this set-up. TJ at sec. 23. It assures that each party to the choice is equally or symmetrically situated, with none enjoying greater power (or “threat advantage”) than any other. TJ at 116, 121. It also isolates the parties’ choice from the contingencies—the sheer luck—underlying the variations in people’s natural abilities and talents, their social backgrounds, and their particular society’s historical circumstances. About their society, Rawls has the parties simply assume that it is characterized by the “circumstances of justice,” which principally include (a) the fact that material goods are scarce, but moderately so and (b) that there is, within society, a plurality of worldviews—“conceptions of the good” —moral, religious, and secular. TJ at sec. 22.

It would be too fanciful to think of the parties to the OP as having the capacity to invent principles. The point of the thought experiment, rather, is to see which principles would be chosen in a fair set-up. To use the OP this way, we must offer the parties a menu of principles to choose from. Rawls offers them various principles to consider. Among them are his own principles (to be described below) and the two versions of utilitarianism, classical and average. The crux of Rawls’s appeal to the OP is whether he can show that the parties will prefer his principles to average utilitarianism.

Would rational parties behind a veil of ignorance choose average utilitarianism? The economist John Harsanyi argues that they would because it would be rational for parties lacking any other information to maximize their expectation of well-being. Harsanyi (1953) Since they do not know who they will be, they will therefore want to maximize the average level of well-being in society. Given Rawls’s opposition to utilitarianism, it would be ironic if Rawls’s thought experiment supported it. Because Rawls’s OP differs from Harsanyi’s choice situation in important ways, however, its parties will not prefer average utilitarianism to Rawls’s competing principles. The most crucial difference concerns the motivation that is attributed to the parties by stipulation. The veil deprives the parties of any knowledge of the values—the conception of the good—of the person into whose shoes they are to imagine stepping. What, then, are they to prefer? Since Harsanyi refuses to supply his parties with any definite motivation, his answer is somewhat mysterious. Cf. TJ at 152. Rawls instead defines the parties as having a determinate set of motivations.

ii. The Motivations of the Parties to the Original Position

The parties in the hypothetical OP are to choose on behalf of persons in society, for whom they are, in effect, trustees. PL at 76, 106. The veil of ignorance, however, prevents the parties from knowing anything particular about the preferences, likes or dislikes, commitments or aversions of those persons. They also know nothing particular about the society for which they are choosing. On what basis, then, can the parties choose? To ascribe to them a full theory of the human good would fly in the face of the facts of pluralism, for such theories are deeply controversial. Instead, Rawls suggests, we should ascribe to them a “thinner” or less controversial set of commitments. At the core of these are what he calls the “primary goods:” rights, liberties, and opportunities; income and wealth; and the social bases of self-respect. To give the parties a definite basis on which to reason, Rawls postulates that the parties “normally prefer more primary goods rather than less.” TJ at 123. This is the only motivation that TJ ascribes to the parties.

In their pursuit of the primary goods, the parties are defined as being “mutually disinterested:” each is motivated to obtain as many primary goods as he or she can and does not care if others attain primary goods. TJ at 12. The parties are motivated neither by benevolence nor by envy or spite. Many commentators think that this assumption of the parties’ mutual disinterest reflects an unattractively individualistic view of human nature, but, as with the motivations ascribed to the parties, the ascription of mutual disinterest is not intended to mirror human nature. The assumption of mutual disinterest reflects Rawls’s development of, and reaction against, both the sympathetic-spectator tradition in ethics, exemplified by David Hume and Adam Smith, and the more recent ideal-observer theory. The former tradition attempts to imagine the point of view of a fully benevolent spectator of the human scene who reacts impartially and sympathetically to all human travails and successes. The ideal-observer theory typically imagines a somewhat more dispassionate or impersonal, but still omniscient, observer of the human scene. Each of these approaches asks us to imagine what such a spectator or observer would morally approve.

Against these theories, Rawls raises a number of objections, which can be boiled down to this: either they involve neglecting the separateness of persons (in roughly the same way that utilitarianism does when it adds up everyone’s happiness), TJ at 164, or, if they seek to avoid utilitarian aggregation, they will find that “benevolence is at sea as long as its many loves are in opposition in the persons of its many objects.” TJ at 166. In other words, all difficult questions of human conflict will be simply reproduced within the sympathetic spectator’s breast. Rawls was determined to get beyond this impasse. He suggests that the OP should combine the mutual-disinterest assumption with the veil of ignorance. This combination, he argues, will achieve the rough moral equivalence of universal benevolence without either neglecting the separateness of persons or sacrificing definiteness of results. TJ at 128.

As we will see, the definite positive motivations that Rawls ascribes to the parties are crucial to explaining why they will prefer his principles to average utilitarianism. Because the parties’ motivations are essential to the arguments bearing on this central philosophical contest, it is important to attend to Rawls’s rationale for giving this motivation to the parties.

The primary goods are supposed to be uncontroversially worth seeking, albeit not for their own sakes. Initially, TJ presented the primary goods simply as goods that “normally have a use whatever a person’s plan of life.” TJ at 54. Although this claim seems quite modest, philosophers rebutted it by describing life plans or worldviews for which one or another of the primary goods is not useful. These counterexamples revealed the need for a different rationale for the primary goods. At roughly the same time, Rawls began to develop further the Kantian strand in his view. These Kantian ideas ended up providing a new rationale for the primary goods.

iii. Kantian Influence and Interpretation of the Original Position

Rawls had long admired Immanuel Kant’s moral philosophy, making it central to his teaching of the subject. See CP essays 13, 16, 23. TJ aims to build on Kant’s central ideas and to improve on them in certain respects. TJ at sec. 41. By insisting, as against utilitarianism, on the “separateness of persons,” Rawls carries on Kant’s theme of respect for persons. Kant held that the true principles of morality are not imposed on us by our psyches or by eternal conceptual relations that hold true independently of us; rather, Kant argued, the moral law is a law that our reason gives to itself. It is, in this sense, self-chosen or autonomous law. Kant’s position is not that morality requires whatever Ms. Smith or Mr. Jones chooses to believe it does. Rather, his claim is that the rational (or vernünftig) nature that each person shares shapes a single moral law, valid for all: “the categorical imperative.”

Rawls suggests that the OP well models Kant’s central ideas. The OP is set up so that the parties reflect our nature as “reasonable and rational”—Rawls’s dual way of rendering the Kantian adjective vernünftig. Once it is so set up the parties are to choose principles. Their task of choosing principles thus models the idea of autonomy. In designing the OP, Rawls also aimed to resolve what he took to be two crucial difficulties with Kant’s moral theory: the danger of empty abstractness early stressed by Hegel and the difficulty of assuring that the moral law’s dictates adequately express, as Kant thought they must, our nature as free and equal reasonable and rational beings. Rawls addresses the issue of abstractness in many ways—perhaps most fundamentally by dropping Kant’s aim of finding an a priori basis for morality. Although Rawls’s use of the veil of ignorance keeps particular facts at a distance, he insists, as against Kant, that “moral theory must be free to use contingent assumptions and general facts as it pleases.” TJ at 44. Another feature that reduces the abstractness of Rawls’s view is his focus on institutions—on the basic structure of society. In this light, we can see his institutional focus as carrying forward Hegel’s insight that the idea of human freedom can achieve an adequately concrete realization only by a unified social structure of a certain kind.

The OP also addresses the second problem with Kant’s moral theory—the problem of expression. The OP, Rawls suggests, “may be viewed … as a procedural interpretation of Kant’s conception of autonomy and the categorical imperative within the framework of an empirical theory.” TJ at 226. To be autonomous, for Kant, is to act on a law that one gives oneself, a law adequate to one’s nature as a free and equal, reasonable and rational person. The parties to the OP, in selecting principles, implement this idea of autonomy. How they represent equality and rationality are obvious, for they are equally situated and are rational by definition. Reasonableness enters the OP not principally by the rationality of the parties but by the constraints on them—most especially the veil of ignorance. They are also constrained in ways not yet mentioned and that we shall not discuss further, such as “the formal constraints of the concept of right.” TJ at sec. 23. The veil also expresses (or “models”) a crucial aspect of our freedom, namely our freedom to endorse principles in a way that is not controlled by the historical contingencies of the society into which we are born. TJ at 225.

Rawls’s attempt to solve the problem of expression also led him towards a fuller articulation of the parties’ motivations, ascribing to them certain “highest-order interests.” An intermediate step in this direction is his characterization of our three highest-order powers, the “moral powers” that persons have as reasonable and rational beings. “The rational” corresponds to Kant’s “hypothetical imperative” with its directive to take effective means to one’s ends; “the reasonable” corresponds to Kant’s categorical imperative, the moral law that demands that we do the right thing, irrespective of what our ends are. To conceive of persons as reasonable and rational, then, is to conceive of them as having certain higher-order powers. On the side of the rational, there is, first, the power to frame our ends—our “conception of the good”—and to pursue it by selecting effective means to satisfying them. Second, we can also revise our ends when we see reason to do so. Third, on the side of the reasonable, we have the power or capacity to act from “an effective sense of justice:” we can do the right thing.

This Kantian conception of the powers of reasonable and rational persons directly supports Rawls’s later account of the motivations of the parties. The parties are conceived as having highest-order interests that correspond directly to these highest-order powers. Although the account of the moral powers was present in TJ, it is only in his later works that Rawls uses this idea to defend and elaborate the motivation of the parties in the OP.

Rawls’s account of the moral powers explains why it makes sense to postulate that the parties are motivated to secure the primary goods. In various, complicated ways, in his later work, Rawls defends the primary goods as being required for free and equal citizens to promote and protect their three moral powers. This is to cast the primary goods as items objectively needed by moral persons occupying the role of free and equal citizens. While the list of primary goods may not be a perfect or complete account of what is needed to support this aspect of moral personality, Rawls claims that it is the “best available” account that we can muster in the face of the fact of reasonable pluralism. PL at 188-9.

In addition to providing a new rationale for the primary goods, Rawls’s account of the moral powers also became, in his later work, a basis for elaborating the motivations ascribed to the parties. In Political Liberalism, Rawls describes the motivation as: “The parties in the original position have no direct interests except an interest in the person each of them represents and they assess principles of justice in terms of primary goods. In addition, they are concerned with securing for the person they represent the higher-order interests we have in developing and exercising our … moral powers and in securing the conditions under which we can further our determinate conceptions of the good, whatever it is.” PL at 105-6. Here, the motivation of the parties is importantly extended by postulating that these hypothetical beings care about the moral powers of persons in society and also, by extension, about those persons’ ability to pursue what they particularly care about or are committed to.

Rawls’s assumptions about the motivations of the parties involve frankly moral content and are justified on openly moral grounds, as he had always avowed. His aim remains, nonetheless, to assemble in the OP a series of relatively uncontroversial, relatively fixed points among our considered moral judgments and to build an argument on that basis for the superiority of some principles of justice over others.

d. The Principles of Justice as Fairness

“Justice as Fairness” is Rawls’s name for the set of principles he defends in TJ. He refers to “the two principles of Justice as Fairness,” but the second has two parts. These principles address two different aspects of the basic structure of society: the “First Principle” addresses the essentials of the constitutional structure. It holds that society must assure each citizen “an equal claim to a fully adequate scheme of equal basic rights and liberties, which scheme is compatible with the same scheme for all.” PL at 5. The second principle addresses instead those aspects of the basic structure that shape the distribution of opportunities, offices, income, wealth, and in general social advantages. The first part of the second principle holds that the social structures that shape this distribution must satisfy the requirements of “fair equality of opportunity.” The second part of the second principle is the famous—or infamous—“Difference Principle.” It holds that ”social and economic inequalities … are to be to the greatest benefit of the least advantaged members of society.” PL at 6. Each of these three centrally addresses a different set of primary goods: the First Principle concerns rights and liberties; the principle of Fair Equality of Opportunity concerns opportunities; and the Difference Principle primarily concerns income and wealth. (That the view adequately secures the social basis of self-respect is something that Rawls argues more holistically). TJ at 477-8.

e. The Argument from the Original Position

The argument that the parties in the OP will prefer Justice as Fairness to utilitarianism and to the various other alternative principles with which they are presented divides into two parts. There is, first, the question whether the parties will insist upon securing a scheme of equal basic liberties and upon giving them top priority. Secondly, assuming that they will, there remains the question whether social inequalities should be governed by Rawls’s “second principle,” comprising Fair Equality of Opportunity and the Difference Principle, or else should be addressed in a utilitarian way. Making the latter choice, and so inserting utilitarianism into a position subordinate to the First Principle, yields what Rawls calls a “mixed conception.” TJ at 107.

Each of these parts of the argument from the OP is considerably aided by the clarified account of the primary goods that emerges in Rawls’s later work and that has been set out above in the section on the motivation of the parties to the OP. Regarding the first part of the argument from the OP, the crucial point is that the parties are stipulated to care about rights and liberties. They further know, as a general fact about human beings, that the determinate persons on whose behalf they are choosing are likely to have firmly and deeply-held “religious, philosophical, and moral views.” PL at 311 They also have a higher-order interest in protecting these persons’ abilities to advance these conceptions. Accordingly, “they cannot take chances by permitting a lesser liberty of conscience to minority religions, say, on the possibility that those they represent espouse a majority or dominant religion.” PL at 311. Rawls admits that persons’ deeply-held views are not always set in stone, but he insists that not all circumstances in which they may change are morally acceptable. He argues that protecting one’s ability to exercise one’s highest-order power to change one’s mind about such things requires an adequate scheme of basic liberties. PL at 312-3. In addition, he argues that securing the First Principle importantly serves the higher-order interest in an effective sense of justice—and does so better than the pure utilitarian alternative—by better promoting social stability, mutual respect, and social unity. PL at 317-24.

The second part of the argument from the OP takes the First Principle for granted and addresses the matter of social inequalities. Its sticking point has always been the Difference Principle, which strikingly and influentially articulates a liberal-egalitarian socioeconomic position. While there are questions about Rawls’s precise formulation and implementation of the principle of Fair Equality of Opportunity, it is far less controversial, both in theory and in practice. It is the Difference Principle that would most clearly demand deep reforms in existing societies. The set-up of the OP suggests the following, informal argument for the difference principle: because equality is an ideal fundamentally relevant to the idea of fair cooperation, the OP situates the parties symmetrically and deprives them of information that could distinguish them or allow one to gain bargaining advantage over another. Given this set-up, the parties will consider the situation of equal distribution a reasonable starting point in their deliberations. Since they know all the general facts about human societies, however, the parties will realize that society might depart from this starting point by instituting a system of social rules that differentially reward the especially productive and could achieve results that are better for everyone than are the results under rules guaranteeing full equality. This is the kind of inequality that the Difference Principle allows and requires: departures from full equality that make some better off and no one worse off.

While this is the intuitive idea behind the Difference Principle, Rawls’s statement of the principle is more careful and precise. Three main refinements are worth noting. First, because the principle pertains to the basic structure of society and because the parties are comparing different societies organized around different principles, the expectations that matter are not those of particular people but those of representative members of broad social classes. Second, to make his exposition a little simpler, Rawls makes some technical assumptions that let him focus only on the expectations of the least-well-off representative class in a given society. (These assumptions—of “close-knitness” and “chain-connection”—enable him to ignore, for instance, the possibility of increasing the inequality between the rich and the middle-class without affecting those on the bottom. For those who find these simplifying assumptions too restrictive, Rawls offers a multi-tiered, or “lexical,” version of the Difference Principle. TJ at 72. Allowed by these simplifying assumptions to focus only on the least well off representative persons, the Difference Principle thus holds that social rules allowing for inequalities in income and wealth are acceptable just in case those who are least well off under those rules are better off than the least-well-off representative persons under any alternative sets of social rules. This formulation already takes account of the third refinement, which recognizes that the people who are the worst off under one set of social arrangements may not be the same people as those who are worst off under some other set of social arrangements. Cf. PL at 7n.

The Difference Principle requires society to look out for the least well off. But would the parties to the OP prefer the Difference Principle to a utilitarian principle of distribution? Here, Rawls’s interpretation of the OP matters. It took a while for commentators to grasp the degree to which Rawls’s characterization of the OP departed from the much simpler one favored by Harsanyi, from the point of view of which Rawls’s argument for the Difference Principle appeared to be a plain mistake. For parties like Harsanyi’s, it would be irrational to choose the Difference Principle. Harsanyi’s parties lack any determinate motivation: as Rawls puts it, they are “bare-persons.” TJ at 152. With nothing but the bare idea of rationality to guide them, they will naturally choose any principle that will maximize their utility expectation. Since this is what the principle of Average Utilitarianism does, they will choose it. Yet as we have seen, Rawls departs from Harsanyi’s version of the thought experiment by attributing a determinate motivation to the parties, while denying that an index of the primary goods provides an interpretation of what the parties conceive to be good. Rawls never defends the primary goods as goods in themselves. Rather, he defends them as versatile means. In the later theory, the primary goods are defended as facilitating the pursuit and revision, by the persons the parties represent, of their conceptions of the good. While the parties do not know what those conceptions of the good are, they do care about whether the persons they represent can pursue and revise them.

With this departure from Harsanyi in mind, we may finally explain why the parties in the OP will prefer the principles of Justice as Fairness, including the Difference Principle, to average utilitarianism. In laying out the reasoning that favors the Difference Principle, Rawls argues that the parties will have reason to use the “maximin” rule. The maximin rule is a general rule for making choices under conditions of uncertainty. It is markedly different from the rule of maximizing expected value, the more “averaging” sort of rule that Harsanyi’s parties employ. The maximin rule directs one to select that alternative where the minimum place is higher (on whatever the relevant measure is) than the minimum place in any other alternative. Applied to the theory of social justice, maximin is an approach “a person would choose for the design of a society in which his enemy is to assign him his place.” TJ at 133.

The parties to Rawls’s OP are not “bare-persons” but “determinate-persons.” TJ at 152. They care about the primary goods and the highest-order moral powers, but they also know, in effect, that the primary goods that they are motivated to seek are not what the persons they represent ultimately care about. Accordingly, the parties will give special importance to protecting the persons they represent against social allocations of primary goods that might frustrate those persons’ ability to pursue their determinate conceptions of the good. If the parties knew they had in hand an adequate sketch of the good, they might use that to assess the gamble they face, choosing in a maximizing way like Harsanyi’s parties. But Rawls’s parties instead know that the primary goods that they are motivated to seek do not adequately match anyone’s conception of the good. Accordingly, it is rational for them to take a cautious approach. They must do what they can to assure to the persons they represent have a sufficient supply of primary goods for those persons to be able to pursue whatever it is that they do take to be good.

f. Reflective Equilibrium

Although the OP attempts to collect and express a set of crucial constraints that are appropriate to impose on the choice of principles of justice, Rawls recognized from the beginning that we could never just hand over the endorsement of those principles to this hypothetical device. Rather, he foresaw the need to “work from both ends,” pruning and adjusting things as we go. TJ at 18. That is, we need to stop and consider whether, on reflection, we can endorse the results of the OP. If those results clash with some of our more concrete considered judgments about justice, then we have reason to think about modifying the OP.

Alternatively—and this is what Rawls means by working “from both ends”—instead of modifying the OP, we might decide that the argument from the OP gives us good reason to modify the considered judgments of justice with which its conclusions clash. Eventually, we may hope that this process reaches a “reflective equilibrium.” If it does, Rawls wrote, “we shall find a description of the initial situation that matches our considered judgments duly pruned and adjusted.” Ibid.

The reflective equilibrium has been an immensely influential idea about moral justification. It is not a full theory of justification. When it was introduced, however, it suggested a different approach to justifying moral theories than was being commonly pursued. The idea of reflective equilibrium takes two steps away from the sort of conceptual analysis that was then prevalent. First, working on the basis of considered judgments suggests that it is not necessary to build moral theories on necessary or a priori premises. What matters, rather, is whether the premises are ones that “we do, in fact, accept.” TJ at 19. Rawls characterizes considered judgments as simply judgments reached under conditions where our sense of justice is likely to operate without distortion. TJ at 42. Second, the sort of pruning and adjusting that Rawls assumes will be involved in the search for reflective equilibrium implies that theories need not aim for a perfect fit with theory-independent “data.” Whereas the practitioners of conceptual analysis had raised to a fine art the method of generating counterexamples to a general theory, Rawls writes that “objections by way of counterexample are to be made with care.” TJ at 45. Checking a theory’s fit with one’s more concrete considered judgments is only a way-station on the route to reflective equilibrium. Reaching it might involve revising some of those more concrete judgments. A third novel idea about justification thus emerges from this picture: it involves arguments built in various different directions at once. The resulting justification, as Rawls puts it, “is a matter of the mutual support of many considerations.” TJ at 19, 507.

Eventually, the hope is that each person will reach a reflective equilibrium that coincides with every other person’s. Since it is up to each person, however, to determine which arguments are most compelling, Rawls stresses that the reader must make up his or her own mind, rather than trying to predict or anticipate what everyone else will think. TJ at 44.

g. Just Institutions

Part Two of TJ aims to show that Justice as Fairness fits our considered judgments on a whole range of more concrete topics in moral and political philosophy, such as the idea of the rule of law, the problem of justice between generations, and the justification of civil disobedience. Consistent with the idea of reflective equilibrium, Rawls suggests pruning and adjusting those judgments in a number of places. One of the thorniest such issues, that of tolerating the intolerant, recurs in PL. In addition to serving its main purpose of facilitating reflective equilibrium on Justice as Fairness, Part Two also offers a treasure trove of influential and insightful discussion of these and other topics in political philosophy. There is hardly space here even to summarize all the worthwhile points that Rawls makes about these topics. A summary of his controversial and influential discussion of the idea of desert (that is, getting what one deserves), however, will illustrate how he proceeds.

As we have seen, Rawls was deeply aware of the moral arbitrariness of fortune. He held that no one deserves the social position into which he or she is born or the physical characteristics with which he or she is endowed from birth. He also held that no one deserves the character traits he or she is born with, such as his or her capacity for hard work. As he wrote, “The natural distribution is neither just nor unjust; nor is it unjust that persons are born into society at some particular position. These are simply natural facts. What is just and unjust is the way that institutions deal with these facts.” TJ at 87.

In Part Two, Rawls sets out to square this stance on the moral arbitrariness of fortune with our considered judgments about desert, which do hold that desert is relevant to distributive claims. For instance, we tend to think that people who work harder deserve to be rewarded for their effort. We may also think that the talented deserve to be rewarded for the use of their talents, whether or not they deserved those talents in the first place. With these common-sense precepts of justice, Rawls does not disagree; but he clarifies them by responding to them dialectically. TJ at sec. 48. He questions whether these common-sense claims are meant to stand independently of any assumptions about whether or not the basic institutions of society—especially those institutions of property law, contract law, and taxation that, in effect, define the property claims and transfer rules that make up the marketplace—are just. It is unreasonable, Rawls argues, to say that desert is a direct basis for distributional claims even if the socio-economic system is unfair. It is much more reasonable to hold, he suggests, that whether one deserves the compensation one can command in the job marketplace, for instance, depends on whether the basic social institutions are fair. Are they set up so as to assure, among other things, an appropriate relationship between effort and reward? It is this justice of the basic structure that is Rawls’s topic.

Rawls’s alternative proposal is that the common-sense precepts about desert generally presuppose that the basic structure of society is itself fair. When they are qualified in line with this presupposition, Rawls supports them. To prevent the unqualified and the qualified claims from being confused with each other, however, he uses the term “legitimate expectations” as a term of art to express the claims of desert appropriately so qualified. A crucial idea of Justice as Fairness is that fundamental principles of justice must be respected for the rules of social cooperation to be fair, and that when they are, we should allow the free operation of the market largely to determine people’s legitimate expectations. (This dialectical clarification of the moral import of desert, however, did not satisfy all commentators. See Robert Nozick (1974).

h. Stability

In pursuing his novel topic of the justice of the basic structure of society, Rawls posed novel questions. One set of questions concerned what he calls the “stability” of those societies whose institutions live up to the requirements of a given set of principles of justice. The stability of the institutions called for by a given set of principles of justice—their ability to endure over time and to re-establish themselves after temporary disturbances—is a quality those principles must have if they are to serve their purposes.. TJ at 398-400. Unstable institutions would not secure the liberties, rights, and opportunities that the parties care about. If any set of institutions realizing a given set of principles were inherently unstable, that would suggest a need to revise those principles. Accordingly, Rawls argues, in Part Three of TJ, that institutions embodying Justice as Fairness would be stable – even more stable than institutions embodying the utilitarian principle.

In addressing the question of stability, Rawls never leaves behind the perspective of moral justification. Stability of a kind might be achieved by arranging a stand-off of opposing but equal armies. The results of such a balance of power are not of interest to Rawls. Rather, the stability question he asks concerns whether, in a society that conforms to the principles, citizens can wholeheartedly accept those principles. Wholeheartedness will require, for instance, that the reasons on the basis of which the citizens accept the principles are reasons affirmed by those very principles. PL at xlii. If stability can be grounded on such wholeheartedly moral reasons—as opposed to ulterior reasons—then it is “stability for the right reasons.” PL at xxxix. In TJ, the account of stability for the right reasons involved imagining that this wholeheartedness arose from individuals being thoroughly educated, along Kantian lines, to think of fairness in terms of the principles of Justice as Fairness. Cf. PL at lxii. As we will see, he later came to think that this account violated the assumption of pluralism.

The imaginative exercise of assessing the comparative stability of different principles would be useless and unfair if one were to compare, say, an enlightened and ideally-run set of institutions embodying Justice as Fairness with the stupidest possible set of institutions compatible with the utilitarian principle. In order to standardize the terms of comparison, Rawls discusses only the “well-ordered societies” corresponding to each of the rival sets of principles. His notion of a well-ordered society is complex. See CP at 232-5. The gist of it is that the relevant principles of justice are publicly accepted by everyone and that the basic social institutions are publicly known (or believed with good reason) to satisfy those principles.

Assessing the comparative stability of alternative well-ordered societies requires a complex imaginative effort at tracing likely phenomena of social psychology. As Rawls comments, “One conception of justice is more stable than another if the sense of justice that it tends to generate is stronger and more likely to override disruptive inclinations and if the institutions it allows foster weaker impulses and temptations to act justly.” CP at 398. In order to address the first of these issues, about the strength of the sense of justice, Chapter VIII develops a rich and somewhat original account of moral education. Drawing upon empirical research in developmental psychology, Rawls describes the gradual development of individuals’ senses of justice as involving three stages: the morality of authority, which is fostered in families; the morality of association; and the morality of principles. He argues that each of these stages of moral education will work more effectively under Justice as Fairness than it will under utilitarianism. TJ at chap. 8. He also argues that a society organized around the two principles of Justice as Fairness will be less prone to the disruptive effects of envy than will a utilitarian society. TJ at secs. 80-81.

i. Congruence

As we have seen, the veil of ignorance disconnects the argument from the OP from any given individual’s full conception of the good. The final question addressed by TJ attempts to reconnect justice to each individual’s good, not in general, but within the well-ordered society of Justice as Fairness. A stable society is one that generates attitudes, such as are encapsulated in an effective sense of justice, that support the just institutions of that society. If, in the well-ordered society, having those attitudes is also a good for the persons who have them, then there is a “match between justice and goodness” that Rawls calls “congruence.” TJ at 350.

In order to address this question of congruence, TJ develops an account of the good for individuals. Chapter VII of TJ, in fact, develops a quite general theory of goodness—called “goodness as rationality”—and then applies it to the special case of the good of an individual over a complete life. Rawls starts from the suggestion that “A is a good X if and only if A has the properties (to a higher degree than the average or standard X) which it is rational to want in an X, given what X’s are used for, or expected to do, and the like (whichever rider is appropriate).” TJ at 350-1. This idea, developed in dialogue with the leading alternatives from the middle of the 20th century, still repays attention. To work out this suggestion for the case of the good for persons, Rawls influentially developed and deployed the notion of a “life plan.” A rational plan of life for an individual, he argued, is answerable to certain principles of “deliberative rationality.” These Rawls sets out in a low-key way that masks the power and originality of his formulations. TJ at 359-72.

Rawls’s argument for congruence—that having an effective sense of justice built around the principles of Justice as Fairness will be a good for each individual—is a complex and philosophically deep one. It appeals to at least four types of intermediate good, each of which may be presumed to be of value to just about everyone: (i) the development and exercise of complex talents (which Rawls’s “Aristotelian Principle” presumes to be a good for human beings), TJ at 374, (ii) autonomy, (iii) community, and (iv) the unity of the self. Rawls’s argument for congruence is spread out across many sections of TJ. Some of its main threads are pulled together by Samuel Freeman in his contribution to The Cambridge Companion to Rawls. Freeman (2003). With regard to autonomy, to supplement the positive argument flowing from the Kantian interpretation of the OP, Rawls argues that the type of objectivity claimed for the principles of Justice as Fairness is not at odds with the idea of the autonomous establishment of principles. TJ at sec. 78. He further argues that Justice as Fairness supports the kind of tightly-knit community he calls “a social union of social unions,” marked by the shared purpose or “common aim of cooperating together to realize their own and another’s nature in ways allowed by the principles of justice.” TJ at 462. If Rawls is right about the congruence of goodness and justice, these “ways” are hardly trivial. (Not long after TJ was published, it came under attack by a set of critics who identified themselves as “communitarians,”  see for example MacIntyre (1984) and Sandel (1998). Ironically, the communitarian critique focused largely on Parts One and Two of TJ, giving short shrift to the powerful articulation of this ideal of community in Part Three.) Finally, regarding the unity of the self, Rawls criticizes the Procrustean sort of unity that could come from attaching oneself to a single “dominant end.” He notes the advantages of a conception of the unity of the self that hangs, instead, on the regulative status of principles of justice. TJ at secs. 83-85. The cumulative effect of these appeals to the development of talent, autonomy, community, and the unity of the self is to support the claim of Justice as Fairness to congruence. In a well-ordered society corresponding to Justice as Fairness, Rawls concludes, an effective sense of justice is a good for the individual who has it. In TJ, this congruence between justice and goodness is the main basis for concluding that individual citizens will wholeheartedly accept the principles of justice as fairness.

3. Recasting the Argument for Stability: Political Liberalism (1993)

Rawls has the parties to the OP assume that the society for which they are choosing principles is in the “circumstances of justice,” which include the presence of a plurality of irreconcilable moral, religious, and philosophical doctrines. But his argument for the comparative stability and the congruence of Justice as Fairness, imagines a well-ordered society in which everyone is brought up in ways deeply informed by the adherence by all adults to the same principles of justice. Accordingly, his discussion of stability and congruence in Part Three of TJ is at odds with the assumption of pluralism. In his second book, Political Liberalism [PL], he set out to rectify this “serious problem.” PL at xvii.

PL clarifies that the only acceptable way to rectify the problem is to modify the account of stability and congruence, because pluralism is no mere theoretical posit. Rather, pluralism has been endemic among the liberal democracies since the 16th century wars of religion. Moreover, pluralism is a permanent feature of liberal or non-repressive societies. It does not rest on irrationality. On the contrary, within a wide range such pluralism is “reasonable” and will not be erased by people’s attempts to cooperate reasonably. That is because a series of intractable “burdens of judgment” all but preclude reasoned convergence on fundamental and comprehensive principles about how to live. PL at 54-8. Accordingly, Rawls takes it as a fact that the kind of uniformity in fundamental moral and political beliefs that he imagined in Part Three of TJ can be maintained only by the oppressive use of state force. He calls this “the fact of oppression.” PL at 37. Since he also—unsurprisingly—holds that oppression is illegitimate, he refrains from offering fundamental and comprehensive principles of how to live. In this way, his insistence on the fact of oppression prompts a marked scaling back of the traditional aims of political philosophy.

The seminal idea of PL is “overlapping consensus.” In an overlapping consensus, each citizen—no matter which of society’s many “comprehensive conceptions” he or she endorses—ends up endorsing the same limited, “political conception” of justice, each for his or her own reasons. The principal role of the overlapping consensus is to replace TJ’s description of wholehearted acceptance. Unlike TJ’s description, the overlapping consensus conceptually reconciles wholehearted acceptance with the fact of reasonable pluralism.

Part of this newer approach is the distinction between “comprehensive conceptions,” which address all questions about how to live, and “political conceptions,” which address only political questions. This distinction has proven somewhat troublesome. The “domain of the political,” as Rawls calls it, is not completely distinct from morality. In concerning himself only with the political, he is not setting aside all moral principles and turning instead to mere strategy or Realpolitik. On the contrary, a political conception “is, of course, a moral conception,” but it is a moral conception that concerns itself only with the basic structure of society. PL at 11. Further, a political conception is one that may be developed in a “freestanding” way, drawing only upon the “very great values” of the political, rather than being presented as deriving from any more comprehensive moral or religious doctrine. PL at 139. A corollary of this approach is that such a political liberalism is not wholly neutral about the good. PL at 191-3. While Justice as Fairness is one such political conception, in PL Rawls makes a point of stressing that it is just one member of the broader family of views he refers to as the “reasonable liberal political conceptions.”

Armed with the idea of an overlapping consensus on a reasonable political conception, Rawls could have contented himself with describing the historical and sociological grounds for hoping that a reasonable overlapping consensus on a political liberalism might be reached. Hope is indeed the leitmotif of PL. E.g PL at,40, 65, 172, 246, 252, 392. But because Rawls never drops his role as an advocate of political liberalism, he must go beyond such disinterested sociological speculation. He must find and describe ways of advocating this view that are compatible with his full, late recognition of the fact of reasonable pluralism. This attempt is what makes PL so rich, difficult, and interesting.

The difficulty is this: to advocate Justice as Fairness or any other political liberalism as true would be to clash with many comprehensive religious and moral doctrines, including those that simply deny that truth or falsity apply to claims of political morality, as well as those that insist that political-moral truths derive only from some divine revelation. To preserve the possibility of an overlapping consensus on political liberalism, it might be thought that its defenders must deny that political liberalism is simply true, severely hampering their ability to defend it. To cope with this difficulty, Rawls pioneered a stance in political philosophy that mirrored his general personal modesty: a stance of avoidance. Using the “method of avoidance,” Rawls neither asserts nor denies such truth claims. CP at 395. “The central idea,” he writes, “is that political liberalism moves within the category of the political and leaves philosophy as it is.” PL at 375. Perhaps defending political liberalism as the most reasonable political conception is to defend it as true; but, again, Rawls neither asserts nor denies that this is so.

Developing a compelling freestanding presentation of political morality may be possible if we may draw upon a shared set of relevant moral ideas implicit in the “background culture” of democratic societies. PL at 14. Foremost among such shared ideas is the idea of fair cooperation among free and equal citizens. Much of PL is accordingly devoted to recasting the earlier argument for Justice as Fairness in terms that are “political, not metaphysical.” Many of the revisions concern the arguments for various features of the OP. Although these revisions occupy much of PL, they need not be covered further here, as most of them have been already anticipated in the above exposition of TJ. To have structured the exposition in this way is to have sided with those who see considerable unity in Rawls’s work, for example, Wenar (2004). One important change, however, is that PL goes to considerably further lengths to show that the values to which the view appeals are political, rather than being tied up in any particular comprehensive doctrine. For instance, that citizens are thought of as free is defended, not by general metaphysical truths about human nature, but rather by our widely shared political convictions. “On the road to Damascus Saul of Tarsus becomes Paul the Apostle. Yet such a conversion implies no change in our public or institutional identity.” PL at 31. On the contrary, our political rights ought not to vary with such changes. To think of political rights in this way is to think of citizens as free, in a relevant, political sense.

Instead of seeing a fundamental unity to Rawls’s work, some commentators emphasize what they take to be PL’s new focus on political legitimacy, as distinct from political justice, for example, Estlund (1998) and Dreben (2003). It is certainly true that Rawls prominently deploys a “liberal principle of legitimacy” that was not present in TJ. This principle states that

[O]ur exercise of political power is proper and hence justifiable only when it is exercised in accordance with a constitution the essentials of which all citizens may reasonably be expected to endorse in the light of principles and ideals acceptable to them as reasonable and rational. PL at 217; cf. 137.

This principle thus appears to connect Rawls’s view to that of others working in political and democratic theory who lean on the notion of “reasons that all can accept,” for example, Gutmann and Thompson (1996). Rawls, however, leans more heavily than most on the notion of reasonableness. This is apparent in a late essay, where he writes that “our exercise of political power is proper only when we … reasonably think that other citizens might also reasonably accept those reasons [on which it is based].” CP at 579.

These further qualifications hint at the relatively limited purpose for which Rawls appeals, within PL, to this principle of legitimacy. The principle is part of his account of “public reason” in pluralist societies. This account answers the question: how can we, in political society, reason with one another so as to set priorities and make political decisions, given the fact of reasonable pluralism and the burdens of judgment that make it permanent? Finding reasons that we reasonably think others might accept is a crucial part of the answer. The demand that we do so makes up the core of the duty of civility that binds citizens acting in any official capacity. Rawls’s limits on public reasoning have been highly controversial, but it is important to remember that they form part of his revised thought experiment about stability. The overall question of PL is similar to that of Part Three of TJ: what grounds do we have for thinking that a political liberalism would be stable? In this context, Rawls’s duty of civility may be seen as contributing his defense of the following conditional claim: if citizens of a pluralist society would abide by such restraints of civility, and if a political liberalism were the object of an overlapping consensus, then that political liberalism would be stable.

To this observation, some of the critics of Rawls’s account of public reason reply that accepting this kind of restraint on public dialogue would be too high a price to pay for a stable liberalism. See Richardson & Weithman vol. 5 (1999). Yet in his last essay on the subject, “The Idea of Public Reason Revisited” (in LP as well as CP), Rawls introduced qualifications to his duty of civility that have mollified some. To begin with, he emphasizes that this stricture is not meant to restrict public discussion in the “background culture” in any way, but only to constrain certain official interactions. He further introduces a “proviso” that allows one to rely, even in official contexts, on reasons dependent on one or another comprehensive doctrine, so long as “in due course” one provides “properly public reasons.” CP at 584. Even this revised account of civility remains highly debatable. Still, it should make a difference to the debate whether we consider the restriction only as part of a hypothetical consideration of the stability of a given well-ordered society (specifically, one that has reached overlapping consensus on some political liberalism) or rather as a doctrine about what civility requires in our society, here and now.

4. Problems of Extension

The modesty and restraint we have noted in Rawls’s general approach is also revealed in the way he set aside a number of difficult questions that properly arise within his self-assigned topic. Complicated as his view is, he was keenly aware of the many simplifying assumptions made by his argument. “We need to be tolerant of simplifications.” TJ at 45-6. His most prominent simplifications are the following two: the assumption (“for the time being”) that society is “a closed system isolated from other societies,” TJ at 7, and that “all citizens are fully cooperating members of a society over a complete life.” CP at 332; cf. PL at 20. These simplifications set aside questions about international justice and about justice for the disabled. An additional simplifying assumption implicit in the account of moral development in Part Three of TJ, is that families are just and caring. Relaxing each of these three simplifying assumptions gives rise to important and challenging “problems of extension” for a Rawlsian view.

In The Law of Peoples [LP] (1999), Rawls relaxes the assumption that society is a closed system that coincides with a nation-state. Once this assumption is dropped, the question that comes to the fore is: upon what principles should the foreign policy of a decent liberal regime be founded? Rawls first looks at this question from the point of view of ideal theory, which supposes that all peoples enjoy a decent liberal-democratic regime. At this level, with reference to a rather thinly-described global original position, Rawls develops basic principles concerning non-intervention, respect for human rights, and assistance for countries lacking the conditions necessary for a decent or just regime to arise. These principles govern one nation in its relations with others. He next discusses the principles that should govern decent liberal societies in their relations with peoples who are not governed by decent liberalisms. He articulates the idea of a “decent consultation hierarchy” to illustrate the sort of non-liberal society that is owed considerable tolerance by the people of a decent liberal society. In a part of the book devoted to non-ideal theory, Rawls impressively defends quite restrictive positions on the right of war and on the moral conduct of warfare. Surprisingly, questions of global distributive justice are confined to one brief section of LP. In that section, Rawls treats quite dismissively two earlier attempts to extend his theoretical framework to questions of international justice, those of Beitz (1979) and Pogge (1994). Drawing on the ideas of TJ, these philosophers had developed quite demanding principles of international distributive justice. In LP, Rawls instead favors a relatively minimal “duty of assistance,” with a definite “target and a cut-off point.” LP at 119.

As to justice for the disabled, Rawls never attempted an extension of his theory. He did direct some brief remarks to the topic in Political Liberalism, noting that the view generates a salient distinction between those whose disabilities permanently prevent them from being able to express their higher-order moral powers as fully cooperating citizens and those whose do not. PL at 183-6. While Rawls limited himself to this observation, Norman Daniels’ work on justice and health care may be viewed as an attempt to extend Rawls’s view in the direction the observation indicates. Daniels (1985). Nussbaum argues that Rawlsian social-contract theory is a deeply flawed basis for addressing questions of justice for the disabled and cannot be well extended to deal with them. Nussbaum (2005).

Responding to critics, Rawls did briefly address justice within the family in “The Idea of Public Reason Revisited.” CP at 595-601; LP at 156-164. He writes that he had “thought that J. S. Mill’s landmark The Subjection of Women … made clear that a decent liberal conception of justice (including what I have called Justice as Fairness) implied equal justice for women as well as men,” but admits that he “should have been more explicit about this.” CP at 595. He there affirms that “the family is part of the basic structure” and is subject to being regulated by the principles of political justice. The laws defining the rights of marriage, divorce, and the ownership and inheritance of property by families and family members are presumably all part of the basic structure of society, as are provisions of the criminal law protecting the basic rights of family members not to be abused.

In the case of the family as in economic transactions, Rawls’s stance illustrates once more how his focus on institutional justice structures his attempt to reconcile freedom and equality. Egalitarian concerns are addressed at the institutional level by assuring that protection for the appropriate rights and liberties is assured by the basic structure of society. Freedom is preserved by allowing individuals to pursue their reasonable conceptions of the good, whatever they may be, within those constitutional constraints.

5. References and Further Reading

Principal Works by John Rawls:

  • A Theory of Justice, rev. ed., Harvard University Press, 1999 [cited as TJ].
  • Political Liberalism, rev. ed., Columbia University Press, 1996 [cited as PL].
  • Collected Papers, ed. Samuel Freeman, Harvard University Press, 1999 [cited as CP].
  • The Law of Peoples, Harvard University Press, 1999 [cited as LP].
  • Lectures on the History of Moral Philosophy, ed. Barbara Herman, Harvard University Press, 2000.
  • Justice as Fairness: A Restatement, ed. Erin Kelly, Harvard University Press, 2001.
  • Lectures on the History of Political Philosophy, ed. Samuel Freeman, Harvard University Press, 2007.

Two useful gateways to the voluminous secondary literature on Rawls are the following:

  • Henry S. Richardson and Paul J. Weithman, eds., The Philosophy of Rawls (5 vols., Garland, 1999).
  • Samuel Freeman, ed., The Cambridge Companion to Rawls (Cambridge University Press, 2003).

On Rawls’s Life

  • Thomas Pogge, “A Brief Sketch of Rawls’s Life,” in Richardson & Weithman, eds., Vol. 1, pp. 1-15.

Other Works Cited:

  • Beitz, Charles. 1979. Political Theory and International Relations. Princeton University Press.
  • Daniels, Norman. 1985. Just Health Care. Cambridge University Press.
  • Dreben, Burton. 2003. On Rawls and Political Liberalism. In Freeman, 2003: 316-346.
  • Estlund, David. 1998. The Insularity of the Reasonable. Ethics 108: 252-75.
  • Gutmann, Amy and Dennis Thompson. 1996. Democracy and Disagreement. Harvard University Press.
  • Harsanyi, John C. 1953. Cardinal Utility in Welfare Economics and in the Theory of Risk-Taking. Journal of Political Economy 61: 453-5.
  • MacIntyre, Alasdair. 1984. After Virtue, 2d ed. (1st ed. 1981) (University of Notre Dame Press).
  • Nozick, Robert. 1974. Anarchy, State, and Utopia. NY: Basic Books.
  • Nussbaum, Martha C. 2005. Frontiers of Justice: Disability, Nationality, Species Membership (Harvard University Press).
  • Okin, Susan. 1989. Justice, Gender, and the Family. NY: Basic Books.
  • Pogge, Thomas. 1994. An Egalitarian Law of Peoples. Philosophy and Public Affairs 23: 195-224.
  • Sandel, Michael. 1998. Liberalism and the Limits of Justice, 2d ed. (1st ed. 1982) (Cambridge University Press).
  • Richardson, Henry S.  2006.  Rawlsian Social Contract Theory and the Severely Disabled.  Journal of Ethics 10: 419-462.
  • Urmson, J. O. 1950. On Grading. Mind 59: 526-29.
  • Wenar, Leif. 2004. The Unity of Rawls’s Work. Journal of Moral Philosophy 1: 265-275.

Author Information

Henry S. Richardson
Email: richardh@georgetown.edu
Georgetown University
U.S.A.

The Knowledge Argument Against Physicalism

Frank Jackson

The knowledge argument is one of the main challenges to physicalism, the doctrine that the world is entirely physical. The argument begins with the claim that there are truths about consciousness that cannot be deduced from the complete physical truth. For example, Frank Jackson’s Mary learns all the physical truths from within a black-and-white room. Then she leaves the room, sees a red tomato for the first time, and learns new truths—new phenomenal truths about what it is like to see red. The arguer infers that, contrary to physicalism, the complete physical truth is not the whole truth. The physical truth does not determine or metaphysically necessitate the whole truth about the world.

This article discusses that argument’s structure, compares Jackson’s version with others, compares the knowledge argument with other anti-physicalist arguments, and summarizes the main lines of response. Nine controversial assumptions are identified. These are the assumptions that:

  1. the notion of the physical is coherent;
  2. the complete physical truth is accessible to the pre-release Mary;
  3. upon leaving the room, she learns something;
  4. the kind of knowledge she acquires upon leaving the room is informational knowledge, rather than ability knowledge, acquaintance knowledge, or something else;
  5. she gains new information, rather than old information represented in a new way;
  6. if the complete-knowledge claim and the learning claim are true, then what Mary learns when she leaves the room cannot be a priori deduced (deduced by reason alone, without empirical investigation) from the complete physical truth.
  7. if there are phenomenal truths that cannot be a priori deduced from the complete physical truth, then the complete physical truth does not metaphysically necessitate those phenomenal truths;
  8. the knowledge argument and epiphenomenalism are consistent.
  9. physicalism entails that the physical metaphysically necessitates the phenomenal.

Various criticisms and defenses of these assumptions are discussed.

Table of Contents

  1. Introduction
  2. The Knowledge Intuition and the Inference to Physicalism’s Falsity
  3. Related Arguments
  4. More Physicalist Responses
  5. Non-physicalist Responses
  6. Other Responses
  7. Jackson’s Retraction
  8. Summary of Assumptions and Criticisms
  9. References and Further Reading

1. Introduction

The knowledge argument aims to refute physicalism, the doctrine that the world is entirely physical. Physicalism (also known as materialism) is widely accepted in contemporary philosophy. But some doubt that phenomenal consciousness—experience, the subjective aspect of the mind—is physical. The knowledge argument articulates one of the main forms this doubt has taken.

Frank Jackson gives the argument its classic statement in (Jackson 1982) and (Jackson 1986). He formulates the argument in terms of Mary, the super-scientist. Her story takes place in the future, when all physical facts have been discovered. These include “everything in completed physics, chemistry, and neurophysiology, and all there is to know about the causal and relational facts consequent upon all this, including of course functional roles” (Jackson 1982, p. 51). She learns all this by watching lectures on a monochromatic television monitor. But she spends her life in a black-and-white room and has no color experiences. Then she leaves the room and sees colors for the first time. Based on this case, Jackson argues roughly as follows. If physicalism were true, then Mary would know everything about human color vision before leaving the room. But intuitively, it would seem that she learns something new when she leaves. She learns what it’s like to see colors, that is, she learns about qualia, the properties that characterize what it’s like. Her new phenomenal knowledge includes knowledge of truths. Therefore, physicalism is false.

In the late 1990’s, Jackson changed his mind: he now defends physicalism and rejects the knowledge argument. But others defend the argument, and even those who reject it often disagree about where it goes awry. The knowledge argument has inspired a voluminous literature, which contains insights about consciousness, knowledge, the limits of third-person science, and the nature of the physical. It is also discussed in non-philosophical works, including a book by E. O. Wilson (1998), a work of fiction (Lodge 2001), and a T.V. series (Brainspotting). This article discusses the argument’s structure, compares Jackson’s version with others, compares the knowledge argument with other anti-physicalist arguments, and summarizes the main lines of response.

2. The Knowledge Intuition and the Inference to Physicalism’s Falsity

The knowledge argument has two parts. One says that physical knowledge is not sufficient for phenomenal knowledge. Call this the knowledge intuition (Stoljar and Nagasawa, 2004). The other says that the knowledge intuition entails the falsity of physicalism.

Thus described, the knowledge argument is not new with Jackson. Locke and other 18th Century British empiricists discussed the knowledge intuition. C. D. Broad gave a version of the knowledge argument in 1925. And other versions appear in more recent writings, such as Thomas Nagel’s 1974 “What is it Like to be a Bat?” What is distinctive about Jackson’s contribution?

Daniel Stoljar and Yujin Nagasawa (2004) answer this question in their introduction to a volume of essays on the knowledge argument. As they say, Jackson contributes at least two main ideas: his Mary example illustrates the knowledge intuition better than previous attempts; and he provides distinctive reasons for inferring physicalism’s falsity from the intuition. Let us take these points in order.

The Mary case divides the knowledge intuition into three claims:

  • The complete-knowledge claim: before leaving the room, Mary knows everything physical.
  • The learning claim: upon leaving, she learns something.
  • The non-deducibility claim: if the complete-knowledge claim and the learning claim are true, then what Mary learns when she leaves the room cannot be a priori deduced (deduced by reason alone, without empirical investigation) from the complete physical truth.

Physicalists may deny the knowledge intuition. But the Mary case suggests that doing so requires rejecting the complete-knowledge claim, the learning claim, or the non-deducibility claim.

The cases discussed by Broad, Nagel, and others do not deliver this result. Consider, for example, Broad’s “mathematical archangel,” a logically omniscient creature who knows all the physical truths about various chemical compounds. Broad calls these truths “mechanistic” instead of “physical,” but the point is the same. On his view, the archangel would know all such truths but still lack phenomenal knowledge concerning, for example, “the peculiar smell of ammonia.” And Broad infers that physicalism (“mechanism”) is false. But what if the physicalist denies that the archangel would lack the relevant phenomenal knowledge? We appear to be at an impasse. By contrast, if the physicalist claims that, while in the room, Mary knows what it’s like to see colors, he must explain why she seems to acquire this knowledge when she leaves. The Mary case breaks the deadlock in favor of the knowledge intuition. Other illustrations of the intuition that precede Jackson’s have further drawbacks. For example, Nagel’s claim that humans cannot imagine what it’s like to be a bat raises distracting issues about the limits of human imagination, about which physicalism carries no obvious commitments. Mary’s fame is just.

To explain the second of Jackson’s distinctive contributions, it will be useful to explain some terminology and abbreviations. First, there is the distinction between the a priori and the a posteriori. A priori truths are those that are justifiable by reason alone, without empirical investigation. Logical truths provide clear examples. For example, one can figure out without empirical investigation that the following claim is true: if Socrates is mortal, then either Socrates is mortal or Socrates is fat. Compare the claim that Socrates is mortal. While we believe the latter claim to be true, reason alone does not justify this belief. Instead, we rely on experience—empirical investigation. So, while it is a priori that if Socrates is mortal, then either Socrates is mortal or Socrates is fat, it is a posteriori that Socrates is mortal. We may also speak of truths that are a priori deducible from other truths. For example, although “Socrates is mortal” is a posteriori, that same truth is a priori deducible from two other truths: “All men are mortal” and “Socrates is a man.” In other words, the latter two truths, taken together, a priori entail that Socrates is mortal.

Second, there is the notion of metaphysically necessary truths. A necessary truth is a truth that could not have failed to be the case. Logical truths again provide clear examples: “Either Socrates is mortal or it is not the case that Socrates is moral” is usually regarded as necessary. Contrast that truth with “Socrates is mortal.” The latter is not necessary. Truths that are not necessary are also known as contingent. Philosophers often distinguish between different strengths or kinds of necessity. For example, there is arguably a sense in which it is a necessary truth that pigs cannot fly like birds. But if the laws of nature were different, then perhaps pigs would be able to fly like birds. So, perhaps it is not metaphysically impossible that pigs should be able to fly like birds. A metaphysically necessary truth is a truth that is necessary in the strictest possible sense: a truth that holds not just because of contingent laws of nature. Saul Kripke (1972) famously argues that there are metaphysically necessary truths that are not truths of pure logic. Indeed, he argues that there are metaphysically necessary truths that are not a priori. For example, on his view, that water is H2O is metaphysically necessary but a posteriori. He recognizes that there could have been substances that resemble water—substances that share water’s superficial qualities, such as its taste and visual appearance—but with a different molecular structure. But, he argues, these substances would not be water.

Third, let us introduce some abbreviations. On Jackson’s version of the knowledge argument, the assumption that Mary knows the complete physical truth about the world does not guarantee that she will be able to figure out the complete truth about human color vision. His reasoning involves the idea of the complete physical truth. Call the complete physical truth P. P can be seen as a long conjunction of all the particular physical truths, which, according to Jackson, Mary learns from watching science lectures. What about the truths that, according to Jackson, Mary does not learn until she leaves the room? Those would be included in the psychological truths about the world. Call the complete psychological truth Q. Finally, consider what Stoljar and Nagasawa call “the psychophysical conditional”: if P then Q, where P is the complete physical truth and Q is the complete psychological truth. As we will see, part of Jackson’s reasoning can be understood in terms of his view about the psychophysical conditional.

We are now in a position to state the second of Jackson’s distinctive contributions to the discussion of the knowledge argument. This contribution concerns his inference from the knowledge intuition to physicalism’s falsity. His inference assumes that if physicalism is true then the complete truth about human color vision is a priori deducible from the complete physical truth. But here a problem arises: why accept this assumption? Consider the psychophysical conditional, if P then Q (again, P is the complete physical truth and Q is the complete psychological truth). As Jackson conceives of physicalism, physicalism entails that the psychophysical conditional is a priori. If he is right, then all truths about color vision would be deducible from P (the complete physical truth). But here physicalists have a natural, obvious response: why not instead characterize physicalism as a Kripkean a posteriori necessity, akin to water is H2O? On this characterization, the psychophysical conditional is metaphysically necessary but not a priori.

In later work, Jackson criticizes this response. His argument is complex, but the basic idea is simple enough. In a 1995 “Postscript,” he reasons as follows. Consider the argument:

H2O covers most of the planet.
Therefore, water covers most of the planet.

The premise necessitates, but does not a priori entail, the conclusion. But, Jackson asks, why is there no a priori entailment? On his view, there is no such entailment because the argument’s premise gives us only part of the physical story. It is also part of the physical story that H2O does the other things that water does, that is, that H2O plays the water role. Playing the water role includes such things as being a substance that occupies oceans and lakes, looks clear to us, has little or no taste, is referred to as “water”, etc. So, let us add the following premise to the argument displayed above:

H2O plays the water role.

Now, says Jackson, the premises do a priori entail the conclusion. Moral: “a rich enough story about the H2O way things are does enable the a priori deduction of the water way things are” (Jackson 1995, p. 413). Likewise, physicalism entails that “knowing a rich enough story about the physical nature of our world is tantamount to knowing the psychological story about our world” (Jackson 1995, p. 414). But if physicalism is true, P should provide just that: a rich enough story. Thus, Jackson concludes, physicalism entails the apriority of the psychophysical conditional after all.

Jackson’s argument is controversial. But in developing it, he fills an important lacuna in the knowledge argument and thereby improves on earlier versions. Others, too, have attempted to fill this lacuna. Most notably, David Chalmers (1996, 2003, 2004, and 2006a) has given sophisticated arguments to this end, which are partly inspired by Jackson’s argument.

3. Related Arguments

The knowledge argument is one of several ways to articulate the suspicion that phenomenal consciousness is not physical. Another common way of articulating the doubt is through the conceivability argument. This argument descends from René Descartes’ main argument for mind-body substance dualism. He argued that, since he can clearly and distinctly conceive of his mind without his body and his body without his mind, they can exist without each other and are therefore distinct substances.

Contemporary versions of the conceivability argument usually rely on thought experiments concerning qualia. One such thought experiment involves inverted qualia. It seems conceivable that there be an individual exactly like me, except he and I are red/green inverted. We are physically and functionally identical, but the color experiences he has when viewing a ripe tomato (in normal light, without special contact lenses, and so forth) resemble the color experiences I have when viewing a ripe zucchini, and vice versa. Such a person would be my inverted twin. Likewise, it seems conceivable that there be a world exactly like ours in all physical and functional respects but without phenomenal consciousness. Creatures that lack consciousness but are physically and functionally identical to ordinary human beings are called zombies. If it is conceivable that there be creatures such as my inverted twin or my zombie twin, then, the conceivability argument runs, this supports the metaphysical possibility of such creatures. And most agree that if such creatures are metaphysically possible, then phenomenal consciousness is neither physical nor functional: physicalism is false.

Yet another related argument is the explanatory argument. This argument begins with the premise that physicalist accounts explain only structure (such as spatiotemporal structure) and function (such as causal role). Then it is argued that explaining structure and function does not suffice to explain consciousness, and so physicalist accounts are explanatorily inadequate.

As Chalmers (2003) notes, the knowledge argument, the conceivability argument, and the explanatory argument can be seen as instances of a general, three-step argument. The first step is to establish an epistemic gap between the physical and phenomenal domains. In the case of the knowledge argument, the gap is often put in terms of a priori deducibility: there are phenomenal truths that cannot be a priori deduced from physical truths. In the case of the conceivability argument, the gap is put in terms of conceivability: it is conceivable that there be inverted qualia or zombies. And in the case of the explanatory argument, the point is put in terms of an explanatory gap. After establishing an epistemic gap, these arguments take a second step and infer a corresponding metaphysical gap: a gap in the world, not just in our epistemic relation to it. The knowledge argument infers a difference in type of fact. The conceivability argument infers the metaphysical possibility of inverted qualia or zombies. And the explanatory argument infers that there are phenomena that cannot be physically explained. As a third step, all three results appear to conflict with physicalism. There are important differences among the arguments, and it is not obvious that they stand or fall together. Nevertheless, it is worth noting that they follow a single abstract pattern.

4. More Physicalist Responses

Most physicalist responses to the knowledge argument fall into three categories: those that reject the inference to physicalism’s falsity and thus deny the metaphysical gap; those that reject the knowledge intuition and thus deny the epistemic gap; and those that derive an absurdity from Jackson’s reasoning.

We have already noted one way of rejecting the inference from the knowledge intuition to physicalism’s falsity: one could defend a version of physicalism on which the psychophysical conditional is necessary but not a priori. There are other ways of rejecting the inference. One is to reject the assumption that phenomenal knowledge is propositional knowledge—knowledge of truths or information. That is, one could argue that the type of knowledge Mary gains when she leaves the room is non-propositional. The most popular version of this view is based on the ability hypothesis, the claim that to know what it’s like is to possess certain abilities, such as the ability to imagine, recognize, and remember experiences. On this view, Mary’s learning consists in her acquiring abilities rather than learning truths. As the view is sometimes put, she gains know-how, not knowledge-that. There are other versions, including the view that upon leaving the room Mary acquires only non-propositional acquaintance knowledge (Conee 1994, Bigelow and Pargetter 1990). On this version, her learning consists, not in acquiring information or abilities, but in becoming directly acquainted with the phenomenal character of color experiences, in the way that one can become acquainted with a city by visiting it.

These views allow the physicalist to accept the knowledge intuition without facing objections that Jackson, Chalmers, and others bring against a posteriori physicalism. But other problems arise. Regarding the ability hypothesis, some doubt that Mary’s learning could consist only in acquiring abilities. Her new knowledge appears to have characteristic marks of propositional knowledge because its content can be embedded in conditionals such as “if seeing red is like this, then it is not like that” (Loar 1990/97). And some philosophers question the significance of the distinction between know-how and knowledge-that on which the strategy of the ability-hypothesis seems to rely (Alter 2023, Stanley and Williamson 2001).

The idea that Mary acquires only acquaintance knowledge has similar difficulties. It is not clear that all she acquires is acquaintance knowledge or that the requisite distinction between acquaintance knowledge and propositional knowledge is tenable. Also, there is a danger of trading on an ambiguity: sometimes “acquaintance” refers to knowledge, sometimes to experience. On the former, epistemic interpretation, it is unclear that Mary’s new “acquaintance knowledge” includes no factual component. And on the latter, experiential interpretation, the acquaintance hypothesis trivializes the learning claim: no one denies that when Mary leaves the room she has new experiences.

Another way to reject the inference to physicalism’s falsity is to argue that Mary’s learning consists in acquiring new ways to represent facts she knew before leaving the room (Loar 1990, 1997, Lycan 1996, Horgan 1984, McMullen 1985, Pereboom 1994, Tye 2002). This view is often combined with an appeal to a posteriori necessity (see section 2 above). But it need not be: one could argue that while the psychophysical conditional is a priori knowable by those who possess the relevant phenomenal concepts, Mary lacks those concepts before leaving the room. The main challenge for this view concerns the status of her new concepts. It is not enough to say that she gains some new concept or other: her conceptual gain must explain her gain in knowledge. The concern is that any concepts adequate to the task—such as the concept having an experience with phenomenal feel f—might incorporate a non-physical component (Chalmers 2006b).

Philosophers have also devised ways to reject the knowledge intuition. Some believe that intuitions based on hypothetical cases should be given little or no weight. Also, specific strategies for rejecting the knowledge intuition have been developed. One is to reject the learning claim: to argue that on reflection Mary does not learn anything when she leaves the room. Some defend this position by arguing that we simply underestimate the power of complete physical knowledge. Suppose we try to fool Mary by greeting her when she leaves the room with a blue banana. Would she be fooled into thinking that seeing yellow is what we would describe as seeing blue? Not necessarily. She could use a brain scanner (perhaps a descendent of a PET device) to examine her own brain processes. She would notice that her brain processes correspond to people having blue experiences, and thereby evade our trap. Maybe our intuition that she learns something fails to take this sort of consideration into account (Dennett 1981, 2006). But other philosophers doubt that the intuition derives from any such error.

Another way to reject the knowledge intuition is to challenge the complete-knowledge claim: to argue that not all physical facts about seeing colors can be learned by watching black-and-white lectures. On this view, a fact might be physical but not discursively learnable. How could this be?

Some (for example, Horgan, 1984) use “physical” broadly, so that that the physical truths include high-level truths necessitated by the microphysical truths. These physicalists argue that phenomenal truths are themselves high-level physical truths, and that it is question-begging to assume that Mary knows all the physical truths simply because she watches lectures on chemistry, physics, etc. Chalmers (2004, 2006a) suggests a natural response to this move: use “physical” narrowly, so that the physical truths include only the microphysical truths (or those plus the truths in chemistry or some other specified domains). It is harder to deny that such truths would be accessible to the pre-release Mary. Of course, this entails that high-level biological truths, for example, will count as non-physical, and thus the existence of non-physical truths will not itself defeat physicalism. But if Jackson’s reasoning is sound, then there are phenomenal truths that are not metaphysically necessitated by the narrowly physical truths—and that result would defeat physicalism.

On another version of the view that the complete-knowledge claim is false, Mary’s science lectures allow her to deduce the truths involving structural-dynamical properties of physical phenomena, but not their intrinsic properties. The knowledge argument does not appear to refute this view. If this view can reasonably be called a physicalist view, then there is at least one version of physicalism that the knowledge argument appears to leave unchallenged. However, it is unclear that this is a significant deficiency. Arguably, on the view in question, consciousness (or protoconsciousness) is a fundamental feature of the universe—or at least no less fundamental than the properties describable in the language of physics, chemistry, etc. That sounds like the sort of view the knowledge argument should be used to establish, not refute.

5. Non-physicalist Responses

If we accept the knowledge argument, then how should we understand the relationship between consciousness and the physical world? Jackson (1982) defends epiphenomenalism, on which phenomenal properties or qualia are caused by but do not cause physical phenomena. But epiphenomenalism is only one non-physicalist view that the knowledge argument leaves open. For example, the knowledge argument is also consistent with interactionist dualism, on which there is two-way causal interaction between the mental and the physical. The knowledge argument is also consistent with Russellian monism, on which phenomenal properties (or protophenomenal properties) are the categorical, intrinsic bases of physical properties, which are at bottom dispositional and relational.

All of these views have significant costs and benefits. For example, interactionist dualism is commonsensical but hard to reconcile with the popular view that the physical world is causally closed, that is, the view that every physical event has a sufficient physical cause. To take another example: epiphenomenalism preserves causal closure but seems to conflict with the widespread naturalistic assumption that consciousness is an integrated part of the natural world.

Historically, epiphenomenalism is associated with Huxley (1874), interactionist dualism with Descartes (1641), and Russellian monism with Russell (1927). For later versions, see Jackson (1982) and Robinson (1982b, 1988) for epiphenomenalism; see Popper and Eccles (1977), Hart (1988), Foster (1991), and Hodgson (1991) for interactionist dualism; and see Rosenberg (2004), Chalmers (2013), Alter and Nagasawa (2015), and Goff (2017) for Russellian monism.

6. Other Responses

Some claim that Jackson’s position is internally inconsistent (Watkins 1989, Campbell 2003). The argument runs roughly as follows. On the knowledge argument, Mary acquires knowledge when she leaves the room because she has states with new qualia. But this is impossible if, as Jackson (1982) suggests, epiphenomenalism is true: on epiphenomenalism, qualia are causally inefficacious; so, how can qualia produce an increase in knowledge? So, Jackson cannot consistently maintain both epiphenomenalism and the learning claim.

However, the sort of epiphenomenalism Jackson defends implies, not that phenomenal features are inefficacious, but only that they have no effects on physical phenomena. He might therefore reply that phenomenal knowledge is not a physical phenomenon, and thus qualia may indeed cause Mary to acquire it. Also, he can reasonably complain that the objection assumes a causal theory of knowledge that is not appropriate for phenomenal knowledge (Nagasawa 2010).

Despite the availability of these replies, there is a serious problem in the vicinity of the inconsistency objection. We should expect physical or functional explanations of our judgments about qualia. But if the knowledge argument is sound, then qualia would seem to be explanatorily irrelevant to these judgments—including the judgment that qualia cannot be explained in physical or functional terms. This is what David Chalmers calls “the paradox of phenomenal judgment” (Chalmers 1996, chapter 5). It appears to be a real problem, which arises for any non-physicalist theory of consciousness.

Another important response to the knowledge argument should be noted. The argument seems to assume that “physical” has a clear meaning. But whether this notion can be adequately defined is not obvious. One problem is “Hempel’s dilemma” (Hempel 1966, Montero 1999). Arguably, we should not define the physical in terms of current physics, because current physics will be extended and presumably revised in substantial ways. We could define it in terms of ideal physics. But who knows what ideal physics will look like? Future physics may involve novel concepts that we cannot begin to imagine. If “physical” is defined in terms of such unknown concepts, then how can we judge whether Mary could learn all the physical facts from black-and-white lectures? And how else should we define the notion except by appeal to (current or ideal) physics?

Some take such considerations to show that the debate over whether consciousness is physical is misguided or meaningless (Chomsky 1980, 1988, Crane and Mellor 1990). But the difficulty may be surmountable (Wilson 2006). On one view, ideal physics will not be wholly unrecognizable: like today’s physics, it will be concerned entirely with structure and dynamics. And one may be able to argue that any structural/dynamical properties can in principle be imparted by black-and-white lectures.

7. Jackson’s Retraction

As we noted earlier, Jackson (1998, 2003, 2007, 2019) has come to embrace physicalism and reject the knowledge argument. More specifically, he rejects the claim that Mary learns new truths when she leaves the room. He argues that this claim derives from a mistaken conception of sensory experience—a conception that he thinks should be replaced with representationalism, the view that phenomenal states are representational states. Interestingly, he combines this view with the ability hypothesis. He writes,

Those who resist accounts in terms of ability acquisition tend to say things like “Mary acquires a new piece of propositional knowledge, namely, that seeing red is like this”, but for the representationalist there is nothing suitable to be the referent of the demonstrative.

We have ended up agreeing with Laurence Nemirow and David Lewis [the authors of the ability-hypothesis strategy] on what happens to Mary on her release. But, for the life of me, I cannot see how we could have known they were right without going via representationalism. (Jackson 2003, p. 439)

It is unclear why Jackson’s representationalism leads him to embrace the ability hypothesis. Despite his commitments to physicalism and the apriority of the psychophysical conditional, he has other options. For example, instead of explaining Mary’s epistemic progress in terms of newly acquired abilities, he might argue that her “progress” is an illusion; in other words, he might reject the learning claim. Moreover, it may be possible to formulate a representationalist version of the knowledge argument that inherits the force of the original (Alter, 2023).

8. Summary of Assumptions and Criticisms

As we have seen, the knowledge argument depends on several controversial assumptions. It will be useful to summarize some of these assumptions and some criticisms of them. I will also mention some sources for relevant arguments.

Assumption 1: The coherence of the notion of the physical: physicalism is a substantive doctrine with non-trivial content.

Criticism 1: The notion of the physical is not well defined, and there is no substantive issue of whether physicalism is true (Chomsky 1980, 1988, Crane and Mellor 1990; cf., Montero 1999). For replies, see Chalmers (1996, 2004), Stoljar (2000), Wilson (2006).

Assumption 2: The complete-knowledge claim (“truths” version): before leaving the room, Mary knows all physical truths.

Criticism 2a: Pre-release Mary does not know all the physical truths, because high-level physical truths cannot in general be a priori deduced from low-level physical truths (Horgan 1984, van Gulick 2004, Block and Stalnaker 1999). For replies, see Chalmers (2004) and Chalmers and Jackson (2001).

Criticism 2b: Pre-release Mary does not know all the physical truths, because truths about the intrinsic properties of physical phenomena cannot be discursively learned (Stoljar 2000, Howell 2013). For replies, see Chalmers (2004).

Assumption 3: The learning claim: upon leaving the room, Mary learns something.

Criticism 3a: We think Mary learns something because we fail to appreciate the implications of knowing all physical truths (Foss 1989, Stemmer 1989, Dennett 1991, 2004). For replies, see Chalmers (1996), Alter (2023), Robinson (1993), and Jacquette (1995).

Criticism 3b: We think Mary learns something because we fail to recognize that phenomenal properties are just representational properties (Jackson 2003, 2007, 2019). For a reply, see Alter (2013).

Criticism 3c: Mary gains only unjustified beliefs (Beisecker 2000).

Assumption 4: The non-deducibility claim: if Mary learns new phenomenal truths when she leaves the room, then those truths cannot be a priori deduced from the complete physical truth.

Criticism 4: Mary cannot deduce certain phenomenal truths from the complete physical truth only because she lacks the relevant concepts, such as the concept of phenomenal redness. Thus, even though Mary cannot deduce Q from P, the psychophysical conditional is a priori for those who have the relevant concepts (Kirk 2005, Montero 2007). For replies, see Alter (2023), Chalmers (2004) and Stoljar (2005).

Assumption 5: The propositional-knowledge claim: the kind of knowledge Mary gains upon leaving the room is propositional or factual—knowledge of information or truths.

Criticism 5a: Mary gains only abilities (Lewis 1983, 1988, Nemirow 1990, Mellor 1993, Meyer 2001). For replies, see Jackson (1986), Bigelow and Pargetter (1990), Loar (1990/97), Coleman (2009), Conee (1994), Nida-Rümelin (1995), Lycan (1996), Alter (2023), Gertler (1999), Tye (2002, chapter 1), Raymont (1999), and Papineau (2002). For counter-replies, see Tye (2002, chapter 1) and Nemirow (2007).

Criticism 5b: Mary gains only acquaintance knowledge (Conee 1994,Tye 2009, Pitt 2019). For replies, see Alter (2023) and Gertler (1999).

Criticism 5c: Mary gains non-propositional knowledge that does not fit easily into folk categories (Churchland 1985, 1989).

Assumption 6: The new-information claim: the information Mary gains upon leaving the room is genuinely new to her.

Criticism 6: Mary merely comes to know truths she already knew under new, phenomenal representations. This view is sometimes called the old-fact/new-representation view. It comes in at least two versions. On one, phenomenal knowledge is assimilated to indexical knowledge: Mary’s “learning” is comparable to the absent-minded U.S. historian’s learning that today is July 4th, America’s Independence Day (McMullen 1985). For replies, see Chalmers (1996, 2004). Another version attaches the old-fact/new-representation view to a posteriori physicalism. Advocates of this version include Loar (1990/97), Lycan (1996), Horgan (1984), and Pereboom (1994). For replies, see Alter (2023) Chalmers (1996, 2003, 2004) and Stoljar (2000).

Assumption 7: The claim that the knowledge intuition entails non-necessitation: if there are phenomenal truths that cannot be a priori deduced from the complete physical truth, then the complete physical truth does not metaphysically necessitate those phenomenal truths.

Criticism 7: Physicalism is an a posteriori necessity and is therefore compatible with the claim that the phenomenal truths are not deducible from the complete physical truth. For references, see the second version of criticism 6 above.

Assumption 8: The consistency claim: the knowledge argument and non-physicalism are consistent.

Criticism 8: The assumption that Mary gains knowledge is inconsistent with epiphenomenalism (Watkins 1989, Campbell 2003). For replies, see Nagasawa (2010).

Assumption 9: The assumption that physicalism entails that the physical metaphysically necessitates the phenomenal.

Criticism 9: Physicalism might be true even if the physical did not necessitate the chemical or the biological. Likewise, physicalism might be true even if the physical did not necessitate the phenomenal (Montero 2013, Montero and Brown 2018, Zhong 2021). For a reply, see Alter (2023).

The knowledge argument rests on other assumptions. For example, one is that if Mary gains new, non-physical information, then there are non-physical properties. Another is that if there are truths that are not metaphysically necessitated by the complete physical truth, then physicalism is false. For a detailed analysis and defense of the knowledge argument, see Alter (2023).

Some critics combine elements of different criticisms. For example, Michael Pelczar’s (2005) criticism appears to contain elements of the acquaintance hypothesis and the old-fact/new-representation view; Jackson both rejects the learning claim and endorses the ability hypothesis (Jackson 2003); and Robert van Gulick (2004) argues that the various physicalist criticisms of the knowledge argument can be seen as parts of a single, coherent reply. Those who endorse the knowledge argument (in addition to Jackson, before he changed his mind) include Robinson (1982a), Nida-Rümelin (1995), Chalmers (1996, 2004), Alter (2023), and Gertler (1999).

William Lycan (2003) writes, “Someday there will be no more articles written about the “Knowledge Argument”… That is beyond dispute. What is less certain is, how much sooner that day will come than the heat death of the universe.” At least for now, however, the knowledge argument continues to inspire fruitful reflection on the nature of consciousness and its place in the natural world.

9. References and Further Reading

  • Alter, Torin. 2023. The Matter of Consciousness: From the Knowledge Argument to Russellian Monism. Oxford: Oxford University Press.
  • Alter, Torin, and Nagasawa, Yujin (eds.) 2015. Consciousness in the Physical World: Perspectives on Russellian Monism. New York: Oxford University Press.
  • Alter, Torin, and Walter, Sven (eds.) 2007. Phenomenal Concepts and Phenomenal Knowledge: New Essays on Consciousness and Physicalism. New York: Oxford University Press.
  • Beisecker, David. 2000. “There’s Something about Mary: Phenomenal Consciousness and Its Attributions”, Southwest Philosophy Review, 16, 143-52.
  • Block, N. & Stalnaker, R. 1999. Conceptual Analysis, Dualism, and the Explanatory Gap. Philosophical Review108: 1-46.
  • Brainspotting. 1994. U.K. television series.
  • Broad, C. D. 1925. The Mind and its Place in Nature, London: Routledge and Kegan Paul.
  • Bigelow, John, and Robert Pargetter. 1990. “Acquaintance with Qualia”, Theoria, 61, 129-47.
  • Campbell, Neil 2003. “An Inconsistency in the Knowledge Argument”, Erkenntnis, 58, 261-66.
  • Chalmers, David J. 1996. The Conscious Mind: In Search of a Fundamental Theory, New York: Oxford University Press.
  • Chalmers, David J. 2003. “Consciousness and it Place in Nature” in S. Stich and T. Warfield (eds.), The Blackwell Guide to the Philosophy of Mind, Oxford: Blackwell. Reprinted in D. Chalmers (ed.), The Philosophy of Mind: Classical and Contemporary Readings, (2002): 247–272, New York: Oxford University Press.
  • Chalmers, David J. 2004. “Phenomenal Concepts and the Knowledge Argument.” In Ludlow, et. al. (2004), pp. 269-98.
  • Chalmers, David J. 2007. “Phenomenal Concepts and the Explanatory Gap”. In T. Alter and S. Walter 2007, pp. 167-94.  .
  • Chalmers, David J. 2013. “Panpsychism and panprotopsychism.” Amherst Lecture in Philosophy: http://www.amherstlecture.org/index.html. Also in T. Alter and Y. Nagasawa 2015, pp. 246-76.
  • Chalmers, David J. and Jackson, Frank (2001). Conceptual Analysis and Reductive Explanation. Philosophical Review110: 315-61.
  • Chomsky, Noam. 1980. Rules and Representations, New York: Columbia University Press.
  • Chomsky, Noam. 1988. Language and Problems of Knowledge: The Managua Lectures, Cambridge, MA: MIT Press.
  • Churchland, Paul. 1985. “Reduction, Qualia, and the Direct Introspection of Brain States”, Journal of Philosophy, 82, 8-28.
  • Churchland, Paul. 1989. “Knowing Qualia: A Reply to Jackson”, in A Neurocomputational Perspective, Cambridge: MIT Press, 67-76.
  • Coleman, Sam. (ed.) 2019. The Knowledge Argument. Cambridge: Cambridge University Press.
  • Coleman, Sam. 2009. “Why the ability hypothesis is best forgotten.“ Journal of Consciousness Studies 16: 74-97.
  • Conee, Earl. 1994. “Phenomenal Knowledge”, Australasian Journal of Philosophy, 72, 136-50.
  • Crane, Tim and Hugh Mellor 1990. “There is no question of physicalism”, Mind, 99, 185-206.
  • Dennett, Daniel C. 1991. Consciousness Explained, Boston: Little Brown and Company.
  • Dennett Daniel C. 2005.  Sweet Dreams: Philosophical Obstacles to a Science of Consciousness. Cambridge, MA: MIT Press.
  • Dennett, Daniel C. 2006. “What RoboMary Knows”. In T. Alter and S. Walter 2007, pp. 15-31.
  • Descartes, René. Meditations on First Philosophy. 1641.
  • Foster, J. 1991. The Immaterial Self: A Defense of the Cartesian Dualist Conception of Mind. Routledge.
  • Foss, Jeff. 1989. “On the Logic of What It Is Like to be a Conscious Subject”, Australasian Journal of Philosophy 67, pp. 305-20.
  • Gertler, Brie 1999. “A Defense of the Knowledge Argument”, Philosophical Studies, 93, 317-36.
  • Goff, Philip. 2017. Consciousness and Fundamental Reality. New York: Oxford University Press.
  • Hart, W. D. 1988.  Engines of the Soul. Cambridge: Cambridge University Press.
  • Hempel, Carl. 1966. Philosophy of Natural Science. Englewood Cliffs, New Jersey: Prentice Hall.
  • Hodgson, D. 1991. The Mind Matters: Consciousness and Choice in a Quantum World. New York: Oxford University Press.
  • Horgan, Terence 1984. “Jackson on Physical Information and Qualia”, Philosophical Quarterly, 34, 147-52.
  • Howell, R. J. 2013. Consciousness and the Limits of Objectivity: The Case for Subjective Physicalism. Oxford: Oxford University Press.
  • Huxley, Thomas H. 1874. “On the Hypothesis that Animals are Automata, and its History”. In D. Chalmers (ed.) The Philosophy of Mind. New York: Oxford University Press, 2002, 24-30.
  • Jackson, Frank. 1982. “Epiphenomenal Qualia”, Philosophical Quarterly, 32, 127-36.
  • Jackson, Frank. 1986. “What Mary Didn’t Know”, Journal of Philosophy, 83, 291-5.
  • Jackson, Frank. 1995. “Postscript”, in Contemporary Materialism, ed. by Paul K. Moser and J. D. Trout, New York: Routledge, 184-9.
  • Jackson, Frank. 1998. “Postscript on Qualia.” In his Mind, Method, and Conditionals: Selected Essays: 76-79. London: Routledge.
  • Jackson, Frank. 2003. “Mind and Illusion”, in  Minds and Persons: Royal Institute of Philosophy Supplement 53, ed. by Anthony O’Hear, Cambridge: Cambridge University Press, 251-271.
  • Jackson, Frank. 2007. “The knowledge argument, diaphanousness, representationalism.” In T. Alter and S. Walter 2007, pp. 52-64.
  • Jackson, Frank. 2019. “The knowledge argument meets representationalism about colour experience.” In S. Coleman 2019, pp. 102-17.
  • Jacquette, Dale. 1995. “The Blue Banana Trick: Dennett on Jackson’s Color Scientist,” Theoria 61, pp. 217-30.
  • Kirk, Robert. 2005. Zombies and Consciousness. New York: Oxford University Press.
  • Kripke, Saul. 1972. “Naming and Necessity”. In The Semantics of Natural Language. Ed. G. Harman and D. Davidson. Dordrecht: Reidel. Reprinted as Naming and Necessity. Cambridge, MA: Harvard University Press, 1980.
  • Lewis, David. 1983. “Postscript to ‘Mad Pain and Martian Pain.’” In his Philosophical Papers, vol. 1. New York: Oxford University Press, pp. 130-32.
  • Lewis, David. 1988. “What Experience Teaches”, Proceedings of Russellian Society (University of Sydney), Reprinted in Lycan (1999); Block, Flanagan and Güzeldere (1997)
  • Loar, Brian. 1990. “Phenomenal States”, in Philosophical Perspectives IV: Action Theory and the Philosophy of Mind, ed. by James Tomberlin, Atascadero: Ridgeview Publishing, 81-108.
  • Loar, Brian. 1997. “Phenomenal States (Revised Version)”, in The Nature of Consciousness, ed. by Ned Block, Flanagan Owen and Güzeldere Güven, Cambridge, MA: MIT Press, 597-616.
  • Locke, John. 1690.  An Essay Concerning Human Understanding.
  • Lodge, David. 2001.  Thinks… London: Secker and Warburg Random House.
  • Ludlow, P., Y. Stoljar, and D. Nagasawa, eds. 2004 There’s Something about Mary: Essays on Phenomenal Consciousness and Frank Jackson’s Knowledge Argument. Cambridge: MIT Press.
  • Lycan, William G. 1996. Consciousness and Experiences, Cambridge, MA: MIT Press.
  • Lycan, William G. 2003. “Perspectival Representation and the Knowledge Argument,” in Q. Smith and A. Jokic (eds.), Consciousness: New Philosophical Essays (Oxford: Oxford University Press).
  • McMullen, C. (1985), “’Knowing What It’s Like’ and the Essential Indexical”, Philosophical Studies 48, pp. 211-33.
  • Mellor, D. H. 1993. “Nothing Like Experience.” Proceedings of the Aristotelian Society 93, pp. 1-16.
  • Meyer, Uwe. 2001. “The Knowledge Argument, Abilities, and Metalinguistic Beliefs”, Erkenntnis, 55, 325-47.
  • Montero, Barbara Gail. 1999. “The Body Problem,” Noûs, Vol. 33, No. 3 (1999) p. 183-20.
  • Montero, B. G. 2007. “Physicalism could be true even if Mary learns something new.” The Philosophical Quarterly 57: 176-89.
  • Montero, Barbara Gail, and Brown, Christopher Devlin. 2018. “Making room for a this-worldly physicalism.” Topoi 37 (3): 523-32.
  • Nagasawa, Y. 2010. The knowledge argument and epiphenomenalism. Erkenntnis 72: 37–56.
  • Nagel, Thomas. 1974. “What Is it Like to Be a Bat?”, Philosophical Review, 83, 435-50.
  • Nemirow, Lawrence. 1990. “Physicalism and the Cognitive Role of Acquaintance”, in Mind and Cognition: A Reader, ed. by William G. Lycan, Oxford: Blackwell, 490-9.
  • Nemirow, Lawrence. 2007. “So This is What it’s Like: a Defense of the Ability Hypothesis.” In T. Alter and S. Walter 2007, pp. 32-51.
  • Nida-Rümelin, Martine 1995. “What Mary Couldn’t Know: Belief About Phenomenal States”, in Conscious Experience, ed. by Thomas Metzinger, Exeter: Imprint Academic, 219-41.
  • Papineau, David. 2002.  Thinking about Consciousness. New York: Oxford University Press, 2002.
  • Pelczar, Michael. 2005. “Enlightening the Fully Informed.” Philosophical Studies 126: 29-56.
  • Pereboom, Derk 1994. “Bats, Brain Scientists, and the Limitations of Introspection”, Philosophy and Phenomenological Research, 54, 315-29.
  • Pitt, D. 2019. “Acquaintance and phenomenal concepts.” In S. Coleman 2019, pp. 87-101.
  • Popper, K. and Eccles, J. 1977. The Self and its Brain: An Argument for Interactionism. Springer.
  • Raymont, Paul. 1999. “The Know-How response to Jackson’s Knowledge Argument”, Journal of Philosophical Research, 24, 113-26.
  • Robinson, Howard. 1982a.  Matter and Sense. Cambridge: Cambridge University Press.
  • Robinson, Howard. 1993. “Dennett on the Knowledge Argument”, Analysis, 53, 174-7.
  • Robinson, W. S. (1982b) “Causation, Sensations and Knowledge”, Mind 91, 524-40.
  • Rosenberg, Gregg. 2004.  A Place for Consciousness: Probing the Deep Structure of the Natural World. New York: Oxford University Press.
  • Russell, Bertrand. 1927.  The Analysis of Matter. London: Kegan Paul.
  • Stanley, Jason and Timothy Willlamson. 2001. “Knowing How,” Journal of Philosophy  v. 98 (8), pp. 411-444.
  • Stemmer, Nathan (1989), “Physicalism and the Argument from Knowledge” Australasian Journal of Philosophy 67, pp. 84-91.
  • Stoljar, Daniel 2000. “Physicalism and the Necessary A Posteriori,” Journal of Philosophy, 97, 33-54.
  • Stoljar, Daniel 2005. “Physicalism and Phenomenal Concepts”, Mind and Language 20: 469-94.
  • Stoljar, Daniel and Yujin Nagasawa. 2004. Introduction to Ludlow, et al. (2004).
  • Tye, Michael. 2000. Consciousness, Color, and Content, Cambridge, MA: MIT Press.
  • Tye, Michael. 2009. Consciousness Revisited: Materialism without Phenomenal Concepts. Cambridge, MA: MIT Press.
  • Van Gulick, Robert. 2004. “So Many Ways of Saying No to Mary”. In Ludlow, et. al. (2004), pp. 365-406.
  • Watkins, Michael 1989. “The Knowledge Argument against the Knowledge Argument,” Analysis, 49, 158-60.
  • Wilson, Edward O.  Consilience: the Unity of Knowledge. New York: Alfred A. Knopf.
  • Wilson, Jessica M. 2006. “On characterizing the physical.” Philosophical Studies 131: 61-99.
  • Zhong, L. 2021. “Physicalism without supervenience.” Philosophical Studies 178 (5):1529-44.

 

Author Information

Torin Alter
Email: talter@ua.edu
The University of Alabama
U. S. A.

Events

Events are particular happenings, occurrences or changes, such as Rob’s drinking the strong espresso at noon, the 1864 re-election of Abraham Lincoln in the US, and so on. At least at first blush, events all seem to have something in common, metaphysically speaking, and some philosophers have inquired into what this common nature is. The main aim of a theory of events is to propose and defend an identity condition on events; that is, a condition under which two events are identical. For example, if Brutus kills Caesar by stabbing him, are there two events, the stabbing and the killing, or only one event?

Each of the leading theories of events is surveyed in this article. According to Jaegwon Kim, events are basically property instantiations. In contrast, Donald Davidson attempts to individuate events by their causes and effects. However, Davidson eventually rejects this view and, together with W.V.O. Quine, individuates events with respect to their location in spacetime. According to David Lewis, an event is a property of a spatiotemporal region. The selection of a theory of events is not a matter which one decides independently of one’s other metaphysical interests and commitments. This article discusses the relative strengths and weaknesses of several theories of events which can help to guide the reader’s own selection of a theory of events. Further philosophical developments may yield a theory of events which is more attractive than the approaches discussed here.

Table of Contents

  1. Kim’s Property-Exemplification Account of Events
    1. Constitutive Object or Region
    2. Properties
    3. Excessive Fine-Grainedness
      1. The Official Line
      2. The Fallback Position
    4. Is the Constitutive Object (Time, Property) Essential?
  2. Davidson’s Theories of Events
    1. The Causal Criterion
    2. The Spatiotemporal Criterion
    3. Events or Objects?
    4. Davidson and Ontological Commitment to Events
  3. David Lewis’s Theory of Events
    1. Preliminaries
    2. The Details of Lewis’ Theory
      1. A Non-Duplication Principle
      2. Regions
      3. Event Essences
      4. Fine-Grainedness and Logical Relations Between Events
  4. Conclusion
  5. References and Further Reading

1. Kim’s Property-Exemplification Account of Events

Events, according to Kim, are structured: they are constituted by an object (or number of objects), a property or relation, and a time (or an interval of time). For simplicity, the discussion will be restricted to monadic events, that is, events with a monadic property exemplified by a single object at a time. Kim’s theory of events consists of two basic principles, the first states the conditions under which any given event exists, the second gives the conditions under which events are identical. In stating the principles Kim represents events by expressions of the form

[x, P, t]

where the operator “[. . .]” is intended to be a special case of the description operator, read “the unique event, x’s having P at t.” Kim’s two principles are the following:

Existence Condition: [x, P, t] exists iff object x exemplifies the n-adic property P at time t.

That is, the unique event of object x’s having property P at time t exists if and only if the object x has P at a given time.

The second principle is the following:

Identity Condition: [x, P,t] = [y,Q, t’] iff x = y, P = Q, and t = t’.

This principle reads: the unique event, x’s having P at a given time t, is identical to the unique event, y’s having Q at a given time t’, if and only if x is identical to y, P is identical to Q, and t is identical to t’. It is sometimes also called the “non-duplication principle.”

According to Kim, (i) events are non-repeatable, concrete particulars, including not only changes but also states and conditions. (ii) Each event has a spatiotemporal location. (iii) Although events may exemplify any number of properties, only one property, the constitutive property, individuates the event. The constitutive properties are not exemplified by the event, but are exemplified by the constitutive substance:

Events themselves have (exemplify) properties; Brutus’ stabbing Caesar has the property of occurring in Rome, it was intentional, it led to the death of Caesar and caused grief in Calpurnia, and so on…. The properties an event exemplifies must be sharply distinguished from its constitutive property (which is exemplified, not by an event, but by the constitutive substance of the event)…. (Kim, 1993, p. 170).

With this in mind we might call attention to the difference between an event’s exemplifying a property from an event’s being an exemplification of a property. According to Kim the event is an exemplification of only the constitutive property while the event exemplifies any number of non-constitutive properties. (iv) Kim gets a type-token relation for events by regarding the constitutive property as the generic event. Particular exemplifications of the constitutive property by a constitutive object are tokens of the generic event. (v) Kimean events are not just ordered triples of the form . Consider the event of Oedipus’ marrying Jocasta at t. A triple exists when Oedipus,t, and marrying Jocasta exist. But the triple can exist while the event does not, namely, Oedipus may fail to have the property, marrying Jocasta, at t.

What follows are the main criticisms of Kim’s theory of events.

a. Constitutive Object or Region

Myles Brand criticizes the account for not being able to accommodate the intuition many have that an event might not have a constitutive object: “Leaving aside the controversial case of mental events, there are changing weather conditions, changing light conditions, changing fields, and so on.” (Brand, 1997, 335) Brand suggests that Kim modify his account by taking spatiotemporal regions as the constituents of events, rather saying that objects are the constituents. So, if an event involves a flash of lightning or a magnetic field increasing in strength, the event occupies (at minimum) the space in which the flash or field increase occurs. It is certainly open to Kim to modify his theory accordingly.

b. Properties

Since, on Kim’s view, events are property exemplifications, a natural question to ask is: what sorts of properties are acceptable as constitutive properties (and thereby as event types)? Kim provides little specification of what sort of view of properties the theory is to be wedded to. Indeed, Kim’s discussion of events does not even specify whether such properties are universals, tropes (as non-repeatables), natural classes, or something else. (Readers unfamiliar with the different views on the nature of properties should see Oliver, 1996). And we might ask whether the properties are sparse (such as Armstrong’s theory of universals) or abundant, corresponding to every predicate (or nearly every predicate). (Again, see Oliver, 1996). The following passage gives us a rough idea how Kim would answer this latter question:

. . . [T]he basic generic events may be best picked out relative to a scientific theory, whether the theory is a common-sense theory of the behavior of middle-sized objects or a highly sophisticated physical theory. They are among the important properties, relative to the theory, in terms of which lawful regularities can be discovered, described, and explained. The basic parameters in terms of which the laws of the theory are formulated would, on this view, give us our basic generic events, and the usual logical, mathematical, and perhaps other types of operations on them would yield complex, defined generic events. We commonly recognize such properties as motion, colors, temperatures, weights, pushing, and breaking, as generic events and states, but we must view this against the background of our common-sense explanatory and predictive scheme of the world around us. I think it highly likely that we cannot pick out generic events completely a priori. (Kim,1993, p.37)

So Kim would like a theory of events which provides a framework to develop theories of causation, explanation, and to explore the mind-body problem and the relation between micro and macro events more generally (Kim,1993, p.36). Such desiderata seem reasonable and, at least at first blush, Kim’s rough gesturing at a notion of properties seems suitable to such desiderata.

This passage also tells us that Kim is open to the view that an answer to the question, “What properties are there?” might involve an a posteriori element, left to scientific theory. But there are further issues that a proponent of the exemplification theory should eventually address. For instance, which properties can be constitutive of events? (i) If the account of properties selected allows that (purported) properties like being equal to the square root of two are in fact properties, such properties do not seem to be properties that can constitute events. (Brand, 1997, p. 335) (ii) If walking is a property constitutive of events, is walking slowly? (We will turn to ii. shortly).

Myles Brand has criticized the property exemplification view because it lacks a criterion for property identity. (Brand, 1997, p. 335) The problem is that Kim’s account is incomplete because we cannot determine when events are identical. We cannot do this because we do not know when they have the same properties. This objection may strike one as weak because it seems to require too much of the property exemplification account. As Brand notes “. . . solutions to a number of central philosophical problems — for instance the mind-body problem, scientific theory reduction and meaning change — also require identity conditions for properties.” (Brand, 1997, p.335) It seems excessive to require Kim to solve such problems to give a viable theory of events. We now turn to more serious criticisms of the theory.

c. Excessive Fine-Grainedness

Although Kim’s above passage gives us a better idea of what sorts of properties constitute events, it does not answer the following question: if “F” is a predicate or verb designating some generic event, (e.g., walking), and “M” is a predicate modifier, (e.g., slowly) does “M(F)” name a new generic event (walking slowly), or does the modifier indicate that the generic event (the walk) exemplifies the property (being slow)? If Sebastian strolls leisurely through the streets of Bologna at t is the stroll the same event as the leisurely stroll? Most people’s intuition is that they are the same event. Let’s call the need for a satisfactory answer to this question “The Problem of Predicate Modification.”

Indeed, the most serious criticism of Kim’s theory is that it yields events that are too fine-grained. That is, events are regarded as being distinct that, intuitively, are the same event. There are two basic types of prolificacy that worry critics. (i) First, there is the Problem of Predicate Modification. (ii) Second, there are sorts of prolificacy not arising from the M(F) operation but from the question: if S does x by doing y is S’s doing x the same event as S’s doing y? Let’s begin with a discussion of type (ii) cases.

Type (ii) prolificacy. To employ a well-known example, on Kim’s view the stabbing of Caesar is a different event from the killing of Caesar, because the properties of being a stabbing and being a killing are, by any reasonable account of property individuation, distinct. (Bennett, 1991) The criticism begins by noting that it is a historical fact that the method of killing was a stabbing. The critic interprets this as saying that the properties were co-instantiated. To this the critic adds that co-instantiation is sufficient for property identity, although, not, of course, sufficient for event identity. Kim’s account of events turns events into property tokens, getting the nature of events wrong. (Bennett, 1991)

Jonathan Bennett provides a detailed objection along such lines, but adding an additional, informative element to his claim that Kimean events are too fine-grained. First, his general claim:

Kim maintains that two nominals can pick out a single event only if (roughly speaking) their predicative parts are equivalent: so it cannot be true that the kick he gave her was the assault he made on her. I argue against this, contending that most of Kim’s prima facie evidence for it depends on his running events together with facts. It is beyond dispute that his kicking her is not the same as his assaulting her, these being different facts. (Bennett,1991, p.626)

His contention that Kim conflates events and facts is fueled by an informative distinction between imperfect and perfect nominals, which he links to a distinction between fact language and event language, respectively:

Following Vendler, I take it that these [event names] will be perfect and not imperfect nominals. Quisling’s betrayal of Norway (perfect) was an event; Quisling’s betraying Norway (imperfect) is a fact, namely the fact that Quisling betrayed Norway. Quisling’s betraying Norway is different from his doing Norway a disservice; these are two facts. His betrayal of Norway was his disservice to Norway; there was only the one event. (Bennett,1991, p. 625)

Perfect nominals, according to Vendler’s research, are our main device for event talk, passing all of the tests for being an event sortal. (Bennett, 1998, p. 6) (However, it should be noted that not all perfect nominals name events. For discussion of this see Bennett, 1988, p. 7). In contrast, imperfect nominals never refer to events because they “. . . don’t behave syntactically as though they were applicable to located particulars: they don’t take articles or attributive adjective, they don’t have plural forms, and so on. Their semantic behavior is wrong too: they don’t go comfortably into contexts about being observed, occurring at stated times or lasting for stated periods, and so on.” (Bennett, 1988, p.7) Instead of naming events imperfect nominals name facts (that is, states of affairs that obtain) and more generally, states of affairs. Vendler and Bennett provide the following argument to the conclusion that imperfect nominals name facts. First, they claim that there is a sort of imperfect nominal that contains a complete sentence in it, sentence nominals, which function as noun phrases which pass all the tests for being imperfect nominals. Bennett takes it that such constructions name facts. He calls these “that [S] constructions.” Bennett further claims:

I contend that any sentence using an imperfect gerundial nominal is synonymous with one in which that gerundial nominals work is done instead by a “that [S] nominal. Test this, and if you find no counterexamples you will agree that imperfect gerundial nominals are basically interchangeable with “that [S] nominals and are therefore names of facts. If you do find counterexamples, Vendler and I must back off, saying merely that many gerundial imperfect nominals name facts and that none name events, and it will be a further problem to know what marks of the fact names from the rest. But I shall stay with the stronger claim until it is refuted. (Bennett, 1988, p.8)

Bennett applies his claim that perfect nominals are our main device for referring to event kinds while imperfect nominals always refer to facts to help settle the dispute concerning type (ii) prolificacy cases. Bennett illustrates how the distinction is useful with respect to this issue by calling our attention to the following interchange between Kim and Davidson:

It is not at all absurd to say that Brutus’ killing Caesar is not the same as Brutus’ stabbing Caesar. Further, to explain Brutus’ killing Caesar (why Brutus killed Caesar) is not the same as to explain Brutus’ stabbing Caesar (why Brutus stabbed Caesar). (Kim, 1993, p 232)

Davidson remarks:

I turn. . . to Kim’s remark that it is not absurd to say that Brutus’ killing Caesar is not the same as Brutus’ stabbing Caesar. The plausibility of this is due, I think, to the undisputed fact that not all stabbings are killings. . . . But [this does not show] that this particular stabbing was not a killing. Brutus’ stabbing of Caesar did result in Caesar’s death so it was in fact, though not of course necessarily, identical with Brutus’ killing of Caesar. (Davidson, 1980, p. 171)

It does appear that, as Bennett aptly puts it, while Kim is saying true things about facts, Davidson is saying true things about events. The provisional conclusion that I draw on the criticism that Kimean events turn events into facts, getting the nature of events wrong, is the following: if one is impressed by the view that the stabbing of Caesar and the killing of Caesar are the same event then one must make sure that it is not because they find it plausible that Brutus’ killing Caesar and Brutus’ stabbing Caesar are distinct. For Bennett has given us reason to believe that such plausibility derives from the plausible distinctness of facts. (Further, Kim has not disputed Bennett’s distinction or its application to the type (ii) prolificacy dispute). On the other hand, perhaps a proponent of the property-exemplification view would like to dispute the linguistic data, or, instead, claim that while the data capture our ordinary event concept, a philosophical theory of events should not seek to satisfy the ordinary event concept, but should instead engage in a conceptual revision.

Type (i) prolificacy. Although Kim is not interested in renouncing the prolificacy of type (ii) he believes that it is a more serious matter that his view might allow adverbial modification to give rise to distinct generic events, (e.g., Sebastian’s strolling and Sebastian’s strolling leisurely are distinct events). That is, he takes such cases as being more plausible examples of excessive fine-grainedness: “it is more plausible to deny identity in cases like it (the stabbing case) than in cases like Sebastian’s stroll and Sebastian’s leisurely stroll (where we suppose Sebastian did stroll leisurely).” (Kim,1993, p. 44) Kim does not say why the stabbing case is more plausible case of distinct events; but he is certainly in tandem with most people’s intuitions in this regard. He offers two ways to deal with the Problem of Adverbial Modification, advancing one as “the official line” and the other as a fallback position.

i. The Official Line

Kim’s strategy is to regard the events as being different, but not entirely distinct events, by claiming that leisurely stroll includes the stroll. Kim does not explain the sort of inclusion that he is appealing to. It is certainly a different sort than a type of inclusion that we might normally apply to events: for example, we might conceive of a war as an extended event consisting of a number of battles, a buying a book as a standing at the register and handing the money and so forth. In each of these cases the extended event has the events of shorter duration as temporal parts. We might say that Sebastian’s stroll is like these by stipulating that there was a temporal part of the stroll that was not leisurely — say he leapt over a puddle. But this would be missing the point as one could just specify a different case such that an entire stroll was leisurely. Kim offers the following point to motivate the non-standard sort of event inclusion that he has in mind:

Take this table: the top of the table is not the same thing as the table. So there are two things, but of course one table — in fact, there are lots of things here if you include the legs, the molecules, the atoms, etc., making up the table. (Kim,1993, p. 46)

One can construct individuals, as counterintuitive to the layperson they may be, from the mereological sum of any spatio-temporal parts. But given a particular table, it would be quite odd to claim that the mereological sum of all of its parts is a new individual, and not, instead, that very same individual. Since the stroll and the leisurely stroll occupy the same space-time worm, the analogy with physical objects will not go through: for a physical object x to include distinct physical object y it requires at least one proper part that is had by object x that is not had by object y and that x have all of y’s parts as proper parts. There is no proper part (time-space region) occupied by the stroll that is not also occupied by the leisurely stroll. We thus have motivation for turning to the second option that Kim provides for dealing with the Problem of Adverbial Modification.

ii. The Fallback Position

The remaining option is to deny that modifiers, or at least a certain class of them, give rise to new generic events, instead, they indicate properties of the generic events. (For example, strolling leisurely is not a generic event, but being leisurely is exemplified by Sebastian’s stroll.) Kim views this option as bringing with it a major drawback: namely, it compromises his original motivation for supplying a theory of events in the first place — that events be the sort of entities that enter into causal relations and are objects of explanations: “But it is clear that we may want to explain not only why Sebastian strolled, i.e., Sebastian’s stroll, but also why he strolled leisurely, i.e., his leisurely stroll. Under the approach being considered, the second explanation would be of why Sebastian’s stroll was leisurely; we would be explaining why a certain event had a certain property, not why a certain event occurred.” (Kim,1993, p. 45)

d. Is the Constitutive Object (Time, Property) Essential?

A second major challenge to the property-exemplification view is the claim that it relies on dubious claims about the essential properties of events. Consider the constitutive object: could the very same event, the changing of the guard, have occurred if a guard was a different person? Could it have been the same event if, instead, it was slightly earlier? Both of these questions raise plausible possibilities.

Kim agrees that the time is not an essential feature of certain events : “it seems correct to say that the stroll could have occurred a little earlier or later than it actually did.”(Kim,1993, p. 48) Kim is also sympathetic to the claim that the property is not essential, although his concern is limited to cases in which modifiers give rise to new generic events. (Kim,1993, p. 47) However, Kim rejects the view that the constitutive substance is not essential.

The fact that someone other than Sebastian could have taken a stroll in his place does not make it the case that the very stroll that Sebastian took could have been taken by someone else. If Mario had been chosen to stroll that night, then there would have been another stroll, namely Mario’s. (Kim, 1993, p. 48)

One natural reaction is to disagree with this assessment because it seems plausible that in the changing of the guard case, the very same event, the changing of the guard, could have occurred if a guard was a different person. But perhaps it is better to not haggle intuitions; the real issue is how Kim, of all people, can be sympathetic to challenges to the non-essentiality of the constitutive time and property. Doesn’t he have to deny this? The matter hinges on whether it is plausible, as Kim seems to believe, that the following claims be held in tandem:

(1) The constitutive time and (in cases of modification) the constitutive property are non-essential

(2) Both of the following are true:

Identity Condition: [x, P,t] = [y,Q, t’] iff x = y, P = Q, and t = t’.

Existence Condition: [x, P, t] exists (occurs) iff object x exemplifies the n-adic property P at time t.

Begin with the first condition. Identity Conditions do not need to entirely specify an entity’s nature. As Kim notes: “It is at least a respectable identity criterion for physical objects that they are the same just in case they are completely coincident in space and time. From this it does not follow that a physically object is essentially where and when it in fact is.” (Kim,1993, p.48) Now consider the Existence Condition. It tells us something about the modal character of events: events are necessarily exemplifications of properties by objects at times. Kim agrees: “There is an essentialist consequence I am willing to accept: events are, essentially, structured complexes of the sort the theory says they are. Thus, events could not be substances, properties, and so on.” (Kim,1993, p.49) But it doesn’t tell us about the modal character of the event in the following sense: it doesn’t say whether the event can occur without any, or even all, of the constitutive entities. Hence, it doesn’t tell us whether any of the constitutive entities are essential. From these observations once can conclude that the conjunction of (1) and (2) are consistent. Consistent, but informative? Although our brief discussion concludes with the observation that Kim avoids a serious criticism, the discussion has also raised the point that Kim has only given a partial specification of the nature of events. To fully specify the nature of events more needs to be said about the modal character of the constitutive entities. Here, intuition haggling comes into play. As Kim comments, on this score, “the general problem is still open.” (Kim,1993, p.49)

2. Davidson’s Theories of Events

Kim defends a relatively fine grained theory of events, but Davidson types events in a rather coarse way. Davidson has advanced two conditions. Initially, he proposed the principle that no two events can have exactly the same causes and effects. Then, after discarding this principle, he proposed that no two events can occur in exactly the same space-time zone, a view which Quine also advanced. The following sections evaluate both non-duplication principles. The discussion of Davidson’s work on events concludes with some general remarks about his influential argument for the existence of events from the use of action sentences.

a. The Causal Criterion

In “The Individuation of Events,” Davidson sets himself the task of determining a criterion for the sameness and difference of events, where events are understood as particular, non-repeatable occurrences. After considering and rejecting various proposals Davidson settles on the following:

(DT1) events are identical iff they have exactly the same causes and effects

Noting “an air of circularity” about this suggestion, he formulates (DT1) as the following:

(DT1′) (Ax)(Ay)(Az)[x = y iff (z caused x iff z caused y) and (x caused z iff y caused z)]

He then writes: “No identities appear on the right of the biconditional.” (Davidson, 1980, p.179) Well, this is true, but (DT’) is nonetheless circular because, of course, x,y and z are events. The circularity is not excisable either, for the gist of Davidson’s suggestion is that events can be individuated by their causes and effects, but what is a cause or effect, for Davidson, if not an event? Davidson claims (inter alia) that events e and e’ are identical only if e and e’ have all the same causes. But causes are events, and to determine if e and e’ have the same causes we need to determine whether each of e’s causes has all the same effects as some cause that e’ has. And among these effects are e and e’, the very events we are trying to distinguish or, alternately, identify. (Lombard, 1998)

Davidson later concedes that (DT1′) is indeed circular and, in light of this, moves to a theory that he had previously rejected in his discussion of Lemmon’s proposal at (Davidson, 1980, p.178).

b. The Spatiotemporal Criterion

Lemmon’s proposal was:

(DT2) events are identical iff they occur in the same space at the same time

Davidson had previously rejected (DT2) because “. . . I thought one might want to hold that two different events used up the same portion of space-time. . .” (Davidson, 1985, p.175) Davidson’s discussion of Lemmon’s proposal will come back to haunt him. In particular, Davidson provided an intriguing example. This example, many believe, is decisive against DT2, the proposal that Davidson himself continued to favor.

Doubt comes easily in the case of events, for it seems natural to say that two different changes can come over the whole of a substance at the same time. For example, if a metal ball becomes warmer during a certain minute, and during the same minute rotates through 35 degrees, must we say that these are the same event? (Davidson, 1980, p.178)

There are two ways of interpreting the example which the discussions of this example sometimes fails to distinguish. Let us begin with one specification, which we will discard as not even superficially challenging the view that there can be different events in the same spacetime location.

(i) The rotation, although occurring during the same minute, temporally precedes the warming. This interpretation takes “at the same time” in the above passage as meaning, “during the same minute.” This could happen if both events occur at, say, 2:51 and the rotating precedes the warming by, say, ten seconds. This seems to be Simone Evnine’s interpretation of the case. (Evnine, 1991, p. 29) This reading of the problem is much easier to solve because the events would be (at least partly) spatiotemporally distinct. Evnine’s interpretation was probably encouraged by the fact that rotating an object will cause the object to warm slightly, in such cases the rotating will precede the warming. It doesn’t seem useful to conceive of the example in this way because it is not, even at first blush, a potential counterexample to the sufficiency of spacetime location for sameness of event because the spacetime locations obviously differ, although they partly overlap.

(ii) It may be suggested that we forget that rotating causes slight warming, and suppose, for the sake of argument, that some additional warming of the object occurs at the same time as the object rotates. Although Davidson does not note this, we can fairly construe his puzzle as being about the additional warming and its having the same spatiotemporal location as the rotating. We have the strong intuition that the additional warming (hereafter “warming”) and the rotating are different events; this is the interesting interpretation of the case because it raises a potential counterexample to (DT2).

Construed in this way, the matter is quite tricky. First, a general observation. When things warm up their molecules randomly jiggle about. This is a different sort of molecular motion than is involved in a thing’s rotating. Given this observation, it might seem like (DT2) is not challenged by the example, after all. One might have the belief that, given this observation, there should be some way to prove that different, (but not completely distinct), spacetime regions are involved. It is natural to be skeptical that such a maneuver is available, however. The same molecules that are randomly jiggling about, because of the heating, are also revolving. Similarly, one cannot assign different spacetime regions to Joe’s Northeasterly walk, although it is, in a sense, both a Northerly walk and an Easterly walk. So this appears to be a counterexample to Davidson’s proposal.

Now, assuming that one is a proponent of DT2, how should one respond to Davidson’s own example of a top’s spinning and heating up? The proponent could swallow the unintuitive result that the spinning and the heating are very same event, saying that DT2 is still in the running, as a non-duplication, principle, because other leading theories of events also have counterintuitive results in some cases. For recall that Kim holds that

(KT) [x,P,t] exists (occurs) iff object x exemplifies the n-adic property p at time t.

On this view the stabbing of Caesar is a different event from the killing of Caesar because the properties are distinct (according to any plausible property theory). This strikes many as being too fine-grained; the killing and the stabbing are not distinct events, although being a killing and being a stabbing are distinct properties. Selecting a theory of events involves an all-things considered judgment that weighs the various strengths and weaknesses of the competing theories. If other non-duplication principles have equally counterintuitive results, then, ceteris paribus, DT2 is still in the running.

c. Events or Objects?

Any critical evaluation of Davidson’s theory of events should (at least briefly) consider the other influential objection to DT2. A common view is that objects are identical if and only if they occupy the same space-time location. And this is precisely DT2, causing some to believe that it gets the nature of events wrong. The objector’s intuition that events are not objects is grounded in the view that events are occurrences and objects are not. So, by Leibniz’ Law, events and objects are distinct. For Davidson’s position to be convincing he needs to explain away the strong intuition that events are occurrences and objects are not. Davidson is concerned with the conflation, and in light of it offers the following suggestion:

. . . events and objects may be related to locations in spacetime in different ways; it may be, for example, that events occur at a time in a place while objects occupy places at times.

Occupying the same portion of spacetime, event and object differ. One is an object which remains the same object through changes, the other a change in an object or objects. Spatiotemporal areas do not distinguish them, but our predicates, our basic grammar, our ways of sorting do. Given my interest in the metaphysics implicit in our language, this is a distinction I do not want to give up. (Davidson, 1980, pp.176)

It does seem correct that when we conceive of events, we generally think of changes, or occurrences. This feature seems to rest at the kernel of our event-concept.

Evnine’s reaction to Davidson’s claim is that “this attempt to resist the assimilation of events to objects will only work if we are able to make a convincing distinction between occurring and occupying which does not itself rely on the distinction between events and objects.” (Evnine, 1991, p. 31) If Evnine is suggesting that an account of events would be circular should it fail to cash out the notion of occurring in a way that doesn’t presuppose eventhood this is not an entirely decisive objection — the concept of an occurrence could simply be taken as primitive in an analysis. However, some would find it unattractive that an unexplained notion, and one that seems so close to the concept of an event, that of an occurrence, is doing all the work in dividing objects from events.

The following, more decisive objection to Davidson’s suggestion may occur to one: there is an intuitive distinction between occurring and occupying — we see events unfold and objects occupy spaces — but it is important to note that many, including Lewis and Kim, consider events, as a metaphysical category, to include some non-happenings or non-occurrence as well as all happenings. And Bennett notes that Davidson himself has “never said that events must be changes and . . . did once express tolerance for the idea of such movements as standing fast.'” (Bennett, 1988, p.176, quoting Davidson) Davidson’s manner of distinguishing events from objects, in so far as it involves the claim that events are essentially occurrences, seems, at least at first blush, incompatible with the view that events are non-occurrences. If Davidson believes that some non-occurrences are events then, in order to preserve his original point, in addition to telling us more about his occurrence/occupation distinction he needs to answer the question: if non-occurrences can be events why are such non-occurrences not objects? Perhaps the only manner of preserving the idea that events are an ontological kind is by renouncing the view that some non-occurrences are events.

At this point it is not clear if Davidson would be interested in doing so. Here I can only gesture in the direction of a possible difficulty. In his chapter on adverbial modification, Bennett suggests that Davidson needs to consider unchanging events in order to

. . . smooth the way for applying his theory to many uses of adverbs to modify not verbs but adjectives. ‘Marvin was icily silent’ entails ‘Marvin was silent’ and it would be uncomfortable for a Davidsonian to have to exclude such entailments from the scope of his theory. It would be better for him to say that the former sentence had the form: For some x: x was an episode of silence, and Marvin was the subject of x, and x was icy. (Bennett, 1988, p.76)

It is likely that the Davidsonian would be interested in applying his theory to uses of adverbs that modify adjectives. This attractive feature will have to be balanced against any desire to distinguish events from objects.

d. Davidson and Ontological Commitment to Events

Finally, in our discussion thus far the existence of events has been taken for granted, the issue being how to individuate them. But Davidson’s work on events is not limited to a defense of a non-duplication principle, indeed, he argues that we need to posit events (inter alia) to explain the meanings of statements employing adverbial modifiers. In “The Individuation of Events” he writes:

. . . without events it does not seem possible to give a natural and acceptable account of the logical form of certain sentences of the most common sorts; it does not seem possible, that is, to show how the meanings of such sentences depend upon their composition. The situation may be sketched as follows. it is clear that the sentence ‘Sebastian strolled through the streets of Bologna at 2 a.m.” entails “Sebastian strolled through the streets of Bologna”, and does so by virtue of its logical form. This requires, it would seem, that the patent syntactical fact that the entailed sentence is contained in the entailing sentence be reflected in the logical form we assign to each sentence. Yet the usual way of formalizing these sentences does not show any such feature: it directs us to consider the first sentence as containing an irreducibly three-place predicate ‘x strolled through y at t’ while the second contains the unrelated predicate ‘x strolled through y.’ (Davidson, 1980, p.166-7)

Davidson suggests that we solve this puzzle by accepting the intuitive idea that “there are things like falls, devourings, and strolls for sentences such as these to be about.” The sentence

Sebastian strolled though the streets of Bologna at 2 a.m.

has the following logical form:

There is an event x such that Sebastian strolled x, x took place in the streets of Bologna, and x was going on at 2 a.m.

This logical form yields the problematic entailments. Davidson’s view is that this correct logical form for action sentences motivates the ontological commitment to events because quantification over a kind of entity involves an ontological commitment to the existence of entities of that kind.

Is Davidson’s argument plausible? (i) Although it is plausible in standard cases, it is unclear how Davidson’s account can be extended to manage various sorts of nonstandard modifiers. How, for instance, will Davidson analyze (S)”Sebastian almost strolled” to reveal that (S) entails “Sebastian didn’t stroll”? (ii) Terry Horgan objects that Davidson’s account is counterintuitive because most adverb constructions do not contain explicit quantification over events. (Horgan,1978, p.47) Horgan is correct, and in light of this, Davidson’s argument is significantly weakened if there is an equally attractive or (more damaging yet) superior alternate account of adverbial modification available that does not involve quantification over events. In light of (i) we can add that a competing account would be even more attractive if it could handle non-standard cases of modification that Davidson’s theory, as it stands, does not.

Indeed, Horgan has formulated an alternate account that does not involve quantification over events. (Horgan, 1978) Romane Clark’s has proposed an extension of standard first order quantification theory to handle predicate modification. (Clark, 1970) Horgan’s alternate account involves modifying Clark’s proposal in such a way that it does not appeal to states of affairs, which are frequently taken as ontological kinds that are either ontologically equivalent to events or include events as a subcategory. Instead, Horgan appeals to set theory, which is already appealed to in formal semantics. In light of this should we apply Occam’s Razor and deny the existence of events? This move would be premature. Davidson has provided a number of other reasons to quantify over events: “I do not believe we can give a cogent account of action, of explanation, of causality, or of the relation between the mental and the physical unless we accept events as individuals.”(Davidson, 1980, p.165) If any of these other considerations are apt, then quantification over events would be in order even if the Horgan/Clark proposal is superior, on balance, to Davidson’s account. Should all of the other considerations fail, the issue will turn on the problem of adverbial modification and any decision on this matter surely requires a detailed treatment of the relative advantages and disadvantages of each of the proposals.

This concludes the treatment of Davidson’s extensive work on events. Davidson obviously takes events very seriously, going as far as arguing that there are a number of reasons to quantify over them. Lewis, in contrast, has a rather opportunistic approach to events: he fashions a theory of events primarily to suit the theoretical needs of his theories of explanation and causation. Nonetheless, Lewis’s theory is regarded by many as being important in its own right.

3. David Lewis’s Theory of Events

The core conception of Lewis’ theory of events is that an event is a property of spatiotemporal regions. (Lewis, 1986, p. 244) Properties, like events, are not basic to Lewis’ ontological scheme. Lewis holds that, “By a property I mean simply a class. To have the property is to belong to the class. All the things that have the property, whether actual or merely possible, belong…The property that corresponds to an event, then, is the class of all regions, at most one per world – where the event occurs.” (Lewis, 1986, 244) This being said, Lewis proposes the following necessary condition for some entity e’s being an event:

(LT) e is an event only if it is a class of spatio-temporal regions, both thisworldy (assuming it occurs in the actual world) and otherworldly.

(LT) is a rough, first approximation of a theory of events. It only tells us which entities are formally eligible to be events — only such entities that are a class of thisworldly (assuming it occurs) and otherworldly spacetime regions. Any member of the class that is the event “occurs”; the event, itself, understood as the class, doesn’t occur. This would be a kind of category mistake because classes, as abstract entities, don’t occur, although they can exist.

We can get an intuitive grip on (LT) by noting a certain commonality with the previously discussed Quine-Davidson account of events, which holds that events are individuated by their spacetime locations: no two events can occupy the same spacetime location. Recall that one criticism of this theory of events is that it treats the simultaneous rotating and the heating of the sphere as being, counterintuitively, the same event. It can be noted that, in contrast, this is not a drawback for Lewis’ account. The Quine-Davidson account identified an event with a certain spacetime region; Lewis, in contrast, can say that an occurrence of an event can be located in the same region that another event is, claiming that the events, as classes, are nonetheless distinct because there will be some member that is in class A that is not in class B, namely, an occurrence in some region that is a member of A and not B. (Here, it is important to bear in mind that the different regions may be at different possible worlds). So it is available to Lewis to characterize the well known case of the sphere that both heats and spins at t as involving two distinct events because the rotating includes otherworldly regions that the heating does not include. So far, so good for Lewis.

But before going further into the strengths and weaknesses of the theory, it is necessary to say more about the theory.

a. Preliminaries

A few preliminary remarks about the process of evaluating Lewis’ theory are useful to keep in mind. As noted, Lewis is an event-opportunist, if you will, letting his interest in explanation and especially, his counterfactual analysis of causation dictate his theory of events; Bennett captures Lewis’ route nicely:

There remains the less ambitious course of basing judgments about the essences of events on the counterfactual analysis of event causation: start with our firm beliefs about what causes what, put them into their counterfactual form in accordance with the analysis and draw conclusions about what the essences of events must be like if we are not to be convicted of too much error in our causal beliefs. That is the third of my three approaches, and it is the one that Lewis adopts. (Bennett, 1988, p.61)

At least at first blush, there seem to be nothing objectionable about this route into events. After all, even Bennett has urged that our ordinary notion of events is not a notion that leads to a useful philosophical theory of events. Why not, then, begin elsewhere? Perhaps Lewis is less ambitious, but commendably more realistic.

Given this route of entry into a theory of events it is natural that those who are interested in Lewis’ influential counterfactual theory of causation would have a particular interest in his theory of events. (Lewis, 1970) Of course, even with a strong antecedent interest in Lewis’s theories of causation and explanation, one might nonetheless turn away from Lewis’s theory if it entails Modal Realism. Those who reject Modal Realism would agree that the following desideratum is a requirement that Lewis’ theory of events must satisfy:

D1: That the theory of events be formulable within the ersatz framework.

As is well known, Lewis is operating with a controversial notion of “possible world” according to which possible worlds are as real as this world, some containing flesh and blood creatures, solid mountains and planets, and so forth. Such worlds are non-actual in the sense that they are not in our world, but they are equally real as our world is. A world, according to Lewis, is a big object containing all objects that exist there as parts. (Lewis, 1986, p.69) So a world with a talking donkey is a world that has a talking donkey as a literal part.

Ersatzers attempt to avoid commitment to Lewis’ possible worlds, reducing possible worlds to other, more acceptable (but in at least some cases still controversial) sorts of entities. (Armstrong, 1989; Loux, 1980; Plantinga, 1976) Ersatz views hold that instead of a plurality of worlds in the modal realist’s sense, there is only one concrete world, with various abstract entities representing ways that our world might have been. Such theories are actualist; they hold that the actual entities represent (in some sense of the word) possibilia. Ersatz views take the abstract-concrete distinction as being well understood, taking the world and the entities that occupy it as being concrete, and taking the representations of the concrete entities as being abstract. There is one correct abstract representation and there are many misrepresentations; the former represents the concrete world, the misrepresentations of the actual world represent the various ways the concrete world might have been.

There are many ersatzers, although not all of the same variety; in contrast, there was only one modal realist — Lewis himself. (For a variety of ersatz theories see Loux, 1980) So it seems fair to say that D1 must be met by Lewis’ theory of events. If it turns out otherwise, even if the theory is clear and consistent, there will be very few adherents to the account of events. The following section lays out the basic details of the theory, then attention focuses on whether D1 is indeed satisfied.

b. The Details of Lewis’ Theory

We have investigated Lewis’s claim that some entity is an event only if it is a class of spatio-temporal regions, both thisworldly and otherworldy. We now turn to four more features of the theory.

i. A Non-Duplication Principle

From (LT), the axiom of extensionality, (which holds that two sets are identical if and only if they have all the same members) and the predicate logic, we can derive a non-duplication principle for Lewis events. Recalling that Lewis events are classes we can say:

(NP) (x)(y)(where x and y are events, x and y are different events if and only if there is at least one member of x that is not a member of y, (or vice versa)).

It is important to note that although (NP) may judge two events to be different, it is consistent with (NP) that they not be entirely distinct in the sense that one event may be a proper subset of another. Here the term “different” is used in the sense of “non-identical.” Think of “different” as meaning, “at least partly distinct.”

ii. Regions

Lewis outlines several features of the operative notion of spacetime regions: “An event occurs in a particular spatiotemporal region. Its region might be small or large; there are collisions of point particles and there are condensations of galaxies, but even the latter occupy regions small by astronomical standards.” (Lewis,1983, p. 243) Still, there are certain specifications on what can count as a region, namely, that no event occur in two different regions of a world and that an event occupy an entire region; in other words, an event can’t occur in any proper part of a region, although parts of it can. (Lewis, 1983, p.243) Lewis leaves it open whether any region is a region in which an event can occur, writing: “A smallish, connected, convex region may seem a more likely candidate than a widely scattered part of spacetime. But I leave this question unsettled, for lack of clear cases.” (Lewis, 1983, p. 243). It is not clear that there really aren’t cases that decide the issue: consider the televising of the Superbowl, it seems scattered throughout the regions of multiple homes, bars, and so forth. This seems a clear case of a very scattered event, although every part of the event is spatially connected.

Lewis says that his theory of events relies on the following assumption:

(A) Regions are individuals that are parts of possible worlds.

He admits that this is controversial but says that he need not defend (A) in his present discussion of events. Given the aforementioned Modal Realist view of possible worlds, we can appreciate the controversial nature of (A).

Now let us ask, can assumption (A) be recast in terms of an ersatz conception of possible worlds? It appears so; indeed, we will now see that desideratum (D1) can be met. That is, the theory of events, including assumption (A), can be recast in terms of an ersatz theory of possible worlds. This point will be illustrated by using a version of linguistic ersatzism.

“Linguistic ersatzism” (LE) is the generic name for the family of modal theories that takes worlds as being constructions out of words of a language; in broad strokes, possibilities are represented via the meanings that words are given. For instance, a typical LE view takes worlds as being maximal consistent sets of sentences (where a set S is maximal iff for every sentence B, S contains either it or its negation, and S is consistent iff it is possible for all the members of S to be true together). Notice that in contrast to Modal Realism, the building blocks of this typical linguistic ersatz view involve relatively uncontroversial entities (sentences and sets of things). (Of course, someone who appeals to ersatz worlds will have her own ontological scheme that is to account for such uncontroversial entities. The particular details will differ – the important thing is that ersatzism, unlike Modal Realism, does not prima facie require anything metaphysically ornate). So let us assume an ersatz theory along the above, generic, lines. As Lewis suggests, the linguistic ersatzer can take a possible individual as a maximal consistent set of open sentences. (Lewis, 1986, p.149) For instance consider what open sentences correspond to Ersatz Hunter Thompson:

Ersatz Thompson: {x is 6′ tall, x is the author of Fear and Loathing in Las Vegas, x is in LA on 4/5/77, …}

The set is consistent because it is possible for there to be an object such that all of the open sentences are true of it. It is maximal because for every open sentence with only “x” as the free variable the set contains either the sentence or its negation. We could do the same for regions. In this way regions are not mereological parts of possible worlds but are instead, subsets of ersatz possible worlds taken as sets. So (A) can be modified this way:

(A) Regions are individuals that are subsets of possible worlds.

where “possible worlds” refers to ersatz worlds, e.g., on the view considered here, maximally consistent sets of sentences. We can also understand the following condition

(LT) e is an event only if it is a set of spatio-temporal regions, both thisworldy (assuming it occurs in the actual world ) and otherworldly

As involving sets containing sets (regions according to (A), as members).

Finally, we can note that although Lewis reduces events to properties, and properties to classes of actual and otherworldy regions, the ersatzer need not adopt Lewis’ conception of properties to adopt (LT), but can just skip the intermediate step of Lewis’ reduction, taking events as classes of regions. Why is this important? First, the ersatzer may reject Lewis’ account of properties. Second, doing so avoids circularity worries for the proponents of ersatz theories that employ properties in constructing possible worlds (e.g., Armstrong, Plantinga). (Armstrong, 1989; Plantinga,1976) In any case, even if one wants or needs to adopt Lewis’ conception of properties, perhaps the direction of explanation could still be preserved if one adopts two conceptions of properties as Lewis does, one sparse and one abundant, employing the sparse conception in formulating the modal theory. But the ersatzer need not even do this.

iii. Event Essences

Lewis rejects the view that events are structured entities constituted by an essential time, object and property. Consider the nominalization “the death of Socrates at t”, while we may pick out the event, Socrates’ death, by this nominalization, it is conceivable that the very same death happened sooner. Now it might be reasonable, in this case, to say that the very same event couldn’t have had a different (so called) “constitutive” individual or property, but there seem to be other cases suggesting that the “constitutive” property and individual are also problematic: e. g, the firing squad shooting was done by Ned but it could have been done by Ted; the strolling could have been a striding. But this is not to say that Lewis is claiming that events don’t have essences; it is just that events aren’t structured in a Kimean way, rather, the essences are read off from the similarity between the members. This latter point is perhaps best illustrated by way of example: according to Lewis an event is essentially a change if and only if for each region something changes in it; an event essentially involves Socrates if and only if Socrates (more specifically, a temporal segment of Socrates’ counterpart) is present in each region; an event essentially occurs in spacetime region R if and only if each member is either R or a counterpart of R, and so on. (Lewis, 1983, p.248-9)

Essences are not to be mainly extrinsic, such as, for instance, an event that is essentially a widowing, nor are they to be overly varied disjuncts, that is, essences like, “an event that is essentially a walking and another that is essentially a talking.” Lewis’ rationale for these requirements stems from his interests in tailoring a theory of events to his accounts of explanation and counterfactuals. For instance, he rules out mainly extrinsic events, using the (purported) event of the widowing of Xantippe as an example, on the following grounds:

They offend our sense of economy. We would seem to count the death of Socrates twice over in our inventory of events. . .(2) they stand in relations of non-causal counterfactual dependence to those genuine events in virtue of which they occur. Without the death of Socrates the widowing of Xanthippe would not have occurred. (She might still have been widowed sooner or later. But recall that the widowing of Xanthippe, as I defined it, had its time essentially.). . .(3) They also stand in relations of non-causal counterfactual dependence to other genuine events, events logically independent of them. Without the widowing of Xanthippe, the subsequent cooling of Socrates’ body would not have occurred. (For in that case he would not have died when he did.) (Lewis, 1983, p.263)

iv. Fine-Grainedness and Logical Relations Between Events

The needs of Lewis’ counterfactual analysis of causation motivate Lewis to adopt a fine-grained notion of event. Suppose that John greets someone, and being rather tense, he says hello loudly. If he wasn’t tense he would have merely said hello softly. Lewis claims that two events of greeting occur:

John says “Hello.” He says it rather too loudly. Arguably there is one event that occurs which is essentially a saying “hello” and only accidentally loud; it would have occurred even if John had spoken softly. Arguably there is a second event that implies, but is not implied by, the first. This event is essentially a saying “Hello” loudly, and it would not have occurred if John had said “Hello” but said it softly. Both events actually occur, but the second could not have occurred without the first. (Lewis, 1983, p.255)

On this view two events of greeting occur, one with a richer essence than the other. The richer event, call it e1, is essentially a loud greeting and would not have occurred if the greeting was soft, e2 is essentially a greeting and is only accidentally loud. It would have occurred if the greeting was soft. As with Kim’s theory, many of those interested in a theory of events that tracks our ordinary event concept would find this result too fine-grained. From the vantage point of Lewis’ interests in his theory this unintuitive result is not a serious problem — again, capturing our ordinary event concept is not Lewis’ stated project. Lewis makes his motivation for the fine-grainedness clear in the following passage:

The real reason why we need both events. . . is that they differ causally. An adequate causal account of what happens cannot limit itself to either one of the two. The first event (the weak one) caused Fred to greet John in return. The second one (the strong one) didn’t. If the second one had not occurred — if John hadn’t said “Hello” so loudly — the first one still might have, in which case Fred still would have returned John’s greeting. Also there is a difference on the side of causes: the second event was, and the first wasn’t, caused inter alia by John’s state of tension. (Lewis, 1983, p. 255)

The rather counterintuitive fine-grainedness seems to be a necessary evil. As it happens, the events are regarded as being different in order that the theory of events can satisfy the needs of Lewis’s theory of causation. But doing so raises a problem: to regard the events as being distinct, when coupled with Lewis’ counterfactual theory of causation, would yield the undesirable result that the first event causes the second. This would be undesirable because the one event implies the other and intuitively, logically related events do not stand in causal relations with each other. Lewis’ way of handling this case is to regard the two events as being different, but not distinct, and to claim that non-distinct events do not stand in causal relations.

4. Conclusion

The selection of a theory of events is not a matter which one decides independently of one’s other metaphysical interests and commitments. In the context of our discussion, we have noted a number of relative strengths and weaknesses which can help to guide the reader’s own selection of a theory of events. Of course, further philosophical developments may yield a theory of events which is more attractive than the approaches discussed here. And there are some truly worthwhile, although less-influential, theories of events that have not been discussed in this article.

5. References and Further Reading

  • Armstrong, David. A Combinatorial Theory of Possibility, Cambridge: Cambridge University Press, 1989.
  • Bennett, Jonathan Francis. Events and Their Names. Indianapolis: Hackett Pub. Co., 1988.
  • Bennett, Jonathan Francis. “Precis of Events and Their Names,” Philosophy and Phenomenological Research 51 (1991): 625-628.
  • Brand, Myles. “Identity Conditions for Events.” American Philosophical Quarterly 14 (1997): 329-337.
  • Casati, Roberto, Varzi, Achille, “Events,” The Stanford Encyclopedia of Philosophy (Fall 2002 Edition), Edward N. Zalta (ed.), URL = http://plato.stanford.edu/archives/fall2002/entries/events/.
  • Clark, Romane. “Concerning the Logic of Predicate Modifiers,” Noûs. 4 (1970): 311-335.
  • Davidson, Donald. Essays on Actions and Events. New York: Oxford University Press, 1980.
  • Davidson, Donald. “Reply to Quine on Events,” In Actions and Events: Perspectives on the Philosophy of Donald Davidson. eds. Lepore, E. and B. Mc Laughlin. Oxford: Basil Blackwell, pp. 172-176, 1985.
  • Evnine, Simone. Donald Davidson. Stanford: Stanford Univ. Press, 1991.
  • Horgan, Terence. “The Case Against Events,” Philosophical Review 87 (1978): 28-47.
  • Kim, Jaegwan. Supervenience and Mind: Selected Philosophical Essays. New York: Cambridge University Press, 1993.
  • Lewis, David. Counterfactuals. Oxford: Blackwell, 1973.
  • Lewis, David. “New Work for a Theory of Universals,” Australasian Journal of Philosophy 61 (1983): 343-377.
  • Lewis, David. Philosophical Papers. Oxford: Oxford University Press, 1983.
  • Lewis, David. On the Plurality of Worlds. Oxford: Basil Blackwell, 1986.
  • Lawrence Lombard, “Ontologies of Events” in Macdonald, Cynthia and Stephen Laurence, Eds. Contemporary Readings in the Foundations of Metaphysics. Oxford: Blackwell, 1998.
  • Loux, Michael. The Possible and the Actual: Readings in the Metaphysics of Modality. New York: Cornell University Press, 1980.
  • Oliver, Alex. “The Metaphysics of Properties,” Mind 105 (1996): 1-80.
  • Plantinga, A., 1976, “Actualism and Possible Worlds,” Theoria 42.
  • Quine, W.V.O. “Events and Reification” in Actions and Events: Perspectives on the Philosophy of Donald Davidson. eds. Lepore, E. and B. Mc Laughlin. Oxford: Basil Blackwell, pp. 162-171, 1985.

Author Information

Susan Schneider
Email: Susan@moravian.edu
Moravian College
U. S. A.

Charles Sanders Peirce: Architectonic Philosophy

peirceThe subject matter of architectonic is the structure of all human knowledge. The purpose of providing an architectonic scheme is to classify different types of knowledge and explain the relationships that exist between these classifications. The architectonic system of C. S. Peirce (1839-1914) divides knowledge according to it status as a “science” and then explains the interrelation of these different scientific disciplines. His belief was that philosophy must be placed within this systematic account of knowledge as science. Peirce adopts his architectonic ambitions of structuring all knowledge, and organizing philosophy within it, from his great philosophical hero, Kant. This systematizing approach became crucial for Peirce in his later work. However, his belief in a structured philosophy related systematically to all other scientific disciplines was important to him throughout his philosophical life.

Table of Contents

  1. The Architectonic System
  2. Mathematics and Philosophy
  3. Philosophy
    1. Phenomenology
    2. The Normative Sciences
      1. Aesthetics and Ethics
      2. Logic
    3. Metaphysics
  4. The Importance of the Systematic Interpretation
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Architectonic System

In later work, Peirce began to organize and systematize his philosophy in terms of its relation to other areas of knowledge. More crucially for his philosophy, though, this enabled him to make explicit the structure and interrelation of different areas of his philosophical thought. In work like his 1902 Carnegie Institute Application, letters to friends, and more conventional writings, Peirce placed his philosophy within a hierarchical classification of sciences. Within this systematization of sciences, “science” is a broad term meaning any organization of human knowledge. The result is that disciplines like history, biographical study and art criticism count as “science.” The sheer number of sciences involved in Peirce’s classification, then, meant that he needed to sub-divide them further. The basis of Peirce’s sub-divisions is not altogether clear or straightforward, but he seems to count Philosophy as a “formal science of discovery.” What Peirce means by this is that Philosophy is concerned with discovering the formal or necessary conditions for the objects with which it concerns itself. Whether this is an accurate classification of philosophy is hard to say, but the idea is that philosophy shares some formal (i.e. quest for necessary conditions) concerns with mathematics and shares a concern for discovering knowledge with the empirical or physical sciences, like chemistry or physics; hence philosophy is a “formal science of discovery.” The hierarchical classification of sciences in relation to philosophy and the hierarchical structure of philosophy itself, then, looks, roughly, as follows:

1) MATHEMATICS

2) PHILOSOPHY

which consists of:

a) Phenomenology

b) Normative Science

which consists of:

i) AESTHETICS

ii) ETHICS

iii) LOGIC

which consists of:

a) Philosophical Grammar

b) Critical Logic

c) Methodeutic

c) Metaphysics

3) PHYSICAL SCIENCE
Figure 1

In creating a systematized classification of science, Peirce hoped to make the connection between different areas of his thought clear, not only to others, but also to himself. If Peirce was able to see how his pragmatism, say, was related to other areas of his philosophy, and how his philosophy in general related to other sciences, he might be able to gain insights into the theory of pragmatism as a consequence. Peirce was, however, aware that a systematic classification of sciences is, to some extent, an abstraction that simplifies the relations between sciences. For the most part, though, he found that it accurately represented his thoughts on philosophy and was a useful tool for organizing his theories.

As suggested already, the sciences and philosophy are organized in a hierarchical fashion. So, from Figure 1., we can see that Mathematics is super-ordinate to philosophy, and philosophy super-ordinate to the physical sciences. Similar relations of super and sub-ordinacy also exist within philosophy and within particular branches of philosophy. The first thing to clarify is that the sub-ordinacy of philosophy to mathematics, or metaphysics to phenomenology, is not sub-ordinacy in the sense of embeddedness, i.e., philosophy is not a sub-branch of mathematics. Of course, embedded sub-ordinacy does occur in Peirce’s classification where, for instance, aesthetics is a sub-branch of Normative Science, just as ethics and logic are. However, ethics and logic are not sub-branches of aesthetics, even though they are sub-ordinate to it. So, what is the nature of the non-embedded sub-ordinacy of, say, philosophy to mathematics?

Non-embedded sub-ordinacy is more a notion of linear priority than topical subsumption. This is because Peirce is organizing sciences in a fashion popularized by Auguste Comte in the nineteenth century, whereby super-ordinate sciences provide general laws or principles for sub-ordinate sciences which provide concrete, realized cases of those general principles. Super-ordinacy, then, is meant to be linear priority in terms of prior provision of general principles, and sub-ordinacy, the posterior realization of those general principles. A contrived example of how this works may go something as follows:

Psychology provides general principles that suggest that the emotional states of human beings are manipulable through sound, i.e., human emotion is susceptible to auditory suggestion. Using that principle, musicians can discover that musical arrangements in minor keys, particularly D minor, invoke sadness amongst listeners. Wagner, for instance, discovered that all chords have a corresponding chord which “resolves” the sequence, leaving the listener satisfied. By consistently refusing to “resolve” chords in his music, Wagner was able to induce tension and anxiety amongst his listeners wherever he wished to do so. These cases of actual musical practice provide concrete, confirming phenomena of the general psychological principle. Psychology, then, is super-ordinate to music, in the sense that it provides general principles for musical practice.

Applied to the hierarchy in figure 1., mathematics provides general laws, which Peirce often calls guiding or leading principles, for philosophy. Philosophy, in turn, provides concrete or confirming cases of those laws. Similar relations exist within philosophy itself, and between philosophy and the empirical sciences. Peirce is not always forthcoming with explicit examples of guiding principles, but, as we shall examine in more detail below, in the case of philosophy and its super-ordinate science, mathematics, he gives us a good indication of what he has in mind.

2. Mathematics and Philosophy

Peirce divides mathematics into three areas that correspond roughly to discrete mathematics, mathematics of the infinite, and mathematical or formal logic. We now think of Peirce’s groundbreaking work in mathematical logic as belonging to logic proper rather than being a branch of mathematics. More important though is the role of mathematics as the provider of guiding principles for subsequent sciences, and particularly philosophy. Following his father, Peirce treated mathematics as “the science which draws necessary conclusions.” What Peirce means is that mathematics is free from existential concerns about its constructs. In this sense, it is hypothetical and abstract. Peirce, for instance, states that mathematics “makes constructions in the imagination according to abstract precepts, and then observes these imaginary objects, finding in them relations of parts not specified in the precept of construction.” What Peirce means is that mathematics creates hypothetical constructions, i.e., constructions which are abstracted and not necessarily actual, and then derives logically necessary connections between them and about them. These “necessary conclusions” about mathematical constructs provide general laws or principles for deriving logically necessary connections between and about all constructs, imaginary or actual. In short, the kinds of reasoning employed in mathematics provide general rules of reasoning, and function as principles to guide our reasoning in subsequent science, particularly philosophy.

For example, we can see the provision of guiding or leading principles from mathematics through the following story about irrational numbers. An irrational number is a number which cannot be expressed as the ratio of two integers. That is, the irrational number is a non-terminating, non-repeating decimal. How did our number systems develop to include numbers other than rational integers? One thought is that Pythagoras realized that there necessarily exists no pair of rational integers such that one can be expressed as the twice the square of the other. The way he came to this conclusion is by noting that in a square whose sides measure one unit in length, the diagonal measures neither one unit nor two units. Consequently, there must exist some other kind of non-rational number which enables us to explain the length of a square’s diagonal in relation to its sides. Now, the way in which Pythagoras came about this conclusion was to note certain features about some diagram (of a square), abstract important features from that particular case, and draw a more general conclusion. These methods of abstraction and generalization are precisely the kind of thing that Peirce has in mind when he says that mathematics, as a super-ordinate science, provides guiding principles for philosophy.

3. Philosophy

Philosophy is divided into three orders: phenomenology, or the science of how things appear to us; the normative sciences, which study how we ought to act; and metaphysics, the study of what is real. Philosophy takes from mathematics the principles of drawing necessary consequences from hypotheses. Further, the three branches of philosophy have hierarchical relationships. Phenomenology uses the principles of mathematics and theorizes on the necessary qualities that all phenomena must have. After this, the normative and metaphysical sciences use, reflect and provide concrete cases of these phenomenological findings.

Similar divisions occur within the branches of philosophy but the most interesting of these is the division within normative science between aesthetics, ethics and logic. Logic within normative science is conceived as semiotics, or the study of signs, and is strongly epistemological in its concern with the structure of knowledge and understanding. As the hierarchy suggests, logic is dependent upon ethics and ethics upon aesthetics. All of these are dependent upon the principles of phenomenology and, more broadly still, upon mathematics. Further, they are all super-ordinate to metaphysics. This is largely because metaphysics concerns itself with the reality and place within nature of these objects. Metaphysics, as the science of what is real, is most similar to the physical sciences and is in many ways meant to be a bridging discipline between philosophy and natural science. As should be clear, the hierarchy moves from abstract disciplines to those whose study involves phenomena that are more concrete.

We know how the three philosophical sub-disciplines are meant to relate to each other in terms of the hierarchy. However, we have yet to examine Peirce’s theories of phenomenology, normative science, and metaphysics in any detail. In the following sections, though, we shall examine each of the three sub-disciplines, and in the case of normative science its sub-sub-disciplines, and look a little more closely at what Peirce take these topics to concern.

a. Phenomenology

The first and most abstract of philosophy’s sub-disciplines is phenomenology. For Peirce, phenomenology is the science of appearances and is abstract in the sense that its subject matter is still general and hypothetical, just as the constructs of mathematics are. However, whereas the general hypothetical subject of mathematics and mathematical reasoning is any theoretical construct, for phenomenology the constructs are those of experience, considered in generalized terms.

In his discussion of phenomenology, Peirce divides all our experience into three general, universal categories and names them firstness, secondness, and thirdness. Peirce’s categories are notoriously hard to understand. Indeed, Peirce thought it to be a science which we could only gain a hazy grasp of until we discovered the categories for ourselves in the course of our own experiences. The major problem with the categories, though, is that they are general and therefore difficult to explain in readily comprehensible terms. The best way to understand the categories, then, is to look at concrete examples that, in some way, exemplify firstness, secondness, or thirdness.

Peirce usually attempts to explain firstness, in general terms, as quality or feeling. It is perhaps more intuitive to grasp firstness this way: think of William James, Charles Peirce and Karl Marx; they all share the quality of being bearded. Let us abstract “beardedness” from this group of men and, when we consider that abstraction in and of itself, we are considering a firstness which those philosophers all share. Of course, the general concept of firstness is purer than this; “beardedness” is just an exemplification of it. Another example might come from Wittgenstein’s discussion in the Philosophical Investigations of how we attend to shapes and colors of some objects. When I try to observe the shape of a vase, in separation from its color, size, etc., by squinting my eyes and tilting my head, I am attempting to observe a firstness of that object.

Resistance, existence or otherness, are all examples of secondness. Peirce often uses the scholastic concept of haecceity, or “thisness,” to explain our experience of secondness. The idea is that when we experience some thing, we experience it as separate from other phenomena and as a brute thing of existence. It is this brute confrontational singularity that a thing experienced must have that Peirce thinks exemplifies secondness. It is our experience of an object as a thing separate to others within the universe that is an experience of secondness. A rather strange example might prove helpful in coming to understand what our experience of secondness might be like. Some historical commentaries of the first landings of the Spanish Conquistadors in South America report how the natives were in awe of these strange four-legged, two-armed, two headed God-like creatures. It seems that the Spanish rode ashore on horse back. Having never seen horses or white men before (let alone white men riding horses), the natives assumed that this was one creature. This seems like a rather strange case, but it perhaps provides a startling example of how we must re-organize our understanding when our experience fails to distinguish two instances of secondness. Of course, the minute the Conquistadors dismounted, the natives experienced the invader as separate to his horse, thereby experiencing his secondness.

Our experiences of mediation, intelligibility or understanding are examples of thirdness. When we place some experience within the structure of our understanding, when we assimilate an experience, we are experiencing thirdness. In many ways, thirdness is similar to the Hegelian notion of “synthesis” and captures the notions of development and growth. When we experience thirdness, we experience some sense of bringing phenomena into order with our knowledge. Principle exemplifiers of thirdness, then, are phenomena like laws, habits, conventions, reason, etc. Extending our previous example of the Conquistador, when the native saw him dismounted and experienced him as separate from his horse, he might also have come to understand that this stranger was, in fact, a man. This experience of understanding how this phenomenon fits into the world is, according to Peirce, meant to be an experience of thirdness.

The three categories are present in all experience but to differing degrees. Consequently, an experience of a quality like redness has firstness, secondness and thirdness; but it has firstness to a greater extent and so exemplifies that category. To see this, we should at least be clear that, as a quality, “redness” is a firstness just as “beardedness” is. However, our experience of the “redness” as existing means that it has secondness. Otherwise, we would be unable to experience it. And the fact that we are able to understand our experience of “redness” as just such an experience, means that it must also have an element of thirdness, otherwise we would be unable to assimilate that experience. So, our experience of “redness” has all three categories to some extent. However, the actual qualitative aspects of the experience, the very reason we call this an experience of “redness,” are what predominate, and this is why we classify “redness” as a first, even though all of the categories are present to some extent.

Furthermore, despite the abstract nature of phenomenology, i.e., the hypothetical status of its constructs, it is not at odds with Peirce’s scientific and experiential approach. As suggested earlier, Peirce maintains that phenomenology is something that we each must carry out and confirm for ourselves in our own experience. So, despite the initially abstract and theoretical appearance of phenomenology, it remains grounded in practice.

Finally, the universal categories are ever present in Peirce’s work. In some respects, the categories are already present in the antecedent science of mathematics where Peirce describes them in terms of relations. The mathematical equivalent of firstness is one-place relational predicates like, “x is bearded”; of secondness is two-place relational predicates like, “x is the barber of y”; and of thirdness is three-place relational predicates like “x shaves y with z.” The explanation of the categories in terms of relational predicates is an early attempt to explain firstness, secondness and thirdness on Peirce’s part and as such should not be taken as reflecting upon the phenomenological account we are looking at here. It is, however, instructive to see one of Peirce’s alternative attempts at explaining the universal categories. The phenomenological derivation of the categories that we are looking at here is a later development in Peirce’s work, and reflects thought about categories that Peirce had always harbored, and is crucial to his systematic vision of philosophy.

b. The Normative Sciences

The normative sciences study the norms of worldly interaction. As Phenomenology studies the necessary qualities of experience, the normative sciences prescribe our response to those experiences. Further, there are three sub-areas within the normative sciences: aesthetics, ethics and logic. Aesthetics is the most abstract of the three normative sciences and provides foundational aims for the other prescriptive disciplines. Ethics explores these aims in relation to conduct, and logic explores those aims in relation to reasoning, a particular form of conduct.

i. Aesthetics and Ethics

Peirce’s theories of aesthetics and ethics are not well developed. In many respects, Peirce self-consciously developed them for his system in order to provide foundations for logic. Consequently, his theories of aesthetics and ethics do not look too much like traditional theories. They are aesthetical and ethical in the sense of being theories of what is unconditionally admirable, and what is of value in human conduct, but they are not systematic or extensive. The two disciplines hold the usual hierarchical relations, with the super-ordinate science of aesthetics providing a general, guiding principle for its sub-ordinate science ethics, which in turn provides realized cases of that principle.

The only guiding principle from aesthetics to ethics that Peirce hints at is what he calls the “ultimate aesthetic ideal.” The ultimate aesthetic ideal is, for Peirce, the growth of reason or rationality. He calls this the “growth of concrete reasonableness.” For instance, the discovery that our galaxy is heliocentric and not geocentric marks a growth in concrete reasonableness, i.e., an increase in our grasp upon reality. Ethics, then, must take this general aesthetic ideal of the unconditionally admirable and ask, “What is admirable in the way of human conduct?” This makes ethics, for Peirce, a question of what kind of conduct is likely to see the growth of reason or rationality. The right action will take us towards achieving the aesthetic ideal, the wrong action will not.

Right conduct, then, is conduct that is self-controlled and deliberate. Further, it is self-controlled and deliberate in an attempt to achieve the aesthetic ideal. What is more, this self-controlled conduct is not simply about action for the individual in isolation; it is also about setting a precedent and providing an example for a community. For instance, I decide that I will never act without reflection upon rumors. I try, through self-controlled and deliberate response, to reflect upon the content and plausibility of the rumors I hear and to find out whether they are truthful or not. Only when I have done this do I act. Here is a case of adopting a particular kind of conduct with the aim of seeing the world become a more reasoned and rational place. However, when I die, my contribution to concrete reasonableness passes with me, unless I can spread this deliberate conduct further. This is precisely what Peirce thinks our ethical conduct should do; not by being purely about individual conduct, but by contributing habits, tendencies and general principles in conduct that others can see and adopt. Our contribution to achieving the aesthetic ideal, then, is not just the adoption of self-controlled conduct, but also establishing such conduct as a communal habit or convention. The growth of concrete reasonableness requires more than just action; it requires continued action.

Peirce has very little more to say about aesthetics and ethics. It appears the notions of the ultimate aesthetic ideal and what is unconditionally admirable in the way of human conduct are only interesting to Peirce as general guiding principles for the sub-ordinate discipline of logic.

ii. Logic

The third of the normative sciences, logic, takes the aim of aesthetics and the principles of ethics and applies them to reasoning. Logic, then, is self-controlled reasoning aimed at the growth of concrete reasonableness. It is as a form of conduct that logic takes a sub-ordinate position to ethics in the philosophical hierarchy.

Logic itself has three branches: Philosophical Grammar, Critical Logic and Methodeutic. Philosophical Grammar, often called Speculative Grammar, is a theoretical explanation and exploration of the nature of signs. This is the area within the hierarchy for Peirce’s famous theory of Semiotics. It is located within logic conceived as the self-controlled conduct of reasoning because Peirce takes all thought, and so all reasoning, to occur through the use of signs. Philosophical Grammar, then, studies the nature of the basic phenomena of reasoning: signs. Signs are essentially triadic phenomena on Peirce’s account, consisting of a sign vehicle, an object and an interpretant or interpreting thought which takes the sign to stand for its object. For instance, a fever is a sign of illness, which I understand as requiring treatment with medicine. The fever is the sign, the illness is its object, and my understanding of this connection is the interpretant. Peirce continually developed complex classifications for signs depending on the inter-relation between the sign, the object and the interpretant.

In many ways, we can see the sign as a concrete case of a general principle from phenomenology, which tells us that each experience will have firstness, secondness and thirdness. Indeed, Peirce sees the sign-vehicle as a firstness, the object as a secondness and the interpretant as a thirdness. However, after 1903, Peirce did not press this reflection of the phenomenological categories in his semiotic too far, even though he remained convinced that it existed.

The second branch of logic is Critical Logic, which studies types of argument. However, Peirce discusses more than just deductive arguments or reasoning within this branch of logic. He also includes discussion of inductive and abductive reasoning. Inductive reasoning, for Peirce, is quantitative reasoning and bears close resemblance to statistical analysis. On Peirce’s analysis, induction is reasoning or argument to a general rule for a population based upon a sample from it. For instance, my sample of the metals in coins leads me to conclude that the pennies in current circulation have approximately 30% copper content. I have induced a general rule about the copper content of all pennies from a random sample of, say, 5% of the pennies in circulation. The more sampling I do the more accurate my general rule will become.

Abductive reasoning is similar to the inference to best explanation and provides conjectures for general rules by proffering some explanatory hypothesis based on some phenomena that we already know. A quick and simple way to grasp how Peirce thinks that abduction and induction are argument forms is to look at their structure in relation to the standard deductive syllogism. Consider the deductively valid argument: all felines are furry; all lions are felines; so all lions are furry. We can recast this to reflect the inductive form of argument like this: all lions are furry; all lions are felines; so all felines are furry. This is obviously a probabilistic argument based on sampling from a general population. We take what we know of some sample population – in this case, that lions as a sample of the general feline population are furry – and conclude that this is present in the population as a whole.

Again, we can recast the structure of the deductive argument to reflect abductive reasoning like this: all felines are furry; all lions are furry; so all lions are felines. Here we are taking two phenomena, the furriness of felines and the furriness of lions, and providing a conjecture that attempts to explain both phenomena with a single general rule.

Obviously, neither induction nor abduction are deductively valid, but Peirce still considers them to be important forms of reasoning and devotes discussion to them within the Critical Logic. Critical Logic also explains, through a discussion of how these arguments are useful, what counts as good or bad reasoning. Consequently, it further explains the purpose of the normative discipline of logic considered as a form of self-controlled conduct.

The third branch of logic is Methodeutic. Methodeutic is home to Peirce’s theories of truth and inquiry and his pragmatic maxim. It concerns the use of signs and argument to create habits and forms of conduct conducive to achieving the logical take on the aesthetic ideal, a steady state of doubt resistant beliefs. For Peirce, the aim of logic or reasoning is to achieve a settled state of belief. The growth of this steady state comes from our desire to eradicate doubt, which causes considerable consternation according to Peirce. Whenever we encounter some phenomenon that casts doubt upon a belief of ours, we feel compelled to find the cause of the recalcitrant experience and settle our beliefs once more. This leads to a steady growth in our body of recalcitrant proof beliefs. Methodeutic, then, is the study of inquiry: or growth through reasoning in action.

c. Metaphysics

The final branch of philosophy is Metaphysics, the study of what is real. As phenomenology studies the necessary qualities of our experience, and the normative sciences prescribe our response to them, Metaphysics studies whether or not the objects of experience are real.

The first thing to note about Peirce’s metaphysics is that it is still a distinctly “hands on” affair. Peirce’s metaphysics, commonly labeled “scientific metaphysics,” attempts to explain the reality of the phenomenological categories and of the methods and principles of inquiry as expounded in the normative sciences. This is all in contra-distinction to “Ontological Metaphysics,” or metaphysics conducted by a priori reasoning. Peirce’s pragmatism means that he is at odds with this kind of metaphysical endeavor. Since a concept’s meaning relies upon its practical bearings, and the bulk of a priori metaphysics make no difference to practice or experience, the bulk of a priori metaphysics is meaningless. Again, this is similar to the verificationist’s anti-metaphysical arguments, but where the logical positivists take this to mean the death of metaphysics, Peirce takes this to mean that a worthwhile metaphysics must be scientific, fallible, cautiously approached, and sub-ordinate to logic.

As with the normative sciences, Peirce makes various distinctions within the branch of metaphysics. Most interesting are his discussions of the reality of his phenomenological categories of firstness, secondness and thirdness and his evolutionary cosmology. In his discussion of the reality of the phenomenological categories, Peirce returns to the subject of his first philosophical discipline, phenomenology, where he identifies the three categories of firstness, secondness and thirdness as general features of all experience. Here, in his metaphysical work, Peirce turns his discussion to the reality of these phenomenological categories. His concern is to ask whether all or any of those categories are real independently of you or I. Does thirdness, for instance, really exist? If it does, then, on Peirce’s view, “possibles” exist.

Peirce places himself with Aristotle, Kant and the Scottish Common-sense philosopher Thomas Reid in taking all three of the phenomenological categories to be real. However, since he takes his own “three category realism” to most strongly reflect the work of John Duns Scotus, Peirce labels himself a “scholastic realist.” Peirce also characterizes other theories and philosophers depending on their own commitments to the reality of the phenomenological categories. For instance, in Peirce’s opinion, “nominalism” does not take the category of thirdness to be real. Although the term “nominalism” is more normally part of mediaeval debate on the existence (or not) of universals, Peirce uses the term to refer to any theory that seems too hardheadedly committed to the explanation of phenomena in terms of concrete existent particulars. It is for this reason that Peirce labels as nominalist any theory which does not take the real existence of laws, generalities, possibilities, etc. seriously (i.e. is not committed to the existence of thirds or thirdness). Of course, it is possible, in Peirce’s opinion, to move too far in the opposite direction. According to Peirce, Hegel’s philosophy, for instance, places too much emphasis on thirdness at the expense of the other categories. Peirce’s, own commitment to a three-category realism, though, is the source of the acute anti-nominalism which affects much of his other philosophical work.

Peirce’s cosmological metaphysics is perhaps the most interesting of his metaphysical writings. Where his general metaphysics discusses the reality of the phenomenological categories, his cosmological work studies the reality and relation to the universe of his work in the normative sciences. The cosmological metaphysics looks at the aesthetic ideal (the growth of concrete reasonableness) and its attainment through growth and habit in the universe at large. In Peirce’s cosmology, the universe grows from a state of nothingness to chaos, or all pervasive firstness. From the state of chaos, it develops to a state in which time and space exist, or a state of secondness, and from there to a state where it is governed by habit and law, i.e. a state of thirdness. The universe does this, not in a mechanistic or deterministic way, but by tending towards habit and a law-like nature through chance and spontaneous transition. This chance-like transition towards thirdness is the growth of concrete reasonableness, i.e. the attainment of the aesthetic ideal through the spontaneous development of habit.

Peirce’s evolutionary cosmology has left many commentators uneasy about its relation to the rest of his work. His development of it during his own life time led some of his friends to fear for his sanity. Indeed, Peirce’s turn towards cosmological metaphysics is often attributed to a mystical experience and crisis of faith in the 1890’s. In truth, Peirce takes his cosmological work to be the logical upshot of the normative sciences and logic, which show the nature and desirability of the growth of reason. Cosmological metaphysics merely shows how the growth of concrete reasonableness occurs in the universe at large.

4. The Importance of the Systematic Interpretation

Traditionally, the systematic background to Peirce’s theories of, say, pragmatism, inquiry, or the categories is ignored. This has lead to a failure to appreciate its significance to the detail of individual theories. Instead, the assessment of Peirce’s philosophy is often made on an issue by issue basis. Take, for instance, Peirce’s pragmatism. Its relation to the broader system enables Peirce to state his pragmatism and show how it need not lapse into nominalism, which is generally the outcome of pragmatic or verificationist principles. Understanding Peirce’s devout anti-nominalism requires some grasp of his system and the place of the pragmatic maxim within it.

This, of course, is not to say that Peirce’s philosophy must live and die by the systematic view. It is possible to take Peirce’s views on individual topics and find much of value in them. However, interpreting Peirce’s philosophy without any appreciation of the systematic background faces the danger of making serious mistakes about the import and intent of Peirce’s work. Returning again to the Peirce’s account of pragmatism, without the systematic background to provide some sense of Peirce’s commitment to anti-nominalism and belief in the possibility of a scientific metaphysics, his pragmatism looks like a simple forerunner of the Logical Positivist’s verification principle. Although common, such an interpretation fails to reflect the nuances of Peirce theory. Reaching a full understanding of Peirce’s work on individual topics, then, is always best achieved with an eye on the systematic background.

5. References and Further Reading

a. Primary Sources

  • Peirce, C.S. 1931-58. The Collected Papers of Charles Sanders Peirce, eds. C. Hartshorne, P. Weiss (Vols. 1-6) and A. Burks (Vols. 7-8). (Cambridge MA: Harvard University Press).
    • The first widespread presentation of Peirce’s work both published and unpublished; its topical arrangement makes it misleading but it is still the first source for most people.
  • Peirce, C.S. 1982-. The Writings of Charles S. Peirce: A Chronological Edition, eds. M. Fisch, C. Kloesel, E. Moore, N. Houser et al. (Bloomington IN: Indiana University Press).
    • The ongoing vision of the late Max Fisch and colleagues to produce an extensive presentation of Peirce’s views on a par with The Collected Papers, but without its idiosyncrasies. Currently published in eight volumes (of thirty) up to 1884, it is rapidly superseding its predecessor.
  • Peirce, C.S. 1992-94. The Essential Peirce, eds. N. Houser and C. Kloesel (Vol. 1) and the Peirce Edition Project (Vol. 2). (Bloomington IN: Indiana University Press).
    • A crucial two volume reader of the cornerstone works of Peirce’s writings. Equally important are the introductory commentaries, particularly by Nathan Houser in Volume 1.

b. Secondary Sources

  • Anderson, D. 1995. The Strands of System. (West Lafayette, IN: Purdue University Press).
    • A systematic reading of Peirce’s thought which, in its introduction, makes an in-depth breakdown of the elements of the system and their relation to each other. Its main body reproduces two important papers by Peirce with accompanying commentary.
  • Hookway, C.J. 1985. Peirce. (London: Routledge and Kegan Paul).
    • Important treatment of Peirce as a systematic philosopher but with emphasis on Peirce’s Kantian inheritance and later rejection of the transcendental approach to truth, logic and inquiry.

Author Information:

Albert Atkin
Email: pip99aka@sheffield.ac.uk
University of Sheffield
United Kingdom

Jean-Jacques Rousseau (1712—1778)

rousseauJean-Jacques Rousseau was one of the most influential thinkers during the Enlightenment in eighteenth century Europe. His first major philosophical work, A Discourse on the Sciences and Arts, was the winning response to an essay contest conducted by the Academy of Dijon in 1750. In this work, Rousseau argues that the progression of the sciences and arts has caused the corruption of virtue and morality. This discourse won Rousseau fame and recognition, and it laid much of the philosophical groundwork for a second, longer work, The Discourse on the Origin of Inequality. The second discourse did not win the Academy’s prize, but like the first, it was widely read and further solidified Rousseau’s place as a significant intellectual figure. The central claim of the work is that human beings are basically good by nature, but were corrupted by the complex historical events that resulted in present day civil society.Rousseau’s praise of nature is a theme that continues throughout his later works as well, the most significant of which include his comprehensive work on the philosophy of education, the Emile, and his major work on political philosophy, The Social Contract: both published in 1762. These works caused great controversy in France and were immediately banned by Paris authorities. Rousseau fled France and settled in Switzerland, but he continued to find difficulties with authorities and quarrel with friends. The end of Rousseau’s life was marked in large part by his growing paranoia and his continued attempts to justify his life and his work. This is especially evident in his later books, The Confessions, The Reveries of the Solitary Walker, and Rousseau: Judge of Jean-Jacques.

Rousseau greatly influenced Immanuel Kant’s work on ethics. His novel Julie or the New Heloise impacted the late eighteenth century’s Romantic Naturalism movement, and his political ideals were championed by leaders of the French Revolution.

Table of Contents

  1. Life
    1. Traditional Biography
    2. The Confessions: Rousseau’s Autobiography
  2. Background
    1. The Beginnings of Modern Philosophy and the Enlightenment
    2. The State of Nature as a Foundation for Ethics and Political Philosophy
  3. The Discourses
    1. Discourse on the Sciences and Arts
    2. Discourse on the Origin of Inequality
    3. Discourse on Political Economy
  4. The Social Contract
    1. Background
    2. The General Will
    3. Equality, Freedom, and Sovereignty
  5. The Emile
    1. Background
    2. Education
    3. Women, Marriage, and Family
    4. The Profession of Faith of the Savoyard Vicar
  6. Other Works
    1. Julie or the New Heloise
    2. Reveries of the Solitary Walker
    3. Rousseau: Judge of Jean Jacques
  7. Historical and Philosophical Influence
  8. References and Further Reading
    1. Works by Rousseau
    2. Works about Rousseau

1. Life

a. Traditional Biography

Jean-Jacques Rousseau was born to Isaac Rousseau and Suzanne Bernard in Geneva on June 28, 1712. His mother died only a few days later on July 7, and his only sibling, an older brother, ran away from home when Rousseau was still a child. Rousseau was therefore brought up mainly by his father, a clockmaker, with whom at an early age he read ancient Greek and Roman literature such as the Lives of Plutarch. His father got into a quarrel with a French captain, and at the risk of imprisonment, left Geneva for the rest of his life. Rousseau stayed behind and was cared for by an uncle who sent him along with his cousin to study in the village of Bosey. In 1725, Rousseau was apprenticed to an engraver and began to learn the trade. Although he did not detest the work, he thought his master to be violent and tyrannical. He therefore left Geneva in 1728, and fled to Annecy. Here he met Louise de Warens, who was instrumental in his conversion to Catholicism, which forced him to forfeit his Genevan citizenship (in 1754 he would make a return to Geneva and publicly convert back to Calvanism). Rousseau’s relationship to Mme. de Warens lasted for several years and eventually became romantic. During this time he earned money through secretarial, teaching, and musical jobs.

In 1742 Rousseau went to Paris to become a musician and composer. After two years spent serving a post at the French Embassy in Venice, he returned in 1745 and met a linen-maid named Therese Levasseur, who would become his lifelong companion (they eventually married in 1768). They had five children together, all of whom were left at the Paris orphanage. It was also during this time that Rousseau became friendly with the philosophers Condillac and Diderot. He worked on several articles on music for Diderot and d’Alembert’s Encyclopedie. In 1750 he published the Discourse on the Arts and Sciences, a response to the Academy of Dijon’s essay contest on the question, “Has the restoration of the sciences and arts tended to purify morals?” This discourse is what originally made Rousseau famous as it won the Academy’s prize. The work was widely read and was controversial. To some, Rousseau’s condemnation of the arts and sciences in the First Discourse made him an enemy of progress altogether, a view quite at odds with that of the Enlightenment project. Music was still a major part of Rousseau’s life at this point, and several years later, his opera, Le Devin du Village (The Village Soothsayer) was a great success and earned him even more recognition. But Rousseau attempted to live a modest life despite his fame, and after the success of his opera, he promptly gave up composing music.

In the autumn of 1753, Rousseau submitted an entry to another essay contest announced by the Academy of Dijon. This time, the question posed was, “What is the origin of inequality among men, and is it authorized by the natural law?” Rousseau’s response would become the Discourse on the Origin of Inequality Among Men. Rousseau himself thought this work to be superior to the First Discourse because the Second Discourse was significantly longer and more philosophically daring. The judges were irritated by its length as well its bold and unorthodox philosophical claims; they never finished reading it. However, Rousseau had already arranged to have it published elsewhere and like the First Discourse, it also was also widely read and discussed.

In 1756, a year after the publication of the Second Discourse, Rousseau and Therese Levasseur left Paris after being invited to a house in the country by Mme. D’Epinay, a friend to the philosophes. His stay here lasted only a year and involved an affair with a woman named Sophie d’Houdetot, the mistress of his friend Saint-Lambert. In 1757, after repeated quarrels with Mme. D’Epinay and her other guests including Diderot, Rousseau moved to lodgings near the country home of the Duke of Luxemburg at Montmorency.

It was during this time that Rousseau wrote some of his most important works. In 1761 he published a novel, Julie or the New Heloise, which was one of the best selling of the century. Then, just a year later in 1762, he published two major philosophical treatises: in April his definitive work on political philosophy, The Social Contract, and in May a book detailing his views on education, Emile. Paris authorities condemned both of these books, primarily for claims Rousseau made in them about religion, which forced him to flee France. He settled in Switzerland and in 1764 he began writing his autobiography, his Confessions. A year later, after encountering difficulties with Swiss authorities, he spent time in Berlin and Paris, and eventually moved to England at the invitation of David Hume. However, due to quarrels with Hume, his stay in England lasted only a year, and in 1767 he returned to the southeast of France incognito.

After spending three years in the southeast, Rousseau returned to Paris in 1770 and copied music for a living. It was during this time that he wrote Rousseau: Judge of Jean-Jacques and the Reveries of the Solitary Walker, which would turn out to be his final works. He died on July 3, 1778. His Confessions were published several years after his death; and his later political writings, in the nineteenth century.

b. The Confessions: Rousseau’s Autobiography

Rousseau’s own account of his life is given in great detail in his Confessions, the same title that Saint Augustine gave his autobiography over a thousand years earlier. Rousseau wrote the Confessions late in his career, and it was not published until after his death. Incidentally, two of his other later works, the “Reveries of the Solitary Walker” and “Rousseau Judge of Jean Jacques” are also autobiographical. What is particularly striking about the Confessions is the almost apologetic tone that Rousseau takes at certain points to explain the various public as well as private events in his life, many of which caused great controversy. It is clear from this book that Rousseau saw the Confessions as an opportunity to justify himself against what he perceived as unfair attacks on his character and misunderstandings of his philosophical thought.

His life was filled with conflict, first when he was apprenticed, later in academic circles with other Enlightenment thinkers like Diderot and Voltaire, with Parisian and Swiss authorities and even with David Hume. Although Rousseau discusses these conflicts, and tries to explain his perspective on them, it is not his exclusive goal to justify all of his actions. He chastises himself and takes responsibility for many of these events, such as his extra-marital affairs. At other times, however, his paranoia is clearly evident as he discusses his intense feuds with friends and contemporaries. And herein lays the fundamental tension in the Confessions. Rousseau is at the same time trying both to justify his actions to the public so that he might gain its approval, but also to affirm his own uniqueness as a critic of that same public.

2. Background

a. The Beginnings of Modern Philosophy and the Enlightenment

Rousseau’s major works span the mid to late eighteenth century. As such, it is appropriate to consider Rousseau, at least chronologically, as an Enlightenment thinker. However, there is dispute as to whether Rousseau’s thought is best characterized as “Enlightenment” or “counter-Enlightenment.” The major goal of Enlightenment thinkers was to give a foundation to philosophy that was independent of any particular tradition, culture, or religion: one that any rational person would accept. In the realm of science, this project has its roots in the birth of modern philosophy, in large part with the seventeenth century philosopher, René Descartes. Descartes was very skeptical about the possibility of discovering final causes, or purposes, in nature. Yet this teleological understanding of the world was the very cornerstone of Aristotelian metaphysics, which was the established philosophy of the time. And so Descartes’ method was to doubt these ideas, which he claims can only be understood in a confused way, in favor of ideas that he could conceive clearly and distinctly. In the Meditations, Descartes claims that the material world is made up of extension in space, and this extension is governed by mechanical laws that can be understood in terms of pure mathematics.

b. The State of Nature as a Foundation for Ethics and Political Philosophy

The scope of modern philosophy was not limited only to issues concerning science and metaphysics. Philosophers of this period also attempted to apply the same type of reasoning to ethics and politics. One approach of these philosophers was to describe human beings in the “state of nature.” That is, they attempted to strip human beings of all those attributes that they took to be the results of social conventions. In doing so, they hoped to uncover certain characteristics of human nature that were universal and unchanging. If this could be done, one could then determine the most effective and legitimate forms of government.

The two most famous accounts of the state of nature prior to Rousseau’s are those of Thomas Hobbes and John Locke. Hobbes contends that human beings are motivated purely by self-interest, and that the state of nature, which is the state of human beings without civil society, is the war of every person against every other. Hobbes does say that while the state of nature may not have existed all over the world at one particular time, it is the condition in which humans would be if there were no sovereign. Locke’s account of the state of nature is different in that it is an intellectual exercise to illustrate people’s obligations to one another. These obligations are articulated in terms of natural rights, including rights to life, liberty and property. Rousseau was also influenced by the modern natural law tradition, which attempted to answer the challenge of skepticism through a systematic approach to human nature that, like Hobbes, emphasized self-interest. Rousseau therefore often refers to the works of Hugo Grotius, Samuel von Pufendorf, Jean Barbeyrac, and Jean-Jacques Burlamaqui. Rousseau would give his own account of the state of nature in the Discourse on the Origin and Foundations of Inequality Among Men, which will be examined below.

Also influential were the ideals of classical republicanism, which Rousseau took to be illustrative of virtues. These virtues allow people to escape vanity and an emphasis on superficial values that he thought to be so prevalent in modern society. This is a major theme of the Discourse on the Sciences and Arts.

3. The Discourses

a. Discourse on the Sciences and Arts

This is the work that originally won Rousseau fame and recognition. The Academy of Dijon posed the question, “Has the restoration of the sciences and arts tended to purify morals?” Rousseau’s answer to this question is an emphatic “no.” The First Discourse won the academy’s prize as the best essay. The work is perhaps the greatest example of Rousseau as a “counter-Enlightenment” thinker. For the Enlightenment project was based on the idea that progress in fields like the arts and sciences do indeed contribute to the purification of morals on individual, social, and political levels.

The First Discourse begins with a brief introduction addressing the academy to which the work was submitted. Aware that his stance against the contribution of the arts and sciences to morality could potentially offend his readers, Rousseau claims, “I am not abusing science…I am defending virtue before virtuous men.” (First Discourse, Vol. I, p. 4). In addition to this introduction, the First Discourse is comprised of two main parts.

The first part is largely an historical survey. Using specific examples, Rousseau shows how societies in which the arts and sciences flourished more often than not saw the decline of morality and virtue. He notes that it was after philosophy and the arts flourished that ancient Egypt fell. Similarly, ancient Greece was once founded on notions of heroic virtue, but after the arts and sciences progressed, it became a society based on luxury and leisure. The one exception to this, according to Rousseau, was Sparta, which he praises for pushing the artists and scientists from its walls. Sparta is in stark contrast to Athens, which was the heart of good taste, elegance, and philosophy. Interestingly, Rousseau here discusses Socrates, as one of the few wise Athenians who recognized the corruption that the arts and sciences were bringing about. Rousseau paraphrases Socrates’ famous speech in the Apology. In his address to the court, Socrates says that the artists and philosophers of his day claim to have knowledge of piety, goodness, and virtue, yet they do not really understand anything. Rousseau’s historical inductions are not limited to ancient civilizations, however, as he also mentions China as a learned civilization that suffers terribly from its vices.

The second part of the First Discourse is an examination of the arts and sciences themselves, and the dangers they bring. First, Rousseau claims that the arts and sciences are born from our vices: “Astronomy was born from superstition; eloquence from ambition, hate, flattery, and falsehood; geometry from avarice, physics from vain curiosity; all, even moral philosophy, from human pride.” (First Discourse, Vol. I, p. 12). The attack on sciences continues as Rousseau articulates how they fail to contribute anything positive to morality. They take time from the activities that are truly important, such as love of country, friends, and the unfortunate. Philosophical and scientific knowledge of subjects such as the relationship of the mind to the body, the orbit of the planets, and physical laws that govern particles fail to genuinely provide any guidance for making people more virtuous citizens. Rather, Rousseau argues that they create a false sense of need for luxury, so that science becomes simply a means for making our lives easier and more pleasurable, but not morally better.

The arts are the subject of similar attacks in the second part of the First Discourse. Artists, Rousseau says, wish first and foremost to be applauded. Their work comes from a sense of wanting to be praised as superior to others. Society begins to emphasize specialized talents rather than virtues such as courage, generosity, and temperance. This leads to yet another danger: the decline of military virtue, which is necessary for a society to defend itself against aggressors. And yet, after all of these attacks, the First Discourse ends with the praise of some very wise thinkers, among them, Bacon, Descartes, and Newton. These men were carried by their vast genius and were able to avoid corruption. However, Rousseau says, they are exceptions; and the great majority of people ought to focus their energies on improving their characters, rather than advancing the ideals of the Enlightenment in the arts and sciences.

b. Discourse on the Origin of Inequality

The Second Discourse, like the first, was a response to a question put forth by the academy of Dijon: “What is the origin of inequality among men; and is it authorized by the natural law?” Rousseau’s response to this question, the Discourse on the Origin of Inequality, is significantly different from the First Discourse for several reasons. First, in terms of the academy’s response, the Second Discourse was not nearly as well received. It exceeded the desired length, it was four times the length of the first, and made very bold philosophical claims; unlike the First Discourse, it did not win the prize. However, as Rousseau was now a well-known and respected author, he was able to have it published independently. Secondly, if the First Discourse is indicative of Rousseau as a “counter-Enlightenment” thinker, the Second Discourse, by contrast, can rightly be considered to be representative of Enlightenment thought. This is primarily because Rousseau, like Hobbes, attacks the classical notion of human beings as naturally social. Finally, in terms of its influence, the Second Discourse is now much more widely read, and is more representative of Rousseau’s general philosophical outlook. In the Confessions, Rousseau writes that he himself sees the Second Discourse as far superior to the first.

The Discourse on the Origin of Inequality is divided into four main parts: a dedication to the Republic of Geneva, a short preface, a first part, and a second part. The scope of Rousseau’s project is not significantly different from that of Hobbes in the Leviathan or Locke in the Second Treatise on Government. Like them, Rousseau understands society to be an invention, and he attempts to explain the nature of human beings by stripping them of all of the accidental qualities brought about by socialization. Thus, understanding human nature amounts to understanding what humans are like in a pure state of nature. This is in stark contrast to the classical view, most notably that of Aristotle, which claims that the state of civil society is the natural human state. Like Hobbes and Locke, however, it is doubtful that Rousseau meant his readers to understand the pure state of nature that he describes in the Second Discourse as a literal historical account. In its opening, he says that it must be denied that men were ever in the pure state of nature, citing revelation as a source which tells us that God directly endowed the first man with understanding (a capacity that he will later say is completely undeveloped in natural man). However, it seems in other parts of the Second Discourse that Rousseau is positing an actual historical account. Some of the stages in the progression from nature to civil society, Rousseau will argue, are empirically observable in so-called primitive tribes. And so the precise historicity with which one ought to regard Rousseau’s state of nature is the matter of some debate.

Part one is Rousseau’s description of human beings in the pure state of nature, uncorrupted by civilization and the socialization process. And although this way of examining human nature is consistent with other modern thinkers, Rousseau’s picture of “man in his natural state,” is radically different. Hobbes describes each human in the state of nature as being in a constant state of war against all others; hence life in the state of nature is solitary, poor, nasty, brutish, and short. But Rousseau argues that previous accounts such as Hobbes’ have all failed to actually depict humans in the true state of nature. Instead, they have taken civilized human beings and simply removed laws, government, and technology. For humans to be in a constant state of war with one another, they would need to have complex thought processes involving notions of property, calculations about the future, immediate recognition of all other humans as potential threats, and possibly even minimal language skills. These faculties, according to Rousseau, are not natural, but rather, they develop historically. In contrast to Hobbes, Rousseau describes natural man as isolated, timid, peaceful, mute, and without the foresight to worry about what the future will bring.

Purely natural human beings are fundamentally different from the egoistic Hobbesian view in another sense as well. Rousseau acknowledges that self-preservation is one principle of motivation for human actions, but unlike Hobbes, it is not the only principle. If it were, Rousseau claims that humans would be nothing more than monsters. Therefore, Rousseau concludes that self-preservation, or more generally self-interest, is only one of two principles of the human soul. The second principle is pity; it is “an innate repugnance to see his fellow suffer.” (Second Discourse, Vol. II, p. 36). It may seem that Rousseau’s depiction of natural human beings is one that makes them no different from other animals. However, Rousseau says that unlike all other creatures, humans are free agents. They have reason, although in the state of nature it is not yet developed. But it is this faculty that makes the long transition from the state of nature to the state of civilized society possible. He claims that if one examines any other species over the course of a thousand years, they will not have advanced significantly. Humans can develop when circumstances arise that trigger the use of reason.

Rousseau’s praise of humans in the state of nature is perhaps one of the most misunderstood ideas in his philosophy. Although the human being is naturally good and the “noble savage” is free from the vices that plague humans in civil society, Rousseau is not simply saying that humans in nature are good and humans in civil society are bad. Furthermore, he is not advocating a return to the state of nature, though some commentators, even his contemporaries such as Voltaire, have attributed such a view to him. Human beings in the state of nature are amoral creatures, neither virtuous nor vicious. After humans leave the state of nature, they can enjoy a higher form of goodness, moral goodness, which Rousseau articulates most explicitly in the Social Contract.

Having described the pure state of nature in the first part of the Second Discourse, Rousseau’s task in the second part is to explain the complex series of historical events that moved humans from this state to the state of present day civil society. Although they are not stated explicitly, Rousseau sees this development as occurring in a series of stages. From the pure state of nature, humans begin to organize into temporary groups for the purposes of specific tasks like hunting an animal. Very basic language in the form of grunts and gestures comes to be used in these groups. However, the groups last only as long as the task takes to be completed, and then they dissolve as quickly as they came together. The next stage involves more permanent social relationships including the traditional family, from which arises conjugal and paternal love. Basic conceptions of property and feelings of pride and competition develop in this stage as well. However, at this stage they are not developed to the point that they cause the pain and inequality that they do in present day society. If humans could have remained in this state, they would have been happy for the most part, primarily because the various tasks that they engaged in could all be done by each individual. The next stage in the historical development occurs when the arts of agriculture and metallurgy are discovered. Because these tasks required a division of labor, some people were better suited to certain types of physical labor, others to making tools, and still others to governing and organizing workers. Soon, there become distinct social classes and strict notions of property, creating conflict and ultimately a state of war not unlike the one that Hobbes describes. Those who have the most to lose call on the others to come together under a social contract for the protection of all. But Rousseau claims that the contract is specious, and that it was no more than a way for those in power to keep their power by convincing those with less that it was in their interest to accept the situation. And so, Rousseau says, “All ran to meet their chains thinking they secured their freedom, for although they had enough reason to feel the advantages of political establishment, they did not have enough experience to foresee its dangers.” (Second Discourse, Vol. II, p. 54).

The Discourse on the Origin of Inequality remains one of Rousseau’s most famous works, and lays the foundation for much of his political thought as it is expressed in the Discourse on Political Economy and Social Contract. Ultimately, the work is based on the idea that by nature, humans are essentially peaceful, content, and equal. It is the socialization process that has produced inequality, competition, and the egoistic mentality.

c. Discourse on Political Economy

The Discourse on Political Economy originally appeared in Diderot and d’Alembert’s Encyclopedia. In terms of its content the work seems to be, in many ways, a precursor to the Social Contract, which would appear in 1762. And whereas the Discourse on the Sciences and Arts and the Discourse on the Origin of Inequality look back on history and condemn what Rousseau sees as the lack of morality and justice in his own present day society, this work is much more constructive. That is, the Discourse on Political Economy explains what he takes to be a legitimate political regime.

The work is perhaps most significant because it is here that Rousseau introduces the concept of the “general will,” a major aspect of his political thought which is further developed in the Social Contract. There is debate among scholars about how exactly one ought to interpret this concept, but essentially, one can understand the general will in terms of an analogy. A political society is like a human body. A body is a unified entity though it has various parts that have particular functions. And just as the body has a will that looks after the well-being of the whole, a political state also has a will which looks to its general well-being. The major conflict in political philosophy occurs when the general will is at odds with one or more of the individual wills of its citizens.

With the conflict between the general and individual wills in mind, Rousseau articulates three maxims which supply the basis for a politically virtuous state: (1) Follow the general will in every action; (2) Ensure that every particular will is in accordance with the general will; and (3) Public needs must be satisfied. Citizens follow these maxims when there is a sense of equality among them, and when they develop a genuine respect for law. This again is in contrast to Hobbes, who says that laws are only followed when people fear punishment. That is, the state must make the penalty for breaking the law so severe that people do not see breaking the law to be of any advantage to them. Rousseau claims, instead, that when laws are in accordance with the general will, good citizens will respect and love both the state and their fellow citizens. Therefore, citizens will see the intrinsic value in the law, even in cases in which it may conflict with their individual wills.

4. The Social Contract

a. Background

The Social Contract is, like the Discourse on Political Economy, a work that is more philosophically constructive than either of the first two Discourses. Furthermore, the language used in the first and second Discourses is crafted in such a way as to make them appealing to the public, whereas the tone of the Social Contract is not nearly as eloquent and romantic. Another more obvious difference is that the Social Contract was not nearly as well-received; it was immediately banned by Paris authorities. And although the first two Discourses were, at the time of their publication, very popular, they are not philosophically systematic. The Social Contract, by contrast, is quite systematic and outlines how a government could exist in such a way that it protects the equality and character of its citizens. But although Rousseau’s project is different in scope in the Social Contract than it was in the first two Discourses, it would be a mistake to say that there is no philosophical connection between them. For the earlier works discuss the problems in civil society as well as the historical progression that has led to them. The Discourse on the Sciences and Arts claims that society has become such that no emphasis is put on the importance of virtue and morality. The Discourse on the Origin of Inequality traces the history of human beings from the pure state of nature through the institution of a specious social contract that results in present day civil society. The Social Contract does not deny any of these criticisms. In fact, chapter one begins with one of Rousseau’s most famous quotes, which echoes the claims of his earlier works: “Man was/is born free; and everywhere he is in chains.” (Social Contract, Vol. IV, p. 131). But unlike the first two Discourses, the Social Contract looks forward, and explores the potential for moving from the specious social contract to a legitimate one.

b. The General Will

The concept of the general will, first introduced in the Discourse on Political Economy, is further developed in the Social Contract although it remains ambiguous and difficult to interpret. The most pressing difficulty that arises is in the tension that seems to exist between liberalism and communitarianism. On one hand, Rousseau argues that following the general will allows for individual diversity and freedom. But at the same time, the general will also encourages the well-being of the whole, and therefore can conflict with the particular interests of individuals. This tension has led some to claim that Rousseau’s political thought is hopelessly inconsistent, although others have attempted to resolve the tension in order to find some type of middle ground between the two positions. Despite these difficulties, however, there are some aspects of the general will that Rousseau clearly articulates. First, the general will is directly tied to Sovereignty: but not Sovereignty merely in the sense of whomever holds power. Simply having power, for Rousseau, is not sufficient for that power to be morally legitimate. True Sovereignty is directed always at the public good, and the general will, therefore, speaks always infallibly to the benefit of the people. Second, the object of the general will is always abstract, or for lack of a better term, general. It can set up rules, social classes, or even a monarchial government, but it can never specify the particular individuals who are subject to the rules, members of the classes, or the rulers in the government. This is in keeping with the idea that the general will speaks to the good of the society as a whole. It is not to be confused with the collection of individual wills which would put their own needs, or the needs of particular factions, above those of the general public. This leads to a related point. Rousseau argues that there is an important distinction to be made between the general will and the collection of individual wills: “There is often a great deal of difference between the will of all and the general will. The latter looks only to the common interest; the former considers private interest and is only a sum of private wills. But take away from these same wills the pluses and minuses that cancel each other out, and the remaining sum of the differences is the general will.” (Social Contract, Vol. IV, p. 146). This point can be understood in an almost Rawlsian sense, namely that if the citizens were ignorant of the groups to which they would belong, they would inevitably make decisions that would be to the advantage of the society as a whole, and thus be in accordance with the general will.

c. Equality, Freedom, and Sovereignty

One problem that arises in Rousseau’s political theory is that the Social Contract purports to be a legitimate state in one sense because it frees human beings from their chains. But if the state is to protect individual freedom, how can this be reconciled with the notion of the general will, which looks always to the welfare of the whole and not to the will of the individual? This criticism, although not unfounded, is also not devastating. To answer it, one must return to the concepts of Sovereignty and the general will. True Sovereignty, again, is not simply the will of those in power, but rather the general will. Sovereignty does have the proper authority override the particular will of an individual or even the collective will of a particular group of individuals. However, as the general will is infallible, it can only do so when intervening will be to the benefit of the society. To understand this, one must take note of Rousseau’s emphasis on the equality and freedom of the citizens. Proper intervention on the part of the Sovereign is therefore best understood as that which secures the freedom and equality of citizens rather than that which limits them. Ultimately, the delicate balance between the supreme authority of the state and the rights of individual citizens is based on a social contract that protects society against factions and gross differences in wealth and privilege among its members.

5. The Emile

a. Background

The Emile or On Education is essentially a work that details Rousseau’s philosophy of education. It was originally published just several months after the Social Contract. Like the Social Contract, the Emile was immediately banned by Paris authorities, which prompted Rousseau to flee France. The major point of controversy in the Emile was not in his philosophy of education per se, however. Rather, it was the claims in one part of the book, the Profession of Faith of the Savoyard Vicar in which Rousseau argues against traditional views of religion that led to the banning of the book. The Emile is unique in one sense because it is written as part novel and part philosophical treatise. Rousseau would use this same form in some of his later works as well. The book is written in first person, with the narrator as the tutor, and describes his education of a pupil, Emile, from birth to adulthood.

b. Education

The basic philosophy of education that Rousseau advocates in the Emile, much like his thought in the first two Discourses, is rooted in the notion that human beings are good by nature. The Emile is a large work, which is divided into five Books, and Book One opens with Rousseau’s claim that the goal of education should be to cultivate our natural tendencies. This is not to be confused with Rousseau’s praise of the pure state of nature in the Second Discourse. Rousseau is very clear that a return the state of nature once human beings have become civilized is not possible. Therefore, we should not seek to be noble savages in the literal sense, with no language, no social ties, and an underdeveloped faculty of reason. Rather, Rousseau says, someone who has been properly educated will be engaged in society, but relate to his or her fellow citizens in a natural way.

At first glance, this may seem paradoxical: If human beings are not social by nature, how can one properly speak of more or less natural ways of socializing with others? The best answer to this question requires an explanation of what Rousseau calls the two forms of self-love: amour-propre and amour de soi. Amour de soi is a natural form of self-love in that it does not depend on others. Rousseau claims that by our nature, each of us has this natural feeling of love toward ourselves. We naturally look after our own preservation and interests. By contrast, amour-propre is an unnatural self-love that is essentially relational. That is, it comes about in the ways in which human beings view themselves in comparison to other human beings. Without amour-propre, human beings would scarcely be able to move beyond the pure state of nature Rousseau describes in the Discourse on Inequality. Thus, amour-propre can contribute positively to human freedom and even virtue. Nevertheless, amour-propre is also extremely dangerous because it is so easily corruptible. Rousseau often describes the dangers of what commentators sometimes refer to as ‘inflamed’ amour-propre. In its corrupted form, amour-propre is the source of vice and misery, and results in human beings basing their own self worth on their feeling of superiority over others. While not developed in the pure state of nature, amour-propre is still a fundamental part of human nature. Therefore goal of Emile’s natural education is in large part to keep him from falling into the corrupted form of this type of self-love.

Rousseau’s philosophy of education, therefore, is not geared simply at particular techniques that best ensure that the pupil will absorb information and concepts. It is better understood as a way of ensuring that the pupil’s character be developed in such a way as to have a healthy sense of self-worth and morality. This will allow the pupil to be virtuous even in the unnatural and imperfect society in which he lives. The character of Emile begins learning important moral lessons from his infancy, thorough childhood, and into early adulthood. His education relies on the tutor’s constant supervision. The tutor must even manipulate the environment in order to teach sometimes difficult moral lessons about humility, chastity, and honesty.

c. Women, Marriage, and Family

As Emile’s is a moral education, Rousseau discusses in great detail how the young pupil is to be brought up to regard women and sexuality. He introduces the character of Sophie, and explains how her education differs from Emile’s. Hers is not as focused on theoretical matters, as men’s minds are more suited to that type of thinking. Rousseau’s view on the nature of the relationship between men and women is rooted in the notion that men are stronger and therefore more independent. They depend on women only because they desire them. By contrast, women both need and desire men. Sophie is educated in such a way that she will fill what Rousseau takes to be her natural role as a wife. She is to be submissive to Emile. And although Rousseau advocates these very specific gender roles, it would be a mistake to take the view that Rousseau regards men as simply superior to women. Women have particular talents that men do not; Rousseau says that women are cleverer than men, and that they excel more in matters of practical reason. These views are continually discussed among both feminist and Rousseau scholars.

d. The Profession of Faith of the Savoyard Vicar

The Profession of Faith of the Savoyard Vicar is part of the fourth Book of the Emile. In his discussion of how to properly educate a pupil about religious matters, the tutor recounts a tale of an Italian who thirty years before was exiled from his town. Disillusioned, the young man was aided by a priest who explained his own views of religion, nature, and science. Rousseau then writes in the first person from the perspective of this young man, and recounts the Vicar’s speech.

The priest begins by explaining how, after a scandal in which he broke his vow of celibacy, he was arrested, suspended, and then dismissed. In his woeful state, the priest began to question all of his previously held ideas. Doubting everything, the priest attempts a Cartesian search for truth by doubting all things that he does not know with absolute certainty. But unlike Descartes, the Vicar is unable to come to any kind of clear and distinct ideas that could not be doubted. Instead, he follows what he calls the “Inner Light” which provides him with truths so intimate that he cannot help but accept them, even though they may be subject to philosophical difficulties. Among these truths, the Vicar finds that he exists as a free being with a free will which is distinct from his body that is not subject to physical, mechanical laws of motion. To the problem of how his immaterial will moves his physical body, the Vicar simply says “I cannot tell, but I perceive that it does so in myself; I will to do something and I do it; I will to move my body and it moves, but if an inanimate body, when at rest, should begin to move itself, the thing is incomprehensible and without precedent. The will is known to me in its action, not in its nature.” (Emile, p. 282). The discussion is particularly significant in that it marks the most comprehensive metaphysical account in Rousseau’s thought.

The Profession of Faith also includes the controversial discussion of natural religion, which was in large part the reason why Emile was banned. The controversy of this doctrine is the fact that it is categorically opposed to orthodox Christian views, specifically the claim that Christianity is the one true religion. The Vicar claims instead that knowledge of God is found in the observation of the natural order and one’s place in it. And so, any organized religion that correctly identifies God as the creator and preaches virtue and morality, is true in this sense. Therefore, the Vicar concludes, each citizen should dutifully practice the religion of his or her own country so long as it is in line with the religion, and thus morality, of nature.

6. Other Works

a. Julie or the New Heloise

Julie or the New Heloise remains one of Rousseau’s popular works, though it is not a philosophical treatise, but rather a novel. The work tells the story of Julie d’Etange and St. Preux, who were one time lovers. Later, at the invitation of her husband, St. Preux unexpectedly comes back into Julie’s life. Although not a work of philosophy per se, Julie or the New Heloise is still unmistakably Rousseau’s. The major tenets of his thought are clearly evident; the struggle of the individual against societal norms, emotions versus reason, and the goodness of human nature are all prevalent themes.

b. Reveries of the Solitary Walker

Rousseau began writing the Reveries of the Solitary Walker in the fall of 1776. By this time, he had grown increasingly distressed over the condemnation of several of his works, most notably the Emile and the Social Contract. This public rejection, combined with rifts in his personal relationships, left him feeling betrayed and even as though he was the victim of a great conspiracy. The work is divided into ten “walks” in which Rousseau reflects on his life, what he sees as his contribution to the public good, and how he and his work have been misunderstood. It is interesting that Rousseau returns to nature, which he had always praised throughout his career. One also recognizes in this praise the recognition of God as the just creator of nature, a theme so prevalent in the Profession of Faith of the Savoyard Vicar. The Reveries of the Solitary Walker, like many of Rousseau’s other works, is part story and part philosophical treatise. The reader sees in it, not only philosophy, but also the reflections of the philosopher himself.

c. Rousseau: Judge of Jean Jacques

The most distinctive feature of this late work, often referred to simply as the Dialogues, is that it is written in the form of three dialogues. The characters in the dialogues are “Rousseau” and an interlocutor identified simply as a “Frenchman.” The subject of these characters’ conversations is the author “Jean-Jacques,” who is the actual historical Rousseau. This somewhat confusing arrangement serves the purpose of Rousseau judging his own career. The character “Rousseau,” therefore, represents Rousseau had he not written his collected works but instead had discovered them as if they were written by someone else. What would he think of this author, represented in the Dialogues as the character “Jean-Jacques?” This self-examination makes two major claims. First, like the Reveries, it makes clearly evident the fact that Rousseau felt victimized and betrayed, and shows perhaps even more so than the Reveries, Rousseau’s growing paranoia. And second, the Dialogues represent one of the few places that Rousseau claims his work is systematic. He claims that there is a philosophical consistency that runs throughout his works. Whether one accepts that such a system is present in Rousseau’s philosophy or not is a question that was not only debated during Rousseau’s time, but is also continually discussed among contemporary scholars.

7. Historical and Philosophical Influence

It is difficult to overestimate Rousseau’s influence, both in the Western philosophical tradition, and historically. Perhaps his greatest directly philosophical influence is on the ethical thought of Immanuel Kant. This may seem puzzling at first glance. For Kant, the moral law is based on rationality, whereas in Rousseau, there is a constant theme of nature and even the emotional faculty of pity described in the Second Discourse. This theme in Rousseau’s thought is not to be ignored, and it would be a mistake to understand Rousseau’s ethics merely as a precursor to Kant; certainly Rousseau is unique and significant in his own respect. But despite these differences, the influence on Kant is undeniable. The Profession of Faith of the Savoyard Vicar is one text in particular that illustrates this influence. The Vicar claims that the correct view of the universe is to see oneself not at the center of things, but rather on the circumference, with all people realizing that we have a common center. This same notion is expressed in the Rousseau’s political theory, particularly in the concept of the general will. In Kant’s ethics, one of the major themes is the claim that moral actions are those that can be universalized. Morality is something separate from individual happiness: a view that Rousseau undoubtedly expresses as well.

A second major influence is Rousseau’s political thought. Not only is he one of the most important figures in the history of political philosophy, later influencing Karl Marx among others, but his works were also championed by the leaders of the French Revolution. And finally, his philosophy was largely instrumental in the late eighteenth century Romantic Naturalism movement in Europe thanks in large part to Julie or the New Heloise and the Reveries of the Solitary Walker.

Contemporary Rousseau scholarship continues to discuss many of the same issues that were debated in the eighteenth century. The tension in his political thought between individual liberty and totalitarianism continues to be an issue of controversy among scholars. Another aspect of Rousseau’s philosophy that has proven to be influential is his view of the family, particularly as it pertains to the roles of men and women.

8. References and Further Reading

a. Works by Rousseau

Below is a list of Rousseau’s major works in chronological order. The titles are given in the original French as well as the English translation. Following the title is the year of the work’s first publication and, for some works, a brief description:

  • Discours sur les Sciences et les Arts (Discourse on the Sciences and Arts), 1750.
    • Often referred to as the “First Discourse,” this work was a submission to the Academy of Dijon’s essay contest, which it won, on the question, “Has the restoration of the sciences and arts tended to purify morals?”
  • Le Devin du Village (The Village Soothsayer), 1753.
    • Rousseau’s opera: it was performed in France and widely successful.
  • Narcisse ou l’amant de lui-même (Narcissus or the lover of himself), 1753.
    • A play written by Rousseau.
  • Lettre sur la musique francaise (Letter on French music), 1753.
  • Discours sur l’origine et les fondments de l’inegalite (Discourse on the Origin and Foundations of Inequality), 1755.
    • Often referred to as the “Second Discourse,” this was another submission to an essay contest sponsored by the Academy of Dijon, though unlike the First Discourse, it did not win the prize. The Second Discourse is a response to the question, “What is the Origin of Inequality Among Men and is it Authorized by the Natural Law?”
  • Discours sur l’Économie politique (Discourse on Political Economy), 1755.
    • Sometimes called the “Third Discourse,” this work originally appeared in the Encyclopédie of Diderot and d’Alembert.
  • Lettre á d’Alembert sur les Spectacles (Letter to Alembert on the Theater), 1758.
  • Juli ou la Nouvelle Héloïse (Julie or the New Heloise), 1761.
    • A novel that was widely read and successful immediately after its publication.
  • Du Contract Social (The Social Contract), 1762.
    • Rousseau’s most comprehensive work on politics.
  • Émile ou de l’Éducation (Émile or On Education), 1762.
    • Rousseau’s major work on education. It also contains the Profession of Faith of the Savoyard Vicar, which documents Rousseau’s views on metaphysics, free will, and his controversial views on natural religion for which the work was banned by Parisian authorities.
  • Lettre á Christophe de Beaumont, Archévêque de Paris (Letter to Christopher de Beaumont, Archbishop of Paris), 1763.
  • Lettres écrites de la Montagne (Letters Written from the Mountain), 1764.
  • Dictionnaire de Musique (Dictionary of Music), 1767.
  • Émile et Sophie ou les Solitaires (Émile and Sophie or the Solitaries), 1780.
    • A short sequel to the Émile.
  • Considérations sur le gouverment de la Pologne (Considerations on the Government of Poland), 1782.
  • Les Confessions (The Confessions), Part I 1782, Part II 1789.
    • Rousseau’s autobiography.
  • Rousseau juge de Jean-Jacques, Dialogues (Rousseau judge of Jean-Jacques, Dialogues), First Dialogue 1780, Complete 1782.
  • Les Rêveries du Promeneur Solitaire (Reveries of the Solitary Walker), 1782.

b. Works about Rousseau

The standard original language edition is Ouevres completes de Jean Jacques Rousseau, eds. Bernard Gagnebin and Marcel Raymond, Paris: Gallimard, 1959-1995. The most comprehensive English translation of Rousseau’s works is the Collected Writings of Rousseau, series eds. Roger Masters and Christopher Kelly, Hanover: University Press of New England, 1990-1997. References are given by the title of the work, the volume number (in Roman Numerals), and the page number. The Collected Works do not include the Emile. References to this work are from Emile, trans. Barbara Foxley, London: Everyman, 2000. The following is a brief list of widely available secondary texts.

  • Cooper, Laurence D. Rousseau and Nature: The Problem of the Good Life. Penn State UP, 1999. Cranston, Maurice. Jean-Jacques: The Early Life and Work of Jean-Jacques, 1712- 1754. University of Chicago Press, 1991.
  • Cranston, Maurice. The Noble Savage: Jean-Jacques Rousseau, 1754-1762. University of Chicago Press, 1991.
  • Cranston, Maurice. The Solitary Self: Jean-Jacques Rousseau in Exile and Adversity. University of Chicago Press, 1997.
  • Dent, N.J.H. Rousseau. Blackwell, 1988.
  • Gourevitch, Victor. Rousseau: The ‘Discourses’ and Other Early Political Writings. Cambridge UP, 1997.
  • Gourevitch, Victor. Rousseau: The ‘Social Contract’ and Other Later Political Writings. Cambridge UP, 1997.
  • Melzer, Arthur. The Natural Goodness of Man: On the Systems of Rousseau’s Thought. University of Chicago Press, 1990.
  • Neuhouser, Frederick. Rousseau’s Theodicy of Self-Love: Evil, Rationality, and the Drive for Recognition. Oxford University Press, 2008.

  • O’Hagan, Timothy. Rousseau. Routledge, 1999.
  • Riley, Patrick, ed. The Cambridge Companion to Rousseau. Cambridge UP, 2001.
  • Reisert, Joseph. Jean-Jacques Rousseau: A Friend of Virtue. Cornell UP, 2003.
  • Rosenblatt, Helena. Rousseau and Geneva. Cambridge: Cabridge UP, 1997.
  • Starobinski, Jean. Jean-Jacques Rousseau: Transparency and Obstruction. Chicago: University of Chicago Press, 1988.
  • Wokler, Robert. Rousseau. Oxford: Oxford UP, 1995.
  • Wokler, Robert, ed. Rousseau and Liberty. Manchester: Manchester UP, 1995.

Author Information

James J. Delaney
Email: jdelaney@niagara.edu
Niagara University
U. S. A.

The Classical Theory of Concepts

The classical theory of concepts is one of the five primary theories of concepts, the other four being prototype or exemplar theories, atomistic theories, theory-theories, and neoclassical theories. The classical theory implies that every complex concept has a classical analysis, where a classical analysis of a concept is a proposition giving metaphysically necessary and jointly sufficient conditions for being in the extension across possible worlds for that concept. That is, a classical analysis for a complex concept C gives a set of individually necessary conditions for being a C (or conditions that must be satisfied in order to be a C) that together are sufficient for being a C (or are such that something’s satisfying every member of that set of necessary conditions entails its being a C). The classical view also goes by the name of “the definitional view of concepts,” or “definitionism,” where a definition of a concept is given in terms of necessary and jointly sufficient conditions.

This article provides information on the classical theory of concepts as present in the historical tradition, on concepts construed most generally, on the nature of classical conceptual analysis, and on the most significant of the objections raised against the classical view.

Table of Contents

  1. Historical Background and Advantages of the Classical View
  2. Concepts in General
    1. Concepts as Semantic Values
    2. Concepts as Universals
    3. Concepts as Mind-Dependent or Mind-Independent
    4. Concepts as the Targets of Analysis
    5. The Classical View and Concepts in General
  3. Classical Analyses
    1. Necessary and Sufficient Conditions
    2. Logical Constitution
    3. Other Conditions on Classical Analyses
    4. Testing Candidate Analyses
    5. Apriority and Analyticity with respect to Classical Analyses
  4. Objections to the Classical View
    1. Plato’s Problem
    2. The Argument from Categorization
    3. Arguments from Vagueness
    4. Quine’s Criticisms
    5. Scientific Essentialist Criticisms
  5. References and Further Reading

1. Historical Background and Advantages of the Classical View

The classical view can be traced back to at least the time of Socrates, for in many of Plato’s dialogues Socrates is clearly seeking a classical analysis of some notion or other. In the Euthyphro, for instance, Socrates seeks to know the nature of piety: Yet what he seeks is not given in terms of, for example, a list of pious people or actions, nor is piety to be identified with what the gods love. Instead, Socrates seeks an account of piety in terms of some specification of what is shared by all things pious, or what makes pious things pious—that is, he seeks a specification of the essence of piety itself. The Socratic elenchus is a method of finding out the nature or essence of various kinds of things, such as friendship (discussed in the Lysis), courage (the Laches), knowledge (the Theatetus), and justice (the Republic). That method of considering candidate definitions and seeking counterexamples to them is the same method one uses to test candidate analyses by seeking possible counterexamples to them, and thus Socrates is in effect committed to something very much like the classical view of concepts.

One sees the same sort of commitment throughout much of the Western tradition in philosophy from the ancient Greeks through the present. Clear examples include Aristotle’s notion of a definition as “an account [or logos] that signifies the essence” (Topics I) by way of a specification of essential attributes, as well as his account of definitions for natural kinds in terms of genus and difference. Particular examples of classical-style analyses abound after Aristotle: For instance, Descartes (in Meditation VI) defines body as that which is extended in both space and time, and mind as that which thinks. Locke (in the Essay Concerning Human Understanding, Ch. 21) defines being free with respect to doing an action A as choosing/willing to do A where one’s choice is part of the cause of one’s actually doing A. Hume defines a miracle (in Enquiry Concerning Human Understanding, §X) as an event that is both a violation of the laws of nature and caused by God. And so on. The classical view looks to be a presumption of the early analytic philosophers as well (with Wittgenstein being a notable exception). The classical view is present in the writings of Frege and Russell, and the view receives its most explicit treatment by that time in G.E. Moore’s Lectures on Philosophy and other writings. Moore gives a classical analysis of the very notion of a classical analysis, and from then on the classical view (or some qualified version of it) has been one of the pillars of analytic philosophy itself.

One reason the classical view has had such staying power is that it provides the most obvious grounding for the sort of inquiry within philosophy that Socrates began. If one presumes that there are answers to What is F?-type questions, where such questions ask for the nature of knowledge, mind, goodness, etc., then that entails that there is such a thing as the nature of knowledge, mind, goodness, etc. The nature of knowledge, for example, is that which is shared by all cases of knowledge, and a classical analysis of the concept of knowledge specifies the nature of knowledge itself. So the classical view fits neatly with the reasonable presumption that there are legitimate answers to philosophical questions concerning the natures or essences of things. As at least some other views of concepts reject the notion that concepts have metaphysically necessary conditions, accepting such other views is tantamount to rejecting (or at least significantly revising) the legitimacy of an important part of the philosophical enterprise.

The classical view also serves as the ground for one of the most basic tools of philosophy—the critical evaluation of arguments. For instance, one ground of contention in the abortion debate concerns whether fetuses have the status of moral persons or not. If they do, then since moral persons have the right not to be killed, generally speaking, then it would seem to follow that abortion is immoral. The classical view grounds the natural way to address the main contention here, for part of the task at hand is to find a proper analysis of the concept of being a moral person. If that analysis specifies features such that not all of them are had by fetuses, then fetuses are not moral persons, and the argument against the moral permissibility of abortion fails. But without there being analyses of the sort postulated by the classical view, it is far from clear how such critical analysis of philosophical arguments is to proceed. So again, the classical view seems to underpin an activity crucial to the practice of philosophy itself.

In contemporary philosophy, J. J. Katz (1999), Frank Jackson (1994, 1998), and Christopher Peacocke (1992) are representative of those who hold at least some qualified version of the classical view. There are others as well, though many philosophers have rejected the view (at least in part due to the criticisms to be discussed in section 4 below). The view is almost universally rejected in contemporary psychology and cognitive science, due to both theoretical difficulties with the classical view and the arrival of new theories of concepts over the last quarter of the twentieth century.

2. Concepts in General

The issue of the nature of concepts is important in philosophy generally, but most perspicuously in philosophy of language and philosophy of mind. Most generally, concepts are thought to be among those things that count as semantic values or meanings (along with propositions). There is also reason to think that concepts are universals (along with properties, relations, etc.), and what general theory of universals applies to concepts is thus a significant issue with respect to the nature of concepts. Whether concepts are mind-dependent or mind-independent is another such issue. Finally, concepts tend to be construed as the targets of analysis. If one then treats analysis as classical analysis, and holds that all complex concepts have classical analyses, then one accepts the classical view. Other views of concepts might accept the thesis that concepts are targets of analysis, but differ from the classical view over the sort of analysis that all complex concepts have.

a. Concepts as Semantic Values

As semantic values, concepts are the intensions or meanings of sub-sentential verbal expressions such as predicates, adjectives, verbs, and adverbs. Just as the sentence “The sun is a star” expresses the proposition that the sun is a star, the predicate “is a star” expresses the concept of being a star (or [star], to introduce notation to be used in what follows). Further, just as the English sentence “Snow is white” expresses the proposition that snow is white, and so does the German sentence “Schnee ist Weiss,” the predicates “is white” in English and “ist Weiss” in German both express the same concept, the concept of being white (or [white]). The intension or meaning of a sentence is a proposition. The intensions or meanings of many sub-sentential entities are concepts.

b. Concepts as Universals

Concepts are also generally thought to be universals. The reasons for this are threefold:

(1) A given concept is expressible using distinct verbal expressions. This can occur in several different ways. My uttering “Snow is white” and your uttering “Snow is white” are distinct utterances, and their predicates are distinct expressions of the same concept [white]. My uttering “Snow is white” and your uttering “Schnee ist Weiss” are distinct sentences with their respective predicates expressing the same concept ([white], again). Even within the same language, my uttering “Grisham is the author of The Firm” and your uttering “Grisham is The Firm’s author” are distinct sentences with distinct predicates, yet their respective predicates express the same concept (the concept [the author of The Firm], in this case).

(2) Second, different agents can possess, grasp, or understand the same concept, though such possession might come in degrees. Most English speakers possess the concept [white], and while many possess [neutrino], not many possess that concept to such a degree that one knows a great deal about what neutrinos themselves are.

(3) Finally, concepts typically have multiple exemplifications or instantiations. Many distinct things are white, and thus there are many exemplifications or instances of the concept [white]. There are many stars and many neutrinos, and thus there are many instances of [star] and [neutrino]. Moreover, distinct concepts can have the very same instances. The concepts [renate] and [cardiate] have all the same actual instances, as far as we know, and so does [human] and [rational animal]. Distinct concepts can also have necessarily all of the same instances: For instance, the concepts [triangular figure] and [trilateral figure] must have the same instances, yet the predicates “is a triangular figure” and “is a trilateral figure” seem to have different meanings.

As universals, concepts may be treated under any of the traditional accounts of universals in general. Realism about concepts (considered as universals) is the view that concepts are distinct from their instances, and nominalism is the view that concepts are nothing over and above, or distinct from, their instances. Ante rem realism (or platonism) about concepts is the view that concepts are ontologically prior to their instances—that is, concepts exist whether they have instances or not. In re realism about concepts is the view that concepts are in some sense “in” their instances, and thus are not ontologically prior to their instances. Conceptualism with respect to concepts holds that concepts are mental entities, being either immanent in the mind itself as a sort of idea, as constituents of complete thoughts, or somehow dependent on the mind for their existence (perhaps by being possessed by an agent or by being possessible by an agent). Conceptualist views also include imagism, the view (dating from Locke and others) that concepts are a sort of mental image. Finally, nominalist views of concepts might identify concepts with classes or sets of particular things (with the concept [star] being identified with the set of all stars, or perhaps the set of all possible stars). Linguistic nominalism identifies concepts with the linguistic expressions used to express them (with [star] being identified with the predicate “is a star,” perhaps). Type linguistic nominalism identifies concepts with types of verbal expressions (with [star] identified with the type of verbal expression exemplified by the predicate “is a star”).

c. Concepts as Mind-Dependent or Mind-Independent

On many views, concepts are things that are “in” the mind, or “part of” the mind, or at least are dependent for their existence on the mind in some sense. Other views deny such claims, holding instead that concepts are mind-independent entities. Conceptualist views are examples of the former, and platonic views are examples of the latter. The issue of whether concepts are mind-dependent or mind-independent carries great weight with respect to the clash between the classical view and other views of concepts (such as prototype views and theory-theories). If concepts are immanent in the mind as mental particulars, for instance, then various objections to the classical view have more force; if concepts exist independently of one’s ideas, beliefs, capacities for categorizing objects, etc., then some objections to the classical view have much less force.

d. Concepts as the Targets of Analysis

Conceptual analysis is of concepts, and philosophical questions of the form What is F? (such as “What is knowledge?,” “What is justice?,” “What is a person?,” etc.) are questions calling for conceptual analyses of various concepts (such as [knowledge], [justice], [person], etc.). Answering the further question “What is a conceptual analysis?” is yet another way to distinguish among different views of concepts. For instance, the classical view holds that all complex concepts have classical analyses, where a complex concept is a concept having an analysis in terms of other concepts. Alternatively, prototype views analyze concepts in terms of typical features or in terms of a prototypical or exemplary case. For instance, such a view might analyze the concept of being a bird in terms of such typical features as being capable of flight, being small, etc., which most birds share, even if not all of them do. A second sort of prototype theory (sometimes called “the exemplar view”) might analyze the concept of being a bird in terms of a most exemplary case (a robin, say, for the concept of being a bird). So-called theory-theories analyze a concept in terms of some internally represented theory about the members of the extension of that concept. For example, one might have an overall theory of birds, and the concept one expresses with one’s use of ‘bird’ is then analyzed in terms of the role that concept plays in that internally represented theory. Neoclassical views of concepts preserve one element of the classical view, namely the claim that all complex concepts have metaphysically necessary conditions (in the sense that, for example, being unmarried is necessary for being a bachelor), but reject the claim that all complex concepts have metaphysically sufficient conditions. Finally, atomistic views reject all notions of analysis just mentioned, denying that concepts have analyses at all.

e. The Classical View and Concepts in General

The classical view claims simply that all complex concepts have classical analyses. As such, the classical view makes no claims as to the status of concepts as universals, or as being mind-dependent or mind-independent entities. The classical view also is consistent with concepts being analyzable by means of other forms of analysis. Yet some views of universals are more friendly to the classical view than others, and the issue of the mind-dependence or mind-independence of concepts is of some importance to whether the classical view is correct or not. For instance, if concepts are identical to ideas present in the mind (as would be true on some conceptualist views), then if the contents of those ideas fail to have necessary and sufficient defining conditions, then the classical view looks to be false (or at least not true for all concepts). Alternatively, on platonic views of concepts, such a lack of available necessary and jointly sufficient conditions for the contents of our own ideas is of no consequence to the classical view, since ideas are not concepts according to platonic accounts.

3. Classical Analyses

There are two components to an analysis of a complex concept (where a complex concept is a concept that has an analysis in terms of other “simpler” concepts): The analysandum, or the concept being analyzed, and the analysans, or the concept that “does the analyzing.” For a proposition to be a classical analysis, the following conditions must hold:

(I) A classical analysis must specify a set of necessary and jointly sufficient conditions for being in the analysandum’s extension (where a concept’s extension is everything to which that concept could apply). (Other classical theorists deny that all classical analysis specify jointly sufficient conditions, holding instead that classical analyses merely specify necessary and sufficient conditions.)

(II) A classical analysis must specify a logical constitution of the analysandum.

Other suggested conditions on classical analysis are given below.

a. Necessary and Sufficient Conditions

Consider an arbitrary concept [F]. A necessary condition for being an F is a condition such that something must satisfy that condition in order for it to be an F. For instance, being male is necessary for being a bachelor, and being four-sided is necessary for being a square. Such characteristics specified in necessary conditions are shared by, or had in common with, all things to which the concept in question applies.

A sufficient condition for being an F is a condition such that if something satisfies that condition, then it must be an F. Being a bachelor is sufficient for being male, for instance, and being a square is sufficient for being a square.

A necessary and sufficient condition for being an F is a condition such that not only must a thing satisfy that condition in order to be an F, but it is also true that if a thing satisfies that condition, then it must be an F. For instance, being a four-sided regular, plane figure is both necessary and sufficient for being a square. That is, a thing must be a four-sided regular plane figure in order for it to be a square, and if a thing is a four-sided regular plane figure, then it must be a square. [The word “regular” means that all sides are the same length.]

Finally, for a concept [F], necessary and jointly sufficient conditions for being an F is a set of necessary conditions such that satisfying all of them is sufficient for being an F. The conditions of being four-sided and of being a regular figure are each necessary conditions for being a square, for instance, and the conjunction of them is a sufficient condition for being a square.

b. Logical Constitution

A classical analysis also gives a logical constitution of the concept being analyzed, in keeping with Moore’s idea that an analysis breaks a concept up into its components or constituents. In an analysis, it is the logical constituents that an analysis specifies, where a logical constituent of a concept is a concept entailed by that concept. (A concept entails another concept when being in the extension of the former entails being in the extension of the latter.) For instance, [four-sided] is a logical constituent of [square], since something’s being a square entails that it is four-sided.

For a logical constitution specified by a classical analysis, a logical constitution of a concept [F] is a collection of concepts, where each member of that collection is entailed by [F], and where [F] entails all of them taken collectively.

Most complex concepts will have more than one logical constitution, given that there are different ways of analyzing the same concept. For instance, “A square is a four-sided regular figure” expresses an analysis of [square], but so does “A square is a four-sided, closed plane figure having sides all the same length and having neighboring sides orthogonal to one another.” The first analysis gives one logical constitution for [square], and the second analysis seems to give another.

c. Other Conditions on Classical Analyses

In addition to conditions (I) and (II), other conditions on classical analyses have been proposed. Among them are the following:

(III) A classical analysis must not include the analysandum as either its analysans or as part of its analysans. That is, a classical analysis cannot be circular. “A square is a square” does not express an analysis, and neither does “A true sentence is a sentence that specifies a true correspondence between the proposition it expresses and the world.”

(IV) A classical analysis must not have its analysandum be more complex than its analysans. That is, while “A square is a four-sided regular figure” expresses an analysis, “A four-sided regular figure is a square” does not. While the latter sentence is true, it does not express an analysis of [four-sided regular figure]. The concept [four-sided regular figure] analyzes [square], not the other way around.

(V) A classical analysis specifies a precise extension of the concept being analyzed, in the sense of specifying for any possible particular whether it is definitely in or definitely not in that concept’s extension.

(VI) A classical analysis does not include any vague concepts in either its analysandum or its analysans.

The last two conditions concern vagueness. It might be thought that an analysis has to specify in some very precise way what is, and what is not, in that concept’s extension (condition (V)), and also that an expression of an analysis itself cannot include any vague terms (condition (VI)).

d. Testing Candidate Analyses

In seeking a correct analysis for a concept, one typically considers some number of so-called candidate analyses. A correct analysis will have no possible counterexamples, where such counterexamples might show a candidate analysis to be either too broad or too narrow. For instance, let

“A square is a four-sided, closed plane figure”

express a candidate analysis for the concept of being a square. This candidate analysis is too broad, since it would include some things as being squares that are nevertheless not squares. Counterexamples include any trapezoid or rectangle (that is not itself a square, that is).

On the other hand, the candidate analysis expressed by

“A square is a red four-sided regular figure”

is too narrow, as it rules out some genuine squares as being squares, as it is at least possible for there to be squares other than red ones. Assuming for sake of illustration that squares are the sorts of things that can be colored at all, a blue square counts as a counterexample to this candidate analysis, since it fails one of the stated conditions that a square be red.

It might be wondered as to why correct analyses have no possible counterexamples, instead of the less stringent condition that correct analyses have no actual counterexamples. The reason is that analyses are put forth as necessary truths. An analysis of a concept like the concept of being a mind, for instance, is a specification of what is shared by all possible minds, not just what is in common among those minds that actually happen to exist. Similarly, in seeking an analysis of the concept of justice or piety (as Socrates sought), what one seeks is not a specification of what is in common among all just actions or all pious actions that are actual. Instead, what one seeks is the nature of justice or piety, and that is what is in common among all possible just actions or pious actions.

e. Apriority and Analyticity with respect to Classical Analyses

Classical analyses are commonly thought to be both a priori and analytic. They look to be a priori since there is no empirical component essential to their justification, and in that sense classical analyses are knowable by reason alone. In fact, the method of seeking possible counterexamples to a candidate analysis is a paradigmatic case of justifying a proposition a priori. Classical analyses also appear to be analytic, since on the rough construal of analytic propositions as those propositions “true by meaning alone,” classical analyses are indeed that sort of proposition. For instance, “A square is a four-sided regular figure” expresses an analysis, and if “square” and “four-sided regular figure” are identical in meaning, then the analysis is true by meaning alone. On an account of analyticity where analytic propositions are those propositions where what is expressed by the predicate expression is “contained in” what is expressed in the subject expression, classical analyses turn out to be analytic. If what is expressed by “four-sided regular figure” is contained in what is expressed by “square,” then “A square is a four-sided regular figure” is such that the meaning of its predicate expression is contained in what its subject expresses. Finally, on an account of analyticity treating analytic propositions as those where substitution of codesignating terms yields a logical truth, classical analyses turn out to be analytic propositions once more. For since “square” and “four-sided regular figure” have the same possible-worlds extension, then substituting “square” for “four-sided regular figure” in “A square is a four-sided regular figure” yields “A square is a square,” which is a logical truth. (For a contrary view holding that analyses are synthetic propositions, rather than analytic, see Ackerman 1981, 1986, and 1992.)

4. Objections to the Classical View

Despite its history and natural appeal, in many circles the classical view has long since been rejected for one reason or another. Even in philosophy, many harbor at least some skepticism of the thesis that all complex concepts have classical analyses with the character described above. A much more common view is that some complex concepts follow the classical model, but not all of them. This section considers six fairly common objections to the classical view.

a. Plato’s Problem

Plato’s problem is that after over two and a half millennia of seeking analyses of various philosophically important concepts, few if any classical analyses of such concepts have ever been discovered and widely agreed upon as fact. If there are classical analyses for all complex concepts, the critics claim, then one would expect a much higher rate of success in finding such analyses given the effort expended so far. In fact, aside from ordinary concepts such as [bachelor] and [sister], along with some concepts in logic and mathematics, there seems to be no consensus on analyses for any philosophically significant concepts. Socrates’ question “What is justice?,” for instance, has received a monumental amount of attention since Socrates’ time, and while there has been a great deal of progress made with respect to what is involved in the nature of justice, there still is not a consensus view as to an analysis of the concept of justice. The case is similar with respect to questions such as “What is the mind?,” “What is knowledge?,” “What is truth?,” “What is freedom?,” and so on.

One might think that such an objection holds the classical view to too high a standard. After all, even in the sciences there is rarely universal agreement with respect to a particular scientific theory, and progress is ongoing in furthering our understanding of entities such as electrons and neutrinos, as well as events like the Big Bang—there is always more to be discovered. Yet it would be preposterous to think that the scientific method is flawed in some way simply because such investigations are ongoing, and because there is not universal agreement with respect to various theories in the sciences. So why think that the method of philosophical analysis, with its presumption that all complex concepts have classical analyses, is flawed in some way because of the lack of widespread agreement with respect to completed or full analyses of philosophically significant concepts?

Yet while there are disagreements in the sciences, especially in cases where a given scientific theory is freshly proposed, such disagreements are not nearly as common as they are in philosophy. For instance, while there are practicing scientists that claim to be suspicious of quantum mechanics, of the general theory of relativity, or of evolution, such detractors are extremely rare compared to what is nearly a unanimous opinion that those theories are correct or nearly correct. In philosophy, however, there are widespread disagreements concerning even the most basic questions in philosophy. For instance, take the questions “Are we free?” and “Does being free require somehow being able to do otherwise?” The first question asks for an analysis of what is meant by “free,” and the second asks whether being able to do otherwise is a necessary condition on being free. Much attention has been paid to such basic questions, and the critics of the classical view claim that one would expect some sort of consensus as to the answers to them if the concept of freedom really has a classical analysis. So there is not mere disagreement with respect to the answers to such questions, but such disagreements are both widespread and involve quite fundamental issues as well. As a result, the difficulty in finding classical analyses has led many to reject the classical view.

b. The Argument from Categorization

There are empirical objections to the classical view as well. The argument from categorization takes as evidence various data with respect to our sorting or categorizing things into various categories, and infers that such behavior shows that the classical view is false. The evidence shows that we tend not to use any set of necessary and sufficient conditions to sort things in to one category or another, where such sorting behavior is construed as involving the application of various concepts. It is not as if one uses a classical analysis to sort things into the bird category, for instance. Instead, it seems that things are categorized according to typical features of members of the category in question, and the reason for this is that more typical members of a given category are sorted into that category more quickly than less typical members of that same category. Robins are sorted into the bird category more quickly than eagles, for instance, and eagles are sorted into the bird category more quickly than ostriches. What this suggests is that if concepts are used for acts of categorization, and classical analyses are not used in all such categorization tasks, then the classical view is false.

One presumption of the argument is that when one sorts something into one category or another, one uses one’s understanding of a conceptual analysis to accomplish the task. Yet classical theorists might complain that this need not be the case. One might use a set of typical features to sort things into the bird category, even if there is some analysis not in terms of typical features that gives the essential features shared by all birds. In other words (as Rey (1983) points out), there is a difference between what it is to look like a bird and what it is to be a bird. An analysis of a concept gives the conditions on which something is an instance of that concept, and it would seem that a concept can have an analysis (classical or otherwise) even if agents use some other set of conditions in acts of categorization.

Whether this reply to the argument from categorization rebuts the argument remains to be seen, but many researchers in cognitive psychology have taken the empirical evidence from acts of categorization to be strong evidence against the classical view. For such evidence also serves as evidence in favor of a view of concepts in competition with the classical view: the so-called prototype view of concepts. According to the prototype view, concepts are analyzed not in terms of necessary and jointly sufficient conditions, but in terms of lists of typical features. Such typical features are not shared by all instances of a given concept, but are shared by at least most of them. For instance, a typical bird flies, is relatively small, and is not carnivorous. Yet none of these features is shared by all birds. Penguins don’t fly, albatrosses are quite large, and birds of prey are carnivores. Such a view of concepts fits much more neatly with the evidence concerning our acts of categorization, so such critics reject the classical view.

c. Arguments from Vagueness

Vagueness has also been seen as problematic for the classical view. For one might think that in virtue of specifying necessary and jointly sufficient conditions, a classical analysis thus specifies a precise extension for the concept being analyzed (where a concept C has a precise extension if and only if for all x, x is either definitely in the extension of C or definitely not in the extension of C). Yet most complex concepts seem not to have such precise extensions. Terms like “bald,” “short,” and “old” all seem to have cases where it is unclear whether the term applies or not. That is, it seems that the concepts expressed by those terms are such that their extensions are unclear. For instance, it seems that there is no precise boundary between the bald and the non-bald, the short and the non-short, and the old and the non-old. But if there are no such precise boundaries to the extensions for many concepts, and a classical analysis specifies such precise boundaries, then there cannot be classical analyses for what is expressed by vague terms.

Two responses deserve note. One reply on behalf of the classical view is that vagueness is not part of the world itself, but instead is a matter of our own epistemic shortcomings. We find unclear cases simply because we don’t know where the precise boundaries for various concepts lie. There could very well be a precise boundary between the bald and the non-bald, for instance, but we find “bald” to be vague simply because we do not know where that boundary lies. Such an epistemic view of vagueness would seem to be of assistance to the classical view, though such a view of vagueness needs a defense, particularly given the presence of other plausible views of vagueness. The second response is that one might admit the presence of unclear cases, and admit the presence of vagueness or “fuzziness” as a feature of the world itself, but hold that such fuzziness is mirrored in the analyses of the concepts expressed by vague terms. For instance, the concept of being a black cat might be analyzed in terms of [black] and [cat], even if “black” and “cat” are both vague terms. So classical theorists might reply that if the vagueness of a term can be mirrored in an analysis in such a way, then the classical view can escape the criticisms.

d. Quine’s Criticisms

A family of criticisms of the classical view is based on W.V.O. Quine’s (1953/1999, 1960) extensive attack on analyticity and the analytic/synthetic distinction. According to Quine, there is no philosophically clear account of the distinction between analytic and synthetic propositions, and as such there is either no such distinction at all or it does no useful philosophical work. Yet classical analyses would seem to be paradigmatic cases of analytic propositions (for example, [bachelors are unmarried males], [a square is a four-sided regular figure]), and if there are no analytic propositions then it seems there are no classical analyses. Furthermore, if there is no philosophically defensible distinction between analytic and synthetic propositions, then there is no legitimate criterion by which to delineate analyses from non-analyses. Those who hold that analyses are actually synthetic propositions face the same difficulty. If analyses are synthetic, then one still needs a principled difference between analytic and synthetic propositions in order to distinguish between analyses and non-analyses.

The literature on Quine’s arguments is vast, and suffice it to say that criticism of Quine’s arguments and of his general position is widespread as well. Yet even among those philosophers who reject Quine’s arguments, most admit that there remains a great deal of murkiness concerning the analytic/synthetic distinction, despite its philosophical usefulness. With respect to the classical view of concepts, the options available to classical theorists are at least threefold: Either meet Quine’s arguments in a satisfactory way, reject the notion that all analyses are analytic (or that all are synthetic), or characterize classical analysis in a way that is neutral with respect to the analytic/synthetic distinction.

e. Scientific Essentialist Criticisms

Scientific essentialism is the view that the members of natural kinds (like gold, tiger, and water) have essential properties at the microphysical level of description, and that identity statements between natural kind terms and descriptions of such properties are metaphysically necessary and knowable only a posteriori. Some versions of scientific essentialism include the thesis that such identity statements are synthetic. That such statements are a posteriori and synthetic looks to be problematic for the classical view. For sake of illustration, let “Water is H2O” express an analysis of what is meant by the natural kind term “water.” According to scientific essentialism, such a proposition is metaphysically necessary in that it is true in all possible worlds, but it is a necessary truth discovered via empirical science. As such, it is not discovered by the a priori process of seeking possible counterexamples, revising candidate analyses in light of such counterexamples, and so on. But if water’s being H2O is known a posteriori, this runs counter to the usual position that all classical analyses are a priori. Furthermore, given that what is expressed by “Water is H2O” is a posteriori, this entails that it is synthetic, rather than analytic as the classical view would normally claim.

Again, the literature is vast with respect to scientific essentialism, identity statements involving natural kind terms, and the epistemic and modal status of such statements. For classical theorists, short of denying the basic theses of scientific essentialism, some options that save some portion of the classical view include holding that the classical view holds for some concepts (such as those in logic and mathematics) but not others (such as those expressed by natural kind terms), or characterizing classical analysis in a way that is neutral with respect to the analytic/synthetic distinction. How successful such strategies would be remains to be seen, and such a revised classical view would have to be weighed against other theories of concepts that handle all complex concepts with a unified treatment.

5. References and Further Reading

  • Ackerman, D. F. 1981. “The Informativeness of Philosophical Analysis.” In P. French, et al. (Eds.), Midwest Studies in Philosophy, vol. 6. Minneapolis, Minnesota: University of Minnesota Press, 313-320.
  • Ackerman, D. F. 1986. “Essential Properties and Philosophical Analysis.” In P. French, et al. (Eds.), Midwest Studies in Philosophy, vol. 11. Minneapolis, Minnesota: University of Minnesota Press, 304-313.
  • Ackerman, D. F. 1992. “Analysis and Its Paradoxes.” In E. Ullman-Margalit (Ed.), The Scientific Enterprise: The Israel Colloquium Studies in History, Philosophy, and Sociology of Science, vol. 4. Norwell, Massachusetts: Kluwer.
  • Bealer, George. 1982. Quality and Concept. Oxford: Clarendon Press.
  • Bealer, George. 1996. “A Priori Knowledge and the Scope of Philosophy.” Philosophical Studies 81, 121-142.
  • Bonjour, Laurence. 1998. In Defense of Pure Reason: A Rationalist Account of A Priori Justification. Cambridge: Cambridge University Press.
  • Chalmers, David J. and Jackson, Frank. 2001. “Conceptual Analysis and Reductive Explanation” [On-line]. Available: http://www.u.arizona.edu/~chalmers/papers/analysis.html
  • Donnellan, Keith. 1983. “Kripke and Putnam on Natural Kind Terms.” In C. Ginet and S. Shoemaker (Eds.), Knowledge and Mind. Oxford: Oxford University Press, 84-104.
  • Fodor, Jerry A. 1998. Concepts: Where Cognitive Science Went Wrong. Oxford: Clarendon Press.
  • Fodor, Jerry A., Garrett, M. F., Walker, E. C. T., and Parkes, C. H. 1980/1999. “Against Definitions.” In Margolis and Laurence 1999, 491-512.
  • Grice, H. P. and Strawson, P. F. 1956. “In Defense of a Dogma.” The Philosophical Review 65 (2), 141-158.
  • Hanna, Robert. 1998. “A Kantian Critique of Scientific Essentialism.” Philosophy and Phenomenological Research 58 (3), 497-528.
  • Harman, Gilbert. 1999. “Doubts About Conceptual Analysis.” In Gilbert Harman, Reasoning, Meaning, and Mind, Oxford: Oxford University Press, 138-143.
  • Jackson, Frank. 1994. “Armchair Metaphysics.” In M. Michael and J. O’Leary-Hawthorne (Eds.), Philosophy in Mind. Dordrecht: Kluwer.
  • Jackson, Frank. 1998. From Metaphysics to Ethics: A Defence of Conceptual Analysis. Oxford: Clarendon Press.
  • Katz, J. J. 1999.
  • Keefe, Rosanna and Smith, Peter (Eds.). 1999. Vagueness: A Reader. Cambridge, Massachusetts: M.I.T. Press.
  • King, Jeffrey C. 1998. “What is a Philosophical Analysis?” Philosophical Studies 90, 155-179.
  • Kripke, Saul A. 1980. Naming and Necessity. Cambridge, Massachusetts: Harvard University Press.
  • Kripke, Saul A. 1993. “Identity and Necessity.” In A. W. Moore, Meaning and Reference. Oxford: Oxford University Press, 162-191.
  • Langford, C. H. 1968. “The Notion of Analysis in Moore’s Philosophy.” In Schlipp 1968, 321-342.
  • Laurence, Stephen and Margolis, Eric. 1999. “Concepts and Cognitive Science.” In Margolis and Laurence 1999, 3-81.
  • Margolis, Eric and Laurence, Stephen (Eds.). 1999. Concepts: Core Readings. M.I.T. Press.
  • Moore, G. E. 1966. Lectures on Philosophy. Ed. C. Lewy. London: Humanities Press.
  • Moore, G. E. 1968. “A Reply to My Critics.” In Schlipp 1968, 660-677.
  • Murphy, Gregory L. 2002. The Big Book of Concepts. Cambridge: M.I.T. Press.
  • Peacocke, Christopher. 1992. A Study of Concepts. Cambridge: M.I.T. Press.
  • Pitt, David. 1999. “In Defense of Definitions.” Philosophical Psychology 12 (2), 139-156.
  • Plato. 1961a. The Collected Dialogues of Plato. Ed. Edith Hamilton and Huntington Cairns. Princeton, New Jersey: Princeton University Press.
  • Plato. 1961b. Euthyphro. Trans. L. Cooper. In Plato 1961a, 169-185.
  • Plato. 1961c. Laches. Trans. L. Cooper. In Plato 1961a, 123-144.
  • Plato. 1961d. Lysis. Trans. L. Cooper. In Plato 1961a, 145-168.
  • Plato. 1961e. Theatetus. Trans. L. Cooper. In Plato 1961a, 845-919.
  • Plato. 1992. Republic. Trans. G. M. A. Grube. Indianapolis, Indiana: Hackett.
  • Prinz, Jesse J. 2002. Furnishing the Mind: Concepts and Their Perceptual Basis. Cambridge: M.I.T. Press.
  • Putnam, Hilary. 1962. “It Ain’t Necessarily So.” Journal of Philosophy 59 (22), 658-671.
  • Putnam, Hilary. 1966. “The Analytic and the Synthetic.” In H. Feigl and G. Maxwell, eds., Minnesota Studies in the Philosophy of Science, vol. III. Minneapolis, Minnesota: University of Minnesota Press, 358-397. Putnam,
  • Hilary. 1970. “Is Semantics Possible?” In H. Keifer and M. Munitz, eds., Language, Belief, and Metaphysics. New York: State University of New York Press, 50-63.
  • Putnam, Hilary. 1975. “The Meaning of ‘Meaning’.” In Keith Gunderson (Ed.), Minnesota Studies in the Philosophy of Science, vol. VII. Minneapolis, Minnesota: University of Minnesota Press, 131-193.
  • Putnam, Hilary. 1983. “‘Two Dogmas’ Revisited.” In Hilary Putnam, Realism and Reason: Philosophical Papers, Volume 3. Cambridge: Cambridge University Press, 87-97.
  • Putnam, Hilary. 1990. “Is Water Necessarily H2O?” In James Conant (Ed.), Realism with a Human Face. Cambridge: Harvard University Press, 54-79.
  • Quine, W. V. O. 1953/1999. “Two Dogmas of Empiricism.” In Margolis and Laurence 1999, 153-170.
  • Quine, W. V. O. 1960. Word and Object. Cambridge: The M.I.T. Press.
  • Ramsey, William. 1992. “Prototypes and Conceptual Analysis.” Topoi 11, 59-70.
  • Rey, Georges. 1983. “Concepts and Stereotypes.” Cognition 15, 237-262.
  • Rey, Georges. 1985. “Concepts and Conceptions: A Reply to Smith, Medin and Rips.” Cognition 19, 297-303.
  • Rosch, Eleanor. 1999. “Principles of Categorization.” In Margolis and Laurence 1999, 189-206.
  • Schlipp, P. (Ed.). 1968. The Philosophy of G. E. Moore. LaSalle, Illinois: Open Court.
  • Smith, Edward E. 1989. “Three Distinctions About Concepts and Categorization.” Mind and Language 4 (1, 2), 57-61.
  • Smith, Edward E., and Medin, Douglas L. 1981. Categories and Concepts. Cambridge: Harvard University Press.
  • Smith, Edward E. 1999. “The Exemplar View.” In Margolis and Laurence 1999, 207-221.
  • Smith, Edward E., Medin, Douglas L., and Rips, Lance J. 1984. “A Psychological Approach to Concepts: Comments on Rey’s ‘Concepts and Stereotypes.’” Cognition 17, 265-274.
  • Sosa, Ernest. 1983. “Classical Analysis.” Journal of Philosophy 80 (11), 695-710.
  • Stalnaker, Robert. 2001. “Metaphysics Without Conceptual Analysis.” Philosophy and Phenomenological Research 62 (3), 631-636.
  • Williamson, Timothy. 1994. Vagueness. New York: Routledge. Williamson, Timothy. 1999. “Vagueness and Ignorance.” In Keefe and Smith 1999, 265-280.

Author Information

Dennis Earl
Email: dearl@coastal.edu
Coastal Carolina University
U. S. A.

Fallibilism

Fallibilism is the epistemological thesis that no belief (theory, view, thesis, and so on) can ever be rationally supported or justified in a conclusive way. Always, there remains a possible doubt as to the truth of the belief. Fallibilism applies that assessment even to science’s best-entrenched claims and to people’s best-loved commonsense views. Some epistemologists have taken fallibilism to imply skepticism, according to which none of those claims or views are ever well justified or knowledge. In fact, though, it is fallibilist epistemologists (which is to say, the majority of epistemologists) who tend not to be skeptics about the existence of knowledge or justified belief. Generally, those epistemologists see themselves as thinking about knowledge and justification in a comparatively realistic way — by recognizing the fallibilist realities of human cognitive capacities, even while accommodating those fallibilities within a theory that allows perpetually fallible people to have knowledge and justified beliefs. Still, although that is the aim of most epistemologists, the question arises of whether it is a coherent aim. Are they pursuing a coherent way of thinking about knowledge and justification? Much current philosophical debate is centered upon that question. Epistemologists generally seek to understand knowledge and justification in a way that permits fallibilism to be describing a benign truth about how we can gain knowledge and justified beliefs. One way of encapsulating that project is by asking whether it is possible for a person ever to have fallible knowledge and justification.

Table of Contents

  1. Introduction
  2. Formulating Fallibilism: Preliminaries
  3. Formulating Fallibilism: A Thesis about Justification
  4. Formulating Fallibilism: Necessary Truths
  5. Empirical Evidence of Fallibility
  6. Philosophical Sources of Fallibilism: Hume
  7. Philosophical Sources of Fallibilism: Descartes
  8. Implications of Fallibilism: No Knowledge?
  9. Implications of Fallibilism: Knowing Fallibly?
  10. Implications of Fallibilism: No Justification?
  11. References and Further Reading

1. Introduction

The term “fallibilism” comes from the nineteenth century American philosopher Charles Sanders Peirce, although the basic idea behind the term long predates him. According to that basic idea, no beliefs (or opinions or views or theses, and so on) are so well justified or supported by good evidence or apt circumstances that they could not be false. Fallibilism tells us that there is no conclusive justification and no rational certainty for any of our beliefs or theses. That is fallibilism in its strongest form, being applied to all beliefs without exception. In principle, it is also possible to be a restricted fallibilist, accepting a fallibilism only about some narrower class of beliefs. For example, we might be fallibilists about whatever beliefs we gain through the use of our senses — even while remaining convinced that we possess the ability to reason in ways that can, at least sometimes, manifest infallibility. Thus, one special case of this possible selectivity would have us being fallibilists about empirical science even while exempting mathematical reasoning from that verdict. For simplicity, though (and because it represents the thinking of most epistemologists), in what follows I will generally discuss fallibilism in its unrestricted form. (The exception will be section 6, where a particularly significant, but seemingly narrower, form of fallibilism will be presented.)

Fallibilism is an epistemologically pivotal thesis, and our initial priority must be to formulate it carefully. Almost all contemporary epistemologists will say that they are fallibilists. Yet the vast majority of them also wish not to be skeptics. They would rather not be committed to embracing principles about the nature of knowledge and justification which commit them to denying that there can be any knowledge or justified belief. This desire coexists, nonetheless, with the belief that fallibility is rampant. Many epistemological debates, it transpires, can be understood in terms of how they try to balance these epistemologically central desires. So, can we find a precise philosophical understanding of ourselves as being perpetually fallible even though reassuringly rational and, for the most part, knowledgeable?

2. Formulating Fallibilism: Preliminaries

An initial statement of fallibilism might be this:

All beliefs are fallible. (No belief is infallible.)

But what, exactly, is that saying? Here are three claims it is not making.

(1) Fallible people. It is not saying just that all believers — all people — are fallible. A person as such is fallible if, at least sometimes, he is capable of forming false beliefs. But that is compatible with the person’s often — on some other occasions — believing infallibly. And that is not a state of affairs which is compatible with fallibilism.

(2) Actually false beliefs. Nor is fallibilism the thesis that in fact all beliefs are false. That possibility is allowed — but it is not required — by fallibilism. Hence, it is false to portray fallibilism — as commentators on science, in particular, sometimes do — in these terms:

All scientific beliefs are false. This includes all scientific theories, of course. (After all, even scientific theories are only theories. So they are fallible — and therefore false.)

Regardless of whether or not that is a correct claim about scientific beliefs and theories, it is not an accurate portrayal of what fallibilism means to say. The key term in fallibilism, as we have so far formulated it, is “fallible.” And this conveys — through its use of “-ible” — only some kind of possibility of falsity, rather than the definite presence of actual falsity.

(3) Contingent truths. Take the belief that there are currently at least one thousand kangaroos alive in Australia. That belief is true, although it need not have been. It could have been false — in that the world need not have been such as to make it true. So, the belief is only contingently true (as philosophers say). By definition, any contingent truth could have failed to be true. But even if we were to accept that all truths are only contingently true, we would not be committed to fallibilism. The recognition that contingent truths exist is not what underlies fallibilism. The claim that any contingent truth could instead have been false is not the fallibilist claim, because fallibilism is not a thesis about truths in themselves. Instead, it is about our attempts in themselves to accept or believe truths. It concerns a kind of fundamental limitation first and foremost upon our powers of rational thought and representation. And although a truth’s being contingent means that it did not have to be true, this does not mean that it will, or even that it can, be altering its truth-value (by becoming false) in such a way as to deceive you. For instance, the truth that there are now more than one thousand kangaroos alive in Australia is not made false even by there being only five kangaroos alive in Australia in two days time from now.

3. Formulating Fallibilism: A Thesis about Justification

Given section 2’s details, a better (and routine) expression of fallibilism is this:

F: All beliefs are only, at best, fallibly justified.

F’s main virtue, as a formulation of fallibilism, is its locating the culprit fallibility as arising within the putative justification that is present on behalf of a given belief. The kind of justification in question is called “epistemic justification” by epistemologists. And the suggested formulation, F, of fallibilism is saying that there is never conclusive justification for the truth of a given belief.

There are competing epistemological theories of what, exactly, epistemic justification is. Roughly speaking, though, it is whatever would make a belief more, rather than less, rationally well supported or established. This sort of rationality is meant to be truth-directed. For example (as Conee and Feldman 2004 would argue), whenever some evidence is providing epistemic support — justification — for a belief, this is a matter of its supporting the truth of that belief. In that sense, the evidence provides good reason to adopt the belief — to adopt it as true. Or (to take another example, such as would be approved of by the kind of theory from Goldman 1979) a believer might have formed her belief within some circumstance or in some way that — regardless of whether she can notice this — makes her belief likely to be true. (And when are these kinds of justificatory support present? In particular, are they only ever present if they are guaranteeing that the belief being supported is true? Are any actually false beliefs ever justified? Section 10 will focus on the question of whether fallible justification is ever present, either for true or for false beliefs.)

Just as there are competing interpretations of the nature of epistemic justification, epistemologists exercise care in how they read F. Perhaps the most natural reading of it says that no one is ever so situated — even when possessing evidence in favor of the truth of a particular belief — that, if she were to be rational in the sense of respecting and understanding and responding just to that evidence, she could not proceed to doubt that the belief is true. More generally, the idea behind F is that, no matter how good one’s justification is in support of a particular belief’s being true, that justification is never so good as to be conclusive — leaving no room for anyone who might be rationally attending to that justification not to have the belief it is supporting. At any stage, according to F, doubt could sensibly (in some relevant sense of “sensibly”) arise as to the truth of the particular belief.

Often, therefore, this kind of possible doubt is called a rational doubt. This is not to say that, necessarily, the most rational reaction is to be swayed by the doubt, accepting it as decisive; whether one should react like that is a separate issue, probably deserving to be decided only after some subtle argument. The term “rational doubt” is meant only to distinguish this sort of actual or possible doubt from a patently irrational one — a doubt that is psychologically, but not even prima facie rationally, available. How might a doubt that is not even prima facie rational arise? Here is one possible way. Imagine a person who is attending to evidence for the truth of a particular belief, yet who refuses to accept the belief’s being true. Suppose that this refusal is due either (i) to her misunderstanding the evidence or (ii) to some psychological quirk such as a general lack of respect for evidence at all or such as mere obstinacy (without her supplying counter-reasons disputing the truth or power of the evidence). There is no accounting for why some people will in fact doubt a given belief: psychologically, doubt could be an option even in the face of rationally conclusive evidence. Nevertheless, fallibilism is not a thesis about that psychological option. The option it describes concerns rationality. Fallibilism is about what it claims to be the ever-present availability of rational doubt.

Accordingly, one possible way of misinterpreting F would involve confusing the concept of a rational doubt with that of a subjectively felt doubt or, maybe more generally, a psychologically present doubt. Rational doubts need not be psychologically actual doubts, just as psychologically actual ones need not be rational. In theory, a person might have or feel some doubt as to whether a particular claim is true — some doubt which she should not have or feel. (Perhaps she is misevaluating the strength of the evidence she has in support of that claim.) Equally, someone might have or feel no doubt as to the truth of a belief he has — when he should have or feel some such doubt. (Perhaps he, too, is misevaluating the strength of the evidence he has in support of his belief.) In either case, the way in which the person is in fact reacting — by having, or by not having, an actual doubt — does not determine whether his or her evidence is in fact providing rationally conclusive support. That is because a particular reaction — of doubting or of not doubting — might not be as justified or rational in itself as is possible. (By analogy, we may keep in mind the case — unfortunately, all too common a kind of case — of a brutal tyrant who claims, sincerely, to have a clear conscience at the end of his life. The morality of his actions is more obviously to be explicated in terms of what his conscience should be telling him rather than of what it is telling him.) In effect, F is saying that no matter what evidence you have, no matter how carefully you have accumulated it, and no matter how rationally you use and evaluate it, you can never thereby have conclusive justification for a belief which you wish to support via all that evidence. Equally, F is saying that no matter what circumstance you occupy, and no matter how you are forming a particular belief, no guarantee is thereby being provided of your belief being true. In those respects (according to F), any justification you have is fallible — and it will remain so, no matter what you do with it, no matter how assiduously you attend to it, no matter what the circumstances are in which you are operating. The problem will also remain, no matter how you might supplement or try to improve your evidence or circumstances. Any possible addition or alteration that you might make will continue leaving open at least a possibility — one to which a careful and rational thinker would in principle respond respectfully if she were to notice it — of your belief’s being false.

In that way, fallibilism — as a thesis about justification — travels more deeply into the human cognitive condition than it would do if it were a point merely about logic, say. It is not saying that no belief is ever supported by evidence whose content logically entails the first belief’s content. An example of that situation would be provided by a person’s having, as evidence, the belief that he is a living, breathing Superman — from which he infers that he is alive. The evidence’s content (“I am a living, breathing Superman”) does logically entail the truth of the inferred content (“I am alive”). (This attribution of logical validity or entailment means — from standard deductive logic — that it is impossible for the first content to be true without the second one also being true.) But the justification being supplied is fallible, because — obviously — the person will have, at best, inconclusive justification for thinking that he is a living, breathing Superman in the first place. The putative justification is the belief (about being Superman) and its history, not only its content and the associated logical relations. Yet fallibilism says that, even when all such further features are taken into account, some potential will remain for rational doubt to be present.

4. Formulating Fallibilism: Necessary Truths

Nevertheless, a modification of F (in section 3) is required, it seems, if fallibilism is to apply to beliefs like mathematical ones or to beliefs reporting theses of pure logic, for instance. Most philosophers would accept that it is possible to be fallible in holding such a belief — and that this is so, even given that there is a sense in which such a belief, when true, could not ever be false. Thus, perhaps mathematical believing is a fallible process, able to lead to false beliefs. Perhaps this is so, even if mathematical truths themselves never “just happen” to be true — never depending upon changeable surrounding circumstances for their truth, hence never being susceptible to being rendered false by some change in those surrounding circumstances. How should we modify F, therefore, so as to understand the way in which fallibility can nonetheless be present in such a case? More generally, how should we modify F, so as to understand the prospect of a person ever having fallible beliefs (let alone only fallible ones) in what philosophers call necessary truths?

By definition, any truth which is not contingent is necessary. The class of necessary truths is the class of propositions or contents which, necessarily, are true. They could not have failed to be true. And that class will generally be thought to contain — maybe most significantly — mathematical truths. Consider, then, the belief that 2 + 2 = 4. In itself (almost every philosopher will concur), there is no possibility of that belief’s being false. However, if it is impossible for that belief to be false, then there is also no possible evidence on the basis of which — in coming to believe that 2 + 2 = 4 — a person could be forming a false belief. In this way, no belief that 2 + 2 = 4 could be merely fallibly justified — at least as this phenomenon has been portrayed in F. Yet it is clear — or so most epistemologists will aver — that mathematical believing can be fallible. Indeed, if fallibilism is true, all mathematical beliefs will be subject to some sort of fallibility: even mathematical beliefs would, at best, be only fallibly justified. How, therefore, is this to be understood?

Here is one suggestion — F* — which modifies F by drawing upon some standard epistemological thinking. The aim in moving from F to F* would be to allow for the possibility of having a fallible belief in a necessary truth:

F*: All beliefs are, at best, only fallibly justified. (And a belief is fallibly justified when — even if the belief, considered in itself, could not be false — the justification for it exemplifies or reflects some more general way or process of thinking or forming beliefs, a way or process which is itself fallible due to its capacity to result in false beliefs.)

Sections 5 and 7 will describe a few possible reasons for a fallibilist to regard your belief that 2 + 2 = 4 as being fallible. In the meantime, we need only note schematically how F* would accommodate those possible reasons. The basic approach would be as follows. Although your belief that 2 + 2 = 4 cannot be false (once it is present), your supposed justification for it is fallible. This could be so in a few ways. For a start, maybe you are merely repeating by rote something you were told many years ago by a somewhat unreliable school teacher. (Imagine the teacher having been poor at making accurate claims within most other areas of mathematics. Even with respect to the elements of mathematics about which she was accurate, she might have been merely repeating by rote what she had been told by her own early — and similarly unreliable — teachers.) The fallibility of memory is also relevant: over the years, one forgets much. Still, your current belief that 2 + 2 = 4 seems accurate. And it need not be present only because of your fallible memory of what your fallible teacher told you. Suppose that you are now very sophisticated in your mathematical thinking: in particular, your justification for your belief that 2 + 2 = 4 is purely mathematical in content. That justification involves clever representation, via precisely defined symbols, of abstract ideas. Nevertheless, even such purely mathematical reasoning can mislead you (no matter that it has not done so on this occasion). Really proving that 2 + 2 = 4 is quite difficult; and when people are seeking to grasp and to implement such proofs, human fallibility may readily intrude. Actual attempts to establish mathematical truths need not always lead to accurate or true beliefs.

At any rate, that is how a fallibilist might well analyze the case.

5. Empirical Evidence of Fallibility

How can we ascertain which of our ways of thinking are fallible? Both ordinary observation and sophisticated empirical research are usually regarded as able to help us here, by revealing some of the means by which fallibility enters our cognitive lives. I will list several of the seemingly fallible means of belief-formation and belief-maintenance that have been noticed.

(1) Misusing evidence. Apparently, people often misevaluate the strength of their evidence. By taking it to be stronger or weaker support than in fact it is for the truth of a particular belief, a person could easily be led to adopt or retain a false, rather than true, belief. Indeed, there are many possible ways not to use evidence properly. For example, people do not always notice, let alone compare and resolve, conflicting pieces of evidence. They might overlook some of the evidence available to them. There can be inattention to details of their evidence. And so forth.

(2) Unreliable senses. How many of us have wholly reliable — always accurate — senses? Shortsightedness is not so rare. The same is true of long-sightedness. People can have poor hearing, not to mention less-than-perfectly discerning senses of smell, taste, and so on. Sensory illusions and hallucinations affect us, too. The road seems to ripple under the heat of the sun; the stick appears to bend as it enters the glass of water; and so forth. In such cases we will think, upon reflection, that what we seem to sense is something we only seem to sense.

(3) Unreliable memory. At times, people suffer lapses of memory; and they can realize this, experiencing “blanks” as they endeavor to recall something. They can also feel as though they are remembering something, when actually this feeling is inaccurate. (A “false memory” is like that. The event which a person seems to recall, for instance, never actually happened.)

(4) Reasoning fallaciously. To reason in a logically invalid way is to reason in a way which, even given the truth of one’s premises or evidence, can lead to falsity. It is thereby to reason fallibly. Do we often reason like that? Seemingly, yes. Of course, often we and others realize that we are doing so. And we and those others might generally be satisfied with our admittedly fallible reasoning. (But should we ever regard it with satisfaction? Section 10 will consider this kind of question.) There are times, though, when we and others do not notice the fallibility in our reasoning. On those occasions, we are — without realizing this about ourselves — reasoning fallaciously. That is, we are reasoning in ways which are logically invalid but which most people mistakenly, albeit routinely, regard as being logically valid.

(5) Intelligence limitations. Is each of us so intelligent as never to make mistakes which a more intelligent person would be less likely (all else being equal) to make? Presumably none of us escape that limitation. Do we notice people making mistakes due to their exercising (and perhaps possessing) less intelligence than was needed not to make those mistakes? We appear to do so. Sometimes (often too late), we observe this in ourselves, too.

(6) Representational limitations. We use language and thought to represent or describe reality — hopefully, to do this accurately. But people have often, we believe, made mistakes about the world around them because of inadequacies in their representational or descriptive resources. For example, they can have been applying misleading and clumsily constructed concepts — ones which could well be replaced within an improved science. (And this sort of problem — at least to judge by the apparent inescapability of disputes among its practitioners — might be even more acute within such areas of thought as philosophy.)

(7) Situational limitations. It is not uncommon for people to make mistakes of fact because they have biases or prejudices that impede their ability to perceive or represent or reflect accurately upon those facts. Such mistakes may be made when people are manifesting an insufficiently developed awareness of pertinent aspects of the world. Maybe a person’s early upbringing, and how she has subsequently lived her life, has not exposed her to a particularly wide range of ideas. Perhaps she has not encountered what are, as it happens, more accurate ideas or principles than the ones she is applying in her attempts to understand the world. All of this might well prevent her even noticing some relevant aspects of the world. (When both I and a doctor gaze at an X-ray, only one of us notices much of medical relevance.)

That list of realistically possible sources of fallibility — philosophers will suspect — could be continued indefinitely. And its scope is disturbingly expansive. Thus, even when you do not feel as though a belief of yours has been formed or maintained in some way that manifests any of those failings, you could be mistaken about that. This is a factual matter; or so most philosophers will say. On any given occasion, it is an empirical question as to whether in fact you are being fallible in one of those ways. (Notably, it is not simply a matter of whether you are feeling fallible.) Accordingly, many epistemologists have paid attention to pertinent empirical research by psychiatrists, neurologists, biologists, anthropologists, and the like, into actual limitations upon human cognitive powers. Data uncovered so far have unveiled the existence of much fallibility. (See, for example, Nisbett and Ross 1980; Kahneman, Slovic, and Tversky 1982.)

Some epistemologists have found this to be worrying in itself. Still, has enough fallibility thereby been uncovered to justify an acceptance of fallibilism? (Remember that fallibilism, in its most general form, is the thesis that all of our beliefs are fallible.) This, too, is at least partly an empirical question. It is the question of just how fallible people are as a group — and, naturally, of just how much a given individual ever manages to transcend such limitations upon people in general. How fallibly, as it happens, do people ever form and maintain beliefs? Is every single one of us fallible enough to render every single one of our beliefs fallible?

It is difficult, perhaps impossible, to use personal observations and empirical research to answer those questions conclusively. (And fallibilism would deny that this is possible anyway.) For presumably such fallibilities would also afflict people as observers and as scientific inquirers. Hence, this would occur even when theorists — let alone casual observers — are investigating those fallibilities. The history of science reveals that many scientific theories which were at one time considered to be true have subsequently been supplanted, with later theories deeming the earlier ones to have been false.

Is science therefore especially fallible as a way of forming beliefs about the world? That is a matter of some philosophical dispute. Empirical science is performed by fallible people, often involving much fallible coordination among themselves. It relies on the fallible process of observation. And it can generate quite complicated theories and beliefs — with that complexity affording scope for marked fallibility. Yet in spite of these sources of fallibility nestling within it (when it is conceived of as a method), science might well (when it is conceived of as a body of theses and doctrines) encompass the most cognitively impressive store of knowledge that humans have ever amassed. Even if not all of its theories and beliefs are true (and therefore not all of them are knowledge), a significant percentage of them seem to have a strong case for being knowledge. Is that compatible with science’s fallibility, even its inherent fallibility, as a method? Or are none of its theories and beliefs knowledge, simply because (as later scientists will realize) some of them are not? Alternatively, are none of them knowledge, because none of them are conclusively justified? That depends on what kind of knowledge scientific knowledge would be. This is a subtle matter, asking us first to consider in general whether there can be inconclusively justified knowledge at all. Section 9 will indicate how epistemologists might take a step towards answering that question. It will do so by discussing the idea of fallible knowledge. (And section 10 will comment on science and fallible justification.)

6. Philosophical Sources of Fallibilism: Hume

Section 5 indicated some empirical grounds on which fallibilism might be thought to be true. Epistemologists have also provided non-empirical arguments for fallibilism, both in its strongest form and in important-but-weaker forms. This section and the next will present two of those arguments.

One of them comes from the eighteenth-century Scottish philosopher David Hume’s classic invention of what is now called inductive skepticism. (For a succinct version of his argument, see his 1902 [1748], sec. IV. For some sense of the philosophical and historical dimensions of that notion, see Buckle 2001: part 2, ch. 4.) At the core of his skeptical argument was an important-even-if-possibly-not-wholly-general fallibilism. Hume’s argument showed, at the very least, the inescapable fallibility of an extremely significant kind of belief — any belief which either is or could be an inductive extrapolation from observational data. According to Hume, no beliefs about what is yet to be observed (by a particular person or some group) can be infallibly established on the basis of what has been observed (by that person or that group). Consider any use of present and past observations, perhaps to derive and at least to support, some view that aims to describe aspects of the world that have not yet been observed. (Standard examples include people’s seeking to justify the belief that the sun will rise tomorrow, by using past observations of it having risen, and people’s many observations of black ravens supposedly justifying the belief that all ravens are black.) Hume noticed that observations can never provide conclusive assurance — a proof — that the world is not about to change from what it has thus far been observed to be like. Even if all observed Fs have been Gs, say, this does not entail that any, let alone all, of the currently unobserved Fs are also Gs. No such guarantee can be given by the past observations. And this is so, no matter how many observations of Fs have been made (short of having observed all of them, while realizing that this has occurred).

Hume presents his argument as one that uncovers a limitation upon the power or reach of reason — that is, upon how much can be revealed to us by reason as such. Possibly, this is in part because that is the non-trivial aspect of his argument. Overall, his argument is describing a limitation upon the power or reach both of reason and of observation — upon how far these faculties or capacities can take us towards proving the truth of various beliefs which, inevitably, we find ourselves having. But that limitation reflects both a point that is non-trivially true (about reason) and one that is trivially true (about observation). Hume combines those two points (as follows) to attain his fallibilism. (1) It is trivially true that any observations that have been made at and before a given time have not been of what, at that time, is yet to be observed. (2) It is true (although not trivially so) that our powers of reason face a limitation of their own, one that leaves them unable to overcome (1)’s limitation upon observation. Our capacity to reason — our powers simply of reflection — must concede that, regardless of however unlikely this might seem at the time, the unobserved Fs could be different in a relevant way from those that have been observed. Hence, in particular, whatever powers of reason we might use in seeking to move beyond our observations will be unable to eliminate the possibility that the presently unobserved Fs are quite different (as regards being Gs) from the Fs that have been observed. Our powers of reason must concede — again, even if this seems unlikely at the time — that continued observations of Fs might be about to begin giving results that are quite different to what such observations have previously revealed about Fs being Gs. Obviously, the past observations of Fs (all of which, we are supposing, were Gs) do not tell us that this is likely to occur, let alone that it is about to do so. But, crucially, pure reason tells us that it could be about to occur. (3) Consequently, if we combine (1) and (2), we reach this result:

Neither observation nor reason can reveal with rational certainty anything about the nature of any of the Fs that are presently unobserved.

In other words, there is always a “logical gap” between the observations of Fs that have been made (either by some individual or a group) and any conclusion regarding Fs that have not yet been observed (by either that individual or that group).

Our appreciation of that gap’s existence is made specific — even dramatic — by the Humean thought that the world could be about to change in the relevant respect. We thus see that fallibility cannot be excluded from any justification which we might think is present for a belief that either is or could be an extrapolation from some observations. Such a belief could be about the future (“The sun will rise tomorrow”), the presently unobserved past (“Dinosaurs used to live here”), populations (“The cats in this neighborhood are vicious”), and so on. Beliefs like that are pivotal in our mental lives, it seems.

Indeed, as some philosophers argue, they can be all-but-ubiquitous — even surprisingly so. When you believe that you are seeing a cat, is this an extrapolation from observations? At first glance, it seems straightforwardly observational itself. Yet maybe it is an extrapolation in a less obvious way. Perhaps it is an extrapolation from both your present sensory experience and similar ones that you have had in the past. Perhaps it is implicitly a prediction that the object in front of you is not about to begin looking and acting like a dog, and that it will continue looking and acting like a cat. (Is this part of what it means to say that the object is a cat — a genuine-flesh-and-blood-physical-object cat?) Are even simple observational beliefs therefore concealed or subtle extrapolations? If they are to be justified, will this need to be inductive justification?

If so, the Humean verdict (when formulated in contemporary epistemological language) remains that, even at best, such beliefs are only fallibly justified. Any justification for them would need to be observations from which they might have been extrapolated (even if in fact this is not, psychologically speaking, how they were reached). And no such justification could ever rationally eliminate the possibility that any group of apparently supportive observations is misleading as to what the world would be found to be like if further observations were to be made.

That is Hume’s inductive fallibilism — a fallibilism about all actual or possible inductive extrapolations from observations. Many interpreters believe that his argument established — or at least that Hume meant it to establish — more than a kind of fallibilism. This is why it is generally called an argument for inductive skepticism, not just for inductive fallibilism. (On Hume’s transition from fallibilism to skepticism, see Stove 1973.) Accordingly, his conclusion is sometimes presented more starkly, as saying that observations never rationally show or establish or support or justify at all any extrapolations beyond observational data, even ones that purport only to describe a likelihood of some observed pattern’s being perpetuated. At its most combative, his conclusion might be said — and sometimes is, especially by non-philosophers — to reveal that predictions are rationally useless or untenable, or that any beliefs “going beyond” observational reports are, rationally speaking, nothing more than guesses. Whether or not that skeptical thesis is true depends, for a start, upon whether there can be such a thing as fallible justification — or whether, once fallibility is present, justification departs. Section 10 will consider that issue.

In any case, Hume’s fallibilism is generally considered by philosophers (for instance, see Quine 1969; Miller 1994: 2-13; Howson 2000: ch. 1) to have struck a serious blow against the otherwise beguiling picture of science as delivering conclusive knowledge of the inner continuing workings of the world. It is not uncommon for people to react to this interpretation of Hume’s result by inferring that therefore science — with its reliance upon observations as data, with which it supports its predictions and more general principles and posits — never really gives us knowledge of a world beyond those observations. The appropriateness of that skeptical inference depends on whether or not there can be such a thing as fallible knowledge — or whether, once fallibility is present, knowledge departs. Section 9 will consider that issue.

7. Philosophical Sources of Fallibilism: Descartes

Does Hume’s reasoning (described in section 6) support fallibilism in its most general form? It does, if all beliefs depend for their justification upon extrapolations from observational experience. And section 6 also indicated briefly how there can be more beliefs like that than we might realize. Nevertheless, the usual philosophical reading of Hume’s argument does not assume that the argument shows that all beliefs are to be supported either fallibly or not at all. We should therefore pay attention to another equally famous philosophical argument, one whose conclusion is definitely that no beliefs at all are conclusively justified.

This argument comes to us from the seventeenth-century French philosopher René Descartes. In his seminal Meditations on First Philosophy (1911 [1641]), Descartes ended Meditation I skeptically, denying himself all knowledge. How was that skeptical conclusion derived? It was based upon a fallibilism — a wholly general fallibilism. And his argument for that fallibilism — the Evil Genius (or Evil Demon) argument, as it is often called — may be presented in this way:

Any beliefs you have about … well, anything … could be present within you merely because some evil genius or demon has installed them there. And they might have been installed so as to deceive you: maybe any or all of them are false. Admittedly, you do not feel as if this has happened within you. Nonetheless, it could have done so. Note that the evil genius is not simply some other person, even an especially clever one. Rather, it would be God-like in pertinent powers although malevolent in accompanying intent — mysteriously able to implant any false beliefs within you so that their presence will feel natural to you, leaving you unaware that any of your beliefs are bedeviled by this untoward causal origin. You will never notice the evil genius’s machinations. All will seem normal to you within your mind. It will feel just as it would if you were observing and thinking carefully and insightfully.

Is that state of affairs possible? Indeed it is (said Descartes, and most epistemologists have since agreed with him about that). Moreover, if it is always present as a possibility, then one pressing part of it — your being mistaken — is always present as a possibility. This is always present, as a possibility afflicting each of your beliefs. What is true of you in this respect, too, is true of everyone. The evil genius could be manipulating all of our minds. Hence, any belief could be false, no matter who has it and no matter how much evidence they have on its behalf. Even the evidence, after all, could have been installed and controlled by an evil genius.

Interestingly, the reference to an evil genius as such, provocative though it is, was not essential even to Descartes’ own reasoning. In Meditation I, he had already — immediately prior to outlining the Evil Genius argument — presented a sufficiently fallibilist worry. It concerned the possibility of his having been formed or created in some way — whatever way that might be — which would leave him perpetually fallible. He wanted to believe that God was his creator. However (he wondered), would God create him as a being who constantly makes mistakes, or who is at least always liable to do so? God would be powerful enough to do this. But (Descartes also thought) surely God would have had no reason to allow him to make even some mistakes. Yet manifestly Descartes does make them. So (he inferred), he could not take for granted at this early stage of his inquiry (as it is portrayed in his Meditations) that he has actually been formed or created by a perfect God. The evidence of his fallibility opens the door to the possibility that he does not have that causal background. So (he continues), maybe his causal origins are something less than perfect, as of course they would be if anything less than a perfect God were involved in them. In that event, however, he is even more likely to make mistakes than he would be if God was his creator. In one way or the other, therefore (concludes Descartes), fallibility is unavoidable for him: no belief of his is immune from the possibility of being mistaken. Thus, fallibilism is thrust upon Descartes by this reasoning. (He realizes, nonetheless, that it is subtle reasoning. He might not retain it in his thinking. He might overlook his fallibility, if he is not mentally vigilant. Hence, he proceeds to describe the evil genius possibility to himself, as a graphic way of holding the fallibilism fast in his mind. The Evil Genius argument is, in effect, a philosophical mnemonic for him.)

Descartes himself did not remain a fallibilist. He believed that (in his Meditation II) he had found a convincing answer to that fallibilist argument. This answer was his Cogito, one of philosophy’s emblematic moments, and it arose via the following reasoning. Descartes thought that if ever in fact he is being deceived by an evil genius, at least he will thereby be in existence at these moments. (It is impossible to be an object of deception without existing.) The deception would be inflicted upon him while he exists as a thinker — specifically, as someone thinking whatever false thoughts are being controlled within him by the evil genius. But this entails (reasoned Descartes) that there is a kind of thought about which he cannot be deceived, even by an evil genius. Because he can know that he is having a particular thought, he can know that he exists at that time. And so he thought, “I think, therefore I am.” (This is the usual translation into English of the “Cogito, ergo sum” from Latin. The latter version is from Descartes’ Discourse on Method.) He would thereby know that much, at any rate (inferred Descartes). He need not — and at this point in his inquiry he does not think that he can — know which, if any, of his beliefs about the wider world are true. Nonetheless, he has knowledge of his inner world — knowledge of his own thinking. He would know not only that he is thinking, but even what it is that he is thinking. These beliefs about his mental life are conclusively supported, too, because — as he has just argued — they are beyond the relevant reach of any evil genius. No evil genius can give him these thoughts (that he is thinking and hence existing) and thereby be deceiving him.

But most subsequent epistemologists have been more swayed by the fallibilism emerging from the Evil Genius argument than by Descartes’ reply to that argument. (For a discussion of these issues in Descartes’ project, see Curley 1978; Wilson 1978.) One common epistemological objection to his use of the Cogito is as follows. How could Descartes have known that it was he in particular who was thinking? Shouldn’t he have rested content with the more cautious and therefore less dubitable thought, “There is some thinking occurring” — instead of inferring the less cautious and therefore more dubitable thought, “I am thinking”? That objection was proposed by Georg Lichtenberg in the eighteenth century. (For a criticism of it, see Williams 1978: ch. 3.) An advocate of it might call upon such reasoning as this:

In order to know that it is his own thinking, as against just some thinking or other, Descartes has to know already — on independent grounds — that he exists. However, in that event he would not know of his existing, only through his knowing of the thinking actually occurring: he would have some other source of knowledge of his existence. Yet his Cogito had been relied upon by him because he was assuming that his knowing of the thinking actually occurring was (in the face of the imagined evil genius) the only way for him to know of his existence.

That reasoning would claim to give us the following results. (1) Descartes does not know that he is thinking — because he would have to know already that he exists (in order to be the subject of the thinking which is noticed), and because he can know that he exists only if he already knows that he is thinking (the latter knowledge being all that is claimed to be invulnerable to the Evil Genius argument). (2) Similarly, Descartes does not know that he exists — because he would have to know already that he is thinking (this being all that is claimed to be invulnerable to the evil genius argument), and because he could know that he is thinking only by already knowing that he exists (thereby being able to be the subject of the thinking that is being noticed). (3) And once we combine those two results, (1) and (2), what do we find? The objection’s conclusion is that Descartes knows of his thinking and of his existence all at once — or not at all. In short, he is not entitled — as a knower — to the “therefore” in his “I think, therefore I exist.”

That is one possible objection to the Cogito. Still, even if it succeeds on its own terms, it leaves open the following question. Can Descartes have all of that knowledge — the knowledge of his thinking and the knowledge of his existence — all at once? This depends on whether, once he has doubted as strongly and widely as he has done, he can have knowledge even of what is in his own mind. In the mid-twentieth century, the Austrian philosopher Ludwig Wittgenstein mounted a deep challenge to anything like the Cogito as a way of grounding our thought and knowledge. Was Descartes legitimately using words at all so as to form clearly known thoughts, such as “I am thinking”? How could he know what these even mean, unless he is applying some understood language? And Wittgenstein argued that no one could genuinely be thinking thoughts which are not depending upon an immersion in a “public” language, presumably a language shared by other speakers, certainly one already built up over time. In which case, Descartes would be mistaken in believing that, even if the possibility of an evil genius imperils all of his other knowledge, he could retain the knowledge of his own thinking. For even that thinking would have its content only by using terms borrowed from a public language. Hence, Descartes would have to be presupposing some knowledge of that public world, even when supposedly retreating to the inner comfort and security of knowing just what he is thinking. (It should be noted that Wittgenstein himself did not generally direct his reasoning — his Private Language argument, as it came to be called — specifically against Descartes by name. For Wittgenstein’s reasoning, see his 1978 [1953] secs. 243-315, 348-412.)

Of course, even if the Cogito does in fact succeed, epistemologists all-but-unite in denying that such conclusiveness would be available for many — or perhaps any — other beliefs. Accordingly, we would still confront an all-but-universal fallibilism, with Descartes having provided an easy way to remember our all-but-inescapable fallibility. In any case, it remains possible that the Cogito does not succeed, and that instead the evil genius argument shows that no belief is ever conclusively justified. Descartes’ argument is not the only one for such a fallibilism. But most epistemologists still refer to it routinely and with some respect, as being a paradigm argument for the most general form of fallibilism.

8. Implications of Fallibilism: No Knowledge?

If we were to accept that fallibilism is true, to what else would we thereby be committed? In particular, what further philosophical views must we hold (all else being equal) if we hold fallibilism?

Probably the most significant idea that arises, in response to that question, is the suggestion that any fallibilist about justification has to be a skeptic about the existence of knowledge. (There is also the proposal that she must be a skeptic about the existence of justification. Section 10 will discuss that proposal.) This potential implication has made fallibilism particularly interesting to many philosophers. Should we accept the skeptical thesis that because (as fallibilists claim) no one is ever holding a belief infallibly, no one ever has a belief which amounts to being knowledge? In this section and the next, we will consider that question — first (in this section) by examining how one might argue for the skeptical thesis, next (in section 9) by seeing how one might argue against it.

That hypothesized skeptic is reasoning along these lines:

  1. Any belief, if it is to be knowledge, needs to be conclusively justified.
  2. No belief is conclusively justified. [Fallibilism tells us this.]
  3. Hence, no belief is knowledge. [This follows from 1-plus-2.]

Fallibilism gives us 2; deductive logic gives us 3 (as following from 1 and 2); and in this section we are not asking whether fallibilism is true. (We are assuming – for the sake of argument – that it is.) So, our immediate challenge is to ask whether 1 is true. Is it a correct thesis about knowledge? Does knowledge require infallibility (as 1 claims it does)? The rest of this section will evaluate what are probably the two most commonly encountered arguments for the claim that knowledge is indeed like that.

(1) Impossibility. Many people say this about knowledge:

If you have knowledge of some aspect of the world, it is impossible for you to be mistaken about that aspect. (An example: “If you know that it’s a dog, you can’t be mistaken about its being one.”)

We may call that the Impossibility of Mistake thesis. Its advocates might infer, from the conjunction of it with fallibilism, that no one ever has any knowledge. Their reasoning would be like this:

Because no one ever has conclusive justification for a belief, mistakes are always possible within one’s beliefs. Hence, no beliefs attain the rank of knowledge. (We would just think — mistakenly — that often knowledge is present.)

But almost all epistemologists would regard that sort of inference as reflecting a misunderstanding of what the Impossibility of Mistake thesis is actually saying. More specifically, they will say that there is a misunderstanding of how the term “impossible” is being used in that thesis. Here are two possible claims that the Impossibility of Mistake thesis could be thought to be making:

Any instance of knowledge is — indeed, it must be — directed at what is true.  (Knowledge entails truth.)

Any instance of knowledge has as its content what, in itself, could not possibly be false. (Knowledge entails necessary truth.)

The first of those two interpretations of the Impossibility of Mistake thesis says that knowledge, in itself, has to be knowledge of what is true. The second of the two possible interpretations says that knowledge is of what, in itself, has to be true. The two claims will be correlatively different in what they imply.

Epistemologists will insist that the first possible interpretation (which could be called the Necessarily, Knowledge Is of What Is True thesis) is manifestly true — but that it does not join together with fallibilism to entail skepticism. Recall (from (2) in section 2) that fallibilism does not deny that there can be truths among our claims and thoughts. It denies only that we are ever conclusively justified in any specific claim or thought as to which claims or thoughts are true. So, while the Necessarily, Knowledge Is of What Is True thesis entails that any case of knowledge would be knowledge of a truth, fallibilism — because it does not deny that there are truths — does not entail that there is no knowledge.

Epistemologists will also deny that the second possible interpretation (which may be called the Knowledge Is of What Is Necessarily True thesis), even if it is true, entails skepticism. Recall (this time from (3) in section 2) that fallibilism is not a thesis which denies that knowledge could ever be of contingent truths. So, while the Knowledge Is of What Is Necessarily True thesis entails that any case of knowledge would be knowledge of a necessary truth, fallibilism — because it does not, in itself, deny that there is knowledge of contingent truths — does not entail that there is no knowledge. (But most epistemologists, incidentally, will deny that the Knowledge Is of What Is Necessarily True thesis is true. They believe that — if there can be knowledge at all — there can be knowledge of contingent truths, not only of necessary ones.)

(2) Linguistic oddity. Another way in which people are sometimes led to deny that a wholly general fallibilism is compatible with people ever having knowledge is by their reflecting on some supposed linguistic infelicities. Imagine saying or thinking something like this:

“I know that’s true, even though I could be mistaken about its being true.” (An example: “I know that it’s raining, even though I could be mistaken in thinking that it is.”)

That is indeed an odd way to speak or think. Let us refer to it as The Self-Doubting Knowledge Claim. Epistemologists also refer to such claims as concessive knowledge-attributions — for short, as CKAs. Should we infer, from that claim’s being so linguistically odd, that no instance of knowledge can allow the possibility (corresponding to the “could” in The Self-Doubting Knowledge Claim) of being mistaken? Would this imply the incompatibility of fallibilism with anyone’s ever having knowledge? Does this show that, whenever one’s evidence in support of a belief does not provide a conclusive proof, the belief fails to be knowledge?

Few epistemologists will think so. They are yet to agree on what, exactly, the oddity of a sentence like The Self-Doubting Knowledge Claim reflects. (Very roughly: there is some oddity in that claim’s expressed mixture of confidence and caution.) But few of them believe that the oddity — however, ultimately, it is to be understood — will imply that knowledge cannot ever be fallible. Their usual view is that the oddity will be found to reside only in the talking or the thinking — in someone’s actively using — any such sentence. And this could be so (they continue) without the sentence’s also actually being false, even when it is being used. Some sentences which clearly are internally logically consistent — and hence which in some sense could be true — cannot be used without a similar linguistic oddity being manifested. Try saying, for example, “It’s raining, but I don’t believe that it is.” As the twentieth-century English philosopher G. E. Moore remarked (and his observation has come to be called Moore’s Paradox), something is amiss in any utterance of that kind of sentence. (For more on Moore’s Paradox, see Sorensen 1988, ch. 1; Baldwin 1990: 226-32.) This particular sentence — “It’s raining, but I don’t believe that it is” — is manifestly odd, seemingly in a similar way to any utterance of The Self-Doubting Knowledge Claim. Yet this does not entail the sentence’s being false. For each half of it could well be true; and they could be true together. The fact that it is raining is logically consistent with the speaker’s not believing that it is. (She could be quite unaware of the weather at the time.) So, the sentence could be true within itself, no matter that it cannot sensibly be uttered, say. That is, its content — what it reports — could be true, even if it cannot sensibly be asserted — as a case of reporting — in living-and-breathing speech or thought.

And the same is true (epistemologists will generally concur) of The Self-Doubting Knowledge Claim, the analogous sentence about knowledge and the possibility of being mistaken. Are they correct about that? The next section engages with that question.

9. Implications of Fallibilism: Knowing Fallibly?

The question with which section 8 ended amounts to this: is it possible for there to be fallible knowledge? If The Self-Doubting Knowledge Claim could ever be true, this would be because at least some beliefs are capable of being knowledge even when there is an accompanying possibility of their being mistaken. Any such belief, it seems, would thereby be both knowledge and fallible.

Many epistemologists, probably the majority, wish to accept that there can be fallible knowledge (although they do not always call it this). Few of them are skeptics about knowledge: almost all epistemologists believe that everyone has much knowledge. But what do they believe about the nature of such knowledge? When an epistemologist attributes knowledge, what — more fully — is being attributed? In general, epistemologists also accept that (for reasons such as those outlined in sections 5 through 7) knowledge is rarely, if ever, based upon infallible justification: they believe that there is little, if any, infallible justification. Hence, most epistemologists, it seems, accept that when people do gain knowledge, this usually, maybe always, involves fallibility.

Epistemologists generally regard this fallibilist approach as more likely to generate a realistic conception of knowledge, too. Their aim is to be tolerant of the cognitive fallibilities that people have as inquirers, while nevertheless according people knowledge (usually a great deal of it). The knowledge would therefore be gained in spite of the fallibility. And, significantly, it would be a kind of knowledge which somehow reflects and incorporates the fallibility. Indeed, it would thereby be fallible knowledge. (It would not be infallible knowledge coexisting with fallibility existing only elsewhere in people’s thinking.) With this strategy in mind, then, epistemologists who are fallibilists tend not to embrace skepticism.

Nor (if section 8 is right) should they do so. That section reported (i) the two reasons most commonly thought to show that fallibility in one’s support for a belief is not good enough if the belief is to be knowledge, along with (ii) the explanations of why (according to most epistemologists) those reasons mentioned in (i) are not good enough to entail their intended result. Given (ii), therefore, (i) will at least fail to give us infallible justification for thinking that fallible knowledge is not possible. Accordingly, perhaps such knowledge is possible. But if it is, then what form would it take?

Almost all epistemologists will adopt this generic conception of it:

Any instance of fallible knowledge is a true belief which is at least fallibly (and less than infallibly) justified.

(And remember that F*, in section 4, gave us some sense of what fallible justification is.) Let us call this the Fallible Knowledge Thesis. It is an application, to fallible knowledge in particular, of what is commonly called the Justified-True-Belief Analysis of Knowledge. (For an overview of that sort of analysis, see Hetherington 1996.) As stated, the Fallible Knowledge Thesis is quite general, in that it says almost nothing about what specific forms the justification within knowledge might take; all that it does require is that the justification would provide only fallible support.

Nonetheless, generic though it is, the question still arises of whether the Fallible Knowledge Thesis is ever satisfiable, let alone actually satisfied. And that question readily leads into this more specific one: Can a true belief ever be knowledge without having its truth entailed by the justification which is contributing to making the belief knowledge? (Sometimes this talk of justification is replaced by references to warrant, where this designates the justification and/or anything else that is being said to be needed if a particular true belief is to be knowledge. For that use of the term “warrant,” see Plantinga 1993.) Section 8 has disposed of some objections to there being any fallible knowledge; and the previous paragraph has gestured at how — via the Justified-True-Belief Analysis — one might conceive of fallible knowledge. Nonetheless, there could be residual resistance to accepting that there can be fallible knowledge like that. Undoubtedly, some people will think, “There just seems to be something wrong with allowing a belief or claim to be knowledge when it could be mistaken.”

That residual resistance is not clearly decisive, though. It could well owe its existence to a failure to distinguish between two significantly different kinds of question. The first asks whether a particular belief, given the justification supporting it, is true (and thereby fallible knowledge). The other question asks whether, given that belief’s being true, there is enough supporting justification in order for it to be (fallible) knowledge. The former question is raised from “within” a particular inquiry into the truth of a particular belief. The latter question arises from “outside” that inquiry into that belief’s being true (even if this question is arising within another inquiry, perhaps an epistemological one). There is no epistemologically standard way of designating the relevant difference between those kinds of question. Perhaps the following is a helpful way to clarify that difference.

(1) The not-necessarily-epistemological question as to whether a belief is true. Imagine trying to ascertain whether some actual or potential belief or claim is true. You ask yourself, say, “Do I know whether I passed that exam?” Suppose that you have good — fallibly good — evidence in favor of your having passed the exam. (You studied well. You concentrated hard. You felt confident. Your earlier marks in similar exams have been good.) And now suppose that you recall the Justified-True-Belief Analysis. You apply it to your case. What does it tell you? It tells you just that if your actual or possible belief (namely, the belief that you passed the exam) is true, then — given your having fallibly good evidence supporting the belief — the belief is or would be knowledge, albeit fallible knowledge. But does this reasoning tell you whether the belief is knowledge? It does not. All that you have been given is this conditional result: If your belief is true, then (given the justification you have in support of it) the belief is also knowledge. You have no means other than your justification, though, of determining whether the belief is true; and because the justification is fallible, it gives you no guarantee of the belief’s being true (and thereby of being knowledge). Moreover, if fallibilism is true, then any justification which you might have, no matter how extensive or detailed it is, would not save you from that plight. Thus (given fallibilism), you are trapped in the situation of being able to reach, at best, the following conclusion: “Because my evidence provides fallible justification for my belief, the belief is fallible knowledge if it is true.” At which point, most probably, you will wonder, “Is it true? That’s what I still don’t know. (I have no other way of knowing it to be true.)” And so — right there and then — you are denying that your belief is knowledge, because you are denying that you know it to be true. The fallibility in your justification leaves you dissatisfied, as an inquirer into the truth of a particular belief, at the idea of allowing that it could be knowledge, even fallible knowledge. When still inquiring into the truth of a particular belief, it is natural for you to deny that (even if, as it happens, the belief is true) your having fallible justification is enough to make the belief knowledge.

(2) The epistemological question as to whether a belief is knowledge. But the epistemologist’s question (asked at the start of this section) as to whether there can be fallible knowledge is not asked from the sort of inquirer’s perspective described in (1). The epistemologist is not asking whether your particular belief is true (while noting the justification you have for the belief). That is the question you are restricted to asking, when you are proceeding as the inquirer in (1). The epistemological question is subtly different. It does not imagine a fallibly justified belief — before asking, without making any actual or hypothetical commitment as to the belief’s truth, whether the belief is knowledge. Rather, the epistemologist’s question considers the conceptual combination of the belief plus the justification for it plus the belief’s being true — which is to say, the whole package that, in this case, is deemed by the Justified-True-Belief Analysis to be knowledge — before proceeding to ask whether this entirety is an instance of knowledge. To put that observation more simply, this epistemological question asks whether a belief which is fallibly justified, and which is true, is (fallible) knowledge. This is the question of whether your belief is knowledge, given (even if only for argument’s sake) that it is true. In (1), your focus was different to that. In wondering whether you had passed the exam, you were asking whether the belief is true: you were still leaving open the issue of whether or not the belief is true. And, as you realized, your fallible justification was also leaving open that question. For it left open the possibility of the belief’s falsity.

Consequently, from (1), it is obvious why an inquirer might want infallibility in her justification for a belief’s truth. Infallibility would mean her not having to leave open the question of the belief’s truth. Without infallibility, the possibility is left open by her justification (which is her only indication of whether her belief is true) of her belief being false — and hence not knowledge. (This is so, even if we demand that, in order for an inquirer’s belief to be knowledge, she has to know that it is. That demand is called the KK-thesis (with its most influential analysis and defense coming from Hintikka 1962: ch. 5) — because one’s having a piece of knowledge is taken to require one’s Knowing that one has that Knowledge. Yet even satisfying that demand does not remove the rational doubt described in (1). If the extra knowledge — the knowledge of the initial belief’s being knowledge — is not required to be infallible itself, then scope for doubt will remain as to whether the initial belief really is knowledge.) But if we can either (i) know or (ii) suppose (for the sake of another kind of inquiry) that the belief is true, then we may switch our perspective, so as to be asking a different question. That is what the epistemologist is doing in (2), by adopting the latter, (ii), of these two options. She supposes, for the sake of argument, that the belief is true; then she can ask, “Would the belief’s being both true and fallibly justified suffice for it to be knowledge?” She can do this without knowing at all, let alone infallibly, whether the belief is true. (She will also not know infallibly, at least not via this questioning, whether the belief is knowledge. Yet what else is to be expected if fallibilism is true?)

It is also obvious, from (1), why an inquirer might want infallibility in her justification, insofar as she is wondering whether to say or claim that some actual or potential belief of hers is knowledge. Nonetheless, this does not entail her needing such justification if her belief is to be knowledge. Remember — from (2) in section 8 — that whether one has a specific piece of knowledge could be quite a different matter to whether one may properly claim to have it. Similarly, most epistemologists will advise us not to confuse what makes a belief knowledge with what rationally assures someone that her belief is knowledge. For example, it is possible — according to fallibilist epistemologists in general — for a person to have some fallible knowledge, even if she does not know infallibly which of her beliefs attain that status.

This section began by asking the epistemological question of whether there can be fallible knowledge. And with our having seen — in this section’s (2) — what that question is actually asking, along with — in this section’s (1) — what it is not asking, we should end the section by acknowledging that, in asking that epistemological question, we need not be crediting epistemological observers with having a special insight into whether, in general, people’s beliefs are true. The question of whether those beliefs are true is not the question being posed by the epistemological observer. She is asking whether a particular belief is knowledge, given (even if only for argument’s sake) that it is true and fallibly justified. She is asking this from “above” or “outside” the various “lower level” or “inner” attempts to know whether the given beliefs are true. The other (“lower level”) inquirers, in contrast, are asking whether their fallibly justified beliefs are true. There is fallibility in each of those processes of questioning; they just happen to have somewhat different subject-matters and methods.

We should not leave a discussion of the Fallible Knowledge Thesis without observing that, even if it is correct in its general thrust, epistemologists have faced severe challenges in their attempts to complete its details — to make it more precise and less generic. Over the past forty or so years, there have been many such attempts. But these have encountered one problem after another, mostly as epistemologists have struggled to solve what is often called the Gettier Problem, stemming from a 1963 article by Edmund Gettier.

A very brief word on that problem is in order here. It has become the epistemological challenge of defining knowledge precisely, so as to understand all actual or possible cases of knowledge — where one of the project’s guiding assumptions has been that it is possible for instances of knowledge to involve justification which supplies only fallible support. In other words, the project has striven to find a precise analysis of what the Fallible Knowledge Thesis would deem to be fallible knowledge; and, unfortunately, the Gettier Problem is generally thought by epistemologists still to be awaiting a definitive solution. Such a solution would determine wholly and exactly how fallible a particular justified true belief can be, and in what specific ways it can be fallible, without that justified true belief failing to be knowledge. In the meantime (while awaiting that sort of solution), epistemologists incline towards accepting the Justified-True-Belief Analysis — represented here in the Fallible Knowledge Thesis — as being at least approximately correct. Certainly in practice, most epistemologists treat the analysis as being correct enough — so that it functions well as giving us a concept of knowledge that is adequate to whatever demands we would place upon a concept of knowledge within most of the contexts where we need a concept of knowledge at all. Such epistemologists take the difficulties that have been encountered in the attempts to ascertain exactly how a fallibly justified true belief can manage to be knowledge as being difficulties of mere (and maybe less important) detail, not ones of insuperable and vital principle. Those epistemologists tend to assume that eventually the needed details will emerge, that these will be agreed upon by epistemologists, and hence that the basic idea behind the Fallible Knowledge Thesis will finally and definitively be vindicated. (For more on the history of that epistemological project, see Shope 1983 and Hetherington 2016.)

But again, that definitive vindication is yet to be achieved. And, of course, it will not eventuate if we should be answering “No” to the question (discussed earlier in this section) of whether a true belief which is less than infallibly justified is able to be knowledge. When there is fallibility in the justification for a particular true belief, is this fact already sufficient to prevent that belief from being knowledge? Few epistemologists wish to believe so. What we have found in this section is that they are at least not obviously mistaken in that optimistic interpretation.

10. Implications of Fallibilism: No Justification?

Sometimes epistemologists believe that fallibilism opens the door upon an even more striking worry than the one discussed in section 9 (namely, the possibility of there being no knowledge, due to the impossibility of knowledge’s ever being fallible). Sometimes they infer, from the presence of fallibility, that even justification (let alone knowledge) is absent. That is, once fallibility enters, even justification — all justification — departs. Consequently, those epistemologists — once they accept that a universal fallibilism obtains — are skeptics even about the existence of justification. (For an example of such an approach, see Miller 1994: ch. 3.)

How would that interpretation of the impact of fallibilism be articulated? In effect, the idea is that if evidence, say, is to provide even good (let alone very good or excellent or perfect) guidance as to which beliefs are true, it is not allowed to be fallible. No justification worthy of the name is able to be merely fallible. And from that viewpoint, of course, skepticism beckons insofar as no one is ever capable of having any infallible justification. If fallibility is rampant, yet infallibility is required if evidence or the like is ever to be supplying real justification, then no real justification is ever supplied. In short, no beliefs are ever justified.

That is a wholly general skepticism about justification, emerging from a wholly general fallibilism. A possible example of that form of skepticism would be the one with which Descartes ended his Meditation I. Cartesian evil genius skepticism would say that, because there is always the possibility of Descartes’ evil genius (in section 7) controlling our minds, any evidence or reasoning that one ever has could be a result just of the evil genius’s hidden intrusion into one’s mind. The evil genius — by making everything within one’s mind false and misleading — could render false all of one’s evidence, along with all of one’s ideas as to what is good reasoning. None of one’s evidence, and none of one’s beliefs as to how to use that evidence, would be true. However, if there were no truth anywhere in one’s thinking (with one never realizing this), then no components of one’s thinking would be truth-indicative or truth-conducive. No part of one’s thinking would ever lead one to have an accurate belief. Continually, one would both begin and end with falsity. And there are many epistemologists in whose estimation this would mean that no part of one’s thinking is ever really justifying some other part of one’s thinking. For justification is usually supposed to have some relevant link to truth. And presumably there would be no such link, if every single element in one’s thinking is misleading — as would be the case if an evil genius was at work. Is that possible, then? Moreover, is it so dramatic a possibility that if we are forever unable to prove that it is absent, then our minds will never contain real justification for even some of our beliefs?

A potentially less general skepticism about justification would be a Humean inductive skepticism (mentioned in section 6). The thinking behind this sort of skepticism infers — from the inherent fallibility of any inductive extrapolations that could be made from some observations — that no such extrapolation is ever even somewhat rational or justifying. Again, the skeptical interpretation of Humean inductive fallibilism is that, given that all possible extrapolations from observations are fallible, neither logic nor any other form of reason can favor one particular extrapolation over another. The fallibilism implies that there is fallibility within any extrapolation: none are immune. And the would-be skeptic infers from this that, once there is such widespread fallibility, there may as well be a complete absence of any pretence at rationality. The fallibility will be inescapable, even as we seek to defend the rationality of one extrapolation over another. Why is that? Well, we could mount such a defense only by pointing to one sort of extrapolation’s possessing a better past record of predictive success, say. But we would be pointing to that better past record, only in order to infer that such an extrapolation is more trustworthy on the present occasion. And that inference would itself be an inductive extrapolation. It, too, is therefore fallible. Accordingly, if there was previously a need to overcome inductive fallibility (with this need being the reason for consulting the past records of success in the first place), then there remains such a need, even after past records of success have been consulted. In this way, it is the fallibility’s inescapability that generates the skepticism.

Yet, as we noted earlier, most epistemologists would wish to evade or undermine skeptical arguments such as those ones — arguments that seek to convert a kind of fallibilism into a corresponding skepticism. How might this non-skeptical maneuver be achieved? There has been a plethora of attempts, too many to mention here. (For one survey, see Rescher 1980.) Moreover, no consensus has developed on how to escape skeptical arguments like these. That issue is beyond the scope of this article.

What may usefully (even if generically) be described here, however, is a fundamental choice as to how to interpret the force of fallibilism within our cognitive lives. Any response to the skeptical challenges will make that choice (even if usually implicitly and in some more specific way). The basic choice will be between the following two underlying pictures of what a wholly general fallibilism would tell us about ourselves:

(A) The inescapable fallibility of one’s cognitive efforts would be like the inescapable limits — whatever, precisely, these are — upon one’s bodily muscles. These limit what one’s body is capable of — while nonetheless being part of how it achieves whatever it does achieve. Inescapable fallibility would thus be like a background limitation — always present, sometimes a source of frustration, but rarely a danger. When used appropriately, muscles strengthen themselves in accomplished yet limited ways. Would the constant presence of fallibility be like a (fallibly) self-correcting mechanism?

(B) Inescapable fallibility would be like a debilitating illness which “feeds upon” itself. It would become ever more dangerous, as its impact is compounded by repeated use. This would badly lower the quality of one’s thinking. (For a model of that process, notice how easily instances of minor fallibility can interact so as to lead to major fallibility. For example, a sequence in which one slightly fallible piece of evidence after another is used as support for the next can end up providing very weak — overly fallible — support: [80%-probabilification X 80%-probabilification X 80%-probabilification X 80%-probabilification]

How are we to choose between (A) and (B) — between the Limited Muscles model of fallibilism and the Debilitating Illness model of it?

Because most epistemologists are non-skeptics, they favor (A) — the Limited Muscles model. This is not to insist that thinking in an (A)-influenced way is bound to succeed against skeptical arguments. The point right now is simply that this way of thinking is one possible goal for an epistemologist. It is the goal of finding some means of successfully understanding and defending an instance of the Limited Muscles model. What is described by that model would be such a theorist’s desired way to conceive, if this is possible, of the general idea of inescapable fallibility. She will seek to conceive of inescapable fallibility as being manageable, even useful. Hence, the Limited Muscles model is a framework which — in extremely general terms — she will hope allows her to understand — in more specific terms — the nature and significance of fallibilism. Perhaps the most influential modern example of this approach was Quine’s (1969), centered upon a famous metaphor from Neurath (1959 [1932/33], sec. 201). That metaphor portrays human cognitive efforts as akin to a boat, afloat at sea. The boat has its own sorts of fallibility. It is subject to stresses and cracks. And how worrying is that? Must the boat sink whenever those weaknesses manifest themselves? No, because that is not how boats usually function. In general, repairs can be made. This may occur even while the boat is still at sea. Structurally, it is strong enough to support repairs to itself, even as it continues being used, even while making progress towards its destination. Neurath regarded cognitive progress as being like that — as did Quine, who further developed Neurath’s model. On what Quine called his “naturalized” conception of epistemology (a conception that many subsequent thinkers have sought to make more detailed and to apply more widely), human observation and reason make cognitive progress in spite of their fallibility. They do so, even when discovering their own fallibility — finding their own stresses and cracks. Must they then sink, floundering in futility? No. They continue being used, often while repairing their own stresses and cracks — reliably correcting their own deliverances and predictions. Section 5 asked whether science is an especially fallible method. As was also noted, though, science provides impressive results. Indeed, it was Quine’s favored example of large-scale cognitive progress. How can that occur? How can scientific claims — including so many striking ones — be justified, in spite of the fallibility that remains? Maybe science is like a ship that carries within it some skilled and imaginative artisans (carpenters, welders, electricians, and the like). Not only can it survive; it can become more grand and capable when being repaired at sea. (Even so, is such cognitive progress best described in probabilistic terms? On that possibility, implied by Humean fallibilism, see Howson 2000.)

Naturally, in contrast to that optimistic model for thinking about fallible justification, skeptics will prefer (B) — the Debilitating Illness model. We have examined (in sections 6 and 7) a couple of specific ways in which they might try to instantiate that general model. We have also seen (in sections 8 through 10) some reasons why those skeptics might not be right. Perhaps they overstate the force of fallibilism — inferring too much from the facts of fallibility. In any case, the present point is that skeptics (like non-skeptics) seek specific arguments in pursuit of a successful articulation and defense of an underlying picture of inescapable fallibility. Both skeptics and non-skeptics thereby search for an understanding of fallibilism’s nature and significance. They simply reach for opposed conceptions of what fallibilism implies about people’s ability to observe and to reason justifiably.

So, there is a substantial choice to be made; and each of us makes it, more or less carefully and consciously, when reflecting upon these topics. Which of those two basic interpretive directions, then, should we follow? The intellectual implications of this difficult choice are exhilaratingly deep.

11. References and Further Reading

  • Baldwin, T. G. E. Moore. London: Routledge, 1990. 226-32.
    • On Moore’s paradox.
  • Buckle, S. Hume’s Enlightenment Tract: The Unity and Purpose of An Enquiry Concerning Human Understanding. Oxford: Oxford University Press, 2001. Part 2, chapter 4.
    • On Hume’s famous skeptical reasoning in his first Enquiry.
  • Conee, E. and Feldman, R. Evidentialism: Essays in Epistemology. Oxford: Oxford University Press, 2004.
    • A traditional (and popular) approach to understanding the nature of epistemic justification.
  • Curley, E. M. Descartes against the Skeptics. Cambridge, Mass.: Harvard University Press, 1978.
    • On Descartes’ skeptical doubting.
  • Descartes, R. The Philosophical Works of Descartes, Vol. I, (eds. and trans.) E. S. Haldane and G. R. T. Ross. Cambridge: Cambridge University Press, 1911 [1641].
    • Contains both the Discourse and the Meditations. These include both the Evil Genius argument and the Cogito.
  • Feldman, R. “Fallibilism and Knowing That One Knows.” The Philosophical Review 90 (1981): 266-82.
    • On the nature and availability of fallible knowledge.
  • Gettier, E. L. “Is Justified True Belief Knowledge?” Analysis 23 (1963): 121-3.
    • The genesis of the Gettier Problem.
  • Goldman, A. I. “What is Justified Belief?” In G. S. Pappas (ed.), Justification and Knowledge: New Studies in Epistemology. Dordrecht: D. Reidel, 1979.
    • An influential analysis of the nature of epistemic justification.
  • Hetherington, S. Knowledge Puzzles: An Introduction to Epistemology. Boulder, Colo.: Westview Press, 1996.
    • Includes an overview of many of the commonly noticed difficulties posed by the Gettier problem for our attaining a full understanding of fallible knowledge.
  • Hetherington, S. “Knowing Failably.” Journal of Philosophy 96 (1999): 565-87.
    • Describes the genus of which fallible knowledge is a species.
  • Hetherington, S. “Fallibilism and Knowing That One Is Not Dreaming.” Canadian Journal of Philosophy 32 (2002): 83-102.
    • Shows how fallibilism need not lead to skepticism about knowledge.
  • Hetherington, S. “Concessive Knowledge-Attributions: Fallibilism and Gradualism.” Synthese 190 (2013): 2835-51.
    • A fallibilist interpretation of concessive knowledge-attributions (instances of the Self-Doubting Knowledge Claim).
  • Hetherington, S. Knowledge and the Gettier Problem. Cambridge: Cambridge University Press (2016).
    • A critical analysis of the history of the Gettier Problem.
  • Hintikka, J. Knowledge and Belief: An Introduction to the Logic of the Two Notions Ithaca, NY: Cornell University Press, 1962. ch. 5.
    • On the KK-thesis — that is, on knowing that one knows.
  • Howson, C. Hume’s Problem: Induction and the Justification of Belief. Oxford: Oxford University Press, 2000.
    • A technically detailed response to Hume’s fallibilist challenge to the possibility of inductively justified belief.
  • Hume, D. An Enquiry Concerning Human Understanding, in Hume’s Enquiries, (ed.) L. A. Selby-Bigge, 2nd edn. Oxford: Oxford University Press, 1902 [1748].
    • This includes, in section IV, the most generally cited version of Hume’s inductive fallibilism and inductive skepticism.
  • Kahneman, D., Slovic, P., and Tversky, A. (eds.). Judgment under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University Press, 1982.
    • On empirical evidence of people’s cognitive fallibilities.
  • Merricks, T. “More on Warrant’s Entailing Truth.” Philosophy and Phenomenological Research 57 (1997): 627-31.
    • Argues against the possibility of there being fallible knowledge.
  • Miller, D. Critical Rationalism: A Restatement and Defence. Chicago: Open Court, 1994.
    • Discusses many ideas (including a skepticism about epistemic justification) that might arise if fallibilism is true.
  • Morton, A. A Guide through the Theory of Knowledge, 3rd edn. Malden, Mass.: Blackwell, 2003. ch. 5.
    • On the basic idea, plus some possible forms, of fallibilism.
  • Nagel, T. The View from Nowhere. New York: Oxford University Press, 1986.
    • See especially chapters I and V. Discusses the interplay of different perspectives (“inner” and “outer” ones) that a person might seek upon herself, especially as greater objectivity is sought. (This bears upon section 9’s distinction between two possible kinds of question that can be asked about whether a particular belief is fallible knowledge.)
  • Neurath, O. “Protocol Sentences,” in A. J. Ayer (ed.), Logical Positivism. Glencoe, Ill.: The Free Press, 1959 [1932/33].
    • Includes the famous “boat at sea” metaphor.
  • Nisbett, R. and Ross, L. Human Inference: Strategies and Shortcomings of Social Judgment. Englewood Cliffs, NJ: Prentice-Hall, 1980.
    • On empirical evidence of people’s cognitive fallibilities.
  • Peirce, C. S. Collected Papers, (eds.) C. Hartshorne and P. Weiss. Cambridge, Mass.: Harvard University Press, 1931-60.
    • See, for example, 1.120, and 1.141 through 1.175, for some of Peirce’s originating articulation of the concept of fallibilism as such.
  • Plantinga, A. Warrant: The Current Debate. New York: Oxford University Press, 1993.
    • An analysis of some proposals as to what warrant might be within (fallible) knowledge.
  • Quine, W. V. “Epistemology Naturalized,” in Ontological Relativity and Other Essays. New York: Columbia University Press, 1969.
    • A bold and prominent statement of the program of naturalized epistemology, trying to understand fallibility as a part of, rather than a threat to, the justified uses of observation and reason.
  • Reed, B. “How to Think about Fallibilism.” Philosophical Studies 107 (2002): 143-57.
    • An attempt to define fallible knowledge.
  • Rescher, N. Scepticism: A Critical Reappraisal. Oxford: Blackwell, 1980.
    • On fallibilism and many associated skeptical issues about knowledge and justification.
  • Shope, R. K. The Analysis of Knowing: A Decade of Research Princeton: Princeton University Press, 1983.
    • Presents much of the earlier history of attempts to solve the Gettier problem — and thereby to define fallible knowledge.
  • Sorensen, R. A. Blindspots. Oxford: Oxford University Press, 1988. ch. 1.
    • A philosophical analysis of the kinds of thought or sentence that constitute Moore’s paradox.
  • Stove, D. C. Probability and Hume’s Inductive Scepticism. Oxford: Oxford University Press, 1973.
    • Explains how Hume’s inductive fallibilism gives way to his inductive skepticism.
  • Williams, B. Descartes: The Project of Pure Enquiry. Hassocks: The Harvester Press, 1978.
    • Analysis of Descartes’ skeptical doubts.
  • Wilson, M. D. Descartes. London: Routledge & Kegan Paul, 1978.
    • Includes an account of Descartes’ skeptical endeavors.
  • Wittgenstein, L. Philosophical Investigations, (trans.) G. E. M. Anscombe. Oxford: Blackwell, 1978 [1953]. Sections 243-315, 348-412.
    • Presents the private language argument (against the possibility of anyone’s being able to think in a language which only they could understand).

Author Information

Stephen Hetherington
Email: s.hetherington@unsw.edu.au
University of New South Wales
Australia

Yang Xiong (53 B.C.E.—18 C.E.)

Yang_XiongYang Xiong (Yang Hsiung) was a prolific yet reclusive court poet whose writings and tragic life spanned the collapse of the Former Han dynasty (202 B.C.E.-9 C.E.) and the brief and catastrophic usurpation of the throne by the Imperial Regent Wang Mang (9-23 C.E.). He is best known for his assertion that human nature originally is neither good (as argued by Mencius) nor depraved (as argued by Xunzi) but rather comes into existence as a mixture of both. Yang Xiong’s chief philosophical writings – an abstruse book of divination known as the Tai xuan (The Great Dark Mystery) and his Fa yan (Words to Live By), a collection of aphorisms and dialogues on a variety of historical and philosophical topics – are little known even among Chinese scholars. These works combine a Daoist concern for cosmology, but may be best described as a product of the intellectual and spiritual syncretism characteristic of the Han dynasty (202 B.C.E.-220 C.E.). As a social critic and classical scholar, he is considered to be the chief representative of the Old Text School (guxue) of Confucianism. Although some think he was one of the most important writers of the late Former Han, he had little influence during his own time and was vilified for his association with the usurper Wang Mang. Consequently, his works have largely been left out of the Confucian canon.

Table of Contents

  1. Life and Writings
  2. Intellectual Context
    1. Han Syncretism and Correlative Cosmology
    2. The Old Text / New Text Controversy
  3. Tai xuan (The Great Dark Mystery)
    1. Date and Significance
    2. The Influence of the Laozi and the Yijing
    3. Correlative Cosmology in the Tai xuan
  4. Fa yan (Words to Live By)
    1. Date and Significance
    2. The Influence of the Lunyu
    3. Syncretism in the Fa yan
    4. Old Text Themes in the Fa yan
    5. Political Philosophy in the Fa yan
    6. View of Human Nature
  5. Poetical Works
  6. References and Further Reading

1. Life and Writings

Yang Xiong was born in 53 B.C.E. in the western city of Chengdu in the province of Shu. His biography in the Qian Han Shu (History of the Former Han) remarks that Yang Xiong was fond of learning, was unconcerned with wealth, office, and reputation, and suffered from a speech impediment and consequently spoke little. As a youth he probably was a student of Zhuang Zun, a reclusive marketplace fortune teller who refused to take office, opting instead to use divination and fortune-telling as a means to encourage virtue among the common people. Before coming to the capital he gained renown for his poetic writings, in particular for his fu, a poetic genre associated with an earlier native of Shu, Sima Xiangru (179-117 B.C.E.). Yang Xiong’s reputation as a poet eventually reached the capital of Chang’an, and around 20 B.C.E. he was summoned to the court of Emperor Cheng. Between the years 14-10 B.C.E., Yang Xiong submitted several poetic pieces commemorating imperial sacrifices and hunts, and finally in 10 B.C.E. he was appointed to the humble office of “Gentleman in Attendance” and “Servitor at the Yellow Gate,” where he would remain until his final days. While not much is known of Yang Xiong’s activities as a lowly official at the Han court, it appears that, as far back as 9 B.C.E., Emperor Cheng issued a decree excusing him from the direct official service, while maintaining an official title, salary, and access to the imperial library for him.

Shortly after his appointment, Yang Xiong became disillusioned with the rectifying power of his poetry and stopped writing it for the court. Yang Xiong’s decision appears to have coincided with the death of his son, a tragedy which left him despondent and financially impoverished. Over the next two decades he produced his two works on philology: Cang Jie xun zuan (Annotations to the Cang Jie), a compilation of annotations to the Qin dynasty’s official imperial dictionary, and Fang yan (Dialects), a collection of regional expressions. During this period, he also produced his Tai xuan (The Great Dark Mystery), which he completed around 2 B.C.E., and Fa yan (Words to Live By), which he completed in 9 CE – right about the time that the Imperial Regent Wang Mang usurped the throne and established the brief Xin dynasty (9-23 CE).

Yang Xiong’s life and writings were overshadowed by the rise and fall of the notorious Wang Mang (45 B.C.E.-23 CE). A nephew of the wife of Emperor Yuan (who reigned 48-32 B.C.E.), Wang Mang rose to the rank of Imperial Regent. In 9 CE, through a combination of court intrigue, political machinations, manipulation of popular superstitions, and opportunity, he seized the throne from the founding House of Liu and declared himself the rightful possessor of the Mandate of Heaven. His short-lived Xin dynasty marks the dividing line between the Former or Western Han (202 B.C.E.-9 C.E.) and the Later or Eastern Han (25-220 CE) and, due to widespread rebellion and a series of natural catastrophes, is widely considered one of the most calamitous periods in Chinese history.

While little is known of Yang Xiong’s activities during his final years, his biography notes that, shortly after Wang Mang’s usurpation Yang Xiong attempted suicide when he was named in a scandal involving one of his former students. He survived the attempt. When Wang Mang heard of it, he ordered all charges against Yang Xiong dropped, proclaiming that the poet had never been involved in any political affairs at court. His final work, Ju qin mei xin, appears to have been a controversial memorial presented to Wang Mang around 14 CE; its title is translated by Knechtges as Denigrating Qin and Praising Xin. Yang Xiong died four years later at the age of 71.

2. Intellectual Context

a. Han Syncretism and Correlative Cosmology

The focus of Yang Xiong’s writings during the middle years of his life is commonly seen as reflecting the Han trend toward syncretism and correlative cosmology. While the disunity of the Warring States period (475-221 B.C.E.) provided fertile soil for the flourishing of the “One Hundred Schools of Thought” (baijia), the unification brought about by the Qin (221-206 B.C.E.) and the Former Han dynasties provided the impetus for their coalescence. This combination of diverse views during the Qin and the Han periods can be seen in works such as the Lushi chunqiu (The Spring and Autumn Annals of Mr. Lu) and the Huainanzi (The Master of Huainan), which blend various streams of ancient Chinese thought, including Daoism, Confucianism, Legalism, Huang-Lao thought, Militarism, Mohism, and yinyang and wuxing (Five Phase) thought.

Though Confucianism became the dominant and official school of thought in the Han, it borrowed heavily from earlier schools, particularly the yinyang and wuxing schools. The former explains all entities and events in terms of the interaction between two interdependent properties, yin (associated with darkness, passivity, and femininity) and yang (associated with light, activity, and masculinity). The latter takes a similar approach to understanding natural phenomena but includes the idea that “Five Phases” (each associated with metal, wood, water, fire, and earth, respectively) succeed one another in a never-ending cyclical process. The amalgamation of Confucianism, yinyang, and wuxing theory is especially evident in the writings of the scholar Dong Zhongshu (179-104 B.C.E.), whose Chunqiu fanlu (Luxuriant Dew of the Spring and Autumn Annals) illustrates a synthesis between Confucian ethics and an amalgam of yinyang and wuxing cosmology. Attempts to develop exhaustive systems of classification (leishu) were also common during this period and can be seen as part of the larger trend toward syncretization. These tables often use a Five Phase cosmological framework in which things are organized analogically on the basis of their relevant associations, rather than on the basis of some discrete essence. As can be seen in Yang Xiong’s Tai xuan, the correlations which form the basis of these classification systems can be bewildering – especially to anyone unfamiliar with the sorts of complex associations found in early Chinese culture.

b. The Old Text / New Text Controversy

Many historians of Chinese philosophy have identified Yang Xiong’s final and best-known work, the Fa yan (Words to Live By), as representative of a more rational and sober-minded form of Confucianism known as the Old Text School (guxue). In contrast to the New Text School, which relied on versions of the classics written in the simpler and officially recognized script of the Han dynasty known as “new script” (jinwen), the Old Text School relied on versions written in the archaic scripts (guwen) and characters of the Zhou dynasty (c. 1100-221 B.C.E.). Legend has it that these latter texts survived the book burnings of the Qin dynasty by lying concealed in the walls of the home of Confucius. Generally speaking, the Old Text School was associated with the simpler, more pragmatic philosophy of Confucius’s native state of Lu, while the New Text school was associated with the often fantastic writings of Zou Yan (305-240 B.C.E.), a native of Qi and founder of the yinyang and wuxing schools of thought.

Through much of the late Former Han dynasty, Confucianism was under the influence of the yinyang and wuxing theories promoted by New Text adherents. During this period, New Text scholars increasingly became interested in esoteric readings of the classics, cosmological speculation, and calamity and portent interpretation. The chief representatives of this period were classical scholars who commonly employed wuxing and yinyang correlations, numerical calculations, and various techniques of divination to fathom the harmony and continuity of humanity, nature, and the ancestral spirits – and to forecast disruptions.

By the reigns of the last Former Han Emperors, the use of yinyang and wuxing theory in interpreting the classics and the progress of history closely paralleled methods found in apocryphal oracle books and commentaries that treated the classics as fortune-telling handbooks and used reports of unusual phenomena not to boldly admonish the Emperor – as did Zou Yan and Dong Zhongshu – but to curry favor with those in power. This trend reached its climax with Wang Mang, whose rise to power and eventual usurpation was associated with, and to a large extent legitimated by, hundreds of favorable omens and the generous rewarding of those who reported them.

While scholars are divided on whether the Old Text School originated from Xunzi’s branch of Confucianism, most characterize this movement as a rational response to the excesses of the New Text school, whose influence had left the Han court and its scholars heavily dependent upon yinyang and wuxing thinking. More broadly, the Old Text school can be seen as a response to the often irrational and superstitious world of the late Former Han – a world that interpreted the classics as containing secret magical formulas and prognostications, was fascinated by talk of immortals, saw itself near the bottom in the historical cycle of rise and decline, and interpreted the passing of each childless Emperor and reports of calamities as portents to be dreaded.

3. Tai xuan (The Great Dark Mystery)

a. Date and Significance

Completed around 2 B.C.E., the Tai xuan is Yang Xiong’s longest and most difficult work. Few scholars have taken time to study it, and those who have often disagree about its import. Some scholars view the main focus of the text to be wuxing theory, others view its main focus to be the Five Constant Virtues (wuchang) of Confucianism, and still others view the Tai xuan as political satire of Wang Mang and other historical figures of the late Former Han. (See Michael Nylan’s translation and commentary of the Tai Xuan (1993)). While the Tai xuan is more a manual of divination than a philosophical treatise, it embodies a number of assumptions about the nature of the world, its cycles of transformation, and the central importance of timeliness in making one’s way in the world. Just as in his earlier poetry, in the Tai xuan Yang Xiong reiterates the view that success and failure do not all come down to individual effort but have much to do with the times and circumstances in which one lives, and that if one does not meet one’s proper time for acting, then one should retire or withdraw and wait for more advantageous times.

b. The Influence of the Laozi and the Yijing

The term xuan in the title is typically used in Chinese literature as a modifier to describe that which is dark, black, mysterious, profound, abstruse or hidden. Yang Xiong, however, uses the term xuan much like the term dao in the Laozi to refer to the hidden fountainhead or initial state out of which things emerge and the mysterious process through which they unfold. While Yang Xiong’s conception of xuan seems to be derived from the Laozi, the text of the Tai xuan is modeled on the Yijing (Book of Changes), certainly the most enigmatic philosophical document in early Chinese literature. Like the Yijing, the Tai xuan is a book of divination based on an evolving sequence of figures that, when taken together, map out the cycles of transformation underlying all things. In both texts, each figure-image-circumstance is articulated through an evolving series of statements that describes and appraises the unfolding of the situation and the meaning of the image. Appended to both the Yijing and the Tai xuan is a set of commentaries that elaborates on the inner meanings of their respective texts.

In some ways, the Tai xuan is even more complex than its model. While the Yijing is made up of 64 hexagrams, the Tai xuan is made up of 81 tetragrams. In the Yijing, each hexagram line can be solid or broken (representing the polarities of yin and yang). In the Tai xuan, each tetragram line can be solid, broken once, or broken twice (representing the triad of heaven, earth, and man), and each of the 81 tetragrams is correlated with, among other things, yin or yang, one of the “Five Phases,” a hexagram from the Yijing, a constellation, days of the calendar, and a musical note.

c. Correlative Cosmology in the Tai xuan

In the Tai xuan, each tetragram is articulated though an evolving series of nine appraisals or judgments (whereas in the Yijing, each hexagram is articulated through a series of six line statements). These line appraisals unfold in a cyclical pattern corresponding to periods of time, the transformations of yin and yang, and a continuous cycle of commencement, maturity and decline. The appraisals can also be divided into those that address the commoner, the noble, and the Emperor.

Also, the often obscure correlative-poetic organization of the images and their associated line appraisals can be seen in the Tai xuan commentary “Numbers of the Dark Mystery,” an example of the Han genre of classificatory works known as leishu. For example, “Numbers of the Mystery” correlates the number five with the earth, the color yellow, fear, wind omens, tumuli, the naked animal (humankind), fur, bottles, weaving, sleeping mats, complying, verticality, glue, sacks, hubs, calves, coffins, bows and arrows, stupidity, and the center courtyard rain well. The basis of these associations is analogical; A is to B as C is to D. The organization scheme is fivefold. The five numerical categories (three and eight, four and nine, two and seven, one and six, and five) correspond to the five directions (east, west, south, north, center), the five phases (wood, metal, fire, water, earth), the seasons (spring, autumn, summer, winter, four seasons), the five colors (green, white, red, black, yellow), the five trades (carpentry, metal smithing, working with fire, waterworks, earth works), and the like.

4. Fa yan (Words to Live By)

a. Date and Significance

Unlike Yang Xiong’s other works, the dating of the Fa yan is fairly certain. In the final passage of the text, there is a reference to Wang Mang as the Duke of Han. The fact that Wang Mang held this title from 1-9 CE implies that the Fa yan could not have been submitted after 9 CE when he took the title of Emperor. In Fa yan 13:34 there is a reference to the Han dynasty as having ruled for 210 years. If the founding of the Han is taken to be 202 B.C.E., then the passage would have been written no earlier than 8 CE. Whatever the date of completion, there is little doubt that the Fa Yan was written during a period when Wang Mang held in his hands the reigns of power and the destiny of his sovereign. It remains his best-known work.

b. The Influence of the Lunyu

In his autobiography, Yang Xiong notes that, just as he modeled his Tai xuan on the greatest of the classics, the Yijing, so he modeled his Fa yan on the text he saw as the greatest of the commentaries – the Confucian Lunyu (Analects). Like the Lunyu, the Fa yan consists of a series of aphorisms and dialogues on a wide variety of historical and philosophical topics. Also like the Lunyu, the language of the Fa yan is archaic, its style terse, and its organization puzzling. While the form, language, and style of the Fa yan all seem to be derived from the Lunyu, the two works are most similar in their underlying concerns.

Both the Lunyu and the Fa Yan focus on the perennial Confucian theme of self-cultivation while emphasizing the importance of learning, friendship, role models, rites and music, and the human virtues. Both works look back to the ancient sage kings, the ways of the Zhou dynasty, and the teachings of the classics as models for their own troubled times. Each work has been read as a subtle attack on the predominant political powers. Finally, both the Lunyu and Fa yan can be characterized as works of frustration that lament the political instability of their respective times, the tendency of princes and officials to overstep their roles, and the failure of Confucius (Kongzi) and Yang Xiong to gain recognition or to exercise political influence.

c. Syncretism in the Fa yan

Among the disjointed sayings and dialogues of the Fa yan, one finds a wide variety of topics and themes. As noted, the most central of these are the perennial Confucian themes: self-cultivation, learning, the natural tendencies, the human virtues, the value of the classics, rites and music, the princely person, the sage, ruling, filial responsibility, and so forth. One also finds in the Fa yan discussions of concepts and themes usually associated with Daoism such as dao (way), de (potency), ziran (spontaneity), wuwei (non-coercive action), minimizing desire, and withdrawing from public life. These topics are often explicated through discussions of an unusually broad assortment of historical figures, including poets, philosophers, rhetoricians, rulers, officials, generals, merchants, rebels, assassins, jesters, recluses, and others. These topics are similarly interpreted through discussions of historical events, such as the collapse of the Zhou dynasty, the intrigues of the Warring States, the rise of the Qin dynasty and its rapid fall, the struggle between Xiang Ji (233-202 B.C.E.) and the Han dynastic founder Liu Bang (247-195 B.C.E.), and the founding of the Han dynasty.

Also included among the numerous topics discussed in the Fa yan are more immediate concerns of the late Former Han. These include the assimilation of heterodox teachings and popular superstitions into commentaries and interpretations of the classics, the decline of the ruling house of Han, the popularity of portents and the rise of Wang Mang, and government reforms in taxation, punishment, division of land, and relations with barbarian tribes. Finally, there are sayings and dialogues which address the concerns of scholar officials living not only in the troubled late Former Han, but throughout much of China’s long history – the practicality and viability of the Confucian way of life, the vanity of the desires for wealth, office and renown, and the challenges of surviving and maintaining one’s integrity in a time of disorder.

d. Old Text Themes in the Fa yan

Throughout the Fa yan, Yang Xiong sets the tone for subsequent representatives of the Old Text School by repeatedly poking fun at questions on magic, immortals, spirits, omens and portents, and esoteric interpretations of the classics. Instead he redirects attention toward concerns directly affecting the living: wealth and poverty, gain and loss, glory and disgrace, success and failure, friendship, joy, integrity, the dangers of public office, ruling the Empire, fate and circumstance, fleeing the world, and death. While the Tai xuan might be described as a synthesis of the various schools of early Chinese thought, the Fa yan elevates the Confucian school above all the others. In aphorism after aphorism, the Fa yan praises Confucius and the classics as the standards, stresses the importance of learning, rites and music, the five virtues, the five relations, and filial responsibility, while at the same time offering sardonic remarks on Daoist, Legalist, and yinyang and wuxing thinkers and their doctrines.

e. Political Philosophy in the Fa yan

On governing, the Fa yan can be seen as advancing a Reformist position. While the literary world of the late Former Han is often explicated in terms of the New and the Old Text schools, the political world of this period is similarly explicated in terms of two opposing camps: Modernists who, like earlier Legalists, advocated policies that sought to enrich the wealth and power of the state through conquering border tribes, opening trade routes, and establishing government monopolies, and Reformists who accused Modernists of ignoring the welfare of the people and advocated instead for a more frugal form of government that emphasized retrenchment in foreign policy, abolition of government monopolies, and land reform. In the Fa yan, Yang Xiong aligns himself with the Reformists by speaking out against government monopolies and expensive military campaigns and voices support for an easing of heavy burdens on the populace and the reinstitution of Zhou dynasty practices and policies.

The Reformist tone of the Fa yan gives credence to the association of Yang Xiong with “the Usurper,” Wang Mang, which has become standard throughout generations of Chinese scholarship. While Wang Mang’s rise to power met with opposition and spurred a number of insurrections, he seems to have found support in the ranks of court scholars for his display of Confucian virtue and his attempts to reorganize the social institutions of the Han along the lines of the Zhou dynasty – the system of rites and institutions highly prized by Confucian scholars since the Warring States period. Some have even seen Wang Mang as genuine in his espousal of Confucian ideals and as a sincere believer that reviving the institutions and rites of the Zhou dynasty would lead to a period of great peace and harmony. The more typical view, dating back to the account of Ban Gu (32-92 CE) in the Qian Han Shu (History of the Former Han), portrays Wang Mang as an ambitious, duplicitous, and murderous charlatan who rebelled against his sovereign and left the Empire in ruins.

Little is known of Yang Xiong’s actual political leanings in the face of Wang Mang’s rise to power. Those who portray Yang Xiong as a Wang Mang partisan point to the fact that, when Wang Mang declared himself Emperor, Yang Xiong did not commit suicide or leave court to become a recluse as did many other Han officials. His supporters, however, point out that, in his earlier poetic works and in the Fa yan, Yang Xiong has a great deal to say – most of it critical – about men who, in the name of principle, committed suicide or fled to the mountains. As noted above, it appears that Yang Xiong preferred instead to follow his teacher Zhuang Zun – though not as a recluse among men, but as a recluse at court. Although the Fa yan was written during Wang Mang’s rise in power and apparently finished shortly before his usurpation, he is mentioned only once in it. Nonetheless, some read the text as an apology for Wang Mang’s usurpation and the Confucian reforms he attempted to institute. Others read the Fa yan as consisting of a number of cleverly veiled attacks on Wang Mang’s penchant for superstition, his insatiable ambition, and his pretense to being a humble Confucian.

Some passages of the Fa yan have been read as offering neither flattery nor ridicule but bold admonitions, counseling Wang Mang to remember his filial duties and to return the reigns of power to the rightful ruler. For example, in Fa yan 8:21, there is a terse passage that reads, “The Red and Black Bows and Arrows do not amount to having it.” Centuries earlier the Imperial house of the Zhou dynasty awarded princes a set of bows and arrows as symbol of investiture to punish all within their jurisdiction. In an attempt to follow this ancient tradition, a set of red and black bows and arrows was awarded to Wang Mang in 5 CE as part of the “Conferment of the Nine Distinctions” bestowed on him by ministers, officials, and scholars of the Han court. While commentators uniformly read the phrase “red and black bows and arrows” in Fa yan 8:21 as a reference to this award, they are divided over its meaning. While some see 8:21 as flattering praise, others see it as reminding Wang Mang that having been bestowed the honor of the “Red and Black Bows and Arrows” does not amount to the possession of the mandate.

The passage most frequently cited as evidence of Yang Xiong’s political leanings is found in Fa yan 13:34, where Wang Mang is compared to two of the greatest ministers in Chinese history: Zhou Gong (the Duke of Zhou, c. 12th century B.C.E.) and Yi Yin (c. 18th century B.C.E.). Given the location of this passage at the very end of the text, some have considered it to be a forgery. Others have seen it as a flattering endorsement of Wang Mang. The great Neo-Confucian philosopher Zhu Xi (1130-1200 CE), for example, reads this passage as lavish praise of Wang Mang’s achievements and, on the basis of it, dismisses Yang Xiong as “Wang Mang’s Grandee.” Still others have seen it as admonishing Wang Mang to be like Yi Yin and Zhou Gong before him and to return the reigns of power to his rightful sovereign. It is important to point out that, like Wang Mang, both Yi Yin and Zhou Gong served as Imperial Regents. Like Yi Yin, Wang Mang stood in the wings through a series of short-lived reigns. As in the case of Yi Yin, it fell on Wang Mang to name a successor to the throne. Both Yi Yin and Wang Mang served as regents while their hand-picked successors lacked maturity. But while Yi Yin and Zhou Gong are remembered for handing back the reigns of power, Wang Mang is popularly remembered in the chengyu (proverb) as one who “usurped the Han and named himself Emperor.”

f. View of Human Nature

As Wing-tsit Chan and others have pointed out, the view for which Yang Xiong has become most famous – that human nature is a mixture of good and evil – is articulated only in a single passage of the Fa yan (3:2) and is not elaborated any further:

Human nature is a muddle [hun] of good and evil tendencies. Cultivating the good tendencies makes a person good. Cultivating the evil ones makes a person depraved. This force [qi] – is it not like a horse that drives one towards good or evil?

This hardly amounts to the kind of sustained development of a view of human nature found, for example, in the work of Mencius or Xunzi, who represent opposite poles on the continuum of ancient Chinese views of human nature. Nonetheless, Yang Xiong’s view here, although undefended in philosophical terms, contradicts Mencius’ view that human nature originally is good and can only be warped (but never entirely destroyed) through neglect or negative influences. After Mencius’ view became the orthodox one among Confucians, especially during the Neo-Confucian movement of medieval and early modern China, Yang Xiong’s work came in for a great deal of criticism from Confucians. Thus, rather like Xunzi, Yang Xiong may be seen as something of a black sheep among early Confucians because of his deviation from what became Confucian orthodoxy in a later age.

5. Poetical Works

Before being summoned to court, Yang Xiong wrote a number of poetic pieces of which only one – Fan sao (Refuting Sorrow) – survives. As Yang Xiong explains in his autobiography, Fan Sao was written in response to Li sao (Encountering Sorrow), a poem by the legendary Warring States poet Qu Yuan (340-278 B.C.E.). According to the Shiji (Historical Records) account, Qu Yuan served as a trusted official to King Huai of Chu, but, after he was slandered by a jealous minister, he fell from favor and was exiled. Qu Yuan desperately wished to return to the service of King Huai, but in the end he gave up hope and after composing Li sao, he drowned himself.

While Yang Xiong’s Fan sao is similar in style to Qu Yuan’s Li sao, its outlook is very different. Qu Yuan saw suicide as the only option left to persons of character living in a corrupt age. Yang Xiong, on the other hand, compares Qu Yuan’s response to failure in the political sphere with the response of Confucius. Unlike Qu Yuan, Confucius’s disappointments in searching for rulers who would employ him in “making good government” did not stop him from living a full life of travel, teaching, and writing. Here and in his later philosophical works, we find Yang Xiong maintaining that success and failure do not come down to individual effort but have much to do with the times and circumstances in which one lives. If one does not meet one’s proper time for acting, then one should retire or withdraw and like a snake or dragon lie submerged or like a phoenix remain concealed and wait for more advantageous times.

While at court, Yang Xiong composed a number of primarily autobiographical poetic pieces where he reflects on his poverty, lowly position, lack of recognition, and the ridicule and difficulties these frustrations have engendered. In Jie chao (Dissolving Ridicule), for example, Yang Xiong portrays himself as ridiculed for his low position and his failure to influence the court. In responding, Yang Xiong reiterates a familiar theme in his writings, arguing that in an age beset with chaos, it is better to remain silent and unknown since, as David R. Knechtges translates it, “those who grab for power die, and those who remain silent survive; those who reach the highest positions endanger their family, while those who maintain themselves intact survive.” In Zhu bin (Expelling Poverty), Yang Xiong expels an unwelcome guest named “Poverty” whose lingering presence in the poet’s life has labored his body and afflicted his health, cut him off from friends, and slowed his promotion in office. After listening to Yang Xiong vent, Poverty humbly agrees to leave, but first reminds Yang Xiong of the virtue of the impoverished sage Shun, warns him of the greed of the tyrants Jie and Zhi, and offers the consolation that it is only because of his privation that the poet is able to bear heat and cold, and to live freely with equanimity. Enlightened, Yang Xiong apologizes to Poverty and welcomes him as an honored guest.

Yang Xiong wrote several pieces in a genre known as fu, a term translated by Knechtges as “rhapsody.” Marked by its florid imagery and ecstatic tone, this genre was commonly employed by Han court officials as a means of offering indirect criticism and admonition to the Emperor. As Knechtges points out, most of the well known early writers of rhapsodies, such as Lu Jia (228-140 B.C.E.) and Jia Yi (200-168 B.C.E.), were not only poets but also scholar-officials who saw it as their duty to offer advice and remonstrance (jian) to rulers and did so through their poetic works. In the rhapsodies of later Former Han writers like Sima Xiangru, however, verbal decoration and entertainment took precedence over instruction and admonition.

In his early years at the court of Emperor Cheng, Yang Xiong submitted a number of rhapsodies. At first glance, these works appear to be little more than ornate, fanciful, and flattering descriptions of Imperial spectacles. In Fa yan (Words to Live By) and in the autobiographical section of his biography, however, Yang Xiong stresses that, like earlier poets, he envisioned the primary purpose of these works to be remonstrance – a dangerous political task widely recognized as one of the most central duties of the Confucian scholar. While, on the surface, Yang Xiong’s rhapsodies heap lavish praise on the Emperor, they also contain stern reprimands and warning. For example, within the fanciful descriptions of Imperial grandeur found in the Ganquan fu (Sweet Springs Rhapsody), Yang Xiong indirectly admonishes Emperor Cheng to be more solemn in conducting affairs, suggesting through allusion that, like the lascivious tyrant kings Jie and Xia, Emperor Cheng’s wanton conduct would lead to his downfall. In the Jiaolie fu (Barricade Hunt Rhapsody) and the Changyang fu (Changyang Palace Rhapsody), both of which commemorate imperial hunts, Yang Xiong indirectly criticizes the hunts as lavish, wasteful spectacles that burden the peasants and destroy their farms and farmlands. In his later writings, Yang Xiong claims that he eventually came to see the ornate style of rhapsody as excessive, and realizing that the moral admonitions he tried to provide had gone unheeded (if not unnoticed), he renounced it. He never gave up writing poetry altogether, however.

6. References and Further Reading

There are very few published studies of Yang Xiong in English. Of these, Nylan’s pioneering translation and commentary of the Tai Xuan (1993) is the most complete account of Yang Xiong’s philosophy, while Knechtges’s studies of Yang Xiong’s fu poetry (1976, 1977) and his Qian Han Shu biography (1982) offer superb translations and interpretations of Yang Xiong’s life and literary works. Colvin (2001) provides a translation of the Fa yan and an examination of the seemingly haphazard organization of its aphorisms and dialogues. For a fuller understanding of Yang Xiong’s thought, readers are encouraged to explore the more general accounts of the literary, intellectual, and political contexts of the Former Han dynasty in Bielenstein (1984), Feng (1953), Loewe (1974, 1986), Thomsen (1988), Xiao (1979), and Yu (1967).

  • Bielenstein, Hans. “Han Portents and Prognostications.” Museum of Far Eastern Antiquities 56 (1984): 97-112.
  • Chan, Wing-tsit. “Taoistic Confucianism: Yang Hsiung.” In A Source Book in Chinese Philosophy, ed. Wing-tsit Chan (Princeton: Princeton University Press, 1963), 289-291.
  • Colvin, Andrew. Patterns of Coherence in Yang Xiong’s Fa Yan. Ph.D. dissertation, University of Hawaii at Manoa, 2001.
  • Doeringer, Franklin M. Yang Xiong and his Formulation of a Classicism. Ph.D. dissertation, Columbia University, 1971.
  • Feng, Yulan. A History of Chinese Philosophy, Vol. 2: The Period of Classical Learning. Trans. Derke Bodde. Princeton: Princeton University Press, 1953.
  • Knechtges, David R. The Han Rhapsody: A Study of the Fu of Yang Xiong (53 B.C.- A.D.18). Cambridge: Cambridge University Press, 1976.
  • Knechtges, David R. “Uncovering the Sauce Jar: A Literary Interpretation of Yang Hsiung’s Chu ch’in mei Hsin.” In Ancient China: Studies in Early Civilization, eds. David T. Roy et al (Hong Kong: Chinese University Press, 1977), 229-252.
  • Knechtges, David R. “The Liu Hsin /Yang Hsiung Correspondence on the Fang Yen.” Monumenta Serica 33 (1977): 309-325.
  • Knechtges, David R. The Han Shu Biography of Yang Xiong (53 B.C. to A.D. 18). Tempe: Arizona State University Press, 1982.
  • Loewe, Michael. Crisis and Conflict in Han China 104 B.C. to A.D. 9. London: George Allen and Unwin, 1974.
  • Nylan, Michael. The Canon of Supreme Mystery by Yang Xiong: A Translation with Commentary of the T’ai Hsüan Ching. Albany: State University of New York Press, 1993.
  • Nylan, Michael. “Han Classicists Writing in Dialogue about their Own Tradition.” Philosophy East & West 47/2 (1996): 133-188.
  • Thomsen, Rudi. Ambition and Confucianism: A Biography of Wang Mang. Aarhus: Aarhus University Press, 1988.
  • Twichett, Denis, and Michael Loewe, eds. The Cambridge History of China, Vol. 1: The Ch’in and Han Empires, 221 B.C. – A.D. 220. Cambridge: Cambridge University Press, 1986.
  • Xiao, Gongjun. A History of Chinese Political Thought, Vol. 1: From the Beginnings to the Sixth Century A.D. Trans. F.W. Mote. Princeton: Princeton University Press, 1979.
  • Yu, Yingshi. Trade and Expansion in Han China. Berkeley: University of California Press, 1967.

Author Information

Andrew Colvin
Email: andrew.colvin@sru.edu
Slippery Rock University
U. S. A.

Gettier Problems

Gettier problems or cases are named in honor of the American philosopher Edmund Gettier, who discovered them in 1963. They function as challenges to the philosophical tradition of defining knowledge of a proposition as justified true belief in that proposition. The problems are actual or possible situations in which someone has a belief that is both true and well supported by evidence, yet which — according to almost all epistemologists — fails to be knowledge. Gettier’s original article had a dramatic impact, as epistemologists began trying to ascertain afresh what knowledge is, with almost all agreeing that Gettier had refuted the traditional definition of knowledge. They have made many attempts to repair or replace that traditional definition of knowledge, resulting in several new conceptions of knowledge and of justificatory support. In this respect, Gettier sparked a period of pronounced epistemological energy and innovation — all with a single two-and-a-half page article. There is no consensus, however, that any one of the attempts to solve the Gettier challenge has succeeded in fully defining what it is to have knowledge of a truth or fact. So, the force of that challenge continues to be felt in various ways, and to various extents, within epistemology. Sometimes, the challenge is ignored in frustration at the existence of so many possibly failed efforts to solve it. Often, the assumption is made that somehow it can — and will, one of these days — be solved. Usually, it is agreed to show something about knowledge, even if not all epistemologists concur as to exactly what it shows.

Table of Contents

  1. Introduction
  2. The Justified-True-Belief Analysis of Knowledge
  3. Gettier’s Original Challenge
  4. Some other Gettier Cases
  5. The Basic Structure of Gettier Cases
  6. The Generality of Gettier Cases
  7. Attempted Solutions: Infallibility
  8. Attempted Solutions: Eliminating Luck
  9. Attempted Solutions: Eliminating False Evidence
  10. Attempted Solutions: Eliminating Defeat
  11. Attempted Solutions: Eliminating Inappropriate Causality
  12. Attempted Dissolutions: Competing Intuitions
  13. Attempted Dissolutions: Knowing Luckily
  14. Gettier Cases and Analytic Epistemology
  15. References and Further Reading

1. Introduction

Gettier problems or cases arose as a challenge to our understanding of the nature of knowledge. Initially, that challenge appeared in an article by Edmund Gettier, published in 1963. But his article had a striking impact among epistemologists, so much so that hundreds of subsequent articles and sections of books have generalized Gettier’s original idea into a more wide-ranging concept of a Gettier case or problem, where instances of this concept might differ in many ways from Gettier’s own cases. Philosophers swiftly became adept at thinking of variations on Gettier’s own particular cases; and, over the years, this fecundity has been taken to render his challenge even more significant. This is especially so, given that there has been no general agreement on how to solve the challenge posed by Gettier cases as a group — Gettier’s own ones or those that other epistemologists have observed or imagined. (Note that sometimes this general challenge is called the Gettier problem.) What, then, is the nature of knowledge? And can we rigorously define what it is to know? Gettier’s article gave to these questions a precision and urgency that they had formerly lacked. The questions are still being debated — more or less fervently at different times — within post-Gettier epistemology.

2. The Justified-True-Belief Analysis of Knowledge

Gettier cases are meant to challenge our understanding of propositional knowledge. This is knowledge which is described by phrases of the form “knowledge that p,” with “p” being replaced by some indicative sentence (such as “Kangaroos have no wings”). It is knowledge of a truth or fact — knowledge of how the world is in whatever respect is being described by a given occurrence of “p”. Usually, when epistemologists talk simply of knowledge they are referring to propositional knowledge. It is a kind of knowledge which we attribute to ourselves routinely and fundamentally.

Hence, it is philosophically important to ask what, more fully, such knowledge is. If we do not fully understand what it is, will we not fully understand ourselves either? That is a possibility, as philosophers have long realized. Those questions are ancient ones; in his own way, Plato asked them.

And, prior to Gettier’s challenge, different epistemologists would routinely have offered in reply some more or less detailed and precise version of the following generic three-part analysis of what it is for a person to have knowledge that p (for any particular “p”):

  1. Belief. The person believes that p. This belief might be more or less confident. And it might — but it need not — be manifested in the person’s speech, such as by her saying that p or by her saying that she believes that p. All that is needed, strictly speaking, is for her belief to exist (while possessing at least the two further properties that are about to be listed).
  2. Truth. The person’s belief that p needs to be true. If it is incorrect instead, then — no matter what else is good or useful about it — it is not knowledge. It would only be something else, something lesser. Admittedly, even when a belief is mistaken it can feel to the believer as if it is true. But in that circumstance the feeling would be mistaken; and so the belief would not be knowledge, no matter how much it might feel to the believer like knowledge.
  3. Justification. The person’s belief that p needs to be well supported, such as by being based upon some good evidence or reasoning, or perhaps some other kind of rational justification. Otherwise, the belief, even if it is true, may as well be a lucky guess. It would be correct without being knowledge. It would only be something else, something lesser.

Supposedly (on standard pre-Gettier epistemology), each of those three conditions needs to be satisfied, if there is to be knowledge; and, equally, if all are satisfied together, the result is an instance of knowledge. In other words, the analysis presents what it regards as being three individually necessary, and jointly sufficient, kinds of condition for having an instance of knowledge that p.

The analysis is generally called the justified-true-belief form of analysis of knowledge (or, for short, JTB). For instance, your knowing that you are a person would be your believing (as you do) that you are one, along with this belief’s being true (as it is) and its resting (as it does) upon much good evidence. That evidence will probably include such matters as your having been told that you are a person, your having reflected upon what it is to be a person, your seeing relevant similarities between yourself and other persons, and so on.

It is important to bear in mind that JTB, as presented here, is a generic analysis. It is intended to describe a general structuring which can absorb or generate comparatively specific analyses that might be suggested, either of all knowledge at once or of particular kinds of knowledge. It provides a basic outline — a form — of a theory. In practice, epistemologists would suggest further details, while respecting that general form. So, even when particular analyses suggested by particular philosophers at first glance seem different to JTB, these analyses can simply be more specific instances or versions of that more general form of theory.

Probably the most common way for this to occur involves the specific analyses incorporating, in turn, further analyses of some or all of belief, truth, and justification. For example, some of the later sections in this article may be interpreted as discussing attempts to understand justification more precisely, along with how it functions as part of knowledge. In general, the goal of such attempts can be that of ascertaining aspects of knowledge’s microstructure, thereby rendering the general theory JTB as precise and full as it needs to be in order genuinely to constitute an understanding of particular instances of knowing and of not knowing. Steps in that direction by various epistemologists have tended to be more detailed and complicated after Gettier’s 1963 challenge than had previously been the case. Roderick Chisholm (1966/1977/1989) was an influential exemplar of the post-1963 tendency; A. J. Ayer (1956) famously exemplified the pre-1963 approach.

3. Gettier’s Original Challenge

Gettier’s article described two possible situations. This section presents his Case I. (It is perhaps the more widely discussed of the two. The second will be mentioned in the next section.) Subsequent sections will use this Case I of Gettier’s as a focal point for analysis.

The case’s protagonist is Smith. He and Jones have applied for a particular job. But Smith has been told by the company president that Jones will win the job. Smith combines that testimony with his observational evidence of there being ten coins in Jones’s pocket. (He had counted them himself — an odd but imaginable circumstance.) And he proceeds to infer that whoever will get the job has ten coins in their pocket. (As the present article proceeds, we will refer to this belief several times more. For convenience, therefore, let us call it belief b.) Notice that Smith is not thereby guessing. On the contrary; his belief b enjoys a reasonable amount of justificatory support. There is the company president’s testimony; there is Smith’s observation of the coins in Jones’s pocket; and there is Smith’s proceeding to infer belief b carefully and sensibly from that other evidence. Belief b is thereby at least fairly well justified — supported by evidence which is good in a reasonably normal way. As it happens, too, belief b is true — although not in the way in which Smith was expecting it to be true. For it is Smith who will get the job, and Smith himself has ten coins in his pocket. These two facts combine to make his belief b true. Nevertheless, neither of those facts is something that, on its own, was known by Smith. Is his belief b therefore not knowledge? In other words, does Smith fail to know that the person who will get the job has ten coins in his pocket? Surely so (thought Gettier).

That is Gettier’s Case I, as it was interpreted by him, and as it has subsequently been regarded by almost all other epistemologists. The immediately pertinent aspects of it are standardly claimed to be as follows. It contains a belief which is true and justified — but which is not knowledge. And if that is an accurate reading of the case, then JTB is false. Case I would show that it is possible for a belief to be true and justified without being knowledge. Case I would have established that the combination of truth, belief, and justification does not entail the presence of knowledge. In that sense, a belief’s being true and justified would not be sufficient for its being knowledge.

But if JTB is false as it stands, with what should it be replaced? (Gettier himself made no suggestions about this.) Its failing to describe a jointly sufficient condition of knowing does not entail that the three conditions it does describe are not individually necessary to knowing. And if each of truth, belief, and justification is needed, then what aspect of knowledge is still missing? What feature of Case I prevents Smith’s belief b from being knowledge? What is the smallest imaginable alteration to the case that would allow belief b to become knowledge? Would we need to add some wholly new kind of element to the situation? Or is JTB false only because it is too general — too unspecific? For instance, are only some kinds of justification both needed and enough, if a true belief is to become knowledge? Must we describe more specifically how justification ever makes a true belief knowledge? Is Smith’s belief b justified in the wrong way, if it is to be knowledge?

4. Some other Gettier Cases

Having posed those questions, though, we should realize that they are merely representative of a more general epistemological line of inquiry. The epistemological challenge is not just to discover the minimal repair that we could make to Gettier’s Case I, say, so that knowledge would then be present. Rather, it is to find a failing — a reason for a lack of knowledge — that is common to all Gettier cases that have been, or could be, thought of (that is, all actual or possible cases relevantly like Gettier’s own ones). Only thus will we be understanding knowledge in general — all instances of knowledge, everyone’s knowledge. And this is our goal when responding to Gettier cases.

Sections 7 through 11 will present some attempted diagnoses of such cases. In order to evaluate them, therefore, it would be advantageous to have some sense of the apparent potential range of the concept of a Gettier case. I will mention four notable cases.

The lucky disjunction (Gettier’s second case: 1963). Again, Smith is the protagonist. This time, he possesses good evidence in favor of the proposition that Jones owns a Ford. Smith also has a friend, Brown. Where is Brown to be found at the moment? Smith does not know. Nonetheless, on the basis of his accepting that Jones owns a Ford, he infers — and accepts — each of these three disjunctive propositions:

  • Either Jones owns a Ford, or Brown is in Boston.
  • Either Jones owns a Ford, or Brown is in Barcelona.
  • Either Jones owns a Ford, or Brown is in Brest-Litovsk.

No insight into Brown’s location guides Smith in any of this reasoning. He realizes that he has good evidence for the first disjunct (regarding Jones) in each of those three disjunctions, and he sees this evidence as thereby supporting each disjunction as a whole. Seemingly, he is right about that. (These are inclusive disjunctions, not exclusive. That is, each can, if need be, accommodate the truth of both of its disjuncts. Each is true if even one — let alone both — of its disjuncts is true.) Moreover, in fact one of the three disjunctions is true (albeit in a way that would surprise Smith if he were to be told of how it is true). The second disjunction is true because, as good luck would have it, Brown is in Barcelona — even though, as bad luck would have it, Jones does not own a Ford. (As it happened, the evidence for his doing so, although good, was misleading.) Accordingly, Smith’s belief that either Jones owns a Ford or Brown is in Barcelona is true. And there is good evidence supporting — justifying — it. But is it knowledge?

The sheep in the field (Chisholm 1966/1977/1989). Imagine that you are standing outside a field. You see, within it, what looks exactly like a sheep. What belief instantly occurs to you? Among the many that could have done so, it happens to be the belief that there is a sheep in the field. And in fact you are right, because there is a sheep behind the hill in the middle of the field. You cannot see that sheep, though, and you have no direct evidence of its existence. Moreover, what you are seeing is a dog, disguised as a sheep. Hence, you have a well justified true belief that there is a sheep in the field. But is that belief knowledge?

The pyromaniac (Skyrms 1967). A pyromaniac reaches eagerly for his box of Sure-Fire matches. He has excellent evidence of the past reliability of such matches, as well as of the present conditions — the clear air and dry matches — being as they should be, if his aim of lighting one of the matches is to be satisfied. He thus has good justification for believing, of the particular match he proceeds to pluck from the box, that it will light. This is what occurs, too: the match does light. However, what the pyromaniac did not realize is that there were impurities in this specific match, and that it would not have lit if not for the sudden (and rare) jolt of Q-radiation it receives exactly when he is striking it. His belief is therefore true and well justified. But is it knowledge?

The fake barns (Goldman 1976). Henry is driving in the countryside, looking at objects in fields. He sees what looks exactly like a barn. Accordingly, he thinks that he is seeing a barn. Now, that is indeed what he is doing. But what he does not realize is that the neighborhood contains many fake barns — mere barn facades that look like real barns when viewed from the road. And if he had been looking at one of them, he would have been deceived into believing that he was seeing a barn. Luckily, he was not doing this. Consequently, his belief is justified and true. But is it knowledge?

In none of those cases (or relevantly similar ones), say almost all epistemologists, is the belief in question knowledge. (Note that some epistemologists do not regard the fake barns case as being a genuine Gettier case. There is a touch of vagueness in the concept of a Gettier case.)

5. The Basic Structure of Gettier Cases

Although the multitude of actual and possible Gettier cases differ in their details, some characteristics unite them. For a start, each Gettier case contains a belief which is true and well justified without — according to epistemologists as a whole — being knowledge. The following two generic features also help to constitute Gettier cases:

  1. Fallibility. The justification that is present within each case is fallible. Although it provides good support for the truth of the belief in question, that support is not perfect, strictly speaking. This means that the justification leaves open at least the possibility of the belief’s being false. The justification indicates strongly that the belief is true — without proving conclusively that it is.
  2. Luck. What is most distinctive of Gettier cases is the luck they contain. Within any Gettier case, in fact the well-but-fallibly justified belief in question is true. Nevertheless, there is significant luck in how the belief manages to combine being true with being justified. Some abnormal or odd circumstance is present in the case, a circumstance which makes the existence of that justified and true belief quite fortuitous.

Here is how those two features, (1) and (2), are instantiated in Gettier’s Case I. Smith’s evidence for his belief b was good but fallible. This left open the possibility of belief b being mistaken, even given that supporting evidence. As it happened, that possibility was not realized: Smith’s belief b was actually true. Yet this was due to the intervention of some good luck. Belief b could easily have been false; it was made true only by circumstances which were hidden from Smith. That is, belief b was in fact made true by circumstances (namely, Smith’s getting the job and there being ten coins in his pocket) other than those which Smith’s evidence noticed and which his evidence indicated as being a good enough reason for holding b to be true. What Smith thought were the circumstances (concerning Jones) making his belief b true were nothing of the sort. Luckily, though, some facts of which he had no inkling were making his belief true.

Similar remarks pertain to the sheep-in-the-field case. Within it, your sensory evidence is good. You rely on your senses, taking for granted — as one normally would — that the situation is normal. Then, by standard reasoning, you gain a true belief (that there is a sheep in the field) on the basis of that fallible-but-good evidence. Nonetheless, wherever there is fallibility there is a chance of being mistaken — of gaining a belief which is false. And that is exactly what would have occurred in this case (given that you are actually looking at a disguised dog) — if not, luckily, for the presence behind the hill of the hidden real sheep. Only luckily, therefore, is your belief both justified and true. And because of that luck (say epistemologists in general), the belief fails to be knowledge.

6. The Generality of Gettier Cases

JTB says that any actual or possible case of knowledge that p is an actual or possible instance of some kind of well justified true belief that p — and that any actual or possible instance of some kind of well justified true belief that p is an actual or possible instance of knowledge that p. Hence, JTB is false if there is even one actual or possible Gettier situation (in which some justified true belief fails to be knowledge). Accordingly, since 1963 epistemologists have tried — again and again and again — to revise or repair or replace JTB in response to Gettier cases. The main aim has been to modify JTB so as to gain a ‘Gettier-proof’ definition of knowledge.

How extensive would such repairs need to be? After all, even if some justified true beliefs arise within Gettier situations, not all do so. In practise, such situations are rare, with few of our actual justified true beliefs ever being “Gettiered.” Has Gettier therefore shown only that not all justified true beliefs are knowledge? Correlatively, might JTB be almost correct as it is — in the sense of being accurate about almost all actual or possible cases of knowledge?

On the face of it, Gettier cases do indeed show only that not all actual or possible justified true beliefs are knowledge — rather than that a belief’s being justified and true is never enough for its being knowledge. Nevertheless, epistemologists generally report the impact of Gettier cases in the latter way, describing them as showing that being justified and true is never enough to make a belief knowledge. Why do epistemologists interpret the Gettier challenge in that stronger way?

The reason is that they wish — by way of some universally applicable definition or formula or analysis — to understand knowledge in all of its actual or possible instances and manifestations, not only in some of them. Hence, epistemologists strive to understand how to avoid ever being in a Gettier situation (from which knowledge will be absent, regardless of whether such situations are uncommon). But that goal is, equally, the aim of understanding what it is about most situations that constitutes their not being Gettier situations. If we do not know what, exactly, makes a situation a Gettier case and what changes to it would suffice for its no longer being a Gettier case, then we do not know how, exactly, to describe the boundary between Gettier cases and other situations.

We call various situations in which we form beliefs “everyday” or “ordinary,” for example. In particular, therefore, we might wonder whether all “normally” justified true beliefs are still instances of knowledge (even if in Gettier situations the justified true beliefs are not knowledge). Yet even that tempting idea is not as straightforward as we might have assumed. For do we know what it is, exactly, that makes a situation ordinary? Specifically, what are the details of ordinary situations that allow them not to be Gettier situations — and hence that allow them to contain knowledge? To the extent that we do not understand what it takes for a situation not to be a Gettier situation, we do not understand what it takes for a situation to be a normal one (thereby being able to contain knowledge). Understanding Gettier situations would be part of understanding non-Gettier situations — including ordinary situations. Until we adequately understand Gettier situations, we do not adequately understand ordinary situations — because we would not adequately understand the difference between these two kinds of situation.

7. Attempted Solutions: Infallibility

To the extent that we understand what makes something a Gettier case, we understand what would suffice for that situation not to be a Gettier case. Section 5 outlined two key components — fallibility and luck — of Gettier situations. In this section and the next, we will consider whether removing one of those two components — the removal of which will suffice for a situation’s no longer being a Gettier case — would solve Gettier’s epistemological challenge. That is, we will be asking whether we may come to understand the nature of knowledge by recognizing its being incompatible with the presence of at least one of those two components (fallibility and luck).

There is a prima facie case, at any rate, for regarding justificatory fallibility with concern in this setting. So, let us examine the Infallibility Proposal for solving Gettier’s challenge. There have long been philosophers who doubt (independently of encountering Gettier cases) that allowing fallible justification is all that it would take to convert a true belief into knowledge. (“If you know that p, there must have been no possibility of your being mistaken about p,” they might say.) The classic philosophical expression of that sort of doubt was by René Descartes, most famously in his Meditations on First Philosophy (1641). Contemporary epistemologists who have voiced similar doubts include Keith Lehrer (1971) and Peter Unger (1971). In the opinion of epistemologists who embrace the Infallibility Proposal, we can eliminate Gettier cases as challenges to our understanding of knowledge, simply by refusing to allow that one’s having fallible justification for a belief that p could ever adequately satisfy JTB’s justification condition. Stronger justification than that is required within knowledge (they will claim); infallibilist justificatory support is needed. (They might even say that there is no justification present at all, let alone an insufficient amount of it, given the fallibility within the cases.)

Thus, for instance, an infallibilist about knowledge might claim that because (in Case I) Smith’s justification provided only fallible support for his belief b, this justification was always leaving open the possibility of that belief being mistaken — and that this is why the belief is not knowledge. The infallibilist might also say something similar — as follows — about the sheep-in-the-field case. Because you were relying on your fallible senses in the first place, you were bound not to gain knowledge of there being a sheep in the field. (“It could never be real knowledge, given the inherent possibility of error in using one’s senses.”) And the infallibilist will regard the fake-barns case in the same way, claiming that the potential for mistake (that is, the existence of fallibility) was particularly real, due to the existence of the fake barns. And that is why (infers the infallibilist) there is a lack of knowledge within the case — as indeed there would be within any situation where fallible justification is being used.

So, that is the Infallibility Proposal. The standard epistemological objection to it is that it fails to do justice to the reality of our lives, seemingly as knowers of many aspects of the surrounding world. In our apparently “ordinary” situations, moving from one moment to another, we take ourselves to have much knowledge. Yet we rarely, if ever, possess infallible justificatory support for a belief. And we accept this about ourselves, realizing that we are not wholly — conclusively — reliable. We accept that if we are knowers, then, we are at least not infallible knowers. But the Infallibility Proposal — when combined with that acceptance of our general fallibility — would imply that we are not knowers at all. It would thereby ground a skepticism about our ever having knowledge.

Accordingly, most epistemologists would regard the Infallibility Proposal as being a drastic and mistaken reaction to Gettier’s challenge in particular. In response to Gettier, most seek to understand how we do have at least some knowledge — where such knowledge will either always or almost always be presumed to involve some fallibility. The majority of epistemologists still work towards what they hope will be a non-skeptical conception of knowledge; and attaining this outcome could well need to include their solving the Gettier challenge without adopting the Infallibility Proposal.

8. Attempted Solutions: Eliminating Luck

The other feature of Gettier cases that was highlighted in section 5 is the lucky way in which such a case’s protagonist has a belief which is both justified and true. Is it this luck that needs to be eliminated if the situation is to become one in which the belief in question is knowledge? In general, must any instance of knowledge include no accidentalness in how its combination of truth, belief, and justification is effected? The Eliminate Luck Proposal claims so.

Almost all epistemologists, when analyzing Gettier cases, reach for some version of this idea, at least in their initial or intuitive explanations of why knowledge is absent from the cases. Unger (1968) is one who has also sought to make this a fuller and more considered part of an explanation for the lack of knowledge. He says that a belief is not knowledge if it is true only courtesy of some relevant accident. That description is meant to allow for some flexibility. Even so, further care will still be needed if the Eliminate Luck Proposal is to provide real insight and understanding. After all, if we seek to eliminate all luck whatsoever from the production of the justified true belief (if knowledge is thereby to be present), then we are again endorsing a version of infallibilism (as described in section 7). If no luck is involved in the justificatory situation, the justification renders the belief’s truth wholly predictable or inescapable; in which case, the belief is being infallibly justified. And this would be a requirement which (as section 7 explained) few epistemologists will find illuminating, certainly not as a response to Gettier cases.

What many epistemologists therefore say, instead, is that the problem within Gettier cases is the presence of too much luck. Some luck is to be allowed; otherwise, we would again have reached for the Infallibility Proposal. But too large a degree of luck is not to be allowed. This is why we often find epistemologists describing Gettier cases as containing too much chance or flukiness for knowledge to be present.

Nevertheless, how helpful is that kind of description by those epistemologists? How much luck is too much? That is a conceptually vital question. Yet there has been no general agreement among epistemologists as to what degree of luck precludes knowledge. There has not even been much attempt to determine that degree. (It is no coincidence, similarly, that epistemologists in general are also yet to determine how strong — if it is allowed to be something short of infallibility — the justificatory support needs to be within any case of knowledge.) A specter of irremediable vagueness thus haunts the Eliminate Luck Proposal.

Perhaps understandably, therefore, the more detailed epistemological analyses of knowledge have focused less on delineating dangerous degrees of luck than on characterizing substantive kinds of luck that are held to drive away knowledge. Are there ways in which Gettier situations are structured, say, which amount to the presence of a kind of luck which precludes the presence of knowledge (even when there is a justified true belief)? Most attempts to solve Gettier’s challenge instantiate this form of thinking. In sections 9 through 11, we will encounter a few of the main suggestions that have been made.

9. Attempted Solutions: Eliminating False Evidence

A lot of epistemologists have been attracted to the idea that the failing within Gettier cases is the person’s including something false in her evidence. This would be a problem for her, because she is relying upon that evidence in her attempt to gain knowledge, and because knowledge is itself always true. To the extent that falsity is guiding the person’s thinking in forming the belief that p, she will be lucky to derive a belief that p which is true. And (as section 8 indicated) there are epistemologists who think that a lucky derivation of a true belief is not a way to know that truth. Let us therefore consider the No False Evidence Proposal.

In Gettier’s Case I, for example, Smith includes in his evidence the false belief that Jones will get the job. If Smith had lacked that evidence (and if nothing else were to change within the case), presumably he would not have inferred belief b. He would probably have had no belief at all as to who would get the job (because he would have had no evidence at all on the matter). If so, he would thereby not have had a justified and true belief b which failed to be knowledge. Should JTB therefore be modified so as to say that no belief is knowledge if the person’s justificatory support for it includes something false? JTB would then tell us that one’s knowing that p is one’s having a justified true belief which is well supported by evidence, none of which is false.

That is the No False Evidence Proposal. But epistemologists have noticed a few possible problems with it.

First, as Richard Feldman (1974) saw, there seem to be some Gettier cases in which no false evidence is used. Imagine that (contrary to Gettier’s own version of Case I) Smith does not believe, falsely, “Jones will get the job.” Imagine instead that he believes, “The company president told me that Jones will get the job.” (He could have continued to form the first belief. But suppose that, as it happens, he does not form it.) This alternative belief would be true. It would also provide belief b with as much justification as the false belief provided. So, if all else is held constant within the case (with belief b still being formed), again Smith has a true belief which is well-although-fallibly justified, yet which might well not be knowledge.

Second, it will be difficult for the No False Evidence Proposal not to imply an unwelcome skepticism. Quite possibly, there is always some false evidence being relied upon, at least implicitly, as we form beliefs. Is there nothing false at all — not even a single falsity — in your thinking, as you move through the world, enlarging your stock of beliefs in various ways (not all of which ways are completely reliable and clearly under your control)? If there is even some falsity among the beliefs you use, but if you do not wholly remove it or if you do not isolate it from the other beliefs you are using, then — on the No False Evidence Proposal — there is a danger of its preventing those other beliefs from ever being knowledge. This is a worry to be taken seriously, if a belief’s being knowledge is to depend upon the total absence of falsity from one’s thinking in support of that belief.

Unsurprisingly, therefore, some epistemologists, such as Lehrer (1965), have proposed a further modification of JTB — a less demanding one. They have suggested that what is needed for knowing that p is an absence only of significant and ineliminable (non-isolable) falsehoods from one’s evidence for p’s being true. Here is what that means. First, false beliefs which you are — but need not have been — using as evidence for p are eliminable from your evidence for p. And, second, false beliefs whose absence would seriously weaken your evidence for p are significant within your evidence for p. Accordingly, the No False Evidence Proposal now becomes the No False Core Evidence Proposal. The latter proposal says that if the only falsehoods in your evidence for p are ones which you could discard, and ones whose absence would not seriously weaken your evidence for p, then (with all else being equal) your justification is adequate for giving you knowledge that p. The accompanying application of that proposal to Gettier cases would claim that because, within each such case, some falsehood plays an important role in the protagonist’s evidence, her justified true belief based on that evidence fails to be knowledge. On the modified proposal, this would be the reason for the lack of that knowledge.

One fundamental problem confronting that proposal is obviously its potential vagueness. To what extent, precisely, need you be able to eliminate the false evidence in question if knowledge that p is to be present? How easy, exactly, must this be for you? And just how weakened, exactly, may your evidence for p become — courtesy of the elimination of false elements within it — before it is too weak to be part of making your belief that p knowledge? Such questions still await answers from epistemologists.

10. Attempted Solutions: Eliminating Defeat

Section 9 explored the suggestion that the failing within any Gettier case is a matter of what is included within a given person’s evidence: specifically, some core falsehood is accepted within her evidence. A converse idea has also received epistemological attention — the thought that the failing within any Gettier case is a matter of what is not included in the person’s evidence: specifically, some notable truth or fact is absent from her evidence. This proposal would not simply be that the evidence overlooks at least one fact or truth. Like the unmodified No False Evidence Proposal (with which section 9 began), that would be far too demanding, undoubtedly leading to skepticism. Because there are always some facts or truths not noticed by anyone’s evidence for a particular belief, there would be no knowledge either. No one’s evidence for p would ever be good enough to satisfy the justification requirement that is generally held to be necessary to a belief that p’s being knowledge.

Epistemologists therefore restrict the proposal, turning it into what is often called a defeasibility analysis of knowledge. It can also be termed the No Defeat Proposal. The thought behind it is that JTB should be modified so as to say that what is needed in knowing that p is an absence from the inquirer’s context of any defeaters of her evidence for p. And what is a defeater? A particular fact or truth t defeats a body of justification j (as support for a belief that p) if adding t to j, thereby producing a new body of justification j*, would seriously weaken the justificatory support being provided for that belief that p — so much so that j* does not provide strong enough support to make even the true belief that p knowledge. This means that t is relevant to justifying p (because otherwise adding it to j would produce neither a weakened nor a strengthened j*) as support for p — but damagingly so. In effect, insofar as one wishes to have beliefs which are knowledge, one should only have beliefs which are supported by evidence that is not overlooking any facts or truths which — if left overlooked — function as defeaters of whatever support is being provided by that evidence for those beliefs.

In Case I, for instance, we might think that the reason why Smith’s belief b fails to be knowledge is that his evidence includes no awareness of the facts that he will get the job himself and that his own pocket contains ten coins. Thus, imagine a variation on Gettier’s case, in which Smith’s evidence does include a recognition of these facts about himself. Then either (i) he would have conflicting evidence (by having this evidence supporting his, plus the original evidence supporting Jones’s, being about to get the job), or (ii) he would not have conflicting evidence (if his original evidence about Jones had been discarded, leaving him with only the evidence about himself). But in either of those circumstances Smith would be justified in having belief b — concerning “the person,” whoever it would be, who will get the job. Moreover, in that circumstance he would not obviously be in a Gettier situation — with his belief b still failing to be knowledge. For, on either (i) or (ii), there would be no defeaters of his evidence — no facts which are being overlooked by his evidence, and which would seriously weaken his evidence if he were not overlooking them.

Unfortunately, however, this proposal — like the No False Core Evidence Proposal in section 9 — faces a fundamental problem of vagueness. As we have seen, defeaters defeat by weakening justification: as more and stronger defeaters are being overlooked by a particular body of evidence, that evidence is correlatively weakened. (This is so, even when the defeaters clash directly with one’s belief that p. And it is so, regardless of the believer’s not realizing that the evidence is thereby weakened.) How weak, exactly, can the justification for a belief that p become before it is too weak to sustain the belief’s being knowledge that p? This question — which, in one form or another, arises for all proposals which allow knowledge’s justificatory component to be satisfied by fallible justificatory support — is yet to be answered by epistemologists as a group. In the particular instance of the No Defeat Proposal, it is the question, raised by epistemologists such as William Lycan (1977) and Lehrer and Paxson (1969), of how much — and which aspects — of one’s environment need to be noticed by one’s evidence, if that evidence is to be justification that makes one’s belief that p knowledge. There can be much complexity in one’s environment, with it not always being clear where to draw the line between aspects of the environment which do — and those which do not — need to be noticed by one’s evidence. How strict should we be in what we expect of people in this respect?

11. Attempted Solutions: Eliminating Inappropriate Causality

It has also been suggested that the failing within Gettier situations is one of causality, with the justified true belief being caused — generated, brought about — in too odd or abnormal a way for it to be knowledge. This Appropriate Causality Proposal — initially advocated by Alvin Goldman (1967) — will ask us to consider, by way of contrast, any case of observational knowledge. Seemingly, a necessary part of such knowledge’s being produced is a stable and normal causal pattern’s generating the belief in question. You use your eyes in a standard way, for example. A belief might then form in a standard way, reporting what you observed. That belief will be justified in a standard way, too, partly by that use of your eyes. And it will be true in a standard way, reporting how the world actually is in a specific respect. All of this reflects the causal stability of normal visually-based belief-forming processes. In particular, we realize that the object of the knowledge — that perceived aspect of the world which most immediately makes the belief true — is playing an appropriate role in bringing the belief into existence.

Within Gettier’s Case I, however, that pattern of normality is absent. The aspects of the world which make Smith’s belief b true are the facts of his getting the job and of there being ten coins in his own pocket. But these do not help to cause the existence of belief b. (That belief is caused by Smith’s awareness of other facts — his conversation with the company president and his observation of the contents of Jones’s pocket.) Should JTB be modified accordingly, so as to tell us that a justified true belief is knowledge only if those aspects of the world which make it true are appropriately involved in causing it to exist?

Epistemologists have noticed problems with that Appropriate Causality Proposal, though.

First, some objects of knowledge might be aspects of the world which are unable ever to have causal influences. In knowing that 2 + 2 = 4 (this being a prima facie instance of what epistemologists term a priori knowledge), you know a truth — perhaps a fact — about numbers. And do they have causal effects? Most epistemologists do not believe so. (Maybe instances of numerals, such as marks on paper being interpreted on particular occasions in specific minds, can have causal effects. Yet — it is usually said — such numerals are merely representations of numbers. They are not the actual numbers.) Consequently, it is quite possible that the scope of the Appropriate Causality Proposal is more restricted than is epistemologically desirable. The proposal would apply only to empirical or a posteriori knowledge, knowledge of the observable world — which is to say that it might not apply to all of the knowledge that is actually or possibly available to people. And (as section 6 explained) epistemologists seek to understand all actual or possible knowledge, not just some of it.

Second, to what extent will the Appropriate Causality Proposal help us to understand even empirical knowledge? The problem is that epistemologists have not agreed on any formula for exactly how (if there is to be knowledge that p) the fact that p is to contribute to bringing about the existence of the justified true belief that p. Inevitably (and especially when reasoning is involved), there will be indirectness in the causal process resulting in the formation of the belief that p. But how much indirectness is too much? That is, are there degrees of indirectness that are incompatible with there being knowledge that p? And if so, how are we to specify those critical degrees?

For example, suppose that (in an altered Case I of which we might conceive) Smith’s being about to be offered the job is actually part of the causal explanation of why the company president told him that Jones would get the job. The president, with his mischievous sense of humor, wished to mislead Smith. And suppose that Smith’s having ten coins in his pocket made a jingling noise, subtly putting him in mind of coins in pockets, subsequently leading him to discover how many coins were in Jones’s pocket. Given all of this, the facts which make belief b true (namely, those ones concerning Smith’s getting the job and concerning the presence of the ten coins in his pocket) will actually have been involved in the causal process that brings belief b into existence. Would the Appropriate Causality Proposal thereby be satisfied — so that (in this altered Case I) belief b would now be knowledge? Or should we continue regarding the situation as being a Gettier case, a situation in which (as in the original Case I) the belief b fails to be knowledge? If we say that the situation remains a Gettier case, we need to explain why this new causal ancestry for belief b would still be too inappropriate to allow belief b to be knowledge.

Most epistemologists will regard the altered case as a Gettier case. But in that event they continue to owe us an analysis of what makes a given causal history inappropriate. Often, they talk of deviant causal chains. And that is an evocative phrase. But how clear is it? Once more, we will wonder about vagueness. In particular, we will ask, how deviant can a causal chain (one that results in some belief-formation) become before it is too deviant to be able to be bringing knowledge into existence? As we also found in sections 9 and 10, a conceptually deep problem of vagueness thus remains to be solved.

12. Attempted Dissolutions: Competing Intuitions

Sections 9 through 11 described some of the main proposals that epistemologists have made for solving the Gettier challenge directly. Those proposals accept the usual interpretation of each Gettier case as containing a justified true belief which fails to be knowledge. Each proposal then attempts to modify JTB, the traditional epistemological suggestion for what it is to know that p. What is sought by those proposals, therefore, is an analysis of knowledge which accords with the usual interpretation of Gettier cases. That analysis would be intended to cohere with the claim that knowledge is not present within Gettier cases. And why is it so important to cohere with the latter claim? The standard answer offered by epistemologists points to what they believe is their strong intuition that, within any Gettier case, knowledge is absent. Almost all epistemologists claim to have this intuition about Gettier cases. They treat this intuition with much respect. (It seems that most do so as part of a more general methodology, one which involves the respectful use of intuitions within many areas of philosophy. Frank Jackson [1998] is a prominent proponent of that methodology’s ability to aid our philosophical understanding of key concepts.)

Nonetheless, a few epistemological voices dissent from that approach (as this section and the next will indicate). These seek to dissolve the Gettier challenge. Instead of accepting the standard interpretation of Gettier cases, and instead of trying to find a direct solution to the challenge that the cases are thereby taken to ground, a dissolution of the cases denies that they ground any such challenge in the first place. And one way of developing such a dissolution is to deny or weaken the usual intuition by which almost all epistemologists claim to be guided in interpreting Gettier cases.

One such attempt has involved a few epistemologists — Jonathan Weinberg, Shaun Nichols, and Stephen Stich (2001) — conducting empirical research which (they argue) casts doubt upon the evidential force of the usual epistemological intuition about the cases. When epistemologists claim to have a strong intuition that knowledge is missing from Gettier cases, they take themselves to be representative of people in general (specifically, in how they use the word “knowledge” and its cognates such as “know,” knower,” and the like). That intuition is therefore taken to reflect how “we” — people in general — conceive of knowledge. It is thereby assumed to be an accurate indicator of pertinent details of the concept of knowledge — which is to say, “our” concept of knowledge. Yet what is it that gives epistemologists such confidence in their being representative of how people in general use the word “knowledge”? Mostly, epistemologists test this view of themselves upon their students and upon other epistemologists. The empirical research by Weinberg, Nichols, and Stich asked a wider variety of people — including ones from outside of university or college settings — about Gettier cases. And that research has reported encountering a wider variety of reactions to the cases. When people who lack much, or even any, prior epistemological awareness are presented with descriptions of Gettier cases, will they unhesitatingly say (as epistemologists do) that the justified true beliefs within those cases fail to be knowledge? The empirical evidence gathered so far suggests some intriguing disparities in this regard — including ones that might reflect varying ethnic ancestries or backgrounds. In particular, respondents of east Asian or Indian sub-continental descent were found to be more open than were European Americans (of “Western” descent) to classifying Gettier cases as situations in which knowledge is present. A similar disparity seemed to be correlated with respondents’ socio-economic status.

Those data are preliminary. (And other epistemologists have not sought to replicate those surveys.) Nonetheless, the data are suggestive. At the very least, they constitute some empirical evidence that does not simply accord with epistemologists’ usual interpretation of Gettier cases. Hence, a real possibility has been raised that epistemologists, in how they interpret Gettier cases, are not so accurately representative of people in general. Their shared, supposedly intuitive, interpretation of the cases might be due to something distinctive in how they, as a group, think about knowledge, rather than being merely how people as a whole regard knowledge. In other words, perhaps the apparent intuition about knowledge (as it pertains to Gettier situations) that epistemologists share with each other is not universally shared. Maybe it is at least not shared with as many other people as epistemologists assume is the case. And if so, then the epistemologists’ intuition might not merit the significance they have accorded it when seeking a solution to the Gettier challenge. (Indeed, that challenge itself might not be as distinctively significant as epistemologists have assumed it to be. This possibility arises once we recognize that the prevalence of that usual putative intuition among epistemologists has been important to their deeming, in the first place, that Gettier cases constitute a decisive challenge to our understanding of what it is to know that p.)

Epistemologists might reply that people who think that knowledge is present within Gettier cases are not evaluating the cases properly — that is, as the cases should be interpreted. The question thus emerges of whether epistemologists’ intuitions are particularly trustworthy on this topic. Are they more likely to be accurate (than are other people’s intuitions) in what they say about knowledge — in assessing its presence in, or its absence from, specific situations? Presumably, most epistemologists will think so, claiming that when other people do not concur that in Gettier cases there is a lack of knowledge, those competing reactions reflect a lack of understanding of the cases — a lack of understanding which could well be rectified by sustained epistemological reflection.

Potentially, that disagreement has methodological implications about the nature and point of epistemological inquiry. For we should wonder whether those epistemologists, insofar as their confidence in their interpretation of Gettier cases rests upon their more sustained reflection about such matters, are really giving voice to intuitions as such about Gettier cases when claiming to be doing so. Or are they instead applying some comparatively reflective theories of knowledge? The latter alternative need not make their analyses mistaken, of course. But it would make more likely the possibility that the analyses of knowledge which epistemologists develop in order to understand Gettier cases are not based upon a directly intuitive reading of the cases. This might weaken the strength and independence of the epistemologists’ evidential support for those analyses of knowledge.

For example, maybe the usual epistemological interpretation of Gettier cases is manifesting a commitment to a comparatively technical and demanding concept of knowledge, one that only reflective philosophers would use and understand. Even if the application of that concept feels intuitive to them, this could be due to the kind of technical training that they have experienced. It might not be a coincidence, either, that epistemologists tend to present Gettier cases by asking the audience, “So, is this justified true belief within the case really knowledge?” — thereby suggesting, through this use of emphasis, that there is an increased importance in making the correct assessment of the situation. The audience might well feel a correlative caution about saying that knowledge is present. They could feel obliged to take care not to accord knowledge if there is anything odd — as, clearly, there is — about the situation being discussed. When that kind of caution and care are felt to be required, then — as contextualist philosophers such as David Lewis (1996) have argued is appropriate — we are more likely to deny that knowledge is present.

Hence, if epistemologists continue to insist that the nature of knowledge is such as to satisfy one of their analyses (where this includes knowledge’s being such that it is absent from Gettier cases), then there is a correlative possibility that they are talking about something — knowledge — that is too difficult for many, if any, inquirers ever to attain. How should people — as potential or actual inquirers — react to that possibility? Mark Kaplan (1985) has argued that insofar as knowledge must conform to the demands of Gettier cases (and to the usual epistemological interpretation of them), knowledge is not something about which we should care greatly as inquirers. And the fault would be knowledge’s, not ours. Kaplan advocates our seeking something less demanding and more realistically attainable than knowledge is if it needs to cohere with the usual interpretation of Gettier cases. (An alternative thought which Kaplan’s argument might prompt us to investigate is that of whether knowledge itself could be something less demanding — even while still being at least somewhat worth seeking. Section 13 will discuss that idea.)

Those pivotal issues are currently unresolved. In the meantime, their presence confirms that, by thinking about Gettier cases, we may naturally raise some substantial questions about epistemological methodology — about the methods via which we should be trying to understand knowledge. Those questions include the following ones. What evidence should epistemologists consult as they strive to learn the nature of knowledge? Should they be perusing intuitions? If so, whose? Their own? How should competing intuitions be assessed? And how strongly should favored intuitions be relied upon anyway? Are they to be decisive? Are they at least powerful? Or are they no more than a starting-point for further debate — a provider, not an adjudicator, of relevant ideas?

13. Attempted Dissolutions: Knowing Luckily

Section 12 posed the question of whether supposedly intuitive assessments of Gettier situations support the usual interpretation of the cases as strongly — or even as intuitively — as epistemologists generally believe is the case. How best might that question be answered? Sections 5 and 8 explained that when epistemologists seek to support that usual interpretation in a way that is meant to remain intuitive, they typically begin by pointing to the luck that is present within the cases. That luck is standardly thought to be a powerful — yet still intuitive — reason why the justified true beliefs inside Gettier cases fail to be knowledge.

Nevertheless, a contrary interpretation of the luck’s role has also been proposed, by Stephen Hetherington (1998; 2001). It means to reinstate the sufficiency of JTB, thereby dissolving Gettier’s challenge. That contrary interpretation could be called the Knowing Luckily Proposal. And it analyses Gettier’s Case I along the following lines.

This alternative interpretation concedes (in accord with the usual interpretation) that, in forming his belief b, Smith is lucky to be gaining a belief which is true. More fully: He is lucky to do so, given the evidence by which he is being guided in forming that belief, and given the surrounding facts of his situation. In that sense (we might say), Smith came close to definitely lacking knowledge. (For in that sense he came close to forming a false belief; and a belief which is false is definitely not knowledge.) But to come close to definitely lacking knowledge need not be to lack knowledge. It might merely be to almost lack knowledge. So (as we might also say), it could be to know, albeit luckily so. Smith would have knowledge, in virtue of having a justified true belief. (We would thus continue to regard JTB as being true.) However, because Smith would only luckily have that justified true belief, he would only luckily have that knowledge.

Most epistemologists will object that this sounds like too puzzling a way to talk about knowing. Their reaction is natural. Even this Knowing Luckily Proposal would probably concede that there is very little (if any) knowledge which is lucky in so marked or dramatic a way. And because there is so little (if any) such knowledge, our everyday lives leave us quite unused to thinking of some knowledge as being present within ourselves or others quite so luckily: we would actually encounter little (if any) such knowledge. To the extent that the kind of luck involved in such cases reflects the statistical unlikelihood of such circumstances occurring, therefore, we should expect at least most knowledge not to be present in that lucky way. (Otherwise, this would be the normal way for knowledge to be present. It would not in fact be an unusual way. Hence, strictly speaking, the knowledge would not be present only luckily.)

But even if the Knowing Luckily Proposal agrees that, inevitably, at least most knowledge will be present in comparatively normal ways, the proposal will deny that this entails the impossibility of there ever being at least some knowledge which is present more luckily. Ordinarily, when good evidence for a belief that p accompanies the belief’s being true (as it does in Case I), this combination of good evidence and true belief occurs (unlike in Case I) without any notable luck being needed. Ordinary knowledge is thereby constituted, with that absence of notable luck being part of what makes instances of ordinary knowledge ordinary in our eyes. What is ordinary to us will not strike us as being present only luckily. Again, though, is it therefore impossible for knowledge ever to be constituted luckily? The Knowing Luckily Proposal claims that such knowledge is possible even if uncommon. The proposal will grant that there would be a difference between knowing that p in a comparatively ordinary way and knowing that p in a comparatively lucky way. Knowing comparatively luckily that p would be (i) knowing that p (where this might remain one’s having a justified true belief that p), even while also (ii) running, or having run, a greater risk of not having that knowledge that p. In that sense, it would be to know that p less securely or stably or dependably, more fleetingly or unpredictably.

There are many forms that the lack of stability — the luck involved in the knowledge’s being present — could take. Sometimes it might include the knowledge’s having one of the failings found within Gettier cases. The knowledge — the justified true belief — would be present in a correspondingly lucky way. One interpretive possibility — from Hetherington (2001) — is that of describing this knowledge that p as being of a comparatively poor quality as knowledge that p. Normally, knowledge that p is of a higher quality than this — being less obviously flawed, by being less luckily present. The question persists, though: Must all knowledge that p be, in effect, normal knowledge that p — being of a normal quality as knowledge that p? Or could we sometimes — even if rarely — know that p in a comparatively poor and undesirable way? The Knowing Luckily Proposal allows that this is possible — that this is a conceivable form for some knowledge to take.

That proposal is yet to be widely accepted among epistemologists. Their main objection to it has been what they have felt to be the oddity of talking of knowledge in that way. Accordingly, the epistemological resistance to the proposal partly reflects the standard adherence to the dominant (“intuitive”) interpretation of Gettier cases. Yet this section and the previous one have asked whether epistemologists should be wedded to that interpretation of Gettier cases. So, this section leaves us with the following question: Is it conceptually coherent to regard the justified true beliefs within Gettier cases as instances of knowledge which are luckily produced or present? And how are we to answer that question anyway? With intuitions? Whose? Once again, we encounter section 12’s questions about the proper methodology for making epistemological progress on this issue.

14. Gettier Cases and Analytic Epistemology

Since the initial philosophical description in 1963 of Gettier cases, the project of responding to them (so as to understand what it is to know that p) has often been central to the practice of analytic epistemology. Partly this recurrent centrality has been due to epistemologists’ taking the opportunity to think in detail about the nature of justification — about what justification is like in itself, and about how it is constitutively related to knowledge. But partly, too, that recurrent centrality reflects the way in which, epistemologists have often assumed, responding adequately to Gettier cases requires the use of a paradigm example of a method that has long been central to analytic philosophy. That method involves the considered manipulation and modification of definitional models or theories, in reaction to clear counterexamples to those models or theories.

Thus (we saw in section 2), JTB purported to provide a definitional analysis of what it is to know that p. JTB aimed to describe, at least in general terms, the separable-yet-combinable components of such knowledge. Then Gettier cases emerged, functioning as apparently successful counterexamples to one aspect — the sufficiency — of JTB’s generic analysis. That interpretation of the cases’ impact rested upon epistemologists’ claims to have reflective-yet-intuitive insight into the absence of knowledge from those actual or possible Gettier circumstances. These claims of intuitive insight were treated by epistemologists as decisive data, somewhat akin to favored observations. The claims were to be respected accordingly; and, it was assumed, any modification of the theory encapsulated in JTB would need to be evaluated for how well it accommodated them. So, the entrenchment of the Gettier challenge at the core of analytic epistemology hinged upon epistemologists’ confident assumptions that (i) JTB failed to accommodate the data provided by those intuitions — and that (ii) any analytical modification of JTB would need (and would be able) to be assessed for whether it accommodated such intuitions. That was the analytical method which epistemologists proceeded to apply, vigorously and repeatedly.

Nevertheless, the history of post-1963 analytic epistemology has also contained repeated expressions of frustration at the seemingly insoluble difficulties that have accompanied the many attempts to respond to Gettier’s disarmingly simple paper. Precisely how should the theory JTB be revised, in accord with the relevant data? Exactly which data are relevant anyway? We have seen in the foregoing sections that there is much room for dispute and uncertainty about all of this. For example, we have found a persistent problem of vagueness confronting various attempts to revise JTB. This might have us wondering whether a complete analytical definition of knowledge that p is even possible.

That is especially so, given that vagueness itself is a phenomenon, the proper understanding of which is yet to be agreed upon by philosophers. There is much contemporary discussion of what it even is (see Keefe and Smith 1996). On one suggested interpretation, vagueness is a matter of people in general not knowing where to draw a precise and clearly accurate line between instances of X and instances of non-X (for some supposedly vague phenomenon of being X, such as being bald or being tall). On that interpretation of vagueness, such a dividing line would exist; we would just be ignorant of its location. To many philosophers, that idea sounds regrettably odd when the vague phenomenon in question is baldness, say. (“You claim that there is an exact dividing line, in terms of the number of hairs on a person’s head, between being bald and not being bald? I find that claim extremely hard to believe.”) But should philosophers react with such incredulity when the phenomenon in question is that of knowing, and when the possibility of vagueness is being prompted by discussions of the Gettier problem? For most epistemologists remain convinced that their standard reaction to Gettier cases reflects, in part, the existence of a definite difference between knowing and not knowing. But where, exactly, is that dividing line to be found? As we have observed, the usual epistemological answers to this question seek to locate and to understand the dividing line in terms of degrees and kinds of justification or something similar. Accordingly, the threats of vagueness we have noticed in some earlier sections of this article might be a problem for many epistemologists. Possibly, those forms of vagueness afflict epistemologists’ knowing that a difference between knowledge and non-knowledge is revealed by Gettier cases. Epistemologists continue regarding the cases in that way. Are they right to do so? Do they have that supposed knowledge of what Gettier cases show about knowledge?

The Gettier challenge has therefore become a test case for analytically inclined philosophers. The following questions have become progressively more pressing with each failed attempt to convince epistemologists as a group that, in a given article or talk or book, the correct analysis of knowledge has finally been reached. Will an adequate understanding of knowledge ever emerge from an analytical balancing of various theories of knowledge against relevant data such as intuitions? Must any theory of the nature of knowledge be answerable to intuitions prompted by Gettier cases in particular? And must epistemologists’ intuitions about the cases be supplemented by other people’s intuitions, too? What kind of theory of knowledge is at stake? What general form should the theory take? And what degree of precision should it have? If we are seeking an understanding of knowledge, must this be a logically or conceptually exhaustive understanding? (The methodological model of theory-being-tested-against-data suggests a scientific parallel. Yet need scientific understanding always be logically or conceptually exhaustive if it is to be real understanding?)

The issues involved are complex and subtle. No analysis has received general assent from epistemologists, and the methodological questions remain puzzling. Debate therefore continues. There is uncertainty as to whether Gettier cases — and thereby knowledge — can ever be fully understood. There is also uncertainty as to whether the Gettier challenge can be dissolved. Have we fully understood the challenge itself? What exactly is Gettier’s legacy? As epistemologists continue to ponder these questions, it is not wholly clear where their efforts will lead us. Conceptual possibilities still abound.

15. References and Further Reading

  • Ayer, A. J. (1956). The Problem of Knowledge (London: Macmillan), ch. 1.
    • Presents a well-regarded pre-Gettier JTB analysis of knowledge.
  • Chisholm, R. M. (1966/1977/1989). Theory of Knowledge (any of the three editions). (Englewood Cliffs, NJ: Prentice Hall).
    • Includes the sheep-in-the-field Gettier case, along with attempts to repair JTB.
  • Descartes, R. (1911 [1641]). The Philosophical Works of Descartes, Vol. I, (eds. and trans.) E. S. Haldane and G. R. T. Ross. (Cambridge: Cambridge University Press).
    • Contains the Meditations, which develops and applies Descartes’s conception of knowledge as needing to be infallible.
  • Feldman, R. (1974). “An Alleged Defect in Gettier Counterexamples.” Australasian Journal of Philosophy 52: 68-9. Reprinted in Moser (1986).
    • Presents a Gettier case in which, it is claimed, no false evidence is used by the believer.
  • Gettier, E. L. (1963). “Is Justified True Belief Knowledge?” Analysis 23: 121-3. Reprinted in Roth and Galis (1970) and Moser (1986).
  • Goldman, A. I. (1967). “A Causal Theory of Knowing.” Journal of Philosophy 64: 357-72. Reprinted, with revisions, in Roth and Galis (1970).
    • The initial presentation of a No Inappropriate Causality Proposal.
  • Goldman, A. I.. (1976). “Discrimination and Perceptual Knowledge.” Journal of Philosophy 73: 771-91. Reprinted in Pappas and Swain (1978).
    • Includes the fake-barns Gettier case.
  • Hetherington, S. (1996). Knowledge Puzzles: An Introduction to Epistemology (Boulder, Colo.: Westview Press).
    • Includes an introduction to the justified-true-belief analysis of knowledge, and to several responses to Gettier’s challenge.
  • Hetherington, S. (1998). “Actually Knowing.” Philosophical Quarterly 48: 453-69.
    • Includes a version of the Knowing Luckily Proposal.
  • Hetherington, S. (2001). Good Knowledge, Bad Knowledge: On Two Dogmas of Epistemology (Oxford: Oxford University Press).
    • Extends the Knowing Luckily Proposal, by explaining the idea of having qualitatively better or worse knowledge that p.
  • Jackson, F. (1998). From Metaphysics to Ethics: A Defence of Conceptual Analysis (Oxford: Oxford University Press).
    • Includes discussion of Gettier cases and the role of intuitions and conceptual analysis.
  • Kaplan, M. (1985). “It’s Not What You Know That Counts.” Journal of Philosophy 82: 350-63.
    • Argues that, given Gettier cases, knowledge is not what inquirers should seek.
  • Keefe, R. and Smith, P. (eds.) (1996). Vagueness: A Reader (Cambridge, Mass.: The MIT Press).
    • Contains both historical and contemporary analyses of the nature and significance of vagueness in general.
  • Kirkham, R. L. (1984). “Does the Gettier Problem Rest on a Mistake?” Mind 93: 501-13.
    • Argues that the usual interpretation of Gettier cases depends upon applying an extremely demanding conception of knowledge to the described situations, a conception with skeptical implications.
  • Lehrer, K. (1965). “Knowledge, Truth and Evidence.” Analysis 25: 168-75. Reprinted in Roth and Galis (1970).
    • Presents a No Core False Evidence Proposal.
  • Lehrer, K. (1971). “Why Not Scepticism?” The Philosophical Forum 2: 283-98. Reprinted in Pappas and Swain (1978).
    • Outlines a skepticism based on an Infallibility Proposal about knowledge.
  • Lehrer, K., and Paxson, T. D. (1969). “Knowledge: Undefeated Justified True Belief.” Journal of Philosophy 66: 225-37. Reprinted in Pappas and Swain (1978).
    • Presents a No Defeat Proposal.
  • Lewis, D. (1996). “Elusive Knowledge.” Australasian Journal of Philosophy 74: 549-67.
    • Includes a much-discussed response to Gettier cases which pays attention to nuances in how people discuss knowledge.
  • Lycan, W. G. (1977). “Evidence One Does not Possess.” Australasian Journal of Philosophy 55: 114-26.
    • Discusses potential complications in a No Defeat Proposal.
  • Lycan, W. G. (2006). “On the Gettier Problem Problem.” In Epistemology Futures, (ed.) S. Hetherington. (Oxford: Oxford University Press).
    • A recent overview of the history of attempted solutions to the Gettier problem.
  • Moser, P. K. (ed.) (1986). Empirical Knowledge: Readings in Contemporary Epistemology (Totowa, NJ: Rowman & Littlefield).
    • Contains some influential papers on Gettier cases.
  • Pappas, G. S., and Swain, M. (eds.) (1978). Essays on Knowledge and Justification (Ithaca, NY: Cornell University Press).
    • A key anthology, mainly on the Gettier problem.
  • Plato. Meno 97a-98b.
    • For what epistemologists generally regard as being an early version of JTB.
  • Plato. Theatetus 200d-210c.
    • For seminal philosophical discussion of some possible instances of JTB.
  • Roth, M. D., and Galis, L. (eds.) (1970). Knowing: Essays in the Analysis of Knowledge (New York: Random House).
    • Includes some noteworthy papers on Gettier’s challenge.
  • Shope, R. K. (1983). The Analysis of Knowing: A Decade of Research (Princeton: Princeton University Press).
    • Presents many Gettier cases; discusses several proposed analyses of them.
  • Skyrms, B. (1967). “The Explication of ‘X Knows that p’.” Journal of Philosophy 64: 373-89. Reprinted in Roth and Galis (1970).
    • Includes the pyromaniac Gettier case.
  • Unger, P. (1968). “An Analysis of Factual Knowledge.” Journal of Philosophy 65: 157-70. Reprinted in Roth and Galis (1970).
    • Presents an Eliminate Luck Proposal.
  • Unger, P. (1971). “A Defense of Skepticism.” The Philosophical Review 30: 198-218. Reprinted in Pappas and Swain (1978).
    • Defends and applies an Infallibility Proposal about knowledge.
  • Weinberg, J., Nichols, S., and Stich, S. (2001). “Normativity and Epistemic Intuitions.” Philosophical Topics 29: 429-60.
    • Includes empirical data on competing (‘intuitive’) reactions to Gettier cases.
  • Williamson, T. (2000). Knowledge and Its Limits (Oxford: Oxford University Press), Intro., ch. 1.
    • Includes arguments against responding to Gettier cases with an analysis of knowledge.

Author Information

Stephen Hetherington
Email: s.hetherington@unsw.edu.au
University of New South Wales
Australia

Aristotle: Politics

In his Nicomachean Ethics, Aristotle (384-322 B.C.E.) describes the happy life intended for man by nature as one lived in accordance with virtue, and, in his Politics, he describes the role that politics and the political community must play in bringing about the virtuous life in the citizenry.

The Politics also provides analysis of the kinds of political community that existed in his time and shows where and how these cities fall short of the ideal community of virtuous citizens.

Although in some ways we have clearly moved beyond his thought (for example, his belief in the inferiority of women and his approval of slavery in at least some circumstances), there remains much in Aristotle’s philosophy that is valuable today.

In particular, his views on the connection between the well-being of the political community and that of the citizens who make it up, his belief that citizens must actively participate in politics if they are to be happy and virtuous, and his analysis of what causes and prevents revolution within political communities have been a source of inspiration for many contemporary theorists, especially those unhappy with the liberal political philosophy promoted by thinkers such as John Locke and John Stuart Mill.

Table of Contents

  1. Biography and History
  2. The Texts
  3. Challenges of the Texts
  4. Politics and Ethics
  5. The Importance of Telos
  6. The Text of the Politics
  7. The Politics, Book I
    1. The Purpose of the City
    2. How the City Comes Into Being
    3. Man, the Political Animal
    4. Slavery
    5. Women
  8. The Politics, Book II
    1. What Kind of Partnership Is a City?
    2. Existing Cities: Sparta, Crete, Carthage
  9. The Politics, Book III
    1. Who Is the Citizen?
    2. The Good Citizen and the Good Man
    3. Who Should Rule?
  10. The Politics, Book IV
    1. Polity: The Best Practical Regime
    2. The Importance of the Middle Class
  11. The Politics, Book V
    1. Conflict Between the Rich and the Poor
    2. How to Preserve Regimes
  12. The Politics, Book VI
    1. Varieties of Democracy
    2. The Best Kind of Democracy
    3. The Role of Wealth in a Democracy
  13. The Politics, Book VII
    1. The Best Regime and the Best Men
    2. Characteristics of the Best City
  14. The Politics, Book VIII
    1. The Education of the Young
  15. References and Further Reading

1. Biography and History

Aristotle’s life was primarily that of a scholar. However, like the other ancient philosophers, it was not the stereotypical ivory tower existence. His father was court physician to Amyntas III of Macedon, so Aristotle grew up in a royal household. Aristotle also knew Philip of Macedon (son of Amyntas III) and there is a tradition that says Aristotle tutored Philip’s son Alexander, who would later be called “the Great” after expanding the Macedonian Empire all the way to what is now India. Clearly, Aristotle had significant firsthand experience with politics, though scholars disagree about how much influence, if any, this experience had on Aristotle’s thought. There is certainly no evidence that Alexander’s subsequent career was much influenced by Aristotle’s teaching, which is uniformly critical of war and conquest as goals for human beings and which praises the intellectual, contemplative lifestyle. It is noteworthy that although Aristotle praises the politically active life, he spent most of his own life in Athens, where he was not a citizen and would not have been allowed to participate directly in politics (although of course anyone who wrote as extensively and well about politics as Aristotle did was likely to be politically influential).

Aristotle studied under Plato at Plato’s Academy in Athens, and eventually opened a school of his own (the Lyceum) there. As a scholar, Aristotle had a wide range of interests. He wrote about meteorology, biology, physics, poetry, logic, rhetoric, and politics and ethics, among other subjects. His writings on many of these interests remained definitive for almost two millennia. They remained, and remain, so valuable in part because of the comprehensiveness of his efforts. For example, in order to understand political phenomena, he had his students collect information on the political organization and history of 158 different cities. The Politics makes frequent reference to political events and institutions from many of these cities, drawing on his students’ research. Aristotle’s theories about the best ethical and political life are drawn from substantial amounts of empirical research. These studies, and in particular the Constitution of Athens, will be discussed in more detail below (Who Should Rule?). The question of how these writings should be unified into a consistent whole (if that is even possible) is an open one and beyond the scope of this article. This article will not attempt to organize all of Aristotle’s work into a coherent whole, but will draw on different texts as they are necessary to complete one version of Aristotle’s view of politics.

2. The Texts

The most important text for understanding Aristotle’s political philosophy, not surprisingly, is the Politics. However, it is also important to read Nicomachean Ethics in order to fully understand Aristotle’s political project. This is because Aristotle believed that ethics and politics were closely linked, and that in fact the ethical and virtuous life is only available to someone who participates in politics, while moral education is the main purpose of the political community. As he says in Nicomachean Ethics at 1099b30, “The end [or goal] of politics is the best of ends; and the main concern of politics is to engender a certain character in the citizens and to make them good and disposed to perform noble actions.” Most people living today in Western societies like the United States, Canada, Germany, or Australia would disagree with both parts of that statement. We are likely to regard politics (and politicians) as aiming at ignoble, selfish ends, such as wealth and power, rather than the “best end”, and many people regard the idea that politics is or should be primarily concerned with creating a particular moral character in citizens as a dangerous intrusion on individual freedom, in large part because we do not agree about what the “best end” is. In fact, what people in Western societies generally ask from politics and the government is that they keep each of us safe from other people (through the provision of police and military forces) so that each of us can choose and pursue our own ends, whatever they may be. This has been the case in Western political philosophy at least since John Locke. Development of individual character is left up to the individual, with help from family, religion, and other non-governmental institutions. More will be said about this later, but the reader should keep in mind that this is an important way in which our political and ethical beliefs are not Aristotle’s. The reader is also cautioned against immediately concluding from this that Ar istotle was wrong and we are right. This may be so, but it is important to understand why, and the contrast between Aristotle’s beliefs and ours can help to bring the strengths and weaknesses of our own beliefs into greater clarity.

The reference above to “Nicomachean Ethics at 1099b30″ makes use of what is called Bekker pagination. This refers to the location of beginning of the cited text in the edition of Aristotle’s works produced by Immanuel Bekker in Berlin in 1831 (in this case, it begins on page 1099, column b, line 30). Scholars make use of this system for all of Aristotle’s works except the Constitution of Athens (which was not rediscovered until after 1831) and fragmentary works in order to be able to refer to the same point in Aristotle’s work regardless of which edition, translation, or language they happen to be working with. This entry will make use of the Bekker pagination system, and will also follow tradition and refer to Nicomachean Ethics as simply Ethics. (There is also a Eudemian Ethics which is almost certainly by Aristotle (and which shares three of the ten books of the Nicomachean Ethics) and a work on ethics titled Magna Moralia which has been attributed to him but which most scholars now believe is not his work. Regardless, most scholars believe that the Nicomachean Ethics is Aristotle’s fullest and most mature expression of his ethical theory). The translation is that of Martin Ostwald; see the bibliography for full information. In addition to the texts listed above, the student with an interest in Aristotle’s political theory may also wish to read the Rhetoric, which includes observations on ethics and politics in the context of teaching the reader how to be a more effective speaker, and the Constitution of Athens, a work attributed to Aristotle, but which may be by one of his students, which describes the political history of the city of Athens.

3. Challenges of the Texts

Any honest attempt to summarize and describe Aristotle’s political philosophy must include an acknowledgment that there is no consensus on many of the most important aspects of that philosophy. Some of the reasons for this should be mentioned from the outset.

One set of reasons has to do with the text itself and the transmission of the text from Aristotle’s time to ours. The first thing that can lead to disagreement over Aristotle’s beliefs is the fact that the Politics andEthics are believed by many scholars to be his lecture notes, for lectures which were intended to be heard only by his own students. (Aristotle did write for general audiences on these subjects, probably in dialogue form, but only a few fragments of those writings remain). This is also one reason why many students have difficulty reading his work: no teacher’s lecture notes ever make complete sense to anyone else (their meaning can even elude their author at times). Many topics in the texts are discussed less fully than we would like, and many things are ambiguous which we wish were more straightforward. But if Aristotle was lecturing from these writings, he could have taken care of these problems on the fly as he lectured, since presumably he knew what he meant, or he could have responded to requests for clarification or elaboration from his students.

Secondly, most people who read Aristotle are not reading him in the original Attic Greek but are instead reading translations. This leads to further disagreement, because different authors translate Aristotle differently, and the way in which a particular word is translated can be very significant for the text as a whole. There is no way to definitively settle the question of what Aristotle “really meant to say” in using a particular word or phrase.

Third, the Aristotelian texts we have are not the originals, but copies, and every time a text gets copied errors creep in (words, sentences, or paragraphs can get left out, words can be changed into new words, and so forth). For example, imagine someone writing the sentence “Ronald Reagan was the lastcompetent president of the United States.” It is copied by hand, and the person making the copy accidentally writes (or assumes that the author must have written) “Ronald Reagan was the leastcompetent president of the United States.” If the original is then destroyed, so that only the copy remains, future generations will read a sentence that means almost exactly the opposite of what the author intended. It may be clear from the context that a word has been changed, but then again it may not, and there is always hesitation in changing the text as we have it. In addition, although nowadays it is unacceptable to modify someone else’s work without clearly denoting the changes, this is a relatively recent development and there are portions of Aristotle’s texts which scholars believe were added by later writers. This, too, complicates our understanding of Aristotle.

Finally, there are a number of controversies related to the text of the Politics in particular. These controversies cannot be discussed here, but should be mentioned. For more detail consult the works listed in the “Suggestions for further reading” below. First, there is disagreement about whether the books of the Politics are in the order that Aristotle intended. Carnes Lord and others have argued based on a variety of textual evidence that books 7 and 8 were intended by Aristotle to follow book 3. Rearranging the text in this way would have the effect of joining the early discussion of the origins of political life and the city, and the nature of political justice, with the discussion of the ideal city and the education appropriate for it, while leaving together books 4-6 which are primarily concerned with existing varieties of regimes and how they are preserved and destroyed and moving them to the conclusion of the book. Second, some authors, notably Werner Jaeger, have argued that the different focus and orientation of the different portions of the Politics is a result of Aristotle writing them at different times, reflecting his changing interests and orientation towards Plato‘s teachings. The argument is that at first Aristotle stuck very closely to the attitudes and ideas of his teacher Plato, and only later developed his own more empirical approach. Thus any difficulties that there may be in integrating the different parts of the Politicsarise from the fact that they were not meant to be integrated and were written at different times and with different purposes. Third, the Politics as we have it appears to be incomplete; Book 6 ends in the middle of a sentence and Book 8 in the middle of a discussion. There are also several places in the Politicswhere Aristotle promises to consider a topic further later but does not do so in the text as we have i t (for example, at the end of Book II, Chapter 8). It is possible that Aristotle never finished writing it; more likely there is material missing as a result of damage to the scrolls on which it was written. The extent and content of any missing material is a matter of scholarly debate.

Fortunately, the beginning student of Aristotle will not need to concern themselves much with these problems. It is, however, important to get a quality translation of the text, which provides an introduction, footnotes, a glossary, and a bibliography, so that the reader is aware of places where, for example, there seems to be something missing from the text, or a word can have more than one meaning, or there are other textual issues. These will not always be the cheapest or most widely available translations, but it is important to get one of them, from a library if need be. Several suggested editions are listed at the end of this article.

4. Politics and Ethics

In Book Six of the Ethics Aristotle says that all knowledge can be classified into three categories: theoretical knowledge, practical knowledge, and productive knowledge. Put simply, these kinds of knowledge are distinguished by their aims: theoretical knowledge aims at contemplation, productive knowledge aims at creation, and practical knowledge aims at action. Theoretical knowledge involves the study of truth for its own sake; it is knowledge about things that are unchanging and eternal, and includes things like the principles of logic, physics, and mathematics (at the end of the Ethics Aristotle says that the most excellent human life is one lived in pursuit of this type of knowledge, because this knowledge brings us closest to the divine). The productive and practical sciences, in contrast, address our daily needs as human beings, and have to do with things that can and do change. Productive knowledge means, roughly, know-how; the knowledge of how to make a table or a house or a pair of shoes or how to write a tragedy would be examples of this kind of knowledge. This entry is concerned with practical knowledge, which is the knowledge of how to live and act. According to Aristotle, it is the possession and use of practical knowledge that makes it possible to live a good life. Ethics and politics, which are the practical sciences, deal with human beings as moral agents. Ethics is primarily about the actions of human beings as individuals, and politics is about the actions of human beings in communities, although it is important to remember that for Aristotle the two are closely linked and each influences the other.

The fact that ethics and politics are kinds of practical knowledge has several important consequences. First, it means that Aristotle believes that mere abstract knowledge of ethics and politics is worthless. Practical knowledge is only useful if we act on it; we must act appropriately if we are to be moral. He says at Ethics 1103b25: “The purpose of the present study [of morality] is not, as it is in other inquiries, the attainment of theoretical knowledge: we are not conducting this inquiry in order to know what virtue is, but in order to become good, else there would be no advantage in studying it.”

Second, according to Aristotle, only some people can beneficially study politics. Aristotle believes that women and slaves (or at least those who are slaves by nature) can never benefit from the study of politics, and also should not be allowed to participate in politics, about which more will be said later. But there is also a limitation on political study based on age, as a result of the connection between politics and experience: “A young man is not equipped to be a student of politics; for he has no experience in the actions which life demands of him, and these actions form the basis and subject matter of the discussion” (Ethics 1095a2). Aristotle adds that young men will usually act on the basis of their emotions, rather than according to reason, and since acting on practical knowledge requires the use of reason, young men are unequipped to study politics for this reason too. So the study of politics will only be useful to those who have the experience and the mental discipline to benefit from it, and for Aristotle this would have been a relatively small percentage of the population of a city. Even in Athens, the most democratic city in Greece, no more than 15 percent of the population was ever allowed the benefits of citizenship, including political participation. Athenian citizenship was limited to adult males who were not slaves and who had one parent who was an Athenian citizen (sometimes citizenship was further restricted to require both parents to be Athenian citizens). Aristotle does not think this percentage should be increased – if anything, it should be decreased.

Third, Aristotle distinguishes between practical and theoretical knowledge in terms of the level of precision that can be attained when studying them. Political and moral knowledge does not have the same degree of precision or certainty as mathematics. Aristotle says at Ethics 1094b14: “Problems of what is noble and just, which politics examines, present so much variety and irregularity that some people believe that they exist only by convention and not by nature….Therefore, in a discussion of such subjects, which has to start with a basis of this kind, we must be satisfied to indicate the truth with a rough and general sketch: when the subject and the basis of a discussion consist of matters that hold good only as a general rule, but not always, the conclusions reached must be of the same order.” Aristotle does not believe that the noble and the just exist only by convention, any more than, say, the principles of geometry do. However, the principles of geometry are fixed and unchanging. The definition of a point, or a line, or a plane, can be given precisely, and once this definition is known, it is fixed and unchanging for everyone. However, the definition of something like justice can only be known generally; there is no fixed and unchanging definition that will always be correct. This means that unlike philosophers such as Hobbes and Kant, Aristotle does not and in fact cannot give us a fixed set of rules to be followed when ethical and political decisions must be made. Instead he tries to make his students the kind of men who, when confronted with any particular ethical or political decision, will know the correct thing to do, will understand why it is the correct choice, and will choose to do it for that reason. Such a man will know the general rules to be followed, but will also know when and why to deviate from those rules. (I will use “man” and “men” when referring to citizens so that the reader keeps in mind that Aristotle, and the Greeks generally, excluded women from political part icipation. In fact it is not until the mid-19th century that organized attempts to gain the right to vote for women really get underway, and even today in the 21st century there are still many countries which deny women the right to vote or participate in political life).

5. The Importance of Telos

I have already noted the connection between ethics and politics in Aristotle’s thought. The concept that most clearly links the two is that which Aristotle called telos. A discussion of this concept and its importance will help the reader make sense of what follows. Aristotle himself discusses it in Book II, Chapter 3 of the Physics and Book I, Chapter 3 of the Metaphysics.

The word telos means something like purpose, or goal, or final end. According to Aristotle, everything has a purpose or final end. If we want to understand what something is, it must be understood in terms of that end, which we can discover through careful study. It is perhaps easiest to understand what a telos is by looking first at objects created by human beings. Consider a knife. If you wanted to describe a knife, you would talk about its size, and its shape, and what it is made out of, among other things. But Aristotle believes that you would also, as part of your description, have to say that it is made to cut things. And when you did, you would be describing its telos. The knife’s purpose, or reason for existing, is to cut things. And Aristotle would say that unless you included that telos in your description, you wouldn’t really have described – or understood – the knife. This is true not only of things made by humans, but of plants and animals as well. If you were to fully describe an acorn, you would include in your description that it will become an oak tree in the natural course of things – so acorns too have a telos. Suppose you were to describe an animal, like a thoroughbred foal. You would talk about its size, say it has four legs and hair, and a tail. Eventually you would say that it is meant to run fast. This is the horse’s telos, or purpose. If nothing thwarts that purpose, the young horse will indeed become a fast runner.

Here we are not primarily concerned with the telos of a knife or an acorn or a foal. What concerns us is the telos of a human being. Just like everything else that is alive, human beings have a telos. What is it that human beings are meant by nature to become in the way that knives are meant to cut, acorns are meant to become oak trees, and thoroughbred ponies are meant to become race horses? According to Aristotle, we are meant to become happy. This is nice to hear, although it isn’t all that useful. After all, people find happiness in many different ways. However, Aristotle says that living happily requires living a life of virtue. Someone who is not living a life that is virtuous, or morally good, is also not living a happy life, no matter what they might think. They are like a knife that will not cut, an oak tree that is diseased and stunted, or a racehorse that cannot run. In fact they are worse, since they have chosen the life they lead in a way that a knife or an acorn or a horse cannot.

Someone who does live according to virtue, who chooses to do the right thing because it is the right thing to do, is living a life that flourishes; to borrow a phrase, they are being all that they can be by using all of their human capacities to their fullest. The most important of these capacities is logos – a word that means “speech” and also means “reason” (it gives us the English word “logic”). Human beings alone have the ability to speak, and Aristotle says that we have been given that ability by nature so that we can speak and reason with each other to discover what is right and wrong, what is good and bad, and what is just and unjust.

Note that human beings discover these things rather than creating them. We do not get to decide what is right and wrong, but we do get to decide whether we will do what is right or what is wrong, and this is the most important decision we make in life. So too is the happy life: we do not get to decide what really makes us happy, although we do decide whether or not to pursue the happy life. And this is an ongoing decision. It is not made once and for all, but must be made over and over again as we live our lives. Aristotle believes that it is not easy to be virtuous, and he knows that becoming virtuous can only happen under the right conditions. Just as an acorn can only fulfill its telos if there is sufficient light, the right kind of soil, and enough water (among other things), and a horse can only fulfill its telos if there is sufficient food and room to run (again, among other things), an individual can only fulfill their telos and be a moral and happy human being within a well constructed political community. The community brings about virtue through education and through laws which prescribe certain actions and prohibit others.

And here we see the link between ethics and politics in a different light: the role of politics is to provide an environment in which people can live fully human, ethical, and happy lives, and this is the kind of life which makes it possible for someone to participate in politics in the correct way. As Aristotle says at Ethics1103a30: “We become just by the practice of just actions, self-controlled by exercising self-control, and courageous by performing acts of courage….Lawgivers make the citizens good by inculcating [good] habits in them, and this is the aim of every lawgiver; if he does not succeed in doing that, his legislation is a failure. It is in this that a good constitution differs from a bad one.” This is not a view that would be found in political science textbooks today, but for Aristotle it is the central concern of the study of politics: how can we discover and put into practice the political institutions that will develop virtue in the citizens to the greatest possible extent?

6. The Text of the Politics

Having laid out the groundwork for Aristotle’s thought, we are now in a position to look more closely at the text of the Politics. The translation we will use is that of Carnes Lord, which can be found in the list of suggested readings. This discussion is by no means complete; there is much of interest and value in Aristotle’s political writings that will not be considered here. Again, the reader is encouraged to investigate the list of suggested readings. However, the main topics and problems of Aristotle’s work will be included. The discussion will, to the extent possible, follow the organization of the Politics.

7. The Politics, Book I

a. The Purpose of the City

Aristotle begins the Politics by defining its subject, the city or political partnership. Doing so requires him to explain the purpose of the city. (The Greek word for city is polis, which is the word that gives us English words like “politics” and “policy”). Aristotle says that “It is clear that all partnerships aim at some good, and that the partnership that is most authoritative of all and embraces all the others does so particularly, and aims at the most authoritative good of all. This is what is called the city or the political partnership” (1252a3) (See also III.12). In Greece in Aristotle’s time the important political entities were cities, which controlled surrounding territories that were farmed. It is important to remember that the city was not subordinate to a state or nation, the way that cities are today; it was sovereign over the territory that it controlled. To convey this, some translations use the word “city-state” in place of the world “polis.” Although none of us today lives in a polis , we should not be too quick to dismiss Aristotle’s observations on the way of life of the polis as irrelevant to our own political partnerships.

Notice that Aristotle does not define the political community in the way that we generally would, by the laws that it follows or by the group that holds power or as an entity controlling a particular territory. Instead he defines it as a partnership. The citizens of a political community are partners, and as with any other partnership they pursue a common good. In the case of the city it is the most authoritative or highest good. The most authoritative and highest good of all, for Aristotle, is the virtue and happiness of the citizens, and the purpose of the city is to make it possible for the citizens to achieve this virtue and happiness. When discussing the ideal city, he says “[A] city is excellent, at any rate, by its citizens’ – those sharing in the regime – being excellent; and in our case all the citizens share in the regime” (1332a34). In achieving the virtue that is individual excellence, each of them will fulfill his telos. Indeed, it is the shared pursuit of virtue that makes a city a city.

As I have already noted at the beginning of this text, he says in the Ethics at 1099b30: “The end of politics is the best of ends; and the main concern of politics is to engender a certain character in the citizens and to make them good and disposed to perform noble actions.” As has been mentioned, most people today would not see this as the main concern of politics, or even a legitimate concern. Certainly almost everyone wants to see law-abiding citizens, but it is questionable that changing the citizens’ character or making them morally good is part of what government should do. Doing so would require far more governmental control over citizens than most people in Western societies are willing to allow.

Having seen Aristotle’s definition of the city and its purpose, we then get an example of Aristotle’s usual method of discussing political topics. He begins by examining opinions which are “generally accepted,” which means, as he says in the Topics at 100b21, “are accepted by everyone or by the majority or by the philosophers – i.e. by all, or by the majority, or by the most notable and illustrious of them” on the grounds that any such opinions are likely to have at least some truth to them. These opinions (the Greek word isendoxa), however, are not completely true. They must be systematically examined and modified by scholars of politics before the truths that are part of these opinions are revealed. Because Aristotle uses this method of examining the opinions of others to arrive at truth, the reader must be careful to pay attention to whether a particular argument or belief is Aristotle’s or not. In many cases he is setting out an argument in order to challenge it. It can be difficult to tell when Aristotle is arguing in his own voice and when he is considering the opinions of others, but the reader must carefully make this distinction if they are to understand Aristotle’s teachings. (It has also been suggested that Aristotle’s method should be seen as an example of how political discussion ought to be conducted: a variety of viewpoints and arguments are presented, and the final decision is arrived at through a consideration of the strengths and weaknesses of these viewpoints and arguments). For a further discussion of Aristotle’s methodology, see his discussion of reasoning in general and dialectical reasoning in particular in the Topics. Further examples of his approach can be found in Ethics I.4 and VII.1.

In this case, Aristotle takes up the popular opinion that political rule is really the same as other kinds of rule: that of kings over their subjects, of fathers over their wives and children, and of masters over their slaves. This opinion, he says, is mistaken. In fact, each of these kinds of rule is different. To see why, we must consider how the city comes into being, and it is to this that Aristotle next turns in Book I, Chapter 2.

b. How the City Comes Into Being

Here Aristotle tells the story of how cities have historically come into being. The first partnerships among human beings would have been between “persons who cannot exist without one another” (1252a27). There are two pairs of people for whom this is the case. One pair is that of male and female, for the sake of reproduction. This seems reasonable enough to the modern reader. The other pair, however, is that of “the naturally ruling and ruled, on account of preservation” (1252a30). Here Aristotle is referring to slavery. By “preservation” he means that the naturally ruling master and naturally ruled slave need each other if they are to preserve themselves; slavery is a kind of partnership which benefits both master and slave. We will see how later. For now, he simply says that these pairs of people come together and form a household, which exists for the purpose of meeting the needs of daily life (such as food, shelter, clothing, and so forth). The family is only large enough to provide for the bare necessities of life, sustaining its members’ lives and allowing for the reproduction of the species.

Over time, the family expands, and as it does it will come into contact with other families. Eventually a number of such families combine and form a village. Villages are better than families because they are more self-sufficient. Because villages are larger than families, people can specialize in a wider array of tasks and can develop skills in things like cooking, medicine, building, soldiering, and so forth which they could not develop in a smaller group. So the residents of a village will live more comfortable lives, with access to more goods and services, than those who only live in families.

The significant change in human communities, however, comes when a number of villages combine to form a city. A city is not just a big village, but is fundamentally different: “The partnership arising from [the union of] several villages that is complete is the city. It reaches a level of full self-sufficiency, so to speak; and while coming into being for the sake of living, it exists for the sake of living well” (1252b27). Although the founders of cities create them for the sake of more comfortable lives, cities are unique in making it possible for people to live well. Today we tend to think of “living well” as living a life of comfort, family satisfaction, and professional success, surrounded by nice things. But this is not what Aristotle means by “living well”. As we have seen, for Aristotle “living well” means leading a life of happiness and virtue, and by so doing fulfilling one’s telos. Life in the city, in Aristotle’s view, is therefore necessary for anyone who wishes to be completely human. (His particular concern is with the free men who are citizens). “He who is without a city through nature rather than chance is either a mean sort or superior to man,” Aristotle says (1253a3), and adds “One who is incapable of participating or who is in need of nothing through being self-sufficient is no part of a city, and so is either a beast or a god” (1253a27). Humans are not capable of becoming gods, but they are capable of becoming beasts, and in fact the worst kind of beasts: “For just as man is the best of the animals when completed, when separated from law and adjudication he is the worst of all” (1253a30). Outside of the context of life in a properly constructed city, human happiness and well-being is impossible. Even here at the very beginning of the Politics Aristotle is showing the link between ethics and politics and the importance of a well-constructed city in making it possible for the citizens to live well.

There is therefore a sense in which the city “is prior by nature to the household and to each of us” (1253a19). He compares the individual’s relationship with the city to the relationship of a part of the body to the whole body. The destruction of the whole body would also mean the destruction of each of its parts; “if the whole [body] is destroyed there will not be a foot or a hand” (1253a20). And just as a hand is not able to survive without being attached to a functioning body, so too an individual cannot survive without being attached to a city. Presumably Aristotle also means to imply that the reverse is not true; a body can survive the loss of a foot or a hand, although not without consequence. Thus the individual needs the city more than the city needs any of its individual citizens; as Aristotle says in Book 8 before beginning his discussion of the desirable education for the city’s children, “one ought not even consider that a citizen belongs to himself, but rather that all belong to the city; for each individual is a part of the city” (1337a26).

If the history that he has described is correct, Aristotle points out, then the city is natural, and not purely an artificial human construction, since we have established that the first partnerships which make up the family are driven by natural impulses: “Every city, therefore, exists by nature, if such also are the first partnerships. For the city is their end….[T]he city belongs among the things that exist by nature, and…man is by nature a political animal” (1252b30-1253a3). From the very first partnerships of male and female and master and slave, nature has been aiming at the creation of cities, because cities are necessary for human beings to express their capacities and virtues at their best, thus fulfilling their potential and moving towards such perfection as is possible for human beings. While most people today would not agree that nature has a plan for individual human beings, a particular community, or humanity as a whole (although many people would ascribe such a plan to a god or gods), Aristotle believes that nature does indeed have such a plan, and human beings have unique attributes that when properly used make it possible for us to fulfill that plan. What are those attributes?

c. Man, the Political Animal

That man is much more a political animal than any kind of bee or any herd animal is clear. For, as we assert, nature does nothing in vain, and man alone among the animals has speech….[S]peech serves to reveal the advantageous and the harmful and hence also the just and unjust. For it is peculiar to man as compared to the other animals that he alone has a perception of good and bad and just and unjust and other things of this sort; and partnership in these things is what makes a household and a city (1253a8).

Like bees and herd animals, human beings live together in groups. Unlike bees or herd animals, humans have the capacity for speech – or, in the Greek, logos. As we have seen, logos means not only speech but also reason. Here the linkage between speech and reason is clear: the purpose of speech, a purpose assigned to men by nature, is to reveal what is advantageous and harmful, and by doing so to reveal what is good and bad, just and unjust. This knowledge makes it possible for human beings to live together, and at the same time makes it possible for us to pursue justice as part of the virtuous lives we are meant to live. Other animals living in groups, such as bees, goats, and cows, do not have the ability to speak or to reason as Aristotle uses those terms. Of course, they do not need this ability. They are able to live together without determining what is just and unjust or creating laws to enforce justice among themselves. Human beings, for better or worse, cannot do this.

Although nature brings us together – we are by nature political animals – nature alone does not give us all of what we need to live together: “[T]here is in everyone by nature an impulse toward this sort of partnership. And yet the one who first constituted [a city] is responsible for the greatest of goods” [1253a29]. We must figure out how to live together for ourselves through the use of reason and speech, discovering justice and creating laws that make it possible for human community to survive and for the individuals in it to live virtuous lives. A group of people that has done this is a city: “[The virtue of] justice is a thing belonging to the city. For adjudication is an arrangement of the political partnership, and adjudication is judgment as to what is just” (1253a38). And in discovering and living according to the right laws, acting with justice and exercising the virtues that allow human society to function, we make possible not only the success of the political community but also the flourishing of our own individual virtue and happiness. Without the city and its justice, human beings are the worst of animals, just as we are the best when we are completed by the right kind of life in the city. And it is the pursuit of virtue rather than the pursuit of wealth or security or safety or military strength that is the most important element of a city: “The political partnership must be regarded, therefore, as being for the sake of noble actions, not for the sake of living together” (1281a1).

d. Slavery

Having described the basic parts of the city, Aristotle returns in Chapter 3 of Book I to a discussion of the household, beginning with the matter of slavery, including the question of whether slavery is just (and hence an acceptable institution) or not. This, for most contemporary readers is one of the two most offensive portions of Aristotle’s moral and political thought (the other is his treatment of women, about which more will be said below). For most people today, of course, the answer to this is obvious: slavery is not just, and in fact is one of the greatest injustices and moral crimes that it is possible to commit. (Although it is not widely known, there are still large numbers of people held in slavery throughout the world at the beginning of the 21st century. It is easy to believe that people in the “modern world” have put a great deal of moral distance between themselves and the less enlightened people in the past, but it is also easy to overestimate that distance).

In Aristotle’s time most people – at least the ones that were not themselves slaves – would also have believed that this question had an obvious answer, if they had asked the question at all: of course slavery is just. Virtually every ancient Mediterranean culture had some form of the institution of slavery. Slaves were usually of two kinds: either they had at one point been defeated in war, and the fact that they had been defeated meant that they were inferior and meant to serve, or else they were the children of slaves, in which case their inferiority was clear from their inferior parentage. Aristotle himself says that the sort of war that involves hunting “those human beings who are naturally suited to be ruled but [are] unwilling…[is] by nature just” (1256b25). What is more, the economies of the Greek city-states rested on slavery, and without slaves (and women) to do the productive labor, there could be no leisure for men to engage in more intellectual lifestyles. The greatness of Athenian plays, architecture, sculpture, and philosophy could not have been achieved without the institution of slavery. Therefore, as a practical matter, regardless of the arguments for or against it, slavery was not going to be abolished in the Greek world. Aristotle’s willingness to consider the justice of slavery, however we might see it, was in fact progressive for the time. It is perhaps also worth noting that Aristotle’s will specified that his slaves should be freed upon his death. This is not to excuse Aristotle or those of his time who supported slavery, but it should be kept in mind so as to give Aristotle a fair hearing.

Before considering Aristotle’s ultimate position on the justness of slavery – for who, and under what circumstances, slavery is appropriate – it must be pointed out that there is a great deal of disagreement about what that position is. That Aristotle believes slavery to be just and good for both master and slave in some circumstances is undeniable. That he believes that some people who are currently enslaved are not being held in slavery according to justice is also undeniable (this would apparently also mean that there are people who should be enslaved but currently are not). How we might tell which people belong in which group, and what Aristotle believes the consequences of his beliefs about slavery ought to be, are more difficult problems.

Remember that in his discussion of the household, Aristotle has said that slavery serves the interest of both the master and the slave. Now he tells us why: “those who are as different [from other men] as the soul from the body or man from beast – and they are in this state if their work is the use of the body, and if this is the best that can come from them – are slaves by nature….For he is a slave by nature who is capable of belonging to another – which is also why he belongs to another – and who participates in reason only to the extent of perceiving it, but does not have it” (1254b16-23). Notice again the importance of logos – reason and speech. Those who are slaves by nature do not have the full ability to reason. (Obviously they are not completely helpless or unable to reason; in the case of slaves captured in war, for example, the slaves were able to sustain their lives into adulthood and organize themselves into military forces. Aristotle also promises a discussion of “why it is better to hold out freedom as a reward for all slaves” (1330a30) which is not in the Politics as we have it, but if slaves were not capable of reasoning well enough to stay alive it would not be a good thing to free them). They are incapable of fully governing their own lives, and require other people to tell them what to do. Such people should be set to labor by the people who have the ability to reason fully and order their own lives. Labor is their proper use; Aristotle refers to slaves as “living tools” at I.4. Slaves get the guidance and instructions that they must have to live, and in return they provide the master with the benefits of their physical labor, not least of which is the free time that makes it possible for the master to engage in politics and philosophy.

One of the themes running through Aristotle’s thought that most people would reject today is the idea that a life of labor is demeaning and degrading, so that those who must work for a living are not able to be as virtuous as those who do not have to do such work. Indeed, Aristotle says that when the master can do so he avoids labor even to the extent of avoiding the oversight of those who must engage in it: “[F]or those to whom it is open not to be bothered with such things [i.e. managing slaves], an overseer assumes this prerogative, while they themselves engage in politics or philosophy” (1255b35).

This would seem to legitimate slavery, and yet there are two significant problems.

First, Aristotle points out that although nature would like us to be able to differentiate between who is meant to be a slave and who is meant to be a master by making the difference in reasoning capacity visible in their outward appearances, it frequently does not do so. We cannot look at people’s souls and distinguish those who are meant to rule from those who are meant to be ruled – and this will also cause problems when Aristotle turns to the question of who has a just claim to rule in the city.

Second, in Chapter Six, Aristotle points out that not everyone currently held in slavery is in fact a slave by nature. The argument that those who are captured in war are inferior in virtue cannot, as far as Aristotle is concerned, be sustained, and the idea that the children of slaves are meant to be slaves is also wrong: “[T]hey claim that from the good should come someone good, just as from a human being comes from a human being and a beast from beasts. But while nature wishes to do this, it is often unable to” (1255b3). We are left with the position that while some people are indeed slaves by nature, and that slavery is good for them, it is extremely difficult to find out who these people are, and that therefore it is not the case that slavery is automatically just either for people taken in war or for children of slaves, though sometimes it is (1256b23). In saying this, Aristotle was undermining the legitimacy of the two most significant sources of slaves. If Aristotle’s personal life is relevant, while he himself owned slaves, he was said to have freed them upon his death. Whether this makes Aristotle’s position on slavery more acceptable or less so is left to the reader to decide.

In Chapter 8 of Book I Aristotle says that since we have been talking about household possessions such as slaves we might as well continue this discussion. The discussion turns to “expertise in household management.” The Greek word for “household” is oikos, and it is the source of our word “economics.” In Aristotle’s day almost all productive labor took place within the household, unlike today, in modern capitalist societies, when it mostly takes place in factories, offices, and other places specifically developed for such activity.

Aristotle uses the discussion of household management to make a distinction between expertise in managing a household and expertise in business. The former, Aristotle says, is important both for the household and the city; we must have supplies available of the things that are necessary for life, such as food, clothing, and so forth, and because the household is natural so too is the science of household management, the job of which is to maintain the household. The latter, however, is potentially dangerous. This, obviously, is another major difference between Aristotle and contemporary Western societies, which respect and admire business expertise, and encourage many of our citizens to acquire and develop such expertise. For Aristotle, however, expertise in business is not natural, but “arises rather through a certain experience and art” (1257a5). It is on account of expertise in business that “there is held to be no limit to wealth and possessions” (1257a1). This is a problem because some people are led to pursue wealth without limit, and the choice of such a life, while superficially very attractive, does not lead to virtue and real happiness. It leads some people to “proceed on the supposition that they should either preserve or increase without limit their property in money. The cause of this state is that they are serious about living, but not about living well; and since that desire of theirs is without limit, they also desire what is productive of unlimited things” (1257b38).

Aristotle does not entirely condemn wealth – it is necessary for maintaining the household and for providing the opportunity to develop one’s virtue. For example, generosity is one of the virtues listed in the Ethics, but it is impossible to be generous unless one has possessions to give away. But Aristotle strongly believes that we must not lose sight of the fact that wealth is to be pursued for the sake of living a virtuous life, which is what it means to live well, rather than for its own sake. (So at 1258b1 he agrees with those who object to the lending of money for interest, upon which virtually the entire modern global economy is based). Someone who places primary importance on money and the bodily satisfactions that it can buy is not engaged in developing their virtue and has chosen a life which, however it may seem from the outside or to the person living it, is not a life of true happiness.

This is still another difference between Aristotle and contemporary Western societies. For many if not most people in such societies, the pursuit of wealth without limit is seen as not only acceptable but even admirable. At the same time, many people reject the emphasis Aristotle places on the importance of political participation. Many liberal democracies fail to get even half of their potential voters to cast a ballot at election time, and jury duty, especially in the United States, is often looked on as a burden and waste of time, rather than a necessary public service that citizens should willingly perform. In Chapter 11, Aristotle notes that there is a lot more to be said about enterprise in business, but “to spend much time on such things is crude” (1258b35). Aristotle believes that we ought to be more concerned with other matters; moneymaking is beneath the attention of the virtuous man. (In this Aristotle is in agreement with the common opinion of Athenian aristocrats). He concludes this discussion with a story about Thales the philosopher using his knowledge of astronomy to make a great deal of money, “thus showing how easy it is for philosophers to become wealthy if they so wish, but it is not this they are serious about” (1259a16). Their intellectual powers, which could be turned to wealth, are being used in other, better ways to develop their humanity.

In the course of discussing the various ways of life open to human beings, Aristotle notes that “If, then, nature makes nothing that is incomplete or purposeless, nature must necessarily have made all of these [i.e. all plants and animals] for the sake of human beings” (1256b21). Though not a directly political statement, it does emphasize Aristotle’s belief that there are many hierarchies in nature, as well as his belief that those who are lower in the natural hierarchy should be under the command of those who are higher.

e. Women

In Chapter 12, after the discussion of business expertise has been completed, Aristotle returns to the subject of household rule, and takes up the question of the proper forms of rule over women and children. As with the master’s rule over the slave, and humanity’s rule over plants and other animals, Aristotle defines these kinds of rule in terms of natural hierarchies: “[T]he male, unless constituted in some respect contrary to nature, is by nature more expert at leading than the female, and the elder and complete than the younger and incomplete” (1259a41). This means that it is natural for the male to rule: “[T]he relation of male to female is by nature a relation of superior to inferior and ruler to ruled” (1245b12). And just as with the rule of the master over the slave, the difference here is one of reason: “The slave is wholly lacking the deliberative element; the female has it but it lacks authority; the child has it but it is incomplete” (1260a11).

There is a great deal of scholarly debate about what the phrase “lacks authority” means in this context. Aristotle does not elaborate on it. Some have suggested that it means not that women’s reason is inferior to that of men but that women lack the ability to make men do what they want, either because of some innate psychological characteristic (they are not aggressive and/or assertive enough) or because of the prevailing culture in Greece at the time. Others suggest that it means that women’s emotions are ultimately more influential in determining their behavior than reason is so that reason lacks authority over what a woman does. This question cannot be settled here. I will simply point out the vicious circle in which women were trapped in ancient Greece (and still are in many cultures). The Greeks believed that women are inferior to men (or at least those Greeks who wrote philosophy, plays, speeches, and so forth did. These people, of course, were all men. What Greek women thought of this belief is impossible to say). This belief means that women are denied access to certain areas of life (such as politics). Denying them access to these spheres means that they fail to develop the knowledge and skills to become proficient in them. This lack of knowledge and skills then becomes evidence to reinforce the original belief that they are inferior.

What else does Aristotle have to say about the rule of men over women? He says that the rule of the male over the female and that of the father over children are different in form from the rule of masters over slaves. Aristotle places the rule of male over female in the household in the context of the husband over the wife (female children who had not yet been married would have been ruled by their father. Marriage for girls in Athens typically took place at the age of thirteen or fourteen). Aristotle says at 1259a40 that the wife is to be ruled in political fashion. We have not yet seen what political rule looks like, but here Aristotle notes several of its important features, one of which is that it usually involves “alternation in ruling and being ruled” (1259b2), and another is that it involves rule among those who “tend by their nature to be on an equal footing and to differ in nothing” (1259b5). In this case, however, the husband does not alternate rule with the wife but instead always rules. Apparently the husband is to treat his wife as an equal to the degree that it is possible to do so, but must retain ultimate control over household decisions.

Women have their own role in the household, preserving what the man acquires. However, women do not participate in politics, since their reason lacks the authority that would allow them to do so, and in order to properly fulfill this role the wife must pursue her own telos. This is not the same as that of a man, but as with a man nature intends her to achieve virtues of the kind that are available to her: “It is thus evident that…the moderation of a woman and a man is not the same, nor their courage or justice…but that there is a ruling and a serving courage, and similarly with the other virtues” (1260a19). Unfortunately Aristotle has very little to say about what women’s virtues look like, how they are to be achieved, or how women should be educated. But it is clear that Aristotle believes that as with the master’s superiority to the slave, the man’s superiority to a woman is dictated by nature and cannot be overcome by human laws, customs, or beliefs.

Aristotle concludes the discussion of household rule, and the first book of the Politics, by stating that the discussion here is not complete and “must necessarily be addressed in the [discourses] connected with the regimes” (1260a11). This is the case because both women and children “must necessarily be educated looking to the regime, at least if it makes any difference with a view to the city’s being excellent that both its children and its women are excellent. But it necessarily makes a difference…” (1260a14). “Regime” is one of the ways to translate the Greek word politeia, which is also often translated as “constitution” or “political system.” Although there is some controversy about how best to translate this word, I will use the word “regime” throughout this article. The reader should keep in mind that if the word “constitution” is used this does not mean a written constitution of the sort that most contemporary nation-states employ. Instead, Aristotle uses politeia (however it is translated) to mean the way the state is organized, what offices there are, who is eligible to hold them, how they are selected, and so forth. All of these things depend on the group that holds political power in the city. For example, sometimes power is held by one man who rules in the interest of the city as a whole; this is the kind of regime called monarchy. If power is held by the wealthy who rule for their own benefit, then the regime is an oligarchy.

We will have much more to say later on the topic of regimes. Here Aristotle is introducing another important idea which he will develop later: the idea that the people living under a regime, including the women and children, must be taught to believe in the principles that underlie that regime. (In Book II, Chapter 9, Aristotle severely criticizes the Spartan regime for its failure to properly educate the Spartan women and shows the negative consequences this has had for the Spartan regime). For a monarchy to last, for example, the people must believe in the rightness of monarchical rule and the principles which justify it. Therefore it is important for the monarch to teach the people these principles and beliefs. In Books IV-VI Aristotle develops in much more detail what the principles of the different regimes are, and the Politics concludes with a discussion of the kind of education that the best regime ought to provide its citizens.

8. The Politics, Book II

“Cities…that are held to be in a fine condition” In Book II, Aristotle changes his focus from the household to the consideration of regimes that are “in use in some of the cities that are said to be well managed and any others spoken about by certain persons that are held to be in a fine condition” (1260a30). This examination of existing cities must be done both in order to find out what those cities do properly, so that their successes can be imitated, and to find out what they do improperly so that we can learn from their mistakes. This study and the use of the knowledge it brings remains one of the important tasks of political science. Merely imitating an existing regime, no matter how excellent its reputation, is not sufficient. This is the case “because those regimes now available are in fact not in a fine condition” (1260a34). In order to create a better regime we must study the imperfect ones found in the real world. He will do this again on a more theoretical level in Books IV-VI. We should also examine the ideal regimes proposed by other thinkers. As it turns out, however fine these regimes are in theory, they cannot be put into practice, and this is obviously reason enough not to adopt them. Nevertheless, the ideas of other thinkers can assist us in our search for knowledge. Keep in mind that the practical sciences are not about knowledge for its own sake: unless we put this knowledge to use in order to improve the citizens and the city, the study engaged in by political science is pointless. We will not consider all the details of the different regimes Aristotle describes, but some of them are important enough to examine here.

a. What Kind of Partnership Is a City?

Aristotle begins his exploration of these regimes with the question of the degree to which the citizens in a regime should be partners. Recall that he opened the Politics with the statement that the city is a partnership, and in fact the most authoritative partnership. The citizens of a particular city clearly share something, because it is sharing that makes a partnership. Consider some examples of partnerships: business partners share a desire for wealth; philosophers share a desire for knowledge; drinking companions share a desire for entertainment; the members of a hockey team share a desire to win their game.

So what is it that citizens share? This is an important question for Aristotle, and he chooses to answer this question in the context of Socrates’ imagined community in Plato‘s dialogue The Republic. Aristotle has already said that the regime is a partnership in adjudication and justice. But is it enough that the people of a city have a shared understanding of what justice means and what the laws require, or is the political community a partnership in more than these things? Today the answer would probably be that these things are sufficient – a group of people sharing territory and laws is not far from how most people would define the modern state. In the Republic, Socrates argues that the city should be unified to the greatest degree possible. The citizens, or at least those in the ruling class, ought to share everything, including property, women, and children. There should be no private families and no private property. But this, according to Aristotle, is too much sharing. While the city is clearly a kind of unity, it is a unity that must derive from a multitude. Human beings are unavoidably different, and this difference, as we saw earlier, is the reason cities were formed in the first place, because difference within the city allows for specialization and greater self-sufficiency. Cities are preserved not by complete unity and similarity but by “reciprocal equality,” and this principle is especially important in cities where “persons are free and equal.” In such cities “all cannot rule at the same time, but each rules for a year or according to some other arrangement or period of time. In this way, then, it results that all rule…” (1261a30). This topic, the alternation of rule in cities where the citizens are free and equal, is an important part of Aristotle’s thought, and we will return to it later.

There would be another drawback to creating a city in which everything is held in common. Aristotle notes that people value and care for what is their own: “What belongs in common to the most people is accorded the least care: they take thought for their own things above all, and less about things common, or only so much as falls to each individually” (1261b32). (Contemporary social scientists call this a problem of “collective goods”). Therefore to hold women and property in common, as Socrates proposes, would be a mistake. It would weaken attachments to other people and to the common property of the city, and this would lead to each individual assuming that someone else would care for the children and property, with the end result being that no one would. For a modern example, many people who would not throw trash on their own front yard or damage their own furniture will litter in a public park and destroy the furniture in a rented apartment or dorm room. Some in Aristotle’s time (and since) have suggested that holding property in common will lead to an end to conflict in the city. This may at first seem wise, since the unequal distribution of property in a political community is, Aristotle believes, one of the causes of injustice in the city and ultimately of civil war. But in fact it is not the lack of common property that leads to conflict; instead, Aristotle blames human depravity (1263b20). And in order to deal with human depravity, what is needed is to moderate human desires, which can be done among those “adequately educated by the laws” (1266b31). Inequality of property leads to problems because the common people desire wealth without limit (1267b3); if this desire can be moderated, so too can the problems that arise from it. Aristotle also includes here the clam that the citizens making up the elite engage in conflict because of inequality of honors (1266b38). In other words, they engage in conflict with the other citizens because of their desire for an unequal share of honor, which leads them to treat the many with condescension and arrogance. Holding property in common, Aristotle notes, will not remove the desire for honor as a source of conflict.

b. Existing Cities: Sparta, Crete, Carthage

In Chapters 9-11 of Book II, Aristotle considers existing cities that are held to be excellent: Sparta in Chapter 9, Crete in Chapter 10, and Carthage (which, notably, was not a Greek city) in Chapter 11. It is noteworthy that when Athens is considered following this discussion (in Chapter 12), Aristotle takes a critical view and seems to suggest that the city has declined since the time of Solon. Aristotle does not anywhere in his writings suggest that Athens is the ideal city or even the best existing city. It is easy to assume the opposite, and many have done so, but there is no basis for this assumption. We will not examine the particulars of Aristotle’s view of each of these cities. However, two important points should be noted here. One general point that Aristotle makes when considering existing regimes is that when considering whether a particular piece of legislation is good or not, it must be compared not only to the best possible set of arrangements but also the set of arrangements that actually prevails in the city. If a law does not fit well with the principles of the regime, although it may be an excellent law in the abstract, the people will not believe in it or support it and as a result it will be ineffective or actually harmful (1269a31). The other is that Aristotle is critical of the Spartans because of their belief that the most important virtue to develop and the one that the city must teach its citizens is the kind of virtue that allows them to make war successfully. But war is not itself an end or a good thing; war is for the sake of peace, and the inability of the Spartans to live virtuously in times of peace has led to their downfall. (See also Book VII, Chapter 2, where Aristotle notes the hypocrisy of a city whose citizens seek justice among themselves but “care nothing about justice towards others” (1324b35) and Book VII, Chapter 15).

9. The Politics, Book III

a. Who Is the Citizen?

In Book III, Aristotle takes a different approach to understanding the city. Again he takes up the question of what the city actually is, but here his method is to understand the parts that make up the city: the citizens. “Thus who ought to be called a citizen and what the citizen is must be investigated” (1274b41). For Americans today this is a legal question: anyone born in the United States or born to American citizens abroad is automatically a citizen. Other people can become citizens by following the correct legal procedures for doing so. However, this rule is not acceptable for Aristotle, since slaves are born in the same cities as free men but that does not make them citizens. For Aristotle, there is more to citizenship than living in a particular place or sharing in economic activity or being ruled under the same laws. Instead, citizenship for Aristotle is a kind of activity: “The citizen in an unqualified sense is defined by no other thing so much as by sharing in decision and office” (1275a22). Later he says that “Whoever is entitled to participate in an office involving deliberation or decision is, we can now say, a citizen in this city; and the city is the multitude of such persons that is adequate with a view to a self-sufficient life, to speak simply” (1275b17). And this citizen is a citizen “above all in a democracy; he may, but will not necessarily, be a citizen in the others” (1275b4). We have yet to talk about what a democracy is, but when we do, this point will be important to defining it properly. When Aristotle talks about participation, he means that each citizen should participate directly in the assembly – not by voting for representatives – and should willingly serve on juries to help uphold the laws. Note again the contrast with modern Western nation-states where there are very few opportunities to participate directly in politics and most people struggle to avoid serving on juries.

Participation in deliberation and decision making means that the citizen is part of a group that discusses the advantageous and the harmful, the good and bad, and the just and unjust, and then passes laws and reaches judicial decisions based on this deliberative process. This process requires that each citizen consider the various possible courses of action on their merits and discuss these options with his fellow citizens. By doing so the citizen is engaging in reason and speech and is therefore fulfilling his telos, engaged in the process that enables him to achieve the virtuous and happy life. In regimes where the citizens are similar and equal by nature – which in practice is all of them – all citizens should be allowed to participate in politics, though not all at once. They must take turns, ruling and being ruled in turn. Note that this means that citizenship is not just a set of privileges, it is also a set of duties. The citizen has certain freedoms that non-citizens do not have, but he also has obligations (political participation and military service) that they do not have. We will see shortly why Aristotle believed that the cities existing at the time did not in fact follow this principle of ruling and being ruled in turn.

b. The Good Citizen and the Good Man

Before looking more closely at democracy and the other kinds of regimes, there are still several important questions to be discussed in Book III. One of the most important of these from Aristotle’s point of view is in Chapter 4. Here he asks the question of “whether the virtue of the good man and the excellent citizen is to be regarded as the same or as not the same” (1276b15). This is a question that seems strange, or at least irrelevant, to most people today. The good citizen today is asked to follow the laws, pay taxes, and possibly serve on juries; these are all good things the good man (or woman) would do, so that the good citizen is seen as being more or less subsumed into the category of the good person. For Aristotle, however, this is not the case. We have already seen Aristotle’s definition of the good man: the one who pursues his telos, living a life in accordance with virtue and finding happiness by doing so. What is Aristotle’s definition of the good citizen?

Aristotle has already told us that if the regime is going to endure it must educate all the citizens in such a way that they support the kind of regime that it is and the principles that legitimate it. Because there are several different types of regime (six, to be specific, which will be considered in more detail shortly), there are several different types of good citizen. Good citizens must have the type of virtue that preserves the partnership and the regime: “[A]lthough citizens are dissimilar, preservation of the partnership is their task, and the regime is [this] partnership; hence the virtue of the citizen must necessarily be with a view to the regime. If, then, there are indeed several forms of regime, it is clear that it is not possible for the virtue of the excellent citizen to be single, or complete virtue” (1276b27).

There is only one situation in which the virtue of the good citizen and excellent man are the same, and this is when the citizens are living in a city that is under the ideal regime: “In the case of the best regime, [the citizen] is one who is capable of and intentionally chooses being ruled and ruling with a view to the life in accordance with virtue” (1284a1). Aristotle does not fully describe this regime until Book VII. For those of us not living in the ideal regime, the ideal citizen is one who follows the laws and supports the principles of the regime, whatever that regime is. That this may well require us to act differently than the good man would act and to believe things that the good man knows to be false is one of the unfortunate tragedies of political life.

There is another element to determining who the good citizen is, and it is one that we today would not support. For Aristotle, remember, politics is about developing the virtue of the citizens and making it possible for them to live a life of virtue. We have already seen that women and slaves are not capable of living this kind of life, although each of these groups has its own kind of virtue to pursue. But there is another group that is incapable of citizenship leading to virtue, and Aristotle calls this group “the vulgar”. These are the people who must work for a living. Such people lack the leisure time necessary for political participation and the study of philosophy: “it is impossible to pursue the things of virtue when one lives the life of a vulgar person or a laborer” (1278a20). They are necessary for the city to exist – someone must build the houses, make the shoes, and so forth – but in the ideal city they would play no part in political life because their necessary tasks prevent them from developing their minds and taking an active part in ruling the city. Their existence, like those of the slaves and the women, is for the benefit of the free male citizens. Aristotle makes this point several times in the Politics: see, for example, VII.9 and VIII.2 for discussions of the importance of avoiding the lifestyle of the vulgar if one wants to achieve virtue, and I.13 and III.4, where those who work with their hands are labeled as kinds of slaves.

The citizens, therefore, are those men who are “similar in stock and free,” (1277b8) and rule over such men by those who are their equals is political rule, which is different from the rule of masters over slaves, men over women, and parents over children. This is one of Aristotle’s most important points: “[W]hen [the regime] is established in accordance with equality and similarity among the citizens, [the citizens] claim to merit ruling in turn” (1279a8). Throughout the remainder of the Politics he returns to this point to remind us of the distinction between a good regime and a bad regime. The correct regime of polity, highlighted in Book IV, is under political rule, while deviant regimes are those which are ruled as though a master was ruling over slaves. But this is wrong: “For in the case of persons similar by nature, justice and merit must necessarily be the same according to nature; and so if it is harmful for their bodies if unequal persons have equal sustenance and clothing, it is so also [for their souls if they are equal] in what pertains to honors, and similarly therefore if equal persons have what is unequal” (1287a12).

c. Who Should Rule?

This brings us to perhaps the most contentious of political questions: how should the regime be organized? Another way of putting this is: who should rule? In Books IV-VI Aristotle explores this question by looking at the kinds of regimes that actually existed in the Greek world and answering the question of who actually does rule. By closely examining regimes that actually exist, we can draw conclusions about the merits and drawbacks of each. Like political scientists today, he studied the particular political phenomena of his time in order to draw larger conclusions about how regimes and political institutions work and how they should work. As has been mentioned above, in order to do this, he sent his students throughout Greece to collect information on the regimes and histories of the Greek cities, and he uses this information throughout the Politics to provide examples that support his arguments. (According to Diogenes Laertius, histories and descriptions of the regimes of 158 cities were written, but only one of these has come down to the present: the Constitution of Athens mentioned above).

Another way he used this data was to create a typology of regimes that was so successful that it ended up being used until the time of Machiavelli nearly 2000 years later. He used two criteria to sort the regimes into six categories.

The first criterion that is used to distinguish among different kinds of regimes is the number of those ruling: one man, a few men, or the many. The second is perhaps a little more unexpected: do those in power, however many they are, rule only in their own interest or do they rule in the interest of all the citizens? “[T]hose regimes which look to the common advantage are correct regimes according to what is unqualifiedly just, while those which look only to the advantage of the rulers are errant, and are all deviations from the correct regimes; for they involve mastery, but the city is a partnership of free persons” (1279a16).

Having established these as the relevant criteria, in Book III Chapter 7 Aristotle sets out the six kinds of regimes. The correct regimes are monarchy (rule by one man for the common good), aristocracy (rule by a few for the common good), and polity (rule by the many for the common good); the flawed or deviant regimes are tyranny (rule by one man in his own interest), oligarchy (rule by the few in their own interest), and democracy (rule by the many in their own interest). Aristotle later ranks them in order of goodness, with monarchy the best, aristocracy the next best, then polity, democracy, oligarchy, and tyranny (1289a38). People in Western societies are used to thinking of democracy as a good form of government – maybe the only good form of government – but Aristotle considers it one of the flawed regimes (although it is the least bad of the three) and you should keep that in mind in his discussion of it. You should also keep in mind that by the “common good” Aristotle means the common good of the citizens, and not necessarily all the residents of the city. The women, slaves, and manual laborers are in the city for the good of the citizens.

Almost immediately after this typology is created, Aristotle clarifies it: the real distinction between oligarchy and democracy is in fact the distinction between whether the wealthy or the poor rule (1279b39), not whether the many or the few rule. Since it is always the case that the poor are many while the wealthy are few, it looks like it is the number of the rulers rather than their wealth which distinguishes the two kinds of regimes (he elaborates on this in IV.4). All cities have these two groups, the many poor and the few wealthy, and Aristotle was well aware that it was the conflict between these two groups that caused political instability in the cities, even leading to civil wars (Thucydides describes this in his History of the Peloponnesian War, and the Constitution of Athens also discusses the consequences of this conflict). Aristotle therefore spends a great deal of time discussing these two regimes and the problem of political instability, and we will focus on this problem as well.

First, however, let us briefly consider with Aristotle one other valid claim to rule. Those who are most virtuous have, Aristotle says, the strongest claim of all to rule. If the city exists for the sake of developing virtue in the citizens, then those who have the most virtue are the most fit to rule; they will rule best, and on behalf of all the citizens, establishing laws that lead others to virtue. However, if one man or a few men of exceptional virtue exist in the regime, we will be outside of politics: “If there is one person so outstanding by his excess of virtue – or a number of persons, though not enough to provide a full complement for the city – that the virtue of all the others and their political capacity is not commensurable…such persons can no longer be regarded as part of the city” (1284a4). It would be wrong for the other people in the city to claim the right to rule over them or share rule with them, just as it would be wrong for people to claim the right to share power with Zeus. The proper thing would be to obey them (1284b28). But this situation is extremely unlikely (1287b40). Instead, cities will be made up of people who are similar and equal, which leads to problems of its own.

The most pervasive of these is that oligarchs and democrats each advance a claim to political power based on justice. For Aristotle, justice dictates that equal people should get equal things, and unequal people should get unequal things. If, for example, two students turn in essays of identical quality, they should each get the same grade. Their work is equal, and so the reward should be too. If they turn in essays of different quality, they should get different grades which reflect the differences in their work. But the standards used for grading papers are reasonably straightforward, and the consequences of this judgment are not that important, relatively speaking – they certainly are not worth fighting and dying for. But the stakes are raised when we ask how we should judge the question of who should rule, for the standards here are not straightforward and disagreement over the answer to this question frequently does lead men (and women) to fight and die.

What does justice require when political power is being distributed? Aristotle says that both groups – the oligarchs and democrats – offer judgments about this, but neither of them gets it right, because “the judgment concerns themselves, and most people are bad judges concerning their own things” (1280a14). (This was the political problem that was of most concern to the authors of the United States Constitution: given that people are self-interested and ambitious, who can be trusted with power? Their answer differs from Aristotle’s, but it is worth pointing out the persistence of the problem and the difficulty of solving it). The oligarchs assert that their greater wealth entitles them to greater power, which means that they alone should rule, while the democrats say that the fact that all are equally free entitles each citizen to an equal share of political power (which, because most people are poor, means that in effect the poor rule). If the oligarchs’ claim seems ridiculous, you should keep in mind that the American colonies had property qualifications for voting; those who could not prove a certain level of wealth were not allowed to vote. And poll taxes, which required people to pay a tax in order to vote and therefore kept many poor citizens (including almost all African-Americans) from voting, were not eliminated in the United States until the mid-20th century. At any rate, each of these claims to rule, Aristotle says, is partially correct but partially wrong. We will consider the nature of democracy and oligarchy shortly.

Aristotle also in Book III argues for a principle that has become one of the bedrock principles of liberal democracy: we ought, to the extent possible, allow the law to rule. “One who asks the law to rule, therefore, is held to be asking god and intellect alone to rule, while one who asks man adds the beast. Desire is a thing of this sort; and spiritedness perverts rulers and the best men. Hence law is intellect without appetite” (1287a28). This is not to say that the law is unbiased. It will reflect the bias of the regime, as it must, because the law reinforces the principles of the regime and helps educate the citizens in those principles so that they will support the regime. But in any particular case, the law, having been established in advance, is impartial, whereas a human judge will find it hard to resist judging in his own interest, according to his own desires and appetites, which can easily lead to injustice. Also, if this kind of power is left in the hands of men rather than with the laws, there will be a desperate struggle to control these offices and their benefits, and this will be another cause of civil war. So whatever regime is in power should, to the extent possible, allow the laws to rule. Ruling in accordance with one’s wishes at any particular time is one of the hallmarks of tyranny (it is the same way masters rule over slaves), and it is also, Aristotle says, typical of a certain kind of democracy, which rules by decree rather than according to settled laws. In these cases we are no longer dealing with politics at all, “For where the laws do not rule there is no regime” (1292b30). There are masters and slaves, but there are no citizens.

10. The Politics, Book IV

a. Polity: The Best Practical Regime

In Book IV Aristotle continues to think about existing regimes and their limitations, focusing on the question: what is the best possible regime? This is another aspect of political science that is still practiced today, as Aristotle combines a theory about how regimes ought to be with his analysis of how regimes really are in practice in order to prescribe changes to those regimes that will bring them more closely in line with the ideal. It is in Book VII that Aristotle describes the regime that would be absolutely the best, if we could have everything the way we wanted it; here he is considering the best regime that we can create given the kinds of human beings and circumstances that cities today find themselves forced to deal with, “For one should study not only the best regime but also the regime that is [the best] possible, and similarly also the regime that is easier and more attainable for all” (1288b37).

Aristotle also provides advice for those that want to preserve any of the existing kinds of regime, even the defective ones, showing a kind of hard-headed realism that is often overlooked in his writings. In order to do this, he provides a higher level of detail about the varieties of the different regimes than he has previously given us. There are a number of different varieties of democracy and oligarchy because cities are made up of a number of different groups of people, and the regime will be different depending on which of these groups happens to be most authoritative. For example, a democracy that is based on the farming element will be different than a democracy that is based on the element that is engaged in commerce, and similarly there are different kinds of oligarchies. We do not need to consider these in detail except to note that Aristotle holds to his position that in either a democracy or an oligarchy it is best if the law rules rather than the people possessing power. In the case of democracy it is best if the farmers rule, because farmers will not have the time to attend the assembly, so they will stay away and will let the laws rule (VI.4).

It is, however, important to consider polity in some detail, and this is the kind of regime to which Aristotle next turns his attention. “Simply speaking, polity is a mixture of oligarchy and democracy” (1293b32). Remember that polity is one of the correct regimes, and it occurs when the many rule in the interest of the political community as a whole. The problem with democracy as the rule of the many is that in a democracy the many rule in their own interest; they exploit the wealthy and deny them political power. But a democracy in which the interests of the wealthy were taken into account and protected by the laws would be ruling in the interest of the community as a whole, and it is this that Aristotle believes is the best practical regime. The ideal regime to be described in Book VII is the regime that we would pray for if the gods would grant us our wishes and we could create a city from scratch, having everything exactly the way we would want it. But when we are dealing with cities that already exist, their circumstances limit what kind of regime we can reasonably expect to create. Creating a polity is a difficult thing to do, and although he provides many examples of democracies and oligarchies Aristotle does not give any examples of existing polities or of polities that have existed in the past.

One of the important elements of creating a polity is to combine the institutions of a democracy with those of an oligarchy. For example, in a democracy, citizens are paid to serve on juries, while in an oligarchy, rich people are fined if they do not. In a polity, both of these approaches are used, with the poor being paid to serve and the rich fined for not serving. In this way, both groups will serve on juries and power will be shared. There are several ways to mix oligarchy and democracy, but “The defining principle of a good mixture of democracy and oligarchy is that it should be possible for the same polity to be spoken of as either a democracy or an oligarchy” (1294b14). The regime must be said to be both – and neither – a democracy and an oligarchy, and it will be preserved “because none of the parts of the city generally would wish to have another regime” (1294b38).

b. The Importance of the Middle Class

In addition to combining elements from the institutions of democracy and oligarchy, the person wishing to create a lasting polity must pay attention to the economic situation in the city. In Book II of the EthicsAristotle famously establishes the principle that virtue is a mean between two extremes. For example, a soldier who flees before a battle is guilty of the vice of cowardice, while one who charges the enemy singlehandedly, breaking ranks and getting himself killed for no reason, is guilty of the vice of foolhardiness. The soldier who practices the virtue of courage is the one who faces the enemy, moves forward with the rest of the troops in good order, and fights bravely. Courage, then, is a mean between the extremes of cowardice and foolhardiness. The person who has it neither flees from the enemy nor engages in a suicidal and pointless attack but faces the enemy bravely and attacks in the right way.

Aristotle draws a parallel between virtue in individuals and virtue in cities. The city, he says, has three parts: the rich, the poor, and the middle class. Today we would probably believe that it is the rich people who are the most fortunate of those three groups, but this is not Aristotle’s position. He says: “[I]t is evident that in the case of the goods of fortune as well a middling possession is the best of all. For [a man of moderate wealth] is readiest to obey reason, while for one who is [very wealthy or very poor] it is difficult to follow reason. The former sort tend to become arrogant and base on a grand scale, the latter malicious and base in petty ways; and acts of injustice are committed either through arrogance or through malice” (1295b4). A political community that has extremes of wealth and poverty “is a city not of free persons but of slaves and masters, the ones consumed by envy, the others by contempt. Nothing is further removed from affection and from a political partnership” (1295b22). People in the middle class are free from the arrogance that characterizes the rich and the envy that characterizes the poor. And, since members of this class are similar and equal in wealth, they are likely to regard one another as similar and equal generally, and to be willing to rule and be ruled in turn, neither demanding to rule at all times as the wealthy do or trying to avoid ruling as the poor do from their lack of resources. “Thus it is the greatest good fortune for those who are engaged in politics to have a middling and sufficient property, because where some possess very many things and others nothing, either [rule of] the people in its extreme form must come into being, or unmixed oligarchy, or – as a result of both of these excesses – tyranny. For tyranny arises from the most headstrong sort of democracy and from oligarchy, but much less often from the middling sorts [of regime] and those close to them” (1295b39).

There can be an enduring polity only when the middle class is able either to rule on its own or in conjunction with either of the other two groups, for in this way it can moderate their excesses: “Where the multitude of middling persons predominates either over both of the extremities together or over one alone, there a lasting polity is capable of existing” (1296b38). Unfortunately, Aristotle says, this state of affairs almost never exists. Instead, whichever group, rich or poor, is able to achieve power conducts affairs to suit itself rather than considering the interests of the other group: “whichever of the two succeeds in dominating its opponents does not establish a regime that is common or equal, but they grasp for preeminence in the regime as the prize of victory” (1296a29). And as a result, neither group seeks equality but instead each tries to dominate the other, believing that it is the only way to avoid being dominated in turn. This is a recipe for instability, conflict, and ultimately civil war, rather than a lasting regime. For the polity (or any other regime) to last, “the part of the city that wants the regime to continue must be superior to the part not wanting this” in quality and quantity (1296b16). He repeats this in Book V, calling it the “great principle”: “keep watch to ensure that that the multitude wanting the regime is superior to that not wanting it” (1309b16), and in Book VI he discusses how this can be arranged procedurally (VI.3).

The remainder of Book IV focuses on the kinds of authority and offices in the city and how these can be distributed in democratic or oligarchic fashion. We do not need to concern ourselves with these details, but it does show that Aristotle is concerned with particular kinds of flawed regimes and how they can best operate and function in addition to his interest in the best practical government and the best government generally.

11. The Politics, Book V

a. Conflict between the Rich and the Poor

In Book V Aristotle turns his attention to how regimes can be preserved and how they are destroyed. Since we have seen what kind of regime a polity is, and how it can be made to endure, we are already in a position to see what is wrong with regimes which do not adopt the principles of a polity. We have already seen the claims of the few rich and the many poor to rule. The former believe that because they are greater in material wealth they should also be greater in political power, while the latter claim that because all citizens are equally free political power should also be equally distributed, which allows the many poor to rule because of their superior numbers. Both groups are partially correct, but neither is entirely correct, “And it is for this reason that, when either [group] does not share in the regime on the basis of the conception it happens to have, they engage in factional conflict” which can lead to civil war (1301a37). While the virtuous also have a claim to rule, the very fact that they are virtuous leads them to avoid factional conflict. They are also too small a group to be politically consequential: “[T]hose who are outstanding in virtue do not engage in factional conflict to speak of; for they are few against many” (1304b4). Therefore, the conflict that matters is the one between the rich and poor, and as we have seen, whichever group gets the upper hand will arrange things for its own benefit and in order to harm the other group. The fact that each of these groups ignores the common good and seeks only its own interest is why both oligarchy and democracy are flawed regimes. It is also ultimately self-destructive to try to put either kind of regime into practice: “Yet to have everywhere an arrangement that is based simply on one or the other of these sorts of equality is a poor thing. This is evident from the result: none of these sorts of regimes is lasting” (1302a3). On the other hand, “[O]ne should not consider as characteristic of popular rule or of oligarchy something tha t will make the city democratically or oligarchically run to the greatest extent possible, but something that will do so for the longest period of time” (1320a1). Democracy tends to be more stable than oligarchy, because democracies only have a conflict between rich and poor, while oligarchies also have conflicts within the ruling group of oligarchs to hold power. In addition, democracy is closer to polity than oligarchy is, and this contributes to its greater stability. And this is an important goal; the more moderate a regime is, the longer it is likely to remain in place.

Why does factional conflict arise? Aristotle turns to this question in Chapter 2. He says: “The lesser engage in factional conflict in order to be equal; those who are equal, in order to be greater” (1302a29). What are the things in which the lesser seek to be equal and the equal to be greater? “As for the things over which they engage in factional conflict, these are profit and honor and their opposites….They are stirred up further by arrogance, by fear, by preeminence, by contempt, by disproportionate growth, by electioneering, by underestimation, by [neglect of] small things, and by dissimilarity” (1302a33). Aristotle describes each of these in more detail. We will not examine them closely, but it is worth observing that Aristotle regards campaigning for office as a potentially dangerous source of conflict. If the city is arranged in such a way that either of the major factions feels that it is being wronged by the other, there are many things that can trigger conflict and even civil war; the regime is inherently unstable. We see again the importance of maintaining a regime which all of the groups in the city wish to see continue.

Aristotle says of democracies that “[D]emocracies undergo revolution particularly on account of the wanton behavior of the popular leaders” (1304b20). Such leaders will harass the property owners, causing them to unify against the democracy, and they will also stir up the poor against the rich in order to maintain themselves in power. This leads to conflict between the two groups and civil war. Aristotle cites a number of historical examples of this. Oligarchies undergo revolution primarily “when they treat the multitude unjustly. Any leader is then adequate [to effect revolution]” (1305a29). Revolution in oligarchical regimes can also come about from competition within the oligarchy, when not all of the oligarchs have a share in the offices. In this case those without power will engage in revolution not to change the regime but to change those who are ruling.

b. How to Preserve Regimes

However, despite all the dangers to the regimes, and the unavoidable risk that any particular regime will be overthrown, Aristotle does have advice regarding the preservation of regimes. In part, of course, we learn how to preserve the regimes by learning what causes revolutions and then avoiding those causes, so Aristotle has already given us useful advice for the preservation of regimes. But he has more advice to offer: “In well-blended regimes, then, one should watch out to ensure there are no transgressions of the laws, and above all be on guard against small ones” (1307b29). Note, again, the importance of letting the laws rule.

It is also important in every regime “to have the laws and management of the rest arranged in such a way that it is impossible to profit from the offices….The many do not chafe as much at being kept away from ruling – they are even glad if someone leaves them the leisure for their private affairs – as they do when they suppose that their rulers are stealing common [funds]; then it pains them both not to share in the prerogatives and not to share in the profits” (1308b32).

And, again, it is beneficial if the group that does not have political power is allowed to share in it to the greatest extent possible, though it should not be allowed to hold the authoritative offices (such as general, treasurer, and so forth). Such men must be chosen extremely carefully: “Those who are going to rule in the authoritative offices ought to have three things: first, affection for the established regime; next, a very great capacity for the work involved in rule; third, virtue and justice – in each regime the sort that is relative to the regime…” (1309a33). It is difficult to find all three of these in many men, but it is important for the regime to make use of the men with these qualities to the greatest degree possible, or else the regime will be harmed, either by sedition, incompetence, or corruption. Aristotle also reminds us of the importance of the middling element for maintaining the regime and making it long-lasting; instead of hostility between the oligarchs and democrats, whichever group has power should be certain always to behave benevolently and justly to the other group (1309b18).

“But the greatest of all the things that have been mentioned with a view to making regimes lasting – though it is now slighted by all – is education relative to the regimes. For there is no benefit in the most beneficial laws, even when these have been approved by all those engaging in politics, if they are not going to be habituated and educated in the regime – if the laws are popular, in a popular spirit, if oligarchic, in an oligarchic spirit” (1310a13). This does not mean that the people living in a democracy should be educated to believe that oligarchs are enemies of the regime, to be oppressed as much as possible and treated unjustly, nor does it mean that the wealthy under an oligarchy should be educated to believe that the poor are to be treated with arrogance and contempt. Instead it means being educated in the principles of moderate democracy and moderate oligarchy, so that the regime will be long-lasting and avoid revolution.

In the remainder of Book V Aristotle discusses monarchy and tyranny and what preserves and destroys these types of regimes. Here Aristotle is not discussing the kind of monarchies with which most people today are familiar, involving hereditary descent of royal power, usually from father to son. A monarch in Aristotle’s sense is one who rules because he is superior to all other citizens in virtue. Monarchy therefore involves individual rule on the basis of merit for the good of the whole city, and the monarch because of his virtue is uniquely well qualified to determine what that means. The tyrant, on the other hand, rules solely for his own benefit and pleasure. Monarchy, therefore, involving the rule of the best man over all, is the best kind of regime, while tyranny, which is essentially the rule of a master over a regime in which all are slaves, is the worst kind of regime, and in fact is really no kind of regime at all. Aristotle lists the particular ways in which both monarchy and tyranny are changed and preserved. We do not need to spend much time on these, for Aristotle says that in his time “there are many persons who are similar, with none of them so outstanding as to match the extent and the claim to merit of the office” that would be required for the rule of one man on the basis of exceptional virtue that characterizes monarchy (1313a5), and tyranny is inherently extremely short lived and clearly without value. However, those wishing to preserve either of these kinds of regimes are advised, as oligarchs and democrats have been, to pursue moderation, diminishing the degree of their power in order to extend its duration.

12. The Politics, Book VI

a. Varieties of Democracy

Most of Book VI is concerned with the varieties of democracy, although Aristotle also revisits the varieties of oligarchy. Some of this discussion has to do with the various ways in which the offices, laws, and duties can be arranged. This part of the discussion we will pass over. However, Aristotle also includes a discussion of the animating principle of democracy, which is freedom: “It is customarily said that only in this sort of regime do [men] share in freedom, for, so it is asserted, every democracy aims at this” (1317a40). In modern liberal democracies, of course, the ability of all to share in freedom and for each citizen to live as one wants is considered one of the regime’s strengths. However, keep in mind that Aristotle believes that human life has a telos and that the political community should provide education and laws that will lead to people pursuing and achieving this telos. Given that this is the case, a regime that allows people to do whatever they want is in fact flawed, for it is not guiding them in the direction of the good life.

b. The Best Kind of Democracy

He also explains which of the varieties of democracy is the best. In Chapter 4, we discover that the best sort of democracy is the one made up of farmers: “The best people is the farming sort, so that it is possible also to create [the best] democracy wherever the multitude lives from farming or herding. For on account of not having much property it is lacking in leisure, and so is unable to hold frequent assemblies. Because they do not have the necessary things, they spend their time at work and do not desire the things of others; indeed, working is more pleasant to them than engaging in politics and ruling, where there are not great spoils to be gotten from office” (1318b9). This is a reason why the authoritative offices can be in the hands of the wealthy, as long as the people retain control of auditing and adjudication: “Those who govern themselves in this way must necessarily be finely governed. The offices will always be in the hands of the best persons, the people being willing and not envious of the respectable, while the arrangement is satisfactory for the respectable and notable. These will not be ruled by others who are their inferiors, and they will rule justly by the fact that others have authority over the audits” (1318b33). By “adjudication” Aristotle means that the many should be certain that juries should be made up of men from their ranks, so that the laws will be enforced with a democratic spirit and the rich will not be able to use their wealth to put themselves above the law. By “authority over the audits” Aristotle refers to an institution which provided that those who held office had to provide an accounting of their activities at regular intervals: where the city’s funds came from, where they went, what actions they took, and so forth. They were liable to prosecution if they were found to have engaged in wrongdoing or mismanagement, and the fear of this prosecution, Aristotle says, will keep them honest and ensure that they act according to the wishes of the democracy.

So we see again that the institutions and laws of a city are important, but equally important is the moral character of the citizens. It is only the character of the farming population that makes the arrangements Aristotle describes possible: “The other sorts of multitude out of which the remaining sorts of democracy are constituted are almost all much meaner than these: their way of life is a mean one, with no task involving virtue among the things that occupy the multitude of human beings who are vulgar persons and merchants or the multitude of laborers” (1319a24). And while Aristotle does not say it here, of course a regime organized in this way, giving a share of power to the wealthy and to the poor, under the rule of law, in the interest of everyone, would in fact be a polity more than it would be a democracy.

c. The Role of Wealth in a Democracy

In Chapter 5 of Book VI he offers further advice that would move the city in the direction of polity when he discusses how wealth should be handled in a democracy. Many democracies offer pay for serving in the assembly or on juries so that the poor will be able to attend. Aristotle advises minimizing the number of trials and length of service on juries so that the cost will not be too much of a burden on the wealthy where there are not sources of revenue from outside the city (Athens, for example, received revenue from nearby silver mines, worked by slaves). Where such revenues exist, he criticizes the existing practice of distributing surpluses to the poor in the form of cash payments, which the poor citizens will take while demanding more. However, poverty is a genuine problem in a democracy: “[O]ne who is genuinely of the popular sort (i.e. a supporter of democracy) should see to it that the multitude is not overly poor, for this is the reason for democracy being depraved” (1320a33). Instead the surplus should be allowed to accumulate until enough is available to give the poor enough money to acquire land or start a trade. And even if there is no external surplus, “[N]otables who are refined and sensible will divide the poor among themselves and provide them with a start in pursuing some work” (1320b8). It seems somewhat unusual for Aristotle to be advocating a form of welfare, but that is what he is doing, on the grounds that poverty is harmful to the character of the poor and this harms the community as a whole by undermining its stability.

13. The Politics, Book VII

a. The Best Regime and the Best Men

It is in Book VII that Aristotle describes the regime that is best without qualification. This differs from the discussion of the best regime in Book IV because in Book IV Aristotle’s concern was the best practical regime, meaning one that it would be possible to bring about from the material provided by existing regimes. Here, however, his interest is in the best regime given the opportunity to create everything just as we would want it. It is “the city that is to be constituted on the basis of what one would pray for” (1325b35). As would be expected, he explicitly ties it to the question of the best way of life: “Concerning the best regime, one who is going to undertake the investigation appropriate to it must necessarily discuss first what the most choiceworthy way of life is. As long as this is unclear, the best regime must necessarily be unclear as well…” (1323a14). We have already discussed the best way of life, as well as the fact that most people do not pursue it: “For [men] consider any amount of virtue to be adequate, but wealth, goods, power, reputation, and all such things they seek to excess without limit” (1323a35). This is, as we have said more than once, a mistake: “Living happily…is available to those who have to excess the adornments of character and mind but behave moderately in respect to the external acquisition of good things” (1323b1). And what is true for the individual is also true for the city. Therefore “the best city is happy and acts nobly. It is impossible to act nobly without acting [to achieve] noble things; but there is no noble deed either of a man or of a city that is separate from virtue and prudence. The courage, justice, and prudence of a city have the same power and form as those human beings share in individually who are called just, prudent, and sound.” (1324b30). The best city, like any other city, must educate its citizens to support its principles. The difference between this city and other cities is that the principles that it teaches its citizens are the correct principles for living the good life. It is here, and nowhere else, that the excellent man and the good citizen are the same.

b. Characteristics of the Best City

What would be the characteristics of the best city we could imagine? First of all, we want the city to be the right size. Many people, Aristotle says, are confused about what this means. They assume that the bigger the city is, the better it will be. But this is wrong. It is certainly true that the city must be large enough to defend itself and to be self-sufficient, but “This too, at any rate, is evident from the facts: that it is difficult – perhaps impossible – for a city that is too populous to be well managed” (1326a26). So the right size for the city is a moderate one; it is the one that enables it to perform its function of creating virtuous citizens properly. “[T]he [city] that is made up of too few persons is not self-sufficient, though the city is a self-sufficient thing, while the one that is made up of too many persons is with respect to the necessary things self-sufficient like a nation, but is not a city; for it is not easy for a regime to be present” (1326b3). There is an additional problem in a regime that is too large: “With a view to judgment concerning the just things and with a view to distributing offices on the basis of merit, the citizens must necessarily be familiar with one another’s qualities; where this does not happen to be the case, what is connected with the offices and with judging must necessarily be carried on poorly” (1326b13).

The size of the territory is also an important element of the ideal regime, and it too must be tailored to the purpose of the regime. Aristotle says “[the territory should be] large enough so that the inhabitants are able to live at leisure in liberal fashion and at the same time with moderation” (1326b29). Again Aristotle’s main concern is with life at peace, not life at war. On the other hand, the city and its territory should be such as to afford its inhabitants advantages in times of war; “it ought to be difficult for enemies to enter, but readily exited by [the citizens] themselves,” and not so big that it cannot be “readily surveyable” because only such a territory is “readily defended” (1326b41). It should be laid out in such a way as to be readily defensible (Book VII, Chapters 11-12). It should also be defensible by sea, since proper sea access is part of a good city. Ideally the city will (like Athens) have a port that is several miles away from the city itself, so that contact with foreigners can be regulated. It should also be in the right geographical location.

Aristotle believed that geography was an important factor in determining the characteristics of the people living in a certain area. He thought that the Greeks had the good traits of both the Europeans (spiritedness) and Asians (souls endowed with art and thought) because of the Greek climate (1327b23). While the harsh climate to the north made Europeans hardy and resilient, as well as resistant to being ruled (although Aristotle did not know about the Vikings, they are perhaps the best example of what he is talking about), and the climate of what he called Asia and we now call the Middle East produced a surplus of food that allowed the men the leisure to engage in intellectual and artistic endeavors while robbing them of spiritedness, the Greeks had the best of both worlds: “[I]t is both spirited and endowed with thought, and hence both remains free and governs itself in the best manner and at the same time is capable of ruling all…” (1327b29).

However, despite the necessary attention to military issues, when we consider the ideal city, the principles which we have already elaborated about the nature of the citizens remain central. Even in the ideal city, constructed to meet the conditions for which we would pray, the need for certain tasks, such as farming and laboring, will remain. Therefore there will also be the need for people to do these tasks. But such people should not be citizens, for (as we have discussed) they will lack the leisure and the intellect to participate in governing the city. They are not really even part of the city: “Hence while cities need possessions, possessions are no part of the city. Many animate things (i.e. slaves and laborers) are part of possessions. But the city is a partnership of similar persons, for the sake of a life that is the best possible” (1328a33). The citizens cannot be merchants, laborers, or farmers, “for there is a need for leisure both with a view to the creation of virtue and with a view to political activities” (1329a1). So all the people living in the city who are not citizens are there for the benefit of the citizens. Any goals, wishes, or desires that they might have are irrelevant; in Kant’s terms, they are treated as means rather than ends.

Those that live the lives of leisure that are open to citizens because of the labor performed by the non-citizens (again, including the women) are all similar to one another, and therefore the appropriate political arrangement for them is “in similar fashion to participate in ruling and being ruled in turn. For equality is the same thing [as justice] for persons who are similar, and it is difficult for a regime to last if its constitution is contrary to justice” (1332b25). These citizens will only be able to rule and be ruled in turn if they have had the proper upbringing, and this is the last major topic that Aristotle takes up in the Politics. Most cities make the mistake of neglecting education altogether, leaving it up to fathers to decide whether they will educate their sons at all, and if so what subject matter will be covered and how it will be taught. Some cities have in fact paid attention to the importance of the proper education of the young, training them in the virtues of the regime. Unfortunately, these regimes have taught them the wrong things. Aristotle is particularly concerned with Sparta here; the Spartans devoted great effort to bringing up their sons to believe that the virtues related to war were the only ones that mattered in life. They were successful; but because war is not the ultimate good, their education was not good. (Recall that the Spartan education was also flawed because it neglected the women entirely).

It is important for the person devising the ideal city to learn from this mistake. Such cities do not last unless they constantly remain at war (which is not an end in itself; no one pursues war for its own sake). Aristotle says “Most cities of this sort preserve themselves when at war, but once having acquired [imperial] rule they come to ruin; they lose their edge, like iron, when they remain at peace. The reason is that the legislator has not educated them to be capable of being at leisure” (1334a6). The proper education must be instilled from the earliest stages of life, and even before; Aristotle tells us the ages that are appropriate for marriage (37 for men, 18 for women) in order to bring about children of the finest quality, and insists on the importance of a healthful regimen for pregnant women, specifying that they take sufficient food and remain physically active. He also says that abortion is the appropriate solution when the population threatens to grow too large (1335b24).

14. The Politics, Book VIII

a. The Education of the Young

Book VIII is primarily concerned with the kind of education that the children of the citizens should receive. That this is a crucial topic for Aristotle is clear from its first sentence: “That the legislator must, therefore, make the education of the young his object above all would be disputed by no one” (1337a10). It is so important that it cannot be left to individual families, as was the custom in Greece. Instead, “Since there is a single end for the city as a whole, it is evident that education must necessarily be one and the same for all, and that the superintendence of it should be common and not on a private basis….For common things the training too should be made common” (1337a21). The importance of a common education shaping each citizen so as to enable him to serve the common good of the city recalls the discussion of how the city is prior to the individual in Book I Chapter 2; as has been quoted already in the discussion above, “one ought not even consider that a citizen belongs to himself, but rather that all belong to the city; for each individual is a part of the city” (1337a26).

He elaborates on the content of this education, noting that it should involve the body as well as the mind. Aristotle includes physical education, reading and writing, drawing, and music as subjects which the young potential citizens must learn. The aim of this education is not productive or theoretical knowledge. Instead it is meant to teach the young potential citizens practical knowledge – the kind of knowledge that each of them will need to fulfill his telos and perform his duties as a citizen. Learning the subjects that fall under the heading of productive knowledge, such as how to make shoes, would be degrading to the citizen. Learning the subjects that would fall under the heading of theoretical knowledge would be beyond the ability of most of the citizens, and is not necessary to them as citizens.

15. References and Further Reading

The list below is not intended to be comprehensive. It is limited to works published from 1962 to 2002. Most of these have their own bibliographies and suggested reading lists, and the reader is encouraged to take advantage of these.

Translations of Aristotle

  • Barnes, Jonathan, ed. The Complete Works of Aristotle: The Revised Oxford Translation. Princeton: Princeton University Press, 1984. Two volumes.
    • The standard edition of Aristotle’s complete works.
  • Irwin, Terence, and Gail Fine, eds. Aristotle: Introductory Readings. Indianapolis, IN: Hackett Publishing Company, Inc., 1996
    • As the title suggests, this book includes excerpts from Aristotle’s writings. Understanding any of Aristotle’s texts means reading it in its entirety, but if you want a book by your side to check cross-references from whichever of his texts you are reading (for example, if the editor of the edition of the Politics you are reading refers to the Ethics), this one should do the trick.
  • Aristotle. Nicomachean Ethics. Translated and edited by Roger Crisp. Cambridge: Cambridge University Press, 2000.
    • This translation lacks the scholarly and critical apparatus of the Rowe translation but is still a fine choice.
  • Aristotle. Nicomachean Ethics. Translated and edited by Terry Irwin. Indianapolis: Hackett Publishing, 1999.
  • Aristotle. Nicomachean Ethics. Translated and with an introduction by Martin Ostwald. New York: Macmillan Publishing Company, 1962.
    • The translation used in preparing this entry. A good basic translation.
  • Aristotle. Nicomachean Ethics. Translated and with an introduction by David Ross. Revised by J.L. Ackrill and J.O. Urmson. Oxford: Oxford University Press, 1980.
    • Updated and revised version of a classic translation from 1925. See also Ross’ book on Aristotle below.
  • Aristotle. Nicomachean Ethics. Translation and historical introduction by Christopher Rowe; philosophical introduction and commentary by Sarah Broadie. Oxford: Oxford University Press, 2002.
    • A very thorough introduction and commentary are included with this translation of theEthics. A good choice for the beginning student – but remember that the introduction and commentary are not meant to substitute for actually reading the text!
  • Aristotle. The Politics. Translated and with an introduction by Carnes Lord. Chicago: University of Chicago Press, 1984.
    • The translation used in preparing this entry. A useful introduction and very thorough notes, identifying names, places, and terms with which the reader may not be familiar.
  • Aristotle. The Politics. Translated by C.D.C. Reeve. Indianapolis : Hackett Publishing, 1998.
  • Aristotle. The Politics of Aristotle. Translated by Peter Simpson. Chapel Hill: University of North Carolina Press, 1997.
  • Aristotle. The Politics and The Constitution of Athens. Edited by Stephen Everson. Cambridge: Cambridge University Press, 1996.
    • If you’re looking for The Constitution of Athens this is a good place to go – and with thePolitics in the same book it’s easy to compare the two books to each other. However, the texts are lacking in footnotes, which is a particular problem with the Constitution since it records Athenian history. So, for example, on page 237 we learn that during the rule of the Thirty Tyrants in Athens the rulers chose “ten colleagues to govern the Peiraeus,” without any indication that the Peiraeus was the Athenian harbor and its surrounding community, five miles from the city (it is also the setting of Plato’s Republic). It would help to have names, places, and concepts defined and explained through footnotes for the beginning student. The more advanced student may wish to consult the four volumes on the Politics in the Oxford University Press’s Clarendon Aristotle Series. Volume I, covering Books I and II of the Politics, is by Trevor Saunders; Volume II, on Books III and IV, is by Richard Robinson; Volume III, on Books V and VI, is by David Keyt, and Volume IV, on Books VII and VIII, is by Richard Kraut.
  • Aristotle. The Rhetoric. In George A. Kennedy, Aristotle On Rhetoric: A Theory of Civic Discourse.Translated and with an introduction by George A. Kennedy. New York: Oxford University Press, 1991.
    • The Rhetoric includes observations on politics and ethics in the context of teaching the reader how to become a rhetorician. Whether or not this requires the student to behave ethically is a matter of some debate. Speaking well in public settings was crucial to attaining political success in the Athenian democracy (and is still valuable today) and much of Aristotle’s practical advice remains useful.

Secondary literature – general works on Aristotle

  • Ackrill, J. L. Aristotle the Philosopher. New York: Oxford University Press, 1981.
  • Adler, Mortimer. Aristotle for Everybody: Difficult Thought Made Easy. New York: Macmillan Publishing Co., Inc., 1978.
    • This is probably the easiest-to-read exposition of Aristotle available; Adler says that it is aimed at “everybody – of any age, from twelve or fourteen years upward.” Obviously the author has had to make some sacrifices in the areas of detail and complexity to accomplish this, and anyone who has spent any time at all with Aristotle will probably wish to start elsewhere. Nevertheless, the author succeeds to a very great degree in delivering on the promise of the subtitle, expressing the basics of Aristotle’s thought in simple language using common examples and straightforward descriptions.
  • Barnes, Jonathan. Aristotle: A Very Short Introduction. New York: Oxford University Press, 2000.
  • Barnes, Jonathan, ed. The Cambridge Companion to Aristotle. Cambridge: Cambridge University Press, 1995.
    • “The Companion is intended for philosophical readers who are new to Aristotle,” Barnes writes in the Introduction, and the book delivers. Chapter Seven, by D.S. Hutchinson, covers Aristotle’s ethical theory; Chapter Eight, by C.C.W. Taylor, his political theory. Barnes himself writes the first chapter on Aristotle’s life and work, as well as an excellent introduction which includes an explanation of why no book (or, I would add, encyclopedia article) can substitute for reading the original Aristotelian texts. It also includes the following: “Plato had an influence second only to Aristotle…. But Plato’s philosophical views are mostly false, and for the most part they are evidently false; his arguments are mostly bad, and for the most part they are evidently bad.” If those remarks provoke any kind of emotional or intellectual response in you, you may as well give up: you are on the way to being a student of philosophy.
  • Guthrie, W.K.C. Aristotle: An Encounter. Cambridge: Cambridge University Press, 1981.
    • Volume 6 of his six volume Cambridge History of Ancient Greek Philosophy written between 1962 and 1981.
  • Robinson, Timothy A. Aristotle in Outline. Indianapolis: Hackett Publishing Company, Inc., 1995.
    • Another short (125 pages) introduction to Aristotle’s thought, with three sections: Wisdom and Science, Aristotle’s Ethics, and Politics. It would be an excellent choice for the beginning student or anyone who just wants to be introduced to Aristotle’s philosophy. Robinson is sympathetic to Aristotle but also to his readers, keeping things easy to read while at the same time offering enough detail about Aristotle’s doctrines to illuminate his entire system and making the interconnections among the various elements of Aristotle’s system clear.
  • Ross, Sir David. Aristotle. With an introduction by John L. Ackrill. Sixth edition. London: Routledge, 1995.
    • This is a classic in the field, now in its sixth edition, having first been published in 1923. Not many books can stay useful for eighty years. “It is not an elementary introduction for the absolute beginner,” the introduction says, and that seems right to me, but neither does it require the reader to be an expert. It covers all of Aristotle’s work, with chapters on Logic, Philosophy of Nature, Biology, Psychology, Metaphysics, Ethics, Politics, and Rhetoric and Poetics.
  • Thompson, Garrett and Marshall Missner. On Aristotle. Belmont, CA: Wadsworth, 2000.
    • Another short (100 page) overview of Aristotle’s thought that is too short to be adequate for any one topic (Chapter Nine, Aristotle’s view of politics, is less than six pages long) but might be useful for the new student of Aristotle interested in a brief look at the breadth of Aristotle’s interests. The book by Barnes included above is to be preferred.

Secondary literature – books on Aristotle’s Politics

  • Keyt, David, and Fred Miller, eds. A Companion to Aristotle’s Politics. London: Blackwell, 1991.
  • Kraut, Richard. Aristotle: Political Philosophy. Oxford: Oxford University Press, 2002.
    • An exceptional work of scholarship. Detailed, insightful, and as close to being comprehensive as anyone is likely to get in one book. The text is clearly broken down by topic and sub-topic, and the bibliography will help steer the Aristotle student in the right direction for future research. Kraut also notes other authors who disagree with his interpretation and why he believes they are wrong; this too is helpful for further research. Highly recommended.
  • Miller, Fred. Nature, Justice and Rights in Aristotle’s Politics. New York: Oxford University Press, 1995.
  • Mulgan, R.G. Aristotle’s Political Theory: An Introduction for Students of Political Theory. Oxford: Clarendon Press, 1977.
    • Mulgan’s book “is intended for students of political theory who are meeting the Politics for the first time and in an English translation.” It is divided into subjects rather than following the topics in the order discussed in the Politics as this article has done, with footnotes to the relevant passages in Aristotle’s texts. It is nicely detailed and offers excellent discussions (and criticisms) of Aristotle’s thought.
  • Simpson, Peter. A Philosophical Commentary on the Politics of Aristotle. Chapel Hill: University of North Carolina Press, 1998.

Author Information:

Edward Clayton
Email: clayt1ew@cmich.edu
Central Michigan University
U. S. A.

Propositional Logic

Propositional logic, also known as sentential logic and statement logic, is the branch of logic that studies ways of joining and/or modifying entire propositions, statements or sentences to form more complicated propositions, statements or sentences, as well as the logical relationships and properties that are derived from these methods of combining or altering statements. In propositional logic, the simplest statements are considered as indivisible units, and hence, propositional logic does not study those logical properties and relations that depend upon parts of statements that are not themselves statements on their own, such as the subject and predicate of a statement. The most thoroughly researched branch of propositional logic is classical truth-functional propositional logic, which studies logical operators and connectives that are used to produce complex statements whose truth-value depends entirely on the truth-values of the simpler statements making them up, and in which it is assumed that every statement is either true or false and not both. However, there are other forms of propositional logic in which other truth-values are considered, or in which there is consideration of connectives that are used to produce statements whose truth-values depend not simply on the truth-values of the parts, but additional things such as their necessity, possibility or relatedness to one another.

Table of Contents

  1. Introduction
  2. History
  3. The Language of Propositional Logic
    1. Syntax and Formation Rules of PL
    2. Truth Functions and Truth Tables
    3. Definability of the Operators and the Languages PL’ and PL”
  4. Tautologies, Logical Equivalence and Validity
  5. Deduction: Rules of Inference and Replacement
    1. Natural Deduction
    2. Rules of Inference
    3. Rules of Replacement
    4. Direct Deductions
    5. Conditional and Indirect Proofs
  6. Axiomatic Systems and the Propositional Calculus
  7. Meta-Theoretic Results for the Propositional Calculus
  8. Other Forms of Propositional Logic
  9. References and Further Reading

1. Introduction

A statement can be defined as a declarative sentence, or part of a sentence, that is capable of having a truth-value, such as being true or false. So, for example, the following are statements:

  • George W. Bush is the 43rd President of the United States.
  • Paris is the capital of France.
  • Everyone born on Monday has purple hair.

Sometimes, a statement can contain one or more other statements as parts. Consider for example, the following statement:

  • Either Ganymede is a moon of Jupiter or Ganymede is a moon of Saturn.

While the above compound sentence is itself a statement, because it is true, the two parts, “Ganymede is a moon of Jupiter” and “Ganymede is a moon of Saturn”, are themselves statements, because the first is true and the second is false.

The term proposition is sometimes used synonymously with statement. However, it is sometimes used to name something abstract that two different statements with the same meaning are both said to “express”. In this usage, the English sentence, “It is raining”, and the French sentence “Il pleut”, would be considered to express the same proposition; similarly, the two English sentences, “Callisto orbits Jupiter” and “Jupiter is orbited by Callisto” would also be considered to express the same proposition. However, the nature or existence of propositions as abstract meanings is still a matter of philosophical controversy, and for the purposes of this article, the phrases “statement” and “proposition” are used interchangeably.

Propositional logic, also known as sentential logic, is that branch of logic that studies ways of combining or altering statements or propositions to form more complicated statements or propositions. Joining two simpler propositions with the word “and” is one common way of combining statements. When two statements are joined together with “and”, the complex statement formed by them is true if and only if both the component statements are true. Because of this, an argument of the following form is logically valid:

Paris is the capital of France and Paris has a population of over two million.
Therefore, Paris has a population of over two million.

Propositional logic largely involves studying logical connectives such as the words “and” and “or” and the rules determining the truth-values of the propositions they are used to join, as well as what these rules mean for the validity of arguments, and such logical relationships between statements as being consistent or inconsistent with one another, as well as logical properties of propositions, such as being tautologically true, being contingent, and being self-contradictory. (These notions are defined below.)

Propositional logic also studies way of modifying statements, such as the addition of the word “not” that is used to change an affirmative statement into a negative statement. Here, the fundamental logical principle involved is that if a given affirmative statement is true, the negation of that statement is false, and if a given affirmative statement is false, the negation of that statement is true.

What is distinctive about propositional logic as opposed to other (typically more complicated) branches of logic is that propositional logic does not deal with logical relationships and properties that involve the parts of a statement smaller than the simple statements making it up. Therefore, propositional logic does not study those logical characteristics of the propositions below in virtue of which they constitute a valid argument:

  1. George W. Bush is a president of the United States.
  2. George W. Bush is a son of a president of the United States.
  3. Therefore, there is someone who is both a president of the United States and a son of a president of the United States.

The recognition that the above argument is valid requires one to recognize that the subject in the first premise is the same as the subject in the second premise. However, in propositional logic, simple statements are considered as indivisible wholes, and those logical relationships and properties that involve parts of statements such as their subjects and predicates are not taken into consideration.

Propositional logic can be thought of as primarily the study of logical operators. A logical operator is any word or phrase used either to modify one statement to make a different statement, or join multiple statements together to form a more complicated statement. In English, words such as “and”, “or”, “not”, “if … then…”, “because”, and “necessarily”, are all operators.

A logical operator is said to be truth-functional if the truth-values (the truth or falsity, etc.) of the statements it is used to construct always depend entirely on the truth or falsity of the statements from which they are constructed. The English words “and”, “or” and “not” are (at least arguably) truth-functional, because a compound statement joined together with the word “and” is true if both the statements so joined are true, and false if either or both are false, a compound statement joined together with the word “or” is true if at least one of the joined statements is true, and false if both joined statements are false, and the negation of a statement is true if and only if the statement negated is false.

Some logical operators are not truth-functional. One example of an operator in English that is not truth-functional is the word “necessarily”. Whether a statement formed using this operator is true or false does not depend entirely on the truth or falsity of the statement to which the operator is applied. For example, both of the following statements are true:

  • 2 + 2 = 4.
  • Someone is reading an article in a philosophy encyclopedia.

However, let us now consider the corresponding statements modified with the operator “necessarily”:

  • Necessarily, 2 + 2 = 4.
  • Necessarily, someone is reading an article in a philosophy encyclopedia.

Here, the first example is true but the second example is false. Hence, the truth or falsity of a statement using the operator “necessarily” does not depend entirely on the truth or falsity of the statement modified.

Truth-functional propositional logic is that branch of propositional logic that limits itself to the study of truth-functional operators. Classical (or “bivalent”) truth-functional propositional logic is that branch of truth-functional propositional logic that assumes that there are are only two possible truth-values a statement (whether simple or complex) can have: (1) truth, and (2) falsity, and that every statement is either true or false but not both.

Classical truth-functional propositional logic is by far the most widely studied branch of propositional logic, and for this reason, most of the remainder of this article focuses exclusively on this area of logic. In addition to classical truth-functional propositional logic, there are other branches of propositional logic that study logical operators, such as “necessarily”, that are not truth-functional. There are also “non-classical” propositional logics in which such possibilities as (i) a proposition’s having a truth-value other than truth or falsity, (ii) a proposition’s having an indeterminate truth-value or lacking a truth-value altogether, and sometimes even (iii) a proposition’s being both true and false, are considered. (For more information on these alternative forms of propositional logic, consult Section VIII below.)

2. History

The serious study of logic as an independent discipline began with the work of Aristotle (384-322 BCE). Generally, however, Aristotle’s sophisticated writings on logic dealt with the logic of categories and quantifiers such as “all”, and “some”, which are not treated in propositional logic. However, in his metaphysical writings, Aristotle espoused two principles of great importance in propositional logic, which have since come to be called the Law of Excluded Middle and the Law of Contradiction. Interpreted in propositional logic, the first is the principle that every statement is either true or false, the second is the principle that no statement is both true and false. These are, of course, cornerstones of classical propositional logic. There is some evidence that Aristotle, or at least his successor at the Lyceum, Theophrastus (d. 287 BCE), did recognize a need for the development of a doctrine of “complex” or “hypothetical” propositions, that is, those involving conjunctions (statements joined by “and”), disjunctions (statements joined by “or”) and conditionals (statements joined by “if… then…”), but their investigations into this branch of logic seem to have been very minor.

More serious attempts to study statement operators such as “and”, “or” and “if… then…” were conducted by the Stoic philosophers in the late 3rd century BCE. Since most of their original works—if indeed, these writings were even produced—are lost, we cannot make many definite claims about exactly who first made investigations into what areas of propositional logic, but we do know from the writings of Sextus Empiricus that Diodorus Cronus and his pupil Philo had engaged in a protracted debate about whether the truth of a conditional statement depends entirely on it not being the case that its antecedent (if-clause) is true while its consequent (then-clause) is false, or whether it requires some sort of stronger connection between the antecedent and consequent—a debate that continues to have relevance for modern discussion of conditionals. The Stoic philosopher Chrysippus (roughly 280-205 BCE) perhaps did the most in advancing Stoic propositional logic, by marking out a number of different ways of forming complex premises for arguments, and for each, listing valid inference schemata. Chrysippus suggested that the following inference schemata are to be considered the most basic:

  1. If the first, then the second; but the first; therefore the second.
  2. If the first, then the second; but not the second; therefore, not the first.
  3. Not both the first and the second; but the first; therefore, not the second.
  4. Either the first or the second [and not both]; but the first; therefore, not the second.
  5. Either the first or the second; but not the second; therefore the first.

Inference rules such as the above correspond very closely to the basic principles in a contemporary system of natural deduction for propositional logic. For example, the first two rules correspond to the rules of modus ponens and modus tollens, respectively. These basic inference schemata were expanded upon by less basic inference schemata by Chrysippus himself and other Stoics, and are preserved in the work of Diogenes Laertius, Sextus Empiricus and later, in the work of Cicero.

Advances on the work of the Stoics were undertaken in small steps in the centuries that followed. This work was done by, for example, the second century logician Galen (roughly 129-210 CE), the sixth century philosopher Boethius (roughly 480-525 CE) and later by medieval thinkers such as Peter Abelard (1079-1142) and William of Ockham (1288-1347), and others. Much of their work involved producing better formalizations of the principles of Aristotle or Chrysippus, introducing improved terminology and furthering the discussion of the relationships between operators. Abelard, for example, seems to have been the first to clearly differentiate exclusive disjunction from inclusive disjunction (discussed below), and to suggest that inclusive disjunction is the more important notion for the development of a relatively simple logic of disjunctions.

The next major step forward in the development of propositional logic came only much later with the advent of symbolic logic in the work of logicians such as Augustus DeMorgan (1806-1871) and, especially, George Boole (1815-1864) in the mid-19th century. Boole was primarily interested in developing a mathematical-style “algebra” to replace Aristotelian syllogistic logic, primarily by employing the numeral “1” for the universal class, the numeral “0” for the empty class, the multiplication notation “xy” for the intersection of classes x and y, the addition notation “x + y” for the union of classes x and y, etc., so that statements of syllogistic logic could be treated in quasi-mathematical fashion as equations; for example, “No x is y” could be written as “xy = 0”. However, Boole noticed that if an equation such as “x = 1” is read as “x is true”, and “x = 0” is read as “x is false”, the rules given for his logic of classes can be transformed into a logic for propositions, with “x + y = 1” reinterpreted as saying that either x or y is true, and “xy = 1” reinterpreted as meaning that x and y are both true. Boole’s work sparked rapid interest in logic among mathematicians. Later, “Boolean algebras” were used to form the basis of the truth-functional propositional logics utilized in computer design and programming.

In the late 19th century, Gottlob Frege (1848-1925) presented logic as a branch of systematic inquiry more fundamental than mathematics or algebra, and presented the first modern axiomatic calculus for logic in his 1879 work Begriffsschrift. While it covered more than propositional logic, from Frege’s axiomatization it is possible to distill the first complete axiomatization of classical truth-functional propositional logic. Frege was also the first to systematically argue that all truth-functional connectives could be defined in terms of negation and the material conditional.

In the early 20th century, Bertrand Russell gave a different complete axiomatization of propositional logic, considered on its own, in his 1906 paper “The Theory of Implication”, and later, along with A. N. Whitehead, produced another axiomatization using disjunction and negation as primitives in the 1910 work Principia Mathematica. Proof of the possibility of defining all truth functional operators in virtue of a single binary operator was first published by American logician H. M. Sheffer in 1913, though American logician C. S. Peirce (1839-1914) seems to have discovered this decades earlier. In 1917, French logician Jean Nicod discovered that it was possible to axiomatize propositional logic using the Sheffer stroke and only a single axiom schema and single inference rule.

The notion of a “truth table” is often utilized in the discussion of truth-functional connectives (discussed below). It seems to have been at least implicit in the work of Peirce, W. S. Jevons (1835-1882), Lewis Carroll (1832-1898), John Venn (1834-1923), and Allan Marquand (1853-1924). Truth tables appear explicitly in writings by Eugen Müller as early as 1909. Their use gained rapid popularity in the early 1920s, perhaps due to the combined influence of the work of Emil Post, whose 1921 work makes liberal use of them, and Ludwig Wittgenstein’s 1921 Tractatus Logico-Philosophicus, in which truth tables and truth-functionality are prominently featured.

Systematic inquiry into axiomatic systems for propositional logic and related metatheory was conducted in the 1920s, 1930s and 1940s by David Hilbert, Paul Bernays, Alfred Tarski, Jan Łukasiewicz, Kurt Gödel, Alonzo Church, and others. It is during this period, that most of the important metatheoretic results such as those discussed in Section VII were discovered.

Complete natural deduction systems for classical truth-functional propositional logic were developed and popularized in the work of Gerhard Gentzen in the mid-1930s, and subsequently introduced into influential textbooks such as that of F. B. Fitch (1952) and Irving Copi (1953).

Modal propositional logics are the most widely studied form of non-truth-functional propositional logic. While interest in modal logic dates back to Aristotle, by contemporary standards the first systematic inquiry into this modal propositional logic can be found in the work of C. I. Lewis in 1912 and 1913. Among other well-known forms of non-truth-functional propositional logic, deontic logic began with the work of Ernst Mally in 1926, and epistemic logic was first treated systematically by Jaakko Hintikka in the early 1960s. The modern study of three-valued propositional logic began in the work of Jan Łukasiewicz in 1917, and other forms of non-classical propositional logic soon followed suit. Relevance propositional logic is relatively more recent; dating from the mid-1970s in the work of A. R. Anderson and N. D. Belnap. Paraconsistent logic, while having its roots in the work of Łukasiewicz and others, has blossomed into an independent area of research only recently, mainly due to work undertaken by N. C. A. da Costa, Graham Priest and others in the 1970s and 1980s.

3. The Language of Propositional Logic

The basic rules and principles of classical truth-functional propositional logic are, among contemporary logicians, almost entirely agreed upon, and capable of being stated in a definitive way. This is most easily done if we utilize a simplified logical language that deals only with simple statements considered as indivisible units as well as complex statements joined together by means of truth-functional connectives. We first consider a language called PL for “Propositional Logic”. Later we shall consider two even simpler languages, PL’ and PL”.

a. Syntax and Formation Rules of PL

In any ordinary language, a statement would never consist of a single word, but would always at the very least consist of a noun or pronoun along with a verb. However, because propositional logic does not consider smaller parts of statements, and treats simple statements as indivisible wholes, the language PL uses uppercase letters ‘A‘, ‘B‘, ‘C‘, etc., in place of complete statements. The logical signs ‘\land‘, ‘\lor‘, ‘→’, ‘↔’, and ‘\neg‘ are used in place of the truth-functional operators, “and”, “or”, “if… then…”, “if and only if”, and “not”, respectively. So, consider again the following example argument, mentioned in Section I.

Paris is the capital of France and Paris has a population of over two million.
Therefore, Paris has a population of over two million.

If we use the letter ‘C‘ as our translation of the statement “Paris is the captial of France” in PL, and the letter ‘P‘ as our translation of the statement “Paris has a population of over two million”, and use a horizontal line to separate the premise(s) of an argument from the conclusion, the above argument could be symbolized in language PL as follows:

\( \begin{array}{l} C \land P\\ \hline P \end{array} \)

In addition to statement letters like ‘C‘ and ‘P‘ and the operators, the only other signs that sometimes appear in the language PL are parentheses which are used in forming even more complex statements. Consider the English compound sentence, “Paris is the most important city in France if and only if Paris is the capital of France and Paris has a population of over two million.” If we use the letter ‘I‘ in language PL to mean that Paris is the most important city in France, this sentence would be translated into PL as follows:

I \leftrightarrow (C \land P)

The parentheses are used to group together the statements ‘C‘ and ‘P‘ and differentiate the above statement from the one that would be written as follows:

(I \leftrightarrow C) \land P

This latter statement asserts that Paris is the most important city in France if and only if it is the capital of France, and (separate from this), Paris has a population of over two million. The difference between the two is subtle, but important logically.

It is important to describe the syntax and make-up of statements in the language PL in a precise manner, and give some definitions that will be used later on. Before doing this, it is worthwhile to make a distinction between the language in which we will be discussing PL, namely, English, from PL itself. Whenever one language is used to discuss another, the language in which the discussion takes place is called the metalanguage, and language under discussion is called the object language. In this context, the object language is the language PL, and the metalanguage is English, or to be more precise, English supplemented with certain special devices that are used to talk about language PL. It is possible in English to talk about words and sentences in other languages, and when we do, we place the words or sentences we wish to talk about in quotation marks. Therefore, using ordinary English, I can say that “parler” is a French verb, and “I \land C” is a statement of PL. The following expression is part of PL, not English:

(I \leftrightarrow C) \land P

However, the following expression is a part of English; in particular, it is the English name of a PL sentence:

(I \leftrightarrow C) \land P

This point may seem rather trivial, but it is easy to become confused if one is not careful.

In our metalanguage, we shall also be using certain variables that are used to stand for arbitrary expressions built from the basic symbols of PL. In what follows, the Greek letters ‘\alpha‘, ‘\beta‘, and so on, are used for any object language (PL) expression of a certain designated form. For example, later on, we shall say that, if \alpha is a statement of PL, then so is \ulcorner \neg \alpha \urcorner. Notice that ‘\alpha‘ itself is not a symbol that appears in PL; it is a symbol used in English to speak about symbols of PL. We will also be making use of so-called “Quine corners”, written ‘\ulcorner‘ and ‘\urcorner‘, which are a special metalinguistic device used to speak about object language expressions constructed in a certain way. Suppose \alpha is the statement “(I \leftrightarrow C)” and \beta is the statement “(P \land C)“; then \ulcorner \alpha \lor \beta \urcorner is the complex statement “(I \leftrightarrow C) \lor (P \land C)“.

Let us now proceed to giving certain definitions used in the metalanguage when speaking of the language PL.

Definition: A statement letter of PL is defined as any uppercase letter written with or without a numerical subscript.

Note: According to this definition, ‘A‘, ‘B‘, ‘B_2‘, ‘C_3‘, and ‘P_{14}‘ are examples of statement letters. The numerical subscripts are used just in case we need to deal with more than 26 simple statements: in that case, we can use ‘P_1‘ to mean something different than ‘P_2‘, and so forth.

Definition: A connective or operator of PL is any of the signs ‘\neg‘, ‘\land‘, ‘\lor‘, ‘→’, and ‘↔’.

Definition: A well-formed formula (hereafter abbreviated as wff) of PL is defined recursively as follows:

  1. Any statement letter is a well-formed formula.
  2. If \alpha is a well-formed formula, then so is \ulcorner \neg \alpha \urcorner.
  3. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha \land \beta) \urcorner.
  4. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha \lor \beta) \urcorner.
  5. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha \rightarrow \beta) \urcorner.
  6. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha \leftrightarrow \beta) \urcorner.
  7. Nothing that cannot be constructed by successive steps of (1)-(6) is a well-formed formula.

Note: According to part (1) of this definition, the statement letters ‘C‘, ‘P‘ and ‘M‘ are wffs. Because ‘C‘ and ‘P‘ are wffs, by part (3), “(C \land P)” is a wff. Because it is a wff, and ‘M‘ is also a wff, by part (6), “(M \leftrightarrow (C \land P))” is a wff. It is conventional to regard the outermost parentheses on a wff as optional, so that “M \leftrightarrow (C \land P)” is treated as an abbreviated form of “(M \leftrightarrow (C \land P))“. However, whenever a shorter wff is used in constructing a more complicated wff, the parentheses on the shorter wff are necessary.

The notion of a well-formed formula should be understood as corresponding to the notion of a grammatically correct or properly constructed statement of language PL. This definition tells us, for example, that “\neg (Q \lor \neg R)” is grammatical for PL because it is a well-formed formula, whereas the string of symbols, “\neg Q \neg \lor ( \leftrightarrow P \land“, while consisting entirely of symbols used in PL, is not grammatical because it is not well-formed.

b. Truth Functions and Truth Tables

So far we have in effect described the grammar of language PL. When setting up a language fully, however, it is necessary not only to establish rules of grammar, but also describe the meanings of the symbols used in the language. We have already suggested that uppercase letters are used as complete simple statements. Because truth-functional propositional logic does not analyze the parts of simple statements, and only considers those ways of combining them to form more complicated statements that make the truth or falsity of the whole dependent entirely on the truth or falsity of the parts, in effect, it does not matter what meaning we assign to the individual statement letters like ‘P‘, ‘Q‘ and ‘R‘, etc., provided that each is taken as either true or false (and not both).

However, more must be said about the meaning or semantics, of the logical operators ‘\land‘, ‘\lor‘, ‘→’, ‘↔’, and ‘\neg‘. As mentioned above, these are used in place of the English words, ‘and’, ‘or’, ‘if… then…’, ‘if and only if’, and ‘not’, respectively. However, the correspondence is really only rough, because the operators of PL are considered to be entirely truth-functional, whereas their English counterparts are not always used truth-functionally. Consider, for example, the following statements:

  1. If Bob Dole is president of the United States in 2004, then the president of the United States in 2004 is a member of the Republican party.
  2. If Al Gore is president of the United States in 2004, then the president of the United States in 2004 is a member of the Republican party.

For those familiar with American politics, it is tempting to regard the English sentence (1) as true, but to regard (2) as false, since Dole is a Republican but Gore is not. But notice that in both cases, the simple statement in the “if” part of the “if… then…” statement is false, and the simple statement in the “then” part of the statement is true. This shows that the English operator “if… then…” is not fully truth-functional. However, all the operators of language PL are entirely truth-functional, so the sign ‘→’, though similar in many ways to the English “if… then…” is not in all ways the same. More is said about this operator below.

Since our study is limited to the ways in which the truth-values of complex statements depend on the truth-values of the parts, for each operator, the only aspect of its meaning relevant in this context is its associated truth-function. The truth-function for an operator can be represented as a table, each line of which expresses a possible combination of truth-values for the simpler statements to which the operator applies, along with the resulting truth-value for the complex statement formed using the operator.

The signs ‘\land‘, ‘\lor‘, ‘→’, ‘↔’, and ‘\neg‘, correspond, respectively, to the truth-functions of conjunction, disjunction, material implication, material equivalence, and negation. We shall consider these individually.

Conjunction: The conjunction of two statements \alpha and \beta, written in PL as \ulcorner (\alpha \land \beta) \urcorner, is true if both \alpha and \beta are true, and is false if either \alpha is false or \beta is false or both are false. In effect, the meaning of the operator ‘\land‘ can be displayed according to the following chart, which shows the truth-value of the conjunction depending on the four possibilities of the truth-values of the parts:

\alpha \beta (\alpha \land \beta)
T
T
F
F
T
F
T
F
T
F
F
F

Conjunction using the operator ‘\land‘ is language PL’s rough equivalent of joining statements together with ‘and’ in English. In a statement of the form \ulcorner (\alpha \land \beta) \urcorner, the two statements joined together, \alpha and \beta, are called the conjuncts, and the whole statement is called a conjunction.

Instead of the sign ‘\land‘, some other logical works use the signs ‘\&‘ or ‘\bullet‘ for conjunction.

Disjunction: The disjunction of two statements \alpha and \beta, written in PL as \ulcorner (\alpha \lor \beta) \urcorner, is true if either \alpha is true or \beta is true, or both \alpha and \beta are true, and is false only if both \alpha and \beta are false. A chart similar to that given above for conjunction, modified for to show the meaning of the disjunction sign ‘\lor‘ instead, would be drawn as follows:

\alpha \beta (\alpha \lor \beta)
T
T
F
F
T
F
T
F
T
T
T
F

This is language PL’s rough equivalent of joining statements together with the word ‘or’ in English. However, it should be noted that the sign ‘\lor‘ is used for disjunction in the inclusive sense. Sometimes when the word ‘or’ is used to join together two English statements, we only regard the whole as true if one side or the other is true, but not both, as when the statement “Either we can buy the toy robot, or we can buy the toy truck; you must choose!” is spoken by a parent to a child who wants both toys. This is called the exclusive sense of ‘or’. However, in PL, the sign ‘\lor‘ is used inclusively, and is more analogous to the English word ‘or’ as it appears in a statement such as (for example, said about someone who has just received a perfect score on the SAT), “either she studied hard, or she is extremely bright”, which does not mean to rule out the possibility that she both studied hard and is bright. In a statement of the form \ulcorner (\alpha \lor \beta) \urcorner, the two statements joined together, \alpha and \beta, are called the disjuncts, and the whole statement is called a disjunction.

Material Implication: This truth-function is represented in language PL with the sign ‘→’. A statement of the form \ulcorner (\alpha \rightarrow \beta) \urcorner, is false if \alpha is true and \beta is false, and is true if either \alpha is false or \beta is true (or both). This truth-function generates the following chart:

\alpha \beta (\alpha \rightarrow \beta)
T
T
F
F
T
F
T
F
T
F
T
T

Because the truth of a statement of the form \ulcorner (\alpha \rightarrow \beta) \urcorner rules out the possibility of \alpha being true and \beta being false, there is some similarity between the operator ‘→’ and the English phrase, “if… then…”, which is also used to rule out the possibility of one statement being true and another false; however, ‘→’ is used entirely truth-functionally, and so, for reasons discussed earlier, it is not entirely analogous with “if… then…” in English. If \alpha is false, then \ulcorner (\alpha \rightarrow \beta) \urcorner is regarded as true, whether or not there is any connection between the falsity of \alpha and the truth-value of \beta. In a statement of the form \ulcorner (\alpha \rightarrow \beta) \urcorner, we call \alpha the antecedent, and we call \beta the consequent, and the whole statement \ulcorner (\alpha \rightarrow \beta) \urcorner is sometimes also called a (material) conditional.

The sign ‘\supset‘ is sometimes used instead of ‘→’ for material implication.

Material Equivalence: This truth-function is represented in language PL with the sign ‘↔’. A statement of the form \ulcorner (\alpha \leftrightarrow \beta) \urcorner is regarded as true if \alpha and \beta are either both true or both false, and is regarded as false if they have different truth-values. Hence, we have the following chart:

\alpha \beta (\alpha \leftrightarrow \beta)
T
T
F
F
T
F
T
F
T
F
F
T

Since the truth of a statement of the form \ulcorner (\alpha \leftrightarrow \beta) \urcorner requires \alpha and \beta to have the same truth-value, this operator is often likened to the English phrase “…if and only if…”. Again, however, they are not in all ways alike, because ‘↔’ is used entirely truth-functionally. Regardless of what \alpha and \beta are, and what relation (if any) they have to one another, if both are false, \ulcorner (\alpha \leftrightarrow \beta) \urcorner is considered to be true. However, we would not normally regard the statement “Al Gore is the President of the United States in 2004 if and only if Bob Dole is the President of the United States in 2004” as true simply because both simpler statements happen to be false. A statement of the form \ulcorner (\alpha \leftrightarrow \beta) \urcorner is also sometimes referred to as a (material) biconditional.

The sign ‘\equiv‘ is sometimes used instead of ‘↔’ for material equivalence.

Negation: The negation of statement \alpha, simply written \ulcorner \neg \alpha \urcorner in language PL, is regarded as true if \alpha is false, and false if \alpha is true. Unlike the other operators we have considered, negation is applied to a single statement. The corresponding chart can therefore be drawn more simply as follows:

\alpha \neg \alpha
T
F
F
T

The negation sign ‘\neg‘ bears obvious similarities to the word ‘not’ used in English, as well as similar phrases used to change a statement from affirmative to negative or vice-versa. In logical languages, the signs ‘\sim‘ or ‘-‘ are sometimes used in place of ‘\neg‘.

The five charts together provide the rules needed to determine the truth-value of a given wff in language PL when given the truth-values of the independent statement letters making it up. These rules are very easy to apply in the case of a very simple wff such as “(P \land Q)“. Suppose that ‘P‘ is true, and ‘Q‘ is false; according to the second row of the chart given for the operator, ‘\land‘, we can see that this statement is false.

However, the charts also provide the rules necessary for determining the truth-value of more complicated statements. We have just seen that “(P \land Q)” is false if ‘P‘ is true and ‘Q‘ is false. Consider a more complicated statement that contains this statement as a part, for example, “((P \land Q) \rightarrow \neg R)“, and suppose once again that ‘P‘ is true, and ‘Q‘ is false, and further suppose that ‘R‘ is also false. To determine the truth-value of this complicated statement, we begin by determining the truth-value of the internal parts. The statement “(P \land Q)“, as we have seen, is false. The other substatement, “\neg R“, is true, because ‘R‘ is false, and ‘\neg‘ reverses the truth-value of that to which it is applied. Now we can determine the truth-value of the whole wff, “((P \land Q) \rightarrow \neg R)“, by consulting the chart given above for ‘→’. Here, the wff “(P \land Q)” is our \alpha, and “\neg R” is our \beta, and since their truth-values are F and T, respectively, we consult the third row of the chart, and we see that the complex statement “((P \land Q) \rightarrow \neg R)” is true.

We have so far been considering the case in which ‘P‘ is true and ‘Q‘ and ‘R‘ are both false. There are, however, a number of other possibilities with regard to the possible truth-values of the statement letters, ‘P‘, ‘Q‘ and ‘R‘. There are eight possibilities altogether, as shown by the following list:

P
Q
R
T
T
T
T
F
F
F
F
T
T
F
F
T
T
F
F
T
F
T
F
T
F
T
F

Strictly speaking, each of the eight possibilities above represents a different truth-value assignment, which can be defined as a possible assignment of truth-values T or F to the different statement letters making up a wff or series of wffs. If a wff has n distinct statement letters making up, the number of possible truth-value assignments is 2n. With the wff, “((P \land Q) \rightarrow \neg R)“, there are three statement letters, ‘P‘, ‘Q‘ and ‘R‘, and so there are 8 truth-value assignments.

It then becomes possible to draw a chart showing how the truth-value of a given wff would be resolved for each possible truth-value assignment. We begin with a chart showing all the possible truth-value assignments for the wff, such as the one given above. Next, we write out the wff itself on the top right of our chart, with spaces between the signs. Then, for each, truth-value assignment, we repeat the appropriate truth-value, ‘T’, or ‘F’, underneath the statement letters as they appear in the wff. Then, as the truth-values of those wffs that are parts of the complete wff are determined, we write their truth-values underneath the logical sign that is used to form them. The final column filled in shows the truth-value of the entire statement for each truth-value assignment. Given the importance of this column, we highlight it in some way. Here, we highlight it in yellow.

P
Q
R
|
((P
\land
Q)
\neg
R)
T
T
T
T
F
F
F
F
T
T
F
F
T
T
F
F
T
F
T
F
T
F
T
F
T
T
T
T
F
F
F
F
T
T
F
F
F
F
F
F
T
T
F
F
T
T
F
F
F
T
T
T
T
T
T
T
F
T
F
T
F
T
F
T
T
F
T
F
T
F
T
F

Charts such as the one given above are called truth tables. In classical truth-functional propositional logic, a truth table constructed for a given wff in effects reveals everything logically important about that wff. The above chart tells us that the wff “((P \land Q) \rightarrow \neg R)” can only be false if ‘P‘, ‘Q‘ and ‘R‘ are all true, and is true otherwise.

c. Definability of the Operators and the Languages PL’ and PL”

The language PL, as we have seen, contains operators that are roughly analogous to the English operators ‘and’, ‘or’, ‘if… then…’, ‘if and only if’, and ‘not’. Each of these, as we have also seen, can be thought of as representing a certain truth-function. It might be objected however, that there are other methods of combining statements together in which the truth-value of the statement depends wholly on the truth-values of the parts, or in other words, that there are truth-functions besides conjunction, (inclusive) disjunction, material implication, material equivalence and negation. For example, we noted earlier that the sign ‘\lor‘ is used analogously to ‘or’ in the inclusive sense, which means that language PL has no simple sign for ‘or’ in the exclusive sense. It might be thought, however, that the langauge PL is incomplete without the addition of an additional symbol, say ‘\veebar‘, such that \ulcorner (\alpha \veebar \beta) \urcorner would be regarded as true if \alpha is true and \beta is false, or \alpha is false and \beta is true, but would be regarded as false if either both \alpha and \beta are true or both \alpha and \beta are false.

However, a possible response to this objection would be to make note that while language PL does not include a simple sign for this exclusive sense of disjunction, it is possible, using the symbols that are included in PL, to construct a statement that is true in exactly the same circumstances. Consider, for example, a statement of the form \ulcorner \neg (\alpha \leftrightarrow \beta) \urcorner. It is easily shown, using a truth table, that any wff of this form would have the same truth-value as a would-be statement using the operator ‘\veebar‘. See the following chart:

\alpha
\beta
|
\neg
(\alpha
\beta)
T
T
F
F
T
F
T
F
F
T
T
F
T
T
F
F
T
F
F
T
T
F
T
F

Here we see that a wff of the form \ulcorner \neg (\alpha \leftrightarrow \beta) \urcorner is true if either \alpha or \beta is true but not both. This shows that PL is not lacking in any way by not containing a sign ‘\veebar‘. All the work that one would wish to do with this sign can be done using the signs ‘↔’ and ‘\neg‘. Indeed, one might claim that the sign ‘\veebar‘ can be defined in terms of the signs ‘↔’, and ‘\neg‘, and then use the form \ulcorner (\alpha \veebar \beta) \urcorner as an abbreviation of a wff of the form \ulcorner \neg (\alpha \leftrightarrow \beta) \urcorner, without actually expanding the primitive vocabulary of language PL.

The signs ‘\land‘, ‘\lor‘, ‘→’, ‘↔’ and ‘\neg‘, were chosen as the operators to include in PL because they correspond (roughly) the sorts of truth-functional operators that are most often used in ordinary discourse and reasoning. However, given the preceding discussion, it is natural to ask whether or not some operators on this list can be defined in terms of the others. It turns out that they can. In fact, if for some reason we wished our logical language to have a more limited vocabulary, it is possible to get by using only the signs ‘\neg‘ and ‘→’, and define all other possible truth-functions in virtue of them. Consider, for example, the following truth table for statements of the form \ulcorner \neg (\alpha \rightarrow \neg \beta) \urcorner:

\alpha
\beta
|
\neg
(\alpha
\neg
\beta)
T
T
F
F
T
F
T
F
T
F
F
F
T
T
F
F
F
T
T
T
F
T
F
T
T
F
T
F

We can see from the above that a wff of the form \ulcorner \neg (\alpha \rightarrow \neg \beta) \urcorner always has the same truth-value as the corresponding statement of the form \ulcorner (\alpha \land \beta) \urcorner. This shows that the sign ‘\land‘ can in effect be defined using the signs ‘\neg‘ and ‘→’.

Next, consider the truth table for statements of the form \ulcorner (\neg \alpha \rightarrow \beta) \urcorner:

\alpha
\beta
|
(\neg
\alpha
\beta)
T
T
F
F
T
F
T
F
F
F
T
T
T
T
F
F
T
T
T
F
T
F
T
F

Here we can see that a statement of the form \ulcorner (\neg \alpha \rightarrow \beta) \urcorner always has the same truth-value as the corresponding statement of the form \ulcorner (\alpha \lor \beta) \urcorner. Again, this shows that the sign ‘\lor‘ could in effect be defined using the signs ‘→’ and ‘\neg‘.

Lastly, consider the truth table for a statement of the form \ulcorner \neg (( \alpha \rightarrow \beta) \rightarrow \neg (\beta \rightarrow \alpha)) \urcorner:

\alpha
\beta
|
\neg
((\alpha
\beta)
\neg
(\beta
\alpha))
T
T
F
F
T
F
T
F
T
F
F
T
T
T
F
F
T
F
T
T
T
F
T
F
F
T
T
F
F
F
T
F
T
F
T
F
T
T
F
T
T
T
F
F

From the above, we see that a statement of the form \ulcorner \neg (( \alpha \rightarrow \beta) \rightarrow \neg (\beta \rightarrow \alpha)) \urcorner always has the same truth-value as the corresponding statement of the form \ulcorner (\alpha \leftrightarrow \beta) \urcorner. In effect, therefore, we have shown that the remaining operators of PL can all be defined in virtue of ‘→’, and ‘\neg‘, and that, if we wished, we could do away with the operators, ‘\land‘, ‘\lor‘ and ‘↔’, and simply make do with those equivalent expressions built up entirely from ‘→’ and ‘\neg‘.

Let us call the language that results from this simplication PL’. While the definition of a statement letter remains the same for PL’ as for PL, the definition of a well-formed formula (wff) for PL’ can be greatly simplified. In effect, it can be stated as follows:

Definition: A well-formed formula (or wff) of PL’ is defined recursively as follows:

  1. Any statement letter is a well-formed formula.
  2. If \alpha is a well-formed formula, then so is \ulcorner \neg \alpha \urcorner.
  3. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha \rightarrow \beta) \urcorner.
  4. Nothing that cannot be constructed by successive steps of (1)-(3) is a well-formed formula.

Strictly speaking, then, the langauge PL’ does not contain any statements using the operators ‘\lor‘, ‘\land‘, or ‘↔’. One could however, utilize conventions such that, in language PL’, an expression of the form \ulcorner (\alpha \land \beta) \urcorner is to be regarded as a mere abbreviation or short-hand for the corresponding statement of the form \ulcorner \neg (\alpha \rightarrow \neg \beta) \urcorner, and similarly that expressions of the forms \ulcorner (\alpha \lor \beta) \urcorner and \ulcorner (\alpha \leftrightarrow \beta) \urcorner are to be regarded as abbreviations of expressions of the forms \ulcorner (\neg \alpha \rightarrow \beta) \urcorner or \ulcorner \neg (( \alpha \rightarrow \beta) \rightarrow \neg (\beta \rightarrow \alpha)) \urcorner, respectively. In effect, this means that it is possible to translate any wff of language PL into an equivalent wff of language PL’.

In Section VII, it is proven that not only are the operators ‘\neg‘ and ‘→’ sufficient for defining every truth-functional operator included in language PL, but also that they are sufficient for defining any imaginable truth-functional operator in classical propositional logic.

Nevertheless, the choice of ‘\neg‘ and ‘→’ for the primitive signs used in language PL’ is to some extent arbitrary. It would also have been possible to define all other operators of PL (including ‘→’) using the signs ‘\neg‘ and ‘\lor‘. On this approach, \ulcorner (\alpha \land \beta) \urcorner would be defined as \ulcorner \neg (\neg \alpha \lor \neg \beta) \urcorner, \ulcorner (\alpha \rightarrow \beta) \urcorner would be defined as \ulcorner (\neg \alpha \lor \beta) \urcorner, and \ulcorner (\alpha \leftrightarrow \beta) \urcorner would be defined as \ulcorner \neg (\neg(\neg \alpha \lor \beta) \lor \neg (\neg \beta \lor \alpha)) \urcorner. Similarly, we could instead have begun with ‘\neg‘ and ‘\land‘ as our starting operators. On this way of proceeding, \ulcorner (\alpha \lor \beta) \urcorner would be defined as \ulcorner \neg (\neg \alpha \land \neg \beta) \urcorner, \ulcorner (\alpha \rightarrow \beta) \urcorner would be defined as \ulcorner \neg (\alpha \land \neg \beta) \urcorner, and \ulcorner (\alpha \leftrightarrow \beta) \urcorner would be defined as \ulcorner (\neg (\alpha \land \neg \beta) \land \neg (\beta \land \neg \alpha) \urcorner.

There are, as we have seen, multiple different ways of reducing all truth-functional operators down to two primitives. There are also two ways of reducing all truth-functional operators down to a single primitive operator, but they require using an operator that is not included in language PL as primitive. On one approach, we utilize an operator written ‘|’, and explain the truth-function corresponding to this sign by means of the following chart:

\alpha \beta (\alpha | \beta)
T
T
F
F
T
F
T
F
F
T
T
T

Here we can see that a statement of the form \ulcorner (\alpha | \beta) \urcorner is false if both \alpha and \beta are true, and true otherwise. For this reason one might read ‘|’ as akin to the English expression, “Not both … and …”. Indeed, it is possible to represent this truth-function in language PL using an expression of the form, \ulcorner \neg (\alpha \land \beta) \urcorner. However, since it is our intention to show that all other truth-functional operators, including ‘\neg‘ and ‘\land‘ can be derived from ‘|’, it is better not to regard the meanings of ‘\neg‘ and ‘\land‘ as playing a part of the meaning of ‘|’, and instead attempt (however counterintuitive it may seem) to regard ‘|’ as conceptually prior to ‘\neg‘ and ‘\land‘.

The sign ‘|’ is called the Sheffer stroke, and is named after H. M. Sheffer, who first publicized the result that all truth-functional connectives could be defined in virtue of a single operator in 1913.

We can then see that the connective ‘\land‘ can be defined in virtue of ‘|’, because an expression of the form \ulcorner ((\alpha | \beta) | (\alpha | \beta)) \urcorner generates the following truth table, and hence is equivalent to the corresponding expression of the form \ulcorner (\alpha \land \beta) \urcorner:

\alpha
\beta
|
((\alpha
|
\beta)
|
(\alpha
|
\beta))
T
T
F
F
T
F
T
F
T
T
F
F
F
T
T
T
T
F
T
F
T
F
F
F
T
T
F
F
F
T
T
T
T
F
T
F

Similarly, we can define the operator ‘\lor‘ using ‘|’ by noting that an expression of the form \ulcorner ((\alpha | \alpha) | (\beta | \beta)) \urcorner always has the same truth-value as the corresponding statement of the form \ulcorner (\alpha \lor \beta) \urcorner:

\alpha
\beta
|
((\alpha
|
\alpha)
|
(\beta
|
\beta))
T
T
F
F
T
F
T
F
T
T
F
F
F
F
T
T
T
T
F
F
T
T
T
F
T
F
T
F
F
T
F
T
T
F
T
F

The following truth table shows that a statement of the form \ulcorner (\alpha | (\beta | \beta)) \urcorner always has the same truth table as a statement of the form \ulcorner (\alpha \rightarrow \beta) \urcorner:

\alpha
\beta
|
(\alpha
|
(\beta
|
\beta))
T
T
F
F
T
F
T
F
T
T
F
F
T
F
T
T
T
F
T
F
F
T
F
T
T
F
T
F

Although far from intuitively obvious, the following table shows that an expression of the form \ulcorner (((\alpha | \alpha) | (\beta | \beta)) | (\alpha | \beta)) \urcorner always has the same truth-value as the corresponding wff of the form \ulcorner (\alpha \leftrightarrow \beta) \urcorner:

\alpha
\beta
|
(((\alpha
|
\alpha)
|
(\beta
|
\beta))
|
(\alpha
|
\beta))
T

T
F
F

T
F
T
F
T
T
F
F
F
F
T
T
T
T
F
F
T
T
T
F
T
F
T
F
F
T
F
T
T
F
T
F
T
F
F
T
T
T
F
F
F
T
T
T
T
F
T
F

This leaves only the sign ‘\neg‘, which is perhaps the easiest to define using ‘|’, as clearly \ulcorner (\alpha | \alpha) \urcorner, or, roughly, “not both \alpha and \alpha“, has the opposite truth-value from \alpha itself:

\alpha
|
(\alpha
|
\alpha)
T

F

T
F
F
T
T
F

If, therefore, we desire a language for use in studying propositional logic that has as small a vocabulary as possible, we might suggest using a language that employs the sign ‘|’ as its sole primitive operator, and defines all other truth-functional operators in virtue of it. Let us call such a language PL”. PL” differs from PL and PL’ only in that its definition of a well-formed formula can be simplified even further:

Definition: A well-formed formula (or wff) of PL” is defined recursively as follows:

  1. Any statement letter is a well-formed formula.
  2. If \alpha and \beta are well-formed formulas, then so is \ulcorner (\alpha | \beta) \urcorner.
  3. Nothing that cannot be constructed by successive steps of (1)-(2) is a well-formed formula.

In language PL”, strictly speaking, ‘|’ is the only operator. However, for reasons that should be clear from the above, any expression from language PL that involves any of the operators ‘\neg‘, ‘\land‘, ‘\lor‘, ‘→’, or ‘↔’ could be translated into language PL” without the loss of any of its important logical properties. In effect, statements using these signs could be regarded as abbreviations or shorthand expressions for wffs of PL” that only use the operator ‘|’.

Even here, the choice of ‘|’ as the sole primitive is to some extent arbitrary. It would also be possible to reduce all truth-functional operators down to a single primitive by making use of a sign ‘\downarrow‘, treating it as roughly equivalent to the English expression, “neither … nor …”, so that the corresponding chart would be drawn as follows:

\alpha \beta (\alpha \downarrow \beta)
T
T
F
F
T
F
T
F
F
F
F
T

If we were to use ‘\downarrow‘ as our sole operator, we could again define all the others. \ulcorner \neg \alpha \urcorner would be defined as \ulcorner (\alpha \downarrow \alpha) \urcorner; \ulcorner (\alpha \lor \beta) \urcorner would be defined as \ulcorner ((\alpha \downarrow \beta) \downarrow (\alpha \downarrow \beta)) \urcorner; \ulcorner (\alpha \land \beta) \urcorner would be defined as \ulcorner ((\alpha \downarrow \alpha) \downarrow (\beta \downarrow \beta)) \urcorner; and similarly for the other operators. The sign ‘\downarrow‘ is sometimes also referred to as the Sheffer stroke, and is also called the Peirce/Sheffer dagger.

Depending on one’s purposes in studying propositional logic, sometimes it makes sense to use a rich language like PL with more primitive operators, and sometimes it makes sense to use a relatively sparse language such as PL’ or PL” with fewer primitive operators. The advantage of the former approach is that it conforms better with our ordinary reasoning and thinking habits; the advantage of the latter is that it simplifies the logical language, which makes certain interesting results regarding the deductive systems making use of the language easier to prove.

For the remainder of this article, we shall primarily be concerned with the logical properties of statements formed in the richer language PL. However, we shall consider a system making use of language PL’ in some detail in Section VI, and shall also make brief mention of a system making use of language PL”.

4. Tautologies, Logical Equivalence and Validity

Truth-functional propositional logic concerns itself only with those ways of combining statements to form more complicated statements in which the truth-values of the complicated statements depend entirely on the truth-values of the parts. Owing to this, all those features of a complex statement that are studied in propositional logic derive from the way in which their truth-values are derived from those of their parts. These features are therefore always represented in the truth table for a given statement.

Some complex statements have the interesting feature that they would be true regardless of the truth-values of the simple statements making them up. A simple example would be the wff “P \lor \neg P“; that is, “P or not P“. It is fairly easy to see that this statement is true regardless of whether ‘P‘ is true or ‘P‘ is false. This is also shown by its truth table:

P
|
P
\lor
\neg
P
T

F

T
F
T
T
F
T
T
F

There are, however, statements for which this is true but it is not so obvious. Consider the wff, “R \rightarrow ((P \rightarrow Q) \lor \neg (R \rightarrow Q))“. This wff also comes out as true regardless of the truth-values of ‘P‘, ‘Q‘ and ‘R‘.

P
Q
R
|
R
((P
Q)
\lor
\neg
(R
Q))
T
T
T
T
F
F
F
F
T
T
F
F
T
T
F
F
T
F
T
F
T
F
T
F
T
F
T
F
T
F
T
F
T
T
T
T
T
T
T
T
T
T
T
T
F
F
F
F
T
T
F
F
T
T
T
T
T
T
F
F
T
T
F
F
T
T
T
F
T
T
T
T
F
F
T
F
F
F
T
F
T
F
T
F
T
F
T
F
T
T
F
T
T
T
F
T
T
T
F
F
T
T
F
F

Statements that have this interesting feature are called tautologies. Let us define this notion precisely.

Definition: a wff is a tautology if and only if it is true for all possible truth-value assignments to the statement letters making it up.

Tautologies are also sometimes called logical truths or truths of logic because tautologies can be recognized as true solely in virtue of the principles of propositional logic, and without recourse to any additional information.

On the other side of the spectrum from tautologies are statements that come out as false regardless of the truth-values of the simple statements making them up. A simple example of such a statement would be the wff “P \land \neg P“; clearly such a statement cannot be true, as it contradicts itself. This is revealed by its truth table:

P
|
P
\land
\neg
P
T

F

T
F
F
F
F
T
T
F

To state this precisely:

Definition: a wff is a self-contradiction if and only if it is false for all possible truth-value assignments to the statement letters making it up.

Another, more interesting, example of a self-contradiction is the statement “\neg (P \rightarrow Q) \land \neg (Q \rightarrow P)“; this is not as obviously self-contradictory. However, we can see that it is when we consider its truth table:

P
Q
|
\neg
(P
Q)
\land
\neg
(Q
P)
T

T
F
F

T
F
T
F
F
T
F
F
T
T
F
F
T
F
T
T
T
F
T
F
F
F
F
F
F
F
T
F
T
F
T
F
T
T
F
T
T
T
F
F

A statement that is neither self-contradictory nor tautological is called a contingent statement. A contingent statement is true for some truth-value assignments to its statement letters and false for others. The truth table for a contingent statement reveals which truth-value assignments make it come out as true, and which make it come out as false. Consider the truth table for the statement “(P \rightarrow Q) \land (P \rightarrow \neg Q)“:

P
Q
|
(P
Q)
\land
(P
\neg
Q)
T

T
F
F

T
F
T
F
T
T
F
F
T
F
T
T
T
F
T
F
F
F
T
T
T
T
F
F
F
T
T
T
F
T
F
T
T
F
T
F

We can see that of the four possible truth-value assignments for this statement, two make it come as true, and two make it come out as false. Specifically, the statement is true when ‘P‘ is false and ‘Q‘ is true, and when ‘P‘ is false and ‘Q‘ is false, and the statement is false when ‘P‘ is true and ‘Q‘ is true and when ‘P‘ is true and ‘Q‘ is false.

Truth tables are also useful in studying logical relationships that hold between two or more statements. For example, two statements are said to be consistent when it is possible for both to be true, and are said to be inconsistent when it is not possible for both to be true. In propositional logic, we can make this more precise as follows.

Definition: two wffs are consistent if and only if there is at least one possible truth-value assignment to the statement letters making them up that makes both wffs true.

Definition: two wffs are inconsistent if and only if there is no truth-value assignment to the statement letters making them up that makes them both true.

Whether or not two statements are consistent can be determined by means of a combined truth table for the two statements. For example, the two statements, “P \lor Q” and “\neg (P \leftrightarrow \neg Q)” are consistent:

P
Q
|
P
\lor
Q
\neg
(P
\neg
Q)
T

T
F
F

T
F
T
F
T
T
F
F
T
T
T
F
T
F
T
F
T
F
F
T
T
T
F
F
F
T
T
F
F
T
F
T
T
F
T
F

Here, we see that there is one truth-value assignment, that in which both ‘P‘ and ‘Q‘ are true, that makes both “P \lor Q” and “\neg (P \leftrightarrow \neg Q)” true. However, the statements “(P \rightarrow Q) \land P” and “\neg (Q \lor \neg P)” are inconsistent, because there is no truth-value assignment in which both come out as true.

P
Q
|
(P
Q)
\land
P
\neg
(Q
\lor
\neg
P))
T

T
F
F

T
F
T
F
T
T
F
F
T
F
T
T
T
F
T
F
T
F
F
F
T
T
F
F
F
T
F
F
T
F
T
F
T
F
T
T
F
F
T
T
T
T
F
F

Another relationship that can hold between two statements is that of having the same truth-value regardless of the truth-values of the simple statements making them up. Consider a combined truth table for the wffs “\neg P \rightarrow \neg Q” and “\neg (Q \land \neg P)“:

P
Q
|
\neg
P
\neg
Q
\neg
(Q
\land
\neg
P))
T

T
F
F

T
F
T
F
F
F
T
T
T
T
F
F
T
T
F
T
F
T
F
T
T
F
T
F
T
T
F
T
T
F
T
F
F
F
T
F
F
F
T
T
T
T
F
F

Here we see that these two statements necessarily have the same truth-value.

Definition: two statements are said to be logically equivalent if and only if all possible truth-value assignments to the statement letters making them up result in the same resulting truth-values for the whole statements.

The above statements are logically equivalent. However, the truth table given above for the statements “P \lor Q” and “\neg (P \leftrightarrow \neg Q)” show that they, on the other hand, are not logically equivalent, because they differ in truth-value for three of the four possible truth-value assignments.

Finally, and perhaps most importantly, truth tables can be utilized to determine whether or not an argument is logically valid. In general, an argument is said to be logically valid whenever it has a form that makes it impossible for the conclusion to be false if the premises are true. (See the encyclopedia article on “Validity and Soundness“.) In classical propositional logic, we can give this a more precise characterization.

Definition: a wff \beta is said to be a logical consequence of a set of wffs \alpha_1, \alpha_2, ..., \alpha_n, if and only if there is no truth-value assignment to the statement letters making up these wffs that makes all of \alpha_1, \alpha_2, ..., \alpha_n true but does not make \beta true.

An argument is logically valid if and only if its conclusion is a logical consequence of its premises. If an argument whose conclusion is \beta and whose only premise is \alpha is logically valid, then \alpha is said to logically imply \beta.

For example, consider the following argument:

\( \begin{array}{l} P \rightarrow Q\\ \neg Q \rightarrow P\\ \hline Q \end{array} \)

We can test the validity of this argument by constructing a combined truth table for all three statements.

P
Q
|
P
Q
\neg
Q
P
Q
T

T
F
F

T
F
T
F
T
T
F
F
T
F
T
T
T
F
T
F
F
T
F
T
T
F
T
F
T
T
T
F
T
T
F
F
T
F
T
F

Here we see that both premises come out as true in the case in which both ‘P‘ and ‘Q‘ are true, and in which ‘P‘ is false but ‘Q‘ is true. However, in those cases, the conclusion is also true. It is possible for the conclusion to be false, but only if one of the premises is false as well. Hence, we can see that the inference represented by this argument is truth-preserving. Contrast this with the following example:

\( \begin{array}{l} P \rightarrow Q\\ \hline \neg Q \lor \neg P \end{array} \)

Consider the truth-value assignment making both ‘P‘ and ‘Q‘ true. If we were to fill in that row of the truth-value for these statements, we would see that “P \rightarrow Q” comes out as true, but “\neg Q \lor \neg P” comes out as false. Even if ‘P‘ and ‘Q‘ are not actually both true, it is possible for them to both be true, and so this form of reasoning is not truth-preserving. In other words, the argument is not logically valid, and its premise does not logically imply its conclusion.

One of the most striking features of truth tables is that they provide an effective procedure for determining the logical truth, or tautologyhood of any single wff, and for determining the logical validity of any argument written in the language PL. The procedure for constructing such tables is purely rote, and while the size of the tables grows exponentially with the number of statement letters involved in the wff(s) under consideration, the number of rows is always finite and so it is in principle possible to finish the table and determine a definite answer. In sum, classical propositional logic is decidable.

5. Deduction: Rules of Inference and Replacement

a. Natural Deduction

Truth tables, as we have seen, can theoretically be used to solve any question in classical truth-functional propositional logic. However, this method has its drawbacks. The size of the tables grows exponentially with the number of distinct statement letters making up the statements involved. Moreover, truth tables are alien to our normal reasoning patterns. Another method for establishing the validity of an argument exists that does not have these drawbacks: the method of natural deduction. In natural deduction an attempt is made to reduce the reasoning behind a valid argument to a series of steps each of which is intuitively justified by the premises of the argument or previous steps in the series.

Consider the following argument stated in natural language:

Either cat fur or dog fur was found at the scene of the crime. If dog fur was found at the scene of the crime, officer Thompson had an allergy attack. If cat fur was found at the scene of the crime, then Macavity is responsible for the crime. But officer Thompson didn’t have an allergy attack, and so therefore Macavity must be responsible for the crime.

The validity of this argument can be made more obvious by representing the chain of reasoning leading from the premises to the conclusion:

  1. Either cat fur was found at the scene of the crime, or dog fur was found at the scene of the crime. (Premise)
  2. If dog fur was found at the scene of the crime, then officer Thompson had an allergy attack. (Premise)
  3. If cat fur was found at the scene of the crime, then Macavity is responsible for the crime. (Premise)
  4. Officer Thompson did not have an allergy attack. (Premise)
  5. Dog fur was not found at the scene of the crime. (Follows from 2 and 4.)
  6. Cat fur was found at the scene of the crime. (Follows from 1 and 5.)
  7. Macavity is responsible for the crime. (Conclusion. Follows from 3 and 6.)

Above, we do not jump directly from the premises to the conclusion, but show how intermediate inferences are used to ultimately justify the conclusion by a step-by-step chain. Each step in the chain represents a simple, obviously valid form of reasoning. In this example, the form of reasoning exemplified in line 5 is called modus tollens, which involves deducing the negation of the antecedent of a conditional from the conditional and the negation of its consequent. The form of reasoning exemplified in step 5 is called disjunctive syllogism, and involves deducing one disjunct of a disjunction on the basis of the disjunction and the negation of the other disjunct. Lastly, the form of reasoning found at line 7 is called modus ponens, which involves deducing the truth of the consequent of a conditional given truth of both the conditional and its antecedent. “Modus ponens” is Latin for affirming mode, and “modus tollens” is Latin for denying mode.

A system of natural deduction consists in the specification of a list of intuitively valid rules of inference for the construction of derivations or step-by-step deductions. Many equivalent systems of deduction have been given for classical truth-functional propositional logic. In what follows, we sketch one system, which is derived from the popular textbook by Irving Copi (1953). The system makes use of the language PL.

b. Rules of Inference

Here we give a list of intuitively valid rules of inference. The rules are stated in schematic form. Any inference in which any wff of language PL is substituted unformly for the schematic letters in the forms below constitutes an instance of the rule.

Modus ponens (MP):

\( \begin{array}{l} \alpha \rightarrow \beta, \alpha\\ \hline \beta \end{array} \)

(Modus ponens is sometimes also called “modus ponendo ponens”, “detachment” or a form of “→-elimination”.)

Modus tollens (MT):

\( \begin{array}{l} \alpha \rightarrow \beta, \neg \beta\\ \hline \neg \alpha \end{array} \)

(Modus tollens is sometimes also called “modus tollendo tollens” or a form of “→-elimination”.)

Disjunctive syllogism (DS): (two forms)

\( \begin{array}{l} \alpha \lor \beta, \neg \alpha\\ \hline \beta \end{array} \)

\( \begin{array}{l} \alpha \lor \beta, \neg \beta\\ \hline \alpha \end{array} \)

(Disjunctive syllogism is sometimes also called “modus tollendo ponens” or “\lor-elimination”.)

Addition (Add): (two forms)

\( \begin{array}{l} \alpha\\ \hline \alpha \lor \beta \end{array} \)

\( \begin{array}{l} \beta\\ \hline \alpha \lor \beta \end{array} \)

(Addition is sometimes also called “disjunction introduction” or “\lorintroduction”.)

Simplification (Simp): (two forms)

\( \begin{array}{l} \alpha \land \beta\\ \hline \alpha \end{array} \)

\( \begin{array}{l} \alpha \land \beta\\ \hline \beta \end{array} \)

(Simplification is sometimes also called “conjunction elimination” or “\land-elimination”.)

Conjunction (Conj):

\( \begin{array}{l} \alpha, \beta\\ \hline \alpha \land \beta \end{array} \)

(Conjunction is sometimes also called “conjunction introduction”, “\land-introduction” or “logical multiplication”.)

Hypothetical syllogism (HS):

\( \begin{array}{l} \alpha \rightarrow \beta, \beta \rightarrow \gamma\\ \hline \alpha \rightarrow \gamma \end{array} \)

(Hypothetical syllogism is sometimes also called “chain reasoning” or “chain deduction”.)

Constructive dilemma (CD):

\( \begin{array}{l} (\alpha \rightarrow \gamma) \land (\beta \rightarrow \delta), \alpha \lor \beta\\ \hline \gamma \lor \delta \end{array} \)

Absorption (Abs):

\( \begin{array}{l} \alpha \rightarrow \beta\\ \hline \alpha \rightarrow (\alpha \land \beta) \end{array} \)

c. Rules of Replacement

The nine rules of inference listed above represent ways of inferring something new from previous steps in a deduction. Many systems of natural deduction, including those initially designed by Gentzen, consist entirely of rules similar to the above. If the language of a system involves signs introduced by definition, it must also allow the substitution of a defined sign for the expression used to define it, or vice versa. Still other systems, while not making use of defined signs, allow one to make certain substitutions of expressions of one form for expressions of another form in certain cases in which the expressions in question are logically equivalent. These are called rules of replacement, and Copi’s natural deduction system invokes such rules. Strictly speaking, rules of replacement differ from inference rules, because, in a sense, when a rule of replacement is used, one is not inferring something new but merely stating what amounts to the same thing using a different combination of symbols. In some systems, rules for replacement can be derived from the inference rules, but in Copi’s system, they are taken as primitive.

Rules of replacement also differ from inference rules in other ways. Inference rules only apply when the main operators match the patterns given and only apply to entire statements. Inference rules are also strictly unidirectional: one must infer what is below the horizontal line from what is above and not vice-versa. However, replacement rules can be applied to portions of statements and not only to entire statements; moreover, they can be implemented in either direction.

The rules of replacement used by Copi are the following:

Double negation (DN):

\ulcorner \neg \neg \alpha \urcorner is interreplaceable with \alpha

(Double negation is also called “\neg-elimination”.)

Commutativity (Com): (two forms)

\ulcorner \alpha \land \beta \urcorner is interreplaceable with \ulcorner \beta \land \alpha \urcorner
\ulcorner \alpha \lor \beta \urcorner is interreplaceable with \ulcorner \beta \lor \alpha \urcorner

Associativity (Assoc): (two forms)

\ulcorner (\alpha \land \beta) \land \gamma \urcorner is interreplaceable with \ulcorner \alpha \land (\beta \land \gamma) \urcorner
\ulcorner (\alpha \lor \beta) \lor \gamma \urcorner is interreplaceable with \ulcorner \alpha \lor (\beta \lor \gamma) \urcorner

Tautology (Taut): (two forms)

\alpha is interreplaceable with \ulcorner \alpha \land \alpha \urcorner
\alpha is interreplaceable with \ulcorner \alpha \lor \alpha \urcorner

DeMorgan’s Laws (DM): (two forms)

\ulcorner \neg (\alpha \land \beta) \urcorner is interreplaceable with \ulcorner \neg \alpha \lor \neg \beta \urcorner
\ulcorner \neg (\alpha \lor \beta) \urcorner is interreplaceable with \ulcorner \neg \alpha \land \neg \beta \urcorner

Transposition (Trans):

\ulcorner \alpha \rightarrow \beta \urcorner is interreplaceable with \ulcorner \neg \beta \rightarrow \neg \alpha \urcorner

(Transposition is also sometimes called “contraposition”.)

Material Implication (Impl):

\ulcorner \alpha \rightarrow \beta \urcorner is interreplaceable with \ulcorner \neg \alpha \lor \beta \urcorner

Exportation (Exp):

\ulcorner \alpha \rightarrow (\beta \rightarrow \gamma) \urcorner is interreplaceable with \ulcorner (\alpha \land \beta) \rightarrow \gamma \urcorner

Distribution (Dist): (two forms)

\ulcorner \alpha \land (\beta \lor \gamma) \urcorner is interreplaceable with \ulcorner (\alpha \land \beta) \lor (\alpha \land \gamma) \urcorner
\ulcorner \alpha \lor (\beta \land \gamma) \urcorner is interreplaceable with \ulcorner (\alpha \lor \beta) \land (\alpha \lor \gamma) \urcorner

Material Equivalence (Equiv): (two forms)

\ulcorner \alpha \leftrightarrow \beta \urcorner is interreplaceable with \ulcorner (\alpha \rightarrow \beta) \land (\beta \rightarrow \alpha) \urcorner
\ulcorner \alpha \leftrightarrow \beta \urcorner is interreplaceable with \ulcorner (\alpha \land \beta) \lor (\neg \alpha \land \neg \beta) \urcorner

(Material equivalence is sometimes also called “biconditional introduction/elimination” or “↔-introduction/elimination”.)

d. Direct Deductions

A direct deduction of a conclusion from a set of premises consists of an ordered sequence of wffs such that each member of the sequence is either (1) a premise, (2) derived from previous members of the sequence by one of the inference rules, (3) derived from a previous member of the sequence by the replacement of a logically equivalent part according to the rules of replacement, and such that the conclusion is the final step of the sequence.

To be even more precise, a direct deduction is defined as an ordered sequence of wffs, \beta_1, \beta_2, ..., \beta_n, such that for each step \beta_i where i is between 1 and n inclusive, either (1) \beta_i is a premise, (2) \beta_i matches the form given below the horizontal line for one of the 9 inference rules, and there are wffs in the sequence prior to \beta_i matching the forms given above the horizontal line, (3) there is a previous step in the sequence \beta_j where j < i and \beta_j differs from \beta_i at most by matching or containing a part that matches one of the forms given for one of the 10 replacement rules in the same place in whcih \beta_i contains the wff of the corresponding form, and such that the conclusion of the argument is \beta_n.

Using line numbers and the abbreviations for the rules of the system to annotate, the chain of reasoning given above in English, when transcribed into language PL and organized as a direct deduction, would appear as follows:

1. C \lor D Premise
2. C \rightarrow O Premise
3. D \rightarrow M Premise
4. \neg O Premise
5. \neg C 2,4 MT
6. D 1,5 DS
7. M 2,6 MP

There is no unique derivation for a given conclusion from a given set of premises. Here is a distinct derivation for the same conclusion from the same premises:

1. C \lor D Premise
2. C \rightarrow O Premise
3. D \rightarrow M Premise
4. \neg O Premise
5. (C \rightarrow O) \land (D \rightarrow M) 2,3 Conj
6. O \lor M 1,5 CD
7. M 4,6 DS

Consider next the argument:

\( \begin{array}{l} P \leftrightarrow Q\\ (S \lor T) \rightarrow Q\\ \neg P \lor (\neg T \land R)\\ \hline T \rightarrow U \end{array} \)

This argument has six distinct statement letters, and hence constructing a truth table for it would require 64 rows. The table would have 22 columns, thereby requiring 1,408 distinct T/F calculations. Happily, the derivation of the conclusion of the premises using our inference and replacement rules, while far from simple, is relatively less exhausting:

1. P \leftrightarrow Q Premise
2. (S \lor T) \rightarrow Q Premise
3. \neg P \lor (\neg T \land R) Premise
4. (P \rightarrow Q) \land (Q \rightarrow P) 1 Equiv
5. Q \rightarrow P 4 Simp
6. (S \lor T) \rightarrow P 2,5 HS
7. P \rightarrow (\neg T \land R) 3 Impl
8. (S \lor T) \rightarrow (\neg T \land R) 6,7 HS
9. \neg (S \lor T) \lor (\neg T \land R) 8 Impl
10. (\neg S \land \neg T) \lor (\neg T \land R) 9 DM
11. ((\neg S \land \neg T) \lor \neg T) \land ((\neg S \land \neg T) \lor R) 10 Dist
12. (\neg S \land \neg T) \lor \neg T 11 Simp
13. \neg T \lor (\neg S \land \neg T) 12 Com
14. (\neg T \lor \neg S) \land (\neg T \lor \neg T) 13 Dist
15. \neg T \lor \neg T 14 Simp
16. \neg T 15 Taut
17. \neg T \lor U 16 Add
18. T \rightarrow U 17 Impl

e. Conditional and Indirect Proofs

Together the nine inference rules and ten rules of replacement are sufficient for creating a deduction for any logically valid argument, provided that the argument has at least one premise. However, to cover the limiting case of arguments with no premises, and simply to facillitate certain deductions that would be recondite otherwise, it is also customary to allow for certain methods of deduction other than direct derivation. Specifically, it is customary to allow the proof techniques known as conditional proof and indirect proof.

A conditional proof is a derivation technique used to establish a conditional wff, that is, a wff whose main operator is the sign ‘→’. This is done by constructing a sub-derivation within a derivation in which the antecedent of the conditional is assumed as a hypothesis. If, by using the inference rules and rules of replacement (and possibly additional sub-derivations), it is possible to arrive at the consequent, it is permissible to end the sub-derivation and conclude the truth of the conditional statement within the main derivation, citing the sub-derivation as a conditional proof, or ‘CP’ for short. This is much clearer by considering the following example argument:

\( \begin{array}{l} P \rightarrow (Q \lor R)\\ P \rightarrow \neg S\\ S \leftrightarrow Q\\ \hline P \rightarrow R \end{array} \)

While a direct derivation establishing the validity of this argument is possible, it is easier to establish the validity of this argument using a conditional derivation.

1. P \rightarrow (Q \lor R) Premise
2. P \rightarrow \neg S Premise
3. S \leftrightarrow Q Premise
4. P Assumption
5. Q \lor R 1,4 MP
6. \neg S 2,4 MP
7. (S \rightarrow Q) \land (Q \rightarrow S) 3 Equiv
8. Q \rightarrow S 7 Simp
9. \neg Q 6,8 MT
10. R 5,9 DS
11. P \rightarrow R 4-10 CP

Here in order to establish the conditional statement “P \rightarrow R“, we constructed a sub-derivation, which is the portion found at lines 4-10. First, we assumed the truth of ‘P‘, and found that with it, we could derive ‘R‘. Given the premises, we therefore had shown that if ‘P‘ were also true, so would be ‘R‘. Therefore, on the basis of the sub-derivation we were justified in concluding “P \rightarrow R“. This is the usual methodology used in logic and mathematics for establishing the truth of a conditional statement.

Another common method is that of indirect proof, also known as proof by reductio ad absurdum. (For a fuller discussion, see the article on reductio ad absurdum in the encyclopedia.) In an indirect proof (‘IP’ for short), our goal is to demonstrate that a certain wff is false on the basis of the premises. Again, we make use of a sub-derivation; here, we begin by assuming the opposite of that which we’re trying to prove, that is, we assume that the wff is true. If on the basis of this assumption, we can demonstrate an obvious contradiction, that is, a statement of the form \ulcorner \alpha \land \neg \alpha \urcorner, we can conclude that the assumed statement must be false, because anything that leads to a contradiction must be false.

For example, consider the following argument:

\( \begin{array}{l} P \rightarrow Q\\ P \rightarrow (Q \rightarrow \neg P)\\ \hline \neg P \end{array} \)

While, again, a direct derivation of the conclusion for this argument from the premises is possible, it is somewhat easier to prove that “\neg P” is true by showing that, given the premises, it would be impossible for ‘P‘ to be true by assuming that it is and showing this to be absurd.

1. P \rightarrow Q Premise
2. P \rightarrow (Q \rightarrow \neg P) Premise
3. P Assumption
4. Q 1,3 MP
5. Q \rightarrow \neg P 2,3 MP
6. \neg P 4,5 MP
7. P \land \neg P 3,6 Conj
8. \neg P 3-7 IP

Here we were attempting to show that “\neg P” was true given the premises. To do this we assumed instead that ‘P‘ was true. Since this assumption was impossible, we were justified in concluding that ‘P‘ is false, that is, that “\neg P” is true.

When making use of either conditional proof or indirect proof, once a sub-derivation is finished, the lines making it up cannot be used later on in the main derivation or any additional sub-derivations that may be constructed later on.

This completes our characterization of a system of natural deduction for the language PL.

The system of natural deduction just described is formally adequate in the following sense. Earlier, we defined a valid argument as one in which there is no possible truth-value assignment to the statement letters making up its premises and conclusion that makes the premises all true but the conclusion untrue. It is provable that an argument in the language of PL is formally valid in that sense if and only if it is possible to construct a derivation of the conclusion of that argument from the premises using the above rules of inference, rules of replacement and techniques of conditional and indirect proof. Space limitations preclude a full proof of this in the metalanguage, although the reasoning is very similar to that given for the axiomatic Propositional Calculus discussed in Sections VI and VII below.

Informally, it is fairly easy to see that no argument for which a deduction is possible in this system could be invalid according to truth tables. Firstly, the rules of inference are all truth-preserving. For example, in the case of modus ponens, it is fairly easy to see from the truth table for any set of statements of the appropriate form that no truth-value assignment could make both \ulcorner \alpha \rightarrow \beta \urcorner and \alpha true while making \beta false. A similar consideration applies for the others. Moreover, truth tables can easily be used to verify that statements of one of the forms mentioned in the rules of replacement are all logically equivalent with those the rule allows one to swap for them. Hence, the statements could never differ in truth-value for any truth-value assignment. In case of conditional proof, note that any truth-value assignment must make either the conditional true, or it must make the antecedent true and consequent false. The antecedent is what is assumed in a conditional proof. So, if the truth-value assignment makes both it and the premises of the argument true, because the other rules are all truth-preserving, it would be impossible to derive the consequent unless it were also true. A similar consideration justifies the use of indirect proof.

This system represents a useful method for establishing the validity of an argument that has the advantage of coinciding more closely with the way we normally reason. (As noted earlier, however, there are many equivalent systems of natural deduction, all coinciding relatively closely to ordinary reasoning patterns.) One disadvantage this method has, however, is that, unlike truth tables, it does not provide a means for recognizing that an argument is invalid. If an argument is invalid, there is no deduction for it in the system. However, the system itself does not provide a means for recognizing when a deduction is impossible.

Another objection that might be made to the system of deduction sketched above is that it contains more rules and more techniques than it needs to. This leads us directly into our next topic.

6. Axiomatic Systems and the Propositional Calculus

The system of deduction discussed in the previous section is an example of a natural deduction system, that is, a system of deduction for a formal language that attempts to coincide as closely as possible to the forms of reasoning most people actually employ. Natural systems of deduction are typically contrasted with axiomatic systems. Axiomatic systems are minimalist systems; rather than including rules corresponding to natural modes of reasoning, they utilize as few basic principles or rules as possible. Since so few kinds of steps are available in a deduction, relatively speaking, an axiomatic system usually requires more steps for the deduction of a conclusion from a given set of premises as compared to a natural deduction system.

Typically, an axiomatic system consists in the specification of certain wffs that are specified as “axioms”. An axiom is something that is taken as a fundamental truth of the system that does not itself require proof. To allow for the deduction of results from the axioms or the premises of an argument, the system typically also includes at least one (and often only one) rule of inference. Usually, an attempt is made to limit the number of axioms to as few as possible, or at least, limit the number of forms axioms can take.

Because axiomatic systems aim to be minimal, typically they employ languages with simplified vocabularies whenever possible. For classical truth-functional propositional logic, this might involve using a simpler language such as PL’ or PL” instead of the full language PL.

For most of the remainder of this section, we shall sketch an axiomatic system for classical truth-functional propositional logic, which we shall dub the Propositional Calculus (or PC for short). The Propositional Calculus makes use of language PL’, described above. That is, the only connectives it uses are ‘→’ and ‘\neg‘, and the other operators, if used at all, would be understood as shorthand abbreviations making use of the definitions discussion in Section III(c).

System PC consists of three axiom schemata, which are forms a wff fits if it is axiom, along with a single inference rule: modus ponens. We make this more precise by specifying certain definitions.

Definition: a wff of language PL’ is an axiom of PC if and only if it is an instance of one of the following three forms:

\alpha \rightarrow (\beta \rightarrow \alpha) (Axiom Schema 1, or AS1)
(\alpha \rightarrow (\beta \rightarrow \gamma)) \rightarrow ((\alpha \rightarrow \beta) \rightarrow (\alpha \rightarrow \gamma)) (Axiom Schema 2, or AS2)
(\neg \alpha \rightarrow \neg \beta) \rightarrow ((\neg \alpha \rightarrow \beta) \rightarrow \alpha) (Axiom Schema 3, or AS3)

Note that according to this definition, every wff of the form \ulcorner \alpha \rightarrow (\beta \rightarrow \alpha) \urcorner is an axiom. This includes an infinite number of different wffs, from simple cases such as “P \rightarrow (Q \rightarrow P)“, to much more complicated cases such as “(\neg R \rightarrow \neg \neg S) \rightarrow (\neg(\neg M \rightarrow N) \rightarrow (\neg R \rightarrow \neg \neg S))“.

An ordered step-by-step deduction constitutes a derivation in system PC if and only if each step in the deduction is either (1) a premise of the argument, (2) an axiom, or (3) derived from previous steps by modus ponens. Once again we can make this more precise with the following (more recondite) definition:

Definition: an ordered sequence of wffs \beta_1, \beta_2, ..., \beta_n is a derivation in system PC of the wff \beta_n from the premises \alpha_1, \alpha_2, ..., \alpha_m if and only if, for each wff \beta_i in the sequence \beta_1, \beta_2, ..., \beta_n, either (1) \beta_i is one of the premises \alpha_1, \alpha_2, ..., \alpha_m, (2) \beta_i is an axiom of PC, or (3) \beta_i follows from previous members of the series by the inference rule modus ponens (that is, there are previous members of the sequence, \beta_j and \beta_k, such that \beta_j takes the form \ulcorner \beta_k \rightarrow \beta_i \urcorner).

For example, consider the following argument written in the language PL’:

\( \begin{array}{l} P\\ (R \rightarrow P) \rightarrow (R \rightarrow (P \rightarrow S))\\ \hline R \rightarrow S \end{array} \)

The following constitutes a derivation in system PC of the conclusion from the premises:

1. P Premise
2. (R \rightarrow P) \rightarrow (R \rightarrow (P \rightarrow S)) Premise
3. P \rightarrow (R \rightarrow P) Instance of AS1
4. R \rightarrow P 1,3 MP
5. R \rightarrow (P \rightarrow S) 2,4 MP
6. (R \rightarrow (P \rightarrow S)) \rightarrow ((R \rightarrow P) \rightarrow (R \rightarrow S)) Instance of AS2
7. (R \rightarrow P) \rightarrow (R \rightarrow S) 5,6 MP
8. R \rightarrow S 4,7 MP

Historically, the original axiomatic systems for logic were designed to be akin to other axiomatic systems found in mathematics, such as Euclid’s axiomatization of geometry. The goal of developing an axiomatic system for logic was to create a system in which to derive truths of logic making use only of the axioms of the system and the inference rule(s). Those wffs that can be derived from the axioms and inference rule alone, that is, without making use of any additional premises, are called theorems or theses of the system. To make this more precise:

Definition: a wff \alpha is said to be a theorem of PC if and only if there is an ordered sequence of wffs, specifically, a derivation, \beta_1, \beta_2, ..., \beta_n such that, \alpha is \beta_n and each wff \beta_i in the sequence \beta_1, \beta_2, ..., \beta_n, is such that either (1) \beta_i is an axiom of PC, or (2) \beta_i follows from previous members of the series by modus ponens.

One very simple theorem of system PC is the wff “P \rightarrow P“. We can show that it is a theorem by constructing a derivation of “P \rightarrow P” that makes use only of axioms and MP and no additional premises.

1. P \rightarrow (P \rightarrow P) Instance of AS1
2. P \rightarrow ((P \rightarrow P) \rightarrow P) Instance of AS1
3. (P \rightarrow ((P \rightarrow P) \rightarrow P)) \rightarrow ((P \rightarrow (P \rightarrow P)) \rightarrow (P \rightarrow P)) Instance of AS2
4. (P \rightarrow (P \rightarrow P)) \rightarrow (P \rightarrow P) 2,3 MP
5. P \rightarrow P 1,4 MP

It is fairly easy to see that not only is “P \rightarrow P” a theorem of PC, but so is any wff of the form \ulcorner \alpha \rightarrow \alpha \urcorner. Whatever \alpha happens to be, there will be a derivation in PC of the same form:

1. \alpha \rightarrow (\alpha \rightarrow \alpha) Instance of AS1
2. \alpha \rightarrow ((\alpha \rightarrow \alpha) \rightarrow \alpha) Instance of AS1
3. (\alpha \rightarrow ((\alpha \rightarrow \alpha) \rightarrow \alpha)) \rightarrow ((\alpha \rightarrow (\alpha \rightarrow \alpha)) \rightarrow (\alpha \rightarrow \alpha)) Instance of AS2
4. (\alpha \rightarrow (\alpha \rightarrow \alpha)) \rightarrow (\alpha \rightarrow \alpha) 2,3 MP
5. \alpha \rightarrow \alpha 1,4 MP

So even if we make \alpha in the above the more complicated wff, for example, “\neg (\neg M \rightarrow N)“, a derivation with the same form shows that “\neg (\neg M \rightarrow N) \rightarrow \neg (\neg M \rightarrow N)” is also a theorem of PC. Hence, we call \ulcorner \alpha \rightarrow \alpha \urcorner a theorem schema of PC, because all of its instances are theorems of PC. From now on, let’s call it “Theorem Schema 1”, or “TS1” for short.

The following are also theorem schemata of PC:

\alpha \rightarrow \neg \neg \alpha (Theorem Schema 2, or TS2)
\neg \alpha \rightarrow (\alpha \rightarrow \beta) (TS3)
\alpha \rightarrow (\neg \beta \rightarrow \neg (\alpha \rightarrow \beta)) (TS4)
(\alpha \rightarrow \beta) \rightarrow ((\neg \alpha \rightarrow \beta) \rightarrow \beta) (TS5)

You may wish to verify this for yourself by attempting to construct the appropriate proofs for each. Be warned that some require quite lengthy derivations!

It is common to use the notation:

\vdash \beta

to mean that β is a theorem. Similarly, it is common to use the notation:

\alpha_1, \alpha_2, ..., \alpha_m \vdash \beta

to mean that it is possible to construct a derivation of \beta making use of \alpha_1, \alpha_2, ..., \alpha_m as premises.

Considered in terms of number of rules it employs, the axiomatic system PC is far less complex than the system of natural deduction sketched in the previous section. The natural deduction system made use of nine inference rules, ten rules of replacement and two additional proof techniques. The axiomatic system instead, makes use of three axiom schemata and a single inference rule and no additional proof techniques. Yet, the axiomatic system is not lacking in any way.

Indeed, for any argument using language PL’ that is logically valid according to truth tables it is possible to construct a derivation in system PC for that argument. Moreover, every wff of language PL’ that is a logical truth, that is, a tautology according to truth tables, is a theorem of PC. The reverse of these results is true as well; every theorem of PC is a tautology, and every argument for which a derivation in system PC exists is logically valid according to truth tables. These and other features of the Propositional Calculus are discussed, and some are even proven in the next section below.

While the Propositional Calculus is simpler in one way than the natural deduction system sketched in the previous section, in many ways it is actually more complicated to use. For any given argument, a deduction of the conclusion from the premises conducted in PC is likely to be far longer and less psychologically natural than one carried out in a natural deduction system. Such deductions are only simpler in the sense that fewer distinct rules are employed.

System PC is only one of many possible ways of axiomatizing propositional logic. Some systems differ from PC in only very minor ways. For example, we could alter our definition of “axiom” so that a wff is an axiom iff it is an instance of (A1), an instance of (A2), or an instance of the following:

(A3′) (\neg \alpha \rightarrow \neg \beta) \rightarrow (\beta \rightarrow \alpha)

Replacing axiom schema (A3) with (A3′), while altering the way certain deductions must be constructed (making the proofs of many important results longer), has little effect otherwise; the resulting system would have all the same theorems and every argument for which a deduction is possible in the system above would also have a deduction in the revised system, and vice versa.

We also noted above that, strictly speaking, there are an infinite number of axioms of system PC. Instead of utilizing an infinite number of axioms, we might alternatively have utilized only three axioms, namely, the specific wffs:

(A1*) P \rightarrow (Q \rightarrow P)
(A2*) (P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))
(A3*) (\neg P \rightarrow \neg Q) \rightarrow ((\neg P \rightarrow Q) \rightarrow P)

Note that (A1*) is just a unique wff; on this approach, the wff “(\neg R \rightarrow \neg \neg S) \rightarrow (\neg(\neg M \rightarrow N) \rightarrow (\neg R \rightarrow \neg \neg S))” would not count as an axiom, even though it shares a common form with (A1*). To such a system it would be necessary to add an additional inference rule, a rule of substitution or uniform replacement. This would allow one to infer, from a theorem of the system, the result of uniformly replacing any given statement letter (for example, ‘P‘ or ‘Q‘) that occurs within the theorem, with any wff, simple or complex, provided that the same wff replaces all occurrences of the same statement letter in the theorem. On this approach, “(\neg R \rightarrow \neg \neg S) \rightarrow (\neg(\neg M \rightarrow N) \rightarrow (\neg R \rightarrow \neg \neg S))“, while not an axiom, would still be a theorem because it could be derived from the rule of uniform replacement twice, that is, by first replacing ‘P‘ in (A1*) with “(\neg R \rightarrow \neg \neg S)“, and then replacing ‘Q‘ with “\neg(\neg M \rightarrow N)“. The resulting system differs in only subtle ways from our earlier system PC. System PC, strictly speaking, uses only one inference rule, but countenances an infinite number of axioms. This system uses only three axioms, but makes use of an additional rule. System PC, however, avoids this additional inference rule by allowing everything that one could get by substitution in (A1*) to be an axiom. For every theorem \alpha, therefore, if \beta is a wff obtained from \alpha by uniformly substituting wffs for statement letters in \alpha, then \beta is also a theorem of PC, because there would always be a proof of \beta analogous to the proof of \alpha only beginning from different axioms.

It is also possible to construct even more austere systems. Indeed, it is possible to utilize only a single axiom schema (or a single axiom plus a rule of replacement). One possibility, suggested by C. A. Meredith (1953), would be to define an axiom as any wff matching the following form:

((((\alpha \rightarrow \beta) \rightarrow (\neg \gamma \rightarrow \neg \delta)) \rightarrow \gamma) \rightarrow \epsilon) \rightarrow ((\epsilon \rightarrow \alpha) \rightarrow (\delta \rightarrow \alpha))

The resulting system is equally powerful as system PC and has exactly the same set of theorems. However, it is far less psychologically intuitive and straightforward, and deductions even for relatively simple results are often very long.

Historically, the first single axiom schema system made use, instead of language PL’, the even simpler language PL” in which the only connective is the Sheffer stroke, ‘|’, as discussed above. In that case, it is possible to make use only of the following axiom schema:

(\alpha | (\beta | \gamma)) | ((\delta | (\delta | \delta)) | ((\epsilon | \beta) | ((\alpha | \epsilon) | (\alpha | \epsilon))))

The inference rule of MP is replaced with the rule that from wffs of the form \ulcorner \alpha | (\beta | \gamma) \urcorner and \alpha, one can deduce the wff \gamma. This system was discovered by Jean Nicod (1917). Subsequently, a number of possible single axiom systems have been found, some faring better than others in terms of the complexity of the single axiom and in terms of how long deductions for the same results are required to be. (For research in this area, consult McCune et. al. 2002.) Generally, however the more the system allows, the shorter the deductions.

Besides axiomatic and natural deduction forms, deduction systems for propositional logic can also take the form of a sequent calculus; here, rather than specifying definitions of axioms and inference rules, the rules are stated directly in terms of derivability or entailment conditions; for example, one rule might state that if (either \alpha \vdash \beta or \alpha \vdash \gamma) then if \gamma, \alpha \vdash \beta then \alpha \vdash \beta. Sequent calculi, like modern natural deduction systems, were first developed by Gerhard Gentzen. Gentzen’s work also suggests the use of tree-like deduction systems rather than linear step-by-step deduction systems, and such tree systems have proven more useful in automated theorem-proving, that is, in the creation of algorithms for the mechanical construction of deductions (for example, by a computer). However, rather then exploring the details of these and other rival systems, in the next section, we focus on proving things about the system PC, the axiomatic system treated at length above.

7. Important Meta-Theoretic Results for the Propositional Calculus

Note: this section is relatively more technical, and is designed for audiences with some prior background in logic or mathematics. Beginners may wish to skip to the next section.

In this section, we sketch informally the proofs given for certain important features of the Propositional Calculus. Our first topic, however, concerns the language PL’ generally.

Metatheoretic result 1: Language PL’ is expressively adequate, that is, within the context of classical bivalent logic, there are no truth-functions that cannot be represented in it.

We noted in Section III(c) that the connectives ‘\land‘, ‘↔’ and ‘\lor‘ can be defined using the connectives of PL’ (‘→’ and ‘\neg‘). More generally, metatheoretic result 1 holds that any statement built using truth-functional connectives, regardless of what those connectives are, has an equivalent statement formed using only ‘→’ and ‘\neg‘. Here’s the proof.

1. Assume that \alpha is some wff built in some language containing any set of truth-functional connectives, including those not found in PL, PL’ or PL”. For example, \alpha might make use of some three or four-place truth-functional connectives, or connectives such as the exclusive or, or the sign ‘\downarrow‘, or any others you might imagine.

2. We need to show that there is a wff \beta formed only with the connectives ‘→’ and ‘\neg‘ that is logically equivalent with \alpha. Because we have already shown that forms equivalent to those built from ‘\land‘, ‘↔’, and ‘\lor‘ can be constructed from ‘→’ and ‘\neg‘, we are entitled to use them as well.

3. In order for it to be logically equivalent to \alpha, the wff \beta that we construct must have the same final truth-value for every possible truth-value assignment to the statement letters making up \alpha, or in other words, it must have the same final column in a truth table.

4. Let p_1, p_2, ..., p_n be the distinct statement letters making up \alpha. For some possible truth-value assignments to these letters, \alpha may be true, and for others \alpha may be false. The only hard case would be the one in which \alpha is contingent. If \alpha were not contingent, it must either be a tautology, or a self-contradiction. Since clearly tautologies and self-contradictions can be constructed in PL’, and all tautologies are logically equivalent to one another, and all self-contradictions are equivalent to one another, in those cases, our job is easy. Let us suppose instead that \alpha is contingent.

5. Let us construct a wff \beta in the following way.

(a) Consider in turn each truth-value assignment to the letters p_1, p_2, ..., p_n. For each truth-value assignment, construct a conjunction made up of those letters the truth-value assignment makes true, along with the negations of those letters the truth-value assignment makes false. For instance, if the letters involved are ‘A‘, ‘B‘ and ‘C‘, and the truth-value assignment makes ‘A‘ and ‘C‘ true but ‘B‘ false, consider the conjunction ‘((A \land \neg B) \land C)‘.

(b) From the resulting conjunctions, form a complex disjunction formed from those conjunctions formed in step (a) for which the corresponding truth-value assignment makes \alpha true. For example, if the truth-value assignment making ‘A‘ and ‘C‘ true but ‘B‘ false makes \alpha true, include it the disjunction. Suppose, for example, that this truth-value assignment does make \alpha true, as does that assignment in which ‘A‘ and ‘B‘ and ‘C‘ are all made false, but no other truth-value assignment makes \alpha true. In that case, the resulting disjunction would be ‘((A \land \neg B) \land C) \lor ((\neg A \land \neg B) \land \neg C)‘.

6. The wff \beta constructed in step 5 is logically equivalent to \alpha. Consider that for those truth-value assignments making \alpha true, one of the conjunctions making up the disjunction \beta is true, and hence the whole disjunction is true as well. For those truth-value assignments making \alpha false, none of the conjunctions making up \beta is true, because each conjunction will contain at least one conjunct that is false on that truth-value assignment.

7. Because \beta is constructed using only ‘\land‘, ‘\lor‘ and ‘\neg‘, and these can in turn be defined using only ‘\neg‘ and ‘→’, and because \beta is equivalent to \alpha, there is a wff built up only from ‘\neg‘ and ‘→’ that is equivalent to \alpha, regardless of the connectives making up \alpha.

8. Therefore, PL’ is expressively adequate.

Corollary 1.1: Language PL” is also expressively adequate.

The corollary follows at once from metatheoretic result 1, along with the fact, noted in Section III(c), that ‘→’, and ‘\neg‘ can be defined using only ‘|’.

Metatheoretic result 2 (a.k.a. “The Deduction Theorem”): In the Propositional Calculus, PC, whenever it holds that \alpha_1, ..., \alpha_n \vdash \beta, it also holds that \alpha_1, ..., \alpha_{n-1} \vdash \alpha_n \rightarrow \beta

What this means is that whenever we can prove a given result in PC using a certain number of premises, then it is possible, using all the same premises leaving out one exception, \alpha_n, to prove the conditional statement made up of the removed premise, \alpha_n, as antecedent and the conclusion of the original derivation, \beta, as consequent. The importance of this result is that, in effect, it shows that the technique of conditional proof, typically found in natural deduction (see Section V), is unnecessary in PC, because whenever it is possible to prove the consequent of a conditional by taking the antecedent as an additional premise, a derivation directly for the conditional can be found without taking the antecedent as a premise.

Here’s the proof:

1. Assume that \alpha_1, ..., \alpha_n \vdash \beta. This means that there is a derivation of \beta in the Propositional Calculus from the premises \alpha_1, ..., \alpha_n. This derivation takes the form of an ordered sequence \gamma_1, \gamma_2, ..., \gamma_m, where the last member of the sequence, \gamma_m, is \beta, and each member of the sequence is either (1) a premise, that is, it is one of \alpha_1, ..., \alpha_n, (2) an axiom of PC, (3) derived from previous members of the sequence by modus ponens.

2. We need to show that there is a derivation of \ulcorner \alpha_n \rightarrow \beta \urcorner, which, while possibly making use of the other premises of the argument, does not make use of \alpha_n. We’ll do this by showing that for each member, \gamma_i, of the sequence of the original derivation: \gamma_1, \gamma_2, ..., \gamma_m, one can derive \ulcorner \alpha_n \rightarrow \gamma_i \urcorner without making use of \alpha_n as a premise.

3. Each step \gamma_i in the sequence of the original derivation was gotten at in one of three ways, as mentioned in (1) above. Regardless of which case we are dealing with, we can get the result that \alpha_1, ..., \alpha_{n-1} \vdash \alpha_n \rightarrow \gamma_i. There are three cases to consider:

Case (a): Suppose \gamma_i is a premise of the original argument. Then \gamma_i is either one of \alpha_1, ..., \alpha_{n-1} or it is \alpha_n itself. In the latter subcase, what we desire to get is that \ulcorner \alpha_n \rightarrow \alpha_n \urcorner can be gotten at without using \alpha_n as a premise. Because \ulcorner \alpha_n \rightarrow \alpha_n \urcorner is an instance of TS1, we can get it without using any premises. In the latter case, notice that \gamma_i is one of the premises we’re allowed to use in the new derivation. We’re also allowed to introduce the instance of AS1, \ulcorner \gamma_i \rightarrow (\alpha_n \rightarrow \gamma_i) \urcorner. From these, we can get \ulcorner \alpha_n \rightarrow \gamma_i \urcorner by modus ponens.

Case (b): Suppose \gamma_i is an axiom. We need to show that we can get \ulcorner \alpha_n \rightarrow \gamma_i \urcorner without using \alpha_n as a premise. In fact, we can get it without using any premises. Because \gamma_i is an axiom, we can use it in the new derivation as well. As in the last case, we have \ulcorner \gamma_i \rightarrow (\alpha_n \rightarrow \gamma_i) \urcorner as another axiom (an instance of AS1). From these two axioms, we arrive at \ulcorner \alpha_n \rightarrow \gamma_i \urcorner by modus ponens.

Case (c): Suppose that \gamma_i was derived from previous members of the sequence by modus ponens. Specifically, there is some \gamma_j and \gamma_k such that both j and k are less than i, and \gamma_j takes the form \ulcorner \gamma_k \rightarrow \gamma_i \urcorner. We can assume that we have already been able to derive both \ulcorner \alpha_n \rightarrow \gamma_j \urcorner—that is, \ulcorner \alpha_n \rightarrow (\gamma_k \rightarrow \gamma_i) \urcorner—and \ulcorner \alpha_n \rightarrow \gamma_k \urcorner in the new derivation without making use of \alpha_n. (This may seem questionable in the case that either \gamma_j or \gamma_k was itself gotten at by modus ponens. But notice that this just pushes the assumption back, and eventually one will reach the beginning of the original derivation. The first two steps of the sequence, namely, \gamma_1 and \gamma_2, cannot have been derived by modus ponens, since this would require there to have been two previous members of the sequence, which is impossible.) So, in our new derivation, we already have both \ulcorner \alpha_n \rightarrow (\gamma_k \rightarrow \gamma_i) \urcorner and \ulcorner \alpha_n \rightarrow \gamma_k \urcorner.
Notice that \ulcorner (\alpha_n \rightarrow (\gamma_k \rightarrow \gamma_i)) \rightarrow ((\alpha_n \rightarrow \gamma_k) \rightarrow (\alpha_n \rightarrow \gamma_i)) \urcorner is an instance of AS2, and so it can be introduced in the new derivation. By two steps of modus ponens, we arrive at \ulcorner \alpha_n \rightarrow \gamma_i \urcorner, again without using \alpha_n as a premise.

4. If we continue through each step of the original derivation, showing for each such step \gamma_i, we can get \ulcorner \alpha_n \rightarrow \gamma_i \urcorner without using \alpha_n as a premise, eventually, we come to the last step of the original derivation, \gamma_m, which is \beta itself. Applying the procedure from step (3), we get that \ulcorner \alpha_n \rightarrow \beta \urcorner without making use of \alpha_n as a premise. Therefore, the new derivation formed in this way shows that \alpha_1, ..., \alpha_{n-1} \vdash \alpha_n \rightarrow \beta, which is what we were attempting to show.

What’s interesting about this proof for metatheoretic result 2 is that it provides a recipe, given a derivation for a certain result that makes use of one or more premises, for transforming that derivation into one of a conditional statement in which one of the premises of the original argument has become the antecedent. This may be much clearer with an example.

Consider the following derivation for the result that Q \rightarrow R \vdash (P \rightarrow Q) \rightarrow (P \rightarrow R):

1. Q \rightarrow R Premise
2. (Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)) AS1
3. P \rightarrow (Q \rightarrow R) 1,2 MP
4. (P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)) AS2
5. (P \rightarrow Q) \rightarrow (P \rightarrow R) 3,4 MP

It is possible to transform the above derivation into one that uses no premises and that shows that \ulcorner (Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)) \urcorner is a theorem of PC. The procedure for such a transformation involves looking at each step of the original derivation, and for each one, attempting to derive the same statement, only beginning with “(Q \rightarrow R) \rightarrow ...“, without making use of “(Q \rightarrow R)” as a premise. How this is done depends on whether the step is a premise, an axiom, or a result of modus ponens, and depending on which it is, applying one of the three procedures sketched in the proof above. The result is the following:

1. (Q \rightarrow R) \rightarrow (Q \rightarrow R) TS1
2. (Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)) AS1
3. ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R))) \rightarrow ((Q \rightarrow R) \rightarrow ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)))) AS1
4. (Q \rightarrow R) \rightarrow ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R))) 2,3 MP
5. ((Q \rightarrow R) \rightarrow ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)))) \rightarrow (((Q \rightarrow R) \rightarrow (Q \rightarrow R)) \rightarrow ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)))) AS2
6. ((Q \rightarrow R) \rightarrow (Q \rightarrow R)) \rightarrow ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)))) 4,5 MP
7. (Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R)) 1,6 MP
8. (P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)) AS2
9. ((P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))) \rightarrow \big((Q \rightarrow R) \rightarrow ((P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)))\big) AS1
10. (Q \rightarrow R) \rightarrow ((P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))) 8,9 MP
11. \big((Q \rightarrow R) \rightarrow ((P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)))\big) \rightarrow (((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R))) \rightarrow ((Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)))) AS2
12. ((Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R))) \rightarrow ((Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))) 10,11 MP
13. (Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R)) 7,12 MP

The procedure for transforming one sort of derivation into another is purely rote. Moreover, the result is quite often not the most elegant or easy way to show that which you were trying to show. Notice, for example, in the above that lines (2) and (7) are redudant, and more steps were taken than necessary. However, the purely rote procedure is effective.

This metatheoretic result is due to Jacques Herbrand (1930).

It is interesting on its own, especially when one reflects on it as a substitution or replacement for the conditional proof technique. However, it is also very useful for proving other metatheoretic results, as we shall see below.

Metatheoretic result 3: If \alpha is a wff of language PL’, and the statement letters making it up are p_1, p_2, ..., p_n, then if we consider any possible truth-value assignment to these letters, and consider the set of premises, \Delta, that contains p_1 if the truth-value assignment makes p_1 true, but contains \ulcorner \neg p_1 \urcorner if the truth-value assignment makes p_1 false, and similarly for p_2, ..., p_n, if the truth-value assignment makes \alpha true, then in PC, it holds that \Delta \vdash \alpha, and if it makes \alpha false, then \Delta \vdash \neg \alpha.

Here’s the proof.

1. By the definition of a wff, \alpha is either itself a statement letter, or ultimately built up from statement letters by the connectives ‘\neg‘ and ‘→’.

2. If \alpha is itself a statement letter, then obviously either it or its negation is a member of \Delta. It is a member of \Delta if the truth-value assignment makes it true. In that case, obviously, there is a derivation of \alpha from \Delta, since a premise maybe introduced at any time. If the truth-value assignment makes it false instead, then \ulcorner \neg \alpha \urcorner is a member of \Delta, and so we have a derivation of \ulcorner \neg \alpha \urcorner from \Delta, since again a premise may be introduced at any time. This covers the case in which our wff is simply a statement letter.

3. Suppose that \alpha is built up from some other wff \beta with the sign ‘\neg‘, that is, suppose that \alpha is \ulcorner \neg \beta \urcorner. We can assume that we have already gotten the desired result for \beta. (Either \beta is a statement letter, in which case the result holds by step (2), or is itself ultimately built up from statement letters, so even if verifying this assumption requires making a similar assumption, ultimately we will get back to statement letters.) That is, if the truth-value assignment makes \beta true, then we have a derivation of \beta from \Delta. If it makes it false, then we have a derivation of \ulcorner \neg \beta \urcorner from \Delta. Suppose that it makes \beta true. Since \alpha is the negation of \beta, the truth-value assignment must make \alpha false. Hence, we need to show that there is a derivation of \ulcorner \neg \alpha \urcorner from \Delta . Since \alpha is \ulcorner \neg \beta \urcorner, \ulcorner \neg \alpha \urcorneris \ulcorner \neg \neg \beta \urcorner. If we append to our derivation of \beta from \Delta the derivation of \ulcorner \beta \rightarrow \neg \neg \beta \urcorner, an instance of TS2, we can reach a derivation of \ulcorner \neg \neg \beta \urcorner by modus ponens, which is what was required. If we assume instead that the truth-value assignment makes \beta false, then by our assumption, there is a derivation of \ulcorner \neg \beta \urcorner from \Delta. Since \alpha is the negation of \beta, this truth-value assigment must make \alpha true. Now, \alpha simply is \ulcorner \neg \beta \urcorner, so we already have a derivation of it from \Delta.

4. Suppose instead that \alpha is built up from other wffs \beta and \gamma with the sign ‘→’, that is, suppose that \alpha is \ulcorner \beta \rightarrow \gamma \urcorner. Again, we can assume that we have already gotten the desired result for \beta and \gamma. (Again, either they themselves are statement letters or built up in like fashion from statement letters.) Suppose that the truth-value assignment we are considering makes \alpha true. Because \alpha is \ulcorner \beta \rightarrow \gamma \urcorner, by the semantics for the sign ‘→’, the truth-value assignment must make either \beta false or \gamma true. Take the first subcase. If it makes \beta false, then by our assumption, there is a derivation of \ulcorner \neg \beta \urcorner from \Delta. If we append to this the derivation of the instance of TS3, \ulcorner \neg \beta \rightarrow (\beta \rightarrow \gamma) \urcorner, by modus ponens we arrive at derivation of \ulcorner \beta \rightarrow \gamma \urcorner, that is, \alpha, from \Delta. If instead, the truth-value assignment makes \gamma true, then by our assumption there is a derivation of \gamma from \Delta. If we add to this derivation the instance of AS1, \ulcorner \gamma \rightarrow (\beta \rightarrow \gamma) \urcorner, by modus ponens, we then again arrive at a derivation of \ulcorner \beta \rightarrow \gamma \urcorner, that is, \alpha, from \Delta. If instead, the truth-value assignment makes \alpha false, then since \alpha is \ulcorner \beta \rightarrow \gamma \urcorner, the truth-value assignment in question must make \beta true and \gamma false. By our assumption, then it is possible to prove both \beta and \ulcorner \neg \gamma \urcorner from \Delta. If we concatenate these two derivations, and add to them the derivation of the instance of TS4, \ulcorner \beta \rightarrow (\neg \gamma \rightarrow \neg (\beta \rightarrow \gamma)) \urcorner, then by two applications of modus ponens, we can derive \ulcorner \neg (\beta \rightarrow \gamma) \urcorner, which is simply \ulcorner \neg \alpha \urcorner, which is what was desired.

From the above we see that the Propositional Calculus PC can be used to demonstrate the appropriate results for a complex wff if given as premises either the truth or falsity of all its simple parts. This is of course the foundation of truth-functional logic, that the truth or falsity of those complex statements one can make in it be determined entirely by the truth or falsity of the simple statements entering in to it. Metatheoretic result 3 is again interesting on its own, but it plays a crucial role in the proof of completeness, which we turn to next.

Metatheoretic result 4 (Completeness): If \alpha is a wff of language PL’ and a tautology, then \alpha is a theorem of the Propositional Calculus.

This feature of the Propositional Calculus is called completeness because it shows that the Propositional Calculus, as a deductive system aiming to capture all the truths of logic, is a success. Every wff true solely in virtue of the truth-functional nature of the connectives making it up is something that one can prove using only the axioms of PC along with modus ponens. Here’s the proof:

1. Suppose that \alpha is a tautology. This means that every possible truth-value assignment to its statement letters makes it true.

2. Let the statement letters making up \alpha be p_1, p_2, ..., p_n, arranged in some order (say alphabetically and by the numerical order of their subscripts). It follows from (1) and metatheoretic result 3, that there is a derivation in PC of \alpha using any possible set of premises that consists, for each statement letter, of either it or its negation.

3. By metatheoretic result 2, we can remove from each of these sets of premises either p_n or \ulcorner \neg p_n \urcorner, depending on which it contains, and make it an antecedent of a conditional in which \alpha is consequent, and the result will be provable without using p_n or \ulcorner \neg p_n \urcorner as a premise. This means that for every possible set of premises consisting of either p_1 or \ulcorner \neg p_1 \urcorner and so on, up until p_{n-1}, we can derive both \ulcorner p_n \rightarrow \alpha \urcorner and \ulcorner \neg p_n \rightarrow \alpha \urcorner.

4. The wff \ulcorner (p_n \rightarrow \alpha) \rightarrow ((\neg p_n \rightarrow \alpha) \rightarrow \alpha) \urcorner is an instance of TS5. Therefore, for any set of premises from which one can derive both \ulcorner p_n \rightarrow \alpha \urcorner and \ulcorner \neg p_n \rightarrow \alpha \urcorner, by two applications of modus ponens, one can also derive \alpha itself.

5. Putting (3) and (4) together, we have the result that \alpha can be derived from every possible set of premises consisting of either p_1 or \ulcorner \neg p_1 \urcorner and so on, up until p_{n-1}.

6. We can apply the same reasoning given in steps (3)-(5) to remove p_{n-1} or its negation from the premise sets by the deduction theorem, arriving at the result that for every set of premises consisting of either p_1 or \ulcorner \neg p_1 \urcorner and so on, up until p_{n-2}, it is possible to derive \alpha. If we continue to apply this reasoning, eventually, we’ll get the result that we can derive \alpha with either p_1 or its negation as our sole premise. Again, applying the deduction theorem, this means that both \ulcorner p_1 \rightarrow \alpha \urcorner and \ulcorner \neg p_1 \rightarrow \alpha \urcorner can be proven in PC without using any premises, that is, they are theorems. Concatenating the derivations of these theorems, along with the instance of TS5, \ulcorner (p_1 \rightarrow \alpha) \rightarrow ((\neg p_1 \rightarrow \alpha) \rightarrow \alpha) \urcorner, and by two applications of modus ponens, it follows that \alpha itself is a theorem, which is what we sought to demonstrate.

The above proof of the completeness of system PC is easier to appreciate when visualized. Suppose, just for the sake of illustration, that the tautology we wish to demonstrate in system PC has three statement letters, ‘P‘, ‘Q‘ and ‘R‘. There are eight possible truth-value assignments to these letters, and since \alpha is a tautology, all of them make \alpha true. We can sketch in at least this much of \alpha‘s truth table:

P
Q
R
|
\alpha
T
T
T
T
F
F
F
F
T
T
F
F
T
T
F
F
T
F
T
F
T
F
T
F
T
T
T
T
T
T
T
T

Now, given this feature of \alpha, it follows from metatheoretic result 3, that for every possible combination of premises that consists of either ‘P‘ or “\neg P” (but not both), either ‘Q‘ or “\neg Q“, and ‘R‘ or “\neg R“, it is possible from those premises to construct a derivation showing \alpha. This can be visualized as follows:

P, Q, R \vdash \alpha
P, Q, \neg R \vdash \alpha
P, \neg Q, R \vdash \alpha
P, \neg Q, \neg R \vdash \alpha
\neg P, Q, R \vdash \alpha
\neg P, Q, \neg R \vdash \alpha
\neg P, \neg Q, R \vdash \alpha
\neg P, \neg Q, \neg R \vdash \alpha

By the deduction theorem, we can pull out the last premise from each list of premises and make it an antecedent. However, because from the same remaining list of premises we get both \ulcorner R \rightarrow \alpha \urcorner and \ulcorner \neg R \rightarrow \alpha \urcorner, we can get \alpha by itself from those premises according to TS5. Again, to visualize this:

P, Q \vdash R \rightarrow \alpha … and so P, Q \vdash \alpha
P, Q \vdash \neg R \rightarrow \alpha
P, \neg Q \vdash R \rightarrow \alpha … and so P, \neg Q \vdash \alpha
P, \neg Q \vdash \neg R \rightarrow \alpha
\neg P, Q \vdash R \rightarrow \alpha … and so \neg P, Q \vdash \alpha
\neg P, Q \vdash \neg R \rightarrow \alpha
\neg P, \neg Q \vdash R \rightarrow \alpha … and so \neg P, \neg Q \vdash \alpha
\neg P, \neg Q \vdash \neg R \rightarrow \alpha

We can continue this line of reasoning until all the premises are removed.

P, Q \vdash \alpha P \vdash Q \rightarrow \alpha and so P \vdash \alpha and so \vdash P \rightarrow \alpha and so \vdash \alpha
P, \neg Q \vdash \alpha P \vdash \neg Q \rightarrow \alpha
\neg P, Q \vdash \alpha \neg P \vdash Q \rightarrow \alpha and so \neg P \vdash \alpha and so \vdash \neg P \rightarrow \alpha
\neg P, \neg Q \vdash \alpha \neg P \vdash \neg Q \rightarrow \alpha

At the end of this process, we see that \alpha is a theorem. Despite only having three axiom schemata and a single inference rule, it is possible to prove any tautology in the simple Propositional Calculus, PC. It is complete in the requisite sense.

This method of proving the completeness of the Propositional Calculus is due to Kalmár (1935).

Corollary 4.1: If a given wff \beta of language PL’ is a logical consequence of a set of wffs \alpha_1, \alpha_2, ..., \alpha_n, according to their combined truth table, then there is a derivation of \beta with \alpha_1, \alpha_2, ..., \alpha_n as premises in the Propositional Calculus.

Without going into the details of the proof of this corollary, it follows from the fact that if \beta is a logical consequence of \alpha_1, \alpha_2, ..., \alpha_n, then the wff of the form \ulcorner (\alpha_1 \rightarrow (\alpha_2 \rightarrow ... (\alpha_n \rightarrow \beta)...)) \urcorner is a tautology. As a tautology, it is a theorem of PC, and so if one begins with its derivation in PC and appends a number of steps of modus ponens using \alpha_1, \alpha_2, ..., \alpha_n as premises, one can derive \beta.

Metatheoretic result 5 (Soundness): If a wff \alpha is a theorem of the Propositional Calculus (PC), then \alpha is a tautology.

Above, we saw that all tautologies are theorems of PC. The reverse is also true: all theorems of PC are tautologies. Here’s the proof:

1. Suppose that \alpha is a theorem of PC. This means that there is an ordered sequence of steps, each of which is either (1) an axiom of PC, or (2) derived from previous members of the sequence by modus ponens, and such that \alpha is the last member of the sequence.

2. We can show that not only is \alpha a tautology, but so are all the members of the sequence leading to it. The first thing to note is that every axiom of PC is a tautology. To be an axiom of PC, a wff must match one of the axiom schemata AS1, AS2 or AS3. All such wffs must be tautologous; this can easily be verified by constructing truth tables for AS1, AS2 and AS3. (This is left to the reader.)

3. The rule of modus ponens preserves tautologyhood. If \alpha is a tautology and \ulcorner \alpha \rightarrow \beta \urcorner is also a tautology, \beta must be a tautology as well. This is because if \beta were not a tautology, it would be false on some truth-value assignments. However, \alpha, as a tautology, is true for all truth-value assignments. Because a statement of the form \ulcorner \alpha \rightarrow \beta \urcorner is false for any truth-value assignment making \alpha true and \beta false, it would then follow that some truth-value assignment makes \ulcorner \alpha \rightarrow \beta \urcorner false, which is impossible if it too is a tautology.

4. Hence, we see that the axioms with which we begin the sequence, and every step derived from them using modus ponens, must all be tautologies, and consequently, the last step of the sequence, \alpha, must also be a tautology.

This result is called the soundness of the Propositional Calculus; it shows that in it, one cannot demonstrate something that is not logically true.

Corollary 5.1: A wff \alpha of language PL’ is a tautology if and only if \alpha is a theorem of system PC.

This follows immediately from metatheoretic results 4 and 5.

Corollary 5.2 (Consistency): There is no wff \alpha of language PL’ such that both \alpha and \ulcorner \neg \alpha \urcorner are theorems of the Propositional Calculus (PC).

Due to metatheoretic result 5, all theorems of PC are tautologies. It is therefore impossible for both \alpha and \ulcorner \neg \alpha \urcorner to be theorems, as this would require both to be tautologies. That would mean that both are true for all truth-value assignments, but obviously, they must have different truth-values for any given truth-value assignment, and cannot both be true for any, much less all, such assignments.

This result is called consistency because it guarantees that no theorem of system PC can be inconsistent with any other theorem.

Corollary 5.3: If there is a derivation of the wff \beta with \alpha_1, \alpha_2, ..., \alpha_n as premises in the Propositional Calculus, then \beta is a logical consequence of the set of wffs \alpha_1, \alpha_2, ..., \alpha_n, according to their combined truth table.

This is the converse of Corollary 4.1. It follows by the reverse reasoning involved in that corollary. If there is a derivation of \beta taking \alpha_1, \alpha_2, ..., \alpha_n as premises, then by multiple applications of the deduction theorem (Metatheoretic result 2), it follows that \ulcorner (\alpha_1 \rightarrow (\alpha_2 \rightarrow ... (\alpha_n \rightarrow \beta)...)) \urcorner is a theorem of PC. By metatheoretic result 5, \ulcorner (\alpha_1 \rightarrow (\alpha_2 \rightarrow ... (\alpha_n \rightarrow \beta)...)) \urcorner must be a tautology. If so, then there cannot be a truth-value assignment making all of \alpha_1, \alpha_2, ..., \alpha_n true while making \beta false, and so \beta is a logical consequence of \alpha_1, \alpha_2, ..., \alpha_n.

Corollary 5.4: There is a derivation of the wff \beta with \alpha_1, \alpha_2, ..., \alpha_n as premises in the Propositional Calculus if and only if \beta is a logical consequence of \alpha_1, \alpha_2, ..., \alpha_n, according to their combined truth table.

This follows at once from corollaries 4.1 and 5.3. In sum, then, the Propositional Calculus method of demonstrating something to follow from the axioms of logic is extensionally equivalent to the truth table method of determining whether or not something is a logical truth. Similarly, the truth-table method for testing the validity of an argument is equivalent to the test of being able to construct a derivation for it in the Propositional Calculus. In short, the Propositional Calculus is exactly what we wanted it to be.

Corollary 5.5 (Decidability): The Propositional Calculus (PC) is decidable, that is, there is a finite, effective, rote procedure for determining whether or not a given wff \alpha is a theorem of PC or not.

By Corollary 5.1, a wff \alpha is a theorem of PC if and only if it is a tautology. Truth tables provide a rote, effective, and finite procedure for determining whether or not a given wff is a tautology. They therefore also provide such a procedure for determining whether or not a given wff is a theorem of PC.

8. Forms of Propositional Logic

So far we have focused only on classical, truth-functional propositional logic. Its distinguishing features are (1) that all connectives it uses are truth-functional, that is, the truth-values of complex statements formed with those connectives depend entirely on the truth-values of the parts, and (2) that it assumes bivalence: all statements are taken to have exactly one of two truth-values—truth or falsity—with no statement assigned both truth-values or neither. Classical truth-functional propositional logic is the most widely studied and discussed form, but there are other forms of propositional logic.

Perhaps the most well known form of non-truth-functional propositional logic is modal propositional logic. Modal propositional logic involves introducing operators into the logic involving necessity and possibility, usually along with truth-functional operators such as ‘→’, ‘\land‘, ‘\neg‘, etc.. Typically, the sign ‘\Box‘ is used in place of the English operator, “it is necessary that…”, and the sign ‘\Diamond‘ is used in place of the English operator “it is possible that…”. Sometimes both these operators are taken as primitive, but quite often one is defined in terms of the other, since \ulcorner \neg \Box \neg \alpha \urcorner would appear to be logically equivalent with \ulcorner \Diamond \alpha \urcorner. (Roughly, it means the same to say that something is not necessarily not true as it does to say that it is possibly true.)

To see that modal propositional logic is not truth-functional, just consider the following pair of statements:

\Box P
\Box (P \lor \neg P)

The first states that it is necessary that P. Let us suppose in fact that ‘P‘ is true, but might have been false. Since P is not necessarily true, the statement “\Box P” is false. However, the statement “P \lor \neg P” is a tautology and so it could not be false. Hence, the statement “\Box (P \lor \neg P)” is true. Notice that both ‘P‘ and “P \lor \neg P” are true, but different truth-values result when the operator ‘\Box‘ is added. So, in modal propositional logic, the truth-value of a statement does not depend entirely on the truth-values of the parts.

The study of modal propositional logic involves identifying under what conditions statements involving the operators ‘\Box‘ and ‘\Diamond‘ should be regarded as true. Different notions or conceptions of necessity lead to different answers to that question. It also involves discovering what inference rules or systems of deduction would be appropriate given the addition of these operators. Here, there is more controversy than with classical truth-functional logic. For example, in the context of discussions of axiomatic systems for modal propositional logic, very different systems result depending on whether instances of the following schemata are regarded as axiomatic truths, or even truths at all:

\Box \alpha \rightarrow \Box \Box \alpha
\Diamond \alpha \rightarrow \Box \Diamond \alpha

If a statement is necessary, is it necessarily necessary? If a statement is possible, is it necessarily possible? A positive answer to the first question is a key assumption in a logical system known as S4 modal logic. Positive answers to both these questions are key assumptions in a logical system known as S5 modal logic. Other systems of modal logic that avoid such assumptions have also been developed. (For an excellent introduction survey, see Hughes and Cresswell 1996.)

Deontic propositional logic and epistemic propositional logic are two other forms of non-truth-functional propositional logic. The former involves introduction of operators similar to the English operators “it is morally obligatory that…” and “it is morally permissible that…”. Obviously, some things that are in fact true were not morally obligatory, whereas some things that are true were morally obligatory. Again, the truth-value of a statement in deontic logic does not depend wholly on the truth-value of the parts. Epistemic logic involves the addition of operators similar to the English operators “it is known that…” and “it is believed that …”. While everything that is known to be the case is in fact the case, not everything that is the case is known to be the case, so a statement built up with a “it is known that…” will not depend entirely on the truth of the proposition it modifies, even if it depends on it to some degree.

Yet another widely studied form of non-truth-functional propositional logic is relevance propositional logic, which involves the addition of an operator ‘Rel‘ used to connect two statements \alpha and \beta to form a statement \ulcorner Rel(\alpha, \beta) \urcorner, which is interpreted to mean that \alpha is related to \beta in theme or subject matter. For example, if ‘P‘ means that Ben loves Jennifer and ‘Q‘ means that Jennifer is a pop star, then the statement “Rel(P, Q)” is regarded as true; whereas if ‘S‘ means The sun is shining in Tokyo, then “Rel(P, S)” is false, and hence “\neg Rel(P, S)” is true. Obviously, whether or not a statement formed using the connective ‘Rel‘ is true does not depend solely on the truth-value of the propositions involved.

One of the motivations for introducing non-truth-functional propositional logics is to make up for certain oddities of truth-functional logic. Consider the truth table for the sign ‘→’ used in Language PL. A statement of the form \ulcorner \alpha \rightarrow \beta \urcorner is regarded as true whenever its antecedent is false or consequent is true. So if we were to translate the English sentence, “if the author of this article lives in France, then the moon is made of cheese” as “E \rightarrow M“, then strangely, it comes out as true given the semantics of the sign ‘→’ because the antecedent, ‘E‘, is false. In modal propositional logic it is possible to define a much stronger sort of operator to use to translate English conditionals as follows:

\ulcorner \alpha\beta \urcorner is defined as \ulcorner \Box (\alpha \rightarrow \beta) \urcorner

If we transcribe the English “if the author of this article lives in France, then the moon is made of cheese” instead as “EM“, then it does not come out as true, because presumably, it is possible for the author of this article to live in France without the moon being made of cheese. Similarly, in relevance logic, one could also define a stronger sort of connective as follows:

\ulcorner \alpha \Rightarrow \beta \urcorner is defined as \ulcorner Rel(\alpha, \beta) \land (\alpha \rightarrow \beta) \urcorner

Here too, if we were to transcribe the English “if the author of this article lives in France, then the moon is made of cheese” as “E \Rightarrow M” instead of simply “E \rightarrow M“, it comes out as false, because the author of this article living in France is not related to the composition of the moon.

Besides non-truth-functional logic, other logical systems differ from classical truth-functional logic by allowing statements to be assigned truth-values other than truth or falsity, or to be assigned neither truth nor falsity or both truth and falsity. These sorts of logical systems may still be truth-functional in the sense that the truth-value of a complex statement may depend entirely on the truth-values of the parts, but the rules governing such truth-functionality would be more complicated than for classical logic, because it must consider possibilities that classical logic rejects.

Many-valued or multivalent logics are those that consider more than two truth-values. They may admit anything from three to an infinite number of possible truth-values. The simplest sort of many-valued logic is one that admits three truth-values, for example, truth, falsity and indeterminancy. It might seem, for example, that certain statements such as statements about the future, or paradoxical statements such as “this sentence is not true” cannot easily be assigned either truth or falsity, and so, it might be concluded, must have an indeterminate truth-value. The admission of this third truth-value requires one to expand the truth tables given in Section III(a). There, we gave a truth table for statements formed using the operator ‘→’; in three-valued logic, we have to decide what the truth-value of a statement of the form \ulcorner \alpha \rightarrow \beta \urcorner is when either or both of \alpha and \beta has an indeterminate truth-value. Arguably, if any component of a statement is indeterminate in truth-value, then the whole statement is indeterminate as well. This would lead to the following expanded truth table:

\alpha \beta (\alpha \rightarrow \beta)
T
T
T
I
I
I
F
F
F
T
I
F
T
I
F
T
I
F
T
I
F
I
I
I
T
I
T

However, we might wish to retain the feature of classical logic that a statement of the form \ulcorner \alpha \rightarrow \beta \urcorner is always true when its antecedent is false or its consequent is true, and hold that it is indeterminate only when its antecedent is indeterminate and its consequent false or when its antecedent is true and its consequent indeterminate, so that its truth table appears:

\alpha \beta (\alpha \rightarrow \beta)
T
T
T
I
I
I
F
F
F
T
I
F
T
I
F
T
I
F
T
I
F
T
T
I
T
T
T

Such details will have an effect on the remainders of the logical systems. For example, if an axiomatic or natural deduction system is created, and a desirable feature is that something be provable from no premises if and only if it is a tautology in the sense of being true (and not just not false) for all possible truth-value assignments, if we make use of the first truth table for ‘→’, then “P \rightarrow P” should not be provable, because it is indeterminate when ‘P‘ is, whereas if we use the second truth table, then “P \rightarrow P” should be provable, since it is a tautology according to that truth table, that is, it is true regardless of which of the three truth-values is assigned to ‘P‘.

Here we get just a glimpse at the complications created by admitting more than two truth-values. If more than three are admitted, and possibly infinitely many, then the issues become even more complicated.

Intuitionistic propositional logic results from rejecting the assumption that every statement is true or false, and countenances statements that are neither. The result is a sort of logic very much akin to a three-valued logic, since “neither true nor false”, while strictly speaking the rejection of a truth-value, can be thought of as though it were a third truth-value. In intuitionistic logic, the so-called “law of excluded middle,” that is, the law that all statements of the form \ulcorner \alpha \lor \neg \alpha \urcorner are true is rejected. This is because intuitionistic logic takes truth to coincide with direct provability, and it may be that certain statements, such as Goldbach’s conjecture in mathematics, are neither provably the case nor provably not the case.

Paraconsistent propositional logic is even more radical, in countenancing statements that are both true and false. Again, depending on the nature of the system, semantic rules have to be given that determine what the truth-value or truth-values a complex statement has when its component parts are both true and false. Such decisions determine what sorts of new or restricted rules of inference would apply to the logical system. For example, paraconsistent logics, if not trivial, must restrict the rules of inference allowable in classical truth-functional logic, because in systems such as those sketched in Sections V and VI above, from a contradiction, that is, a statement of the form \ulcorner \alpha \land \neg \alpha \urcorner, it is possible to deduce any other statement. Consider, for example, the following deduction in the natural deduction system sketched in Section V.

1. P \land \neg P Premise
2. P 1 Simp
3. \neg P 1 Simp
4. P \lor Q 2 Add
5. Q 3,4 DS

In order to avoid this result, paraconsistent logics must restrict the notion of a valid inference. In order for an inference to be considered valid, not only must it be truth-preserving, that is, that it be impossible to arrive at something untrue when starting with true premises, it must be falsity-avoiding, that is, it must be impossible, starting with true premises, to arrive at something that is false. In paraconsistent logic, where a statement can be both true and false, these two requirements do not coincide. The inference rule of disjunctive syllogism, while truth-preserving, is not falsity-avoiding. In cases in which its premises are true, its conclusion can still be false; more specifically, provided that at least one of its premises is both true and false, its conclusion can be false.

Other forms of non-classical propositional logic, and non-truth-functional propositional logic, continue to be discovered. Obviously any deviance from classical bivalent propositional logic raises complicated logical and philosophical issues that cannot be fully explored here. For more details both on non-classical logic, and on non-truth-functional logic, see the recommended reading section.

9. Suggestions for Further Reading

  • Anderson, A. R. and N. D. Belnap [and J. M. Dunn]. 1975 and 1992. Entailment. 2 vols. Princeton, NJ: Princeton University Press.
  • Bocheński, I. M. 1961. A History of Formal Logic. Notre Dame, Ind.: University of Notre Dame Press.
  • Boole, George. 1847. The Mathematical Analysis of Logic. Cambridge: Macmillan.
  • Boole, George. 1854. An Investigation into the Laws of Thought. Cambridge: Macmillan.
  • Carroll, Lewis. 1958. Symbolic Logic and the Game of Logic. London: Dover.
  • Church, Alonzo. 1956. Introduction to Mathematical Logic. Princeton, NJ: Princeton University Press.
  • Copi, Irving. 1953. Introduction to Logic. New York: Macmillan.
  • Copi, Irving. 1974. Symbolic Logic. 4th ed. New York: Macmillan.
  • da Costa, N. C. A. 1974. “On the Theory of Inconsistent Formal Systems,” Notre Dame Journal of Formal Logic 25: 497-510.
  • De Morgan, Augustus. 1847. Formal Logic. London: Walton and Maberly.
  • Fitch, F. B. 1952. Symbolic Logic: An Introduction. New York: Ronald Press.
  • Frege, Gottlob. 1879. Begriffsschrift, ene der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: L. Nerbert. Published in English as Conceptual Notation, ed. and trans. by Terrell Bynum. Clarendon: Oxford, 1972.
  • Frege, Gottlob. 1923. “Gedankengefüge,” Beträge zur Philosophie des deutchen Idealismus 3: 36-51. Published in English as “Compound Thoughts,” in The Frege Reader, edited by Michael Beaney. Oxford: Blackwell, 1997.
  • Gentzen, Gerhard. 1934. “Untersuchungen über das logische Schließen” Mathematische Zeitschrift 39: 176-210, 405-31. Published in English as “Investigations into Logical Deduction,” in Gentzen 1969.
  • Gentzen, Gerhard. 1969. Collected Papers. Edited by M. E. Szabo. Amsterdam: North-Holland Publishing.
  • Haack, Susan. 1996. Deviant Logic, Fuzzy Logic. Chicago: University of Chicago Press.
  • Herbrand, Jacques. 1930. “Recherches sur la théorie de la démonstration,” Travaux de la Société des Sciences et de la Lettres de Varsovie 33: 133-160.
  • Hilbert, David and William Ackermann. 1950. Principles of Mathematical Logic. New York: Chelsea.
  • Hintikka, Jaakko. 1962. Knowledge and Belief: An Introduction to the Logic of the Two Notions. Ithaca: Cornell University Press.
  • Hughes, G. E. and M. J. Cresswell. 1996. A New Introduction to Modal Logic. London: Routledge.
  • Jevons, W. S. 1880. Studies in Deductive Logic. London: Macmillan.
  • Kalmár, L. 1935. “Über die Axiomatisierbarkeit des Aussagenkalküls,” Acta Scientiarum Mathematicarum 7: 222-43.
  • Kleene, Stephen C. 1952. Introduction to Metamathematics. Princeton, NJ: Van Nostrand.
  • Kneale, William and Martha Kneale. 1962. The Development of Logic. Clarendon: Oxford.
  • Lewis, C. I. and C. H. Langford. 1932. Symbolic Logic. New York: Dover.
  • Łukasiewicz, Jan. 1920. “O logice trojwartosciowej,” Ruch Filozoficny 5: 170-171. Published in English as “On Three-Valued Logic,” in Łukasiewicz 1970.
  • Łukasiewicz, Jan. 1970. Selected Works. Amsterdam: North-Holland.
  • Łukasiewicz, Jan and Alfred Tarski. 1930. “Untersuchungen über den Aussagenkalkül,” Comptes Rendus des séances de la Société des Sciences et de la Lettres de Varsovie 32: 30-50. Published in English as “Investigations into the Sentential Calculus,” in Tarski 1956.
  • Mally, Ernst. 1926. Grundgesetze des Sollens: Elemente der Logik des Willens. Graz: Leuschner und Lubensky.
  • McCune, William, Robert Veroff, Branden Fitelson, Kenneth Harris, Andrew Feist and Larry Wos. 2002. “Short Single Axioms for Boolean Algebra,” Journal of Automated Reasoning 29: 1-16.
  • Mendelson, Elliot. 1997. Introduction to Mathematical Logic. 4th ed. London: Chapman and Hall.
  • Meredith, C. A. 1953. “Single Axioms for the Systems (C, N), (C, O) and (A, N) of the Two-valued Propositional Calculus,” Journal of Computing Systems 3: 155-62.
  • Müller, Eugen, ed. 1909. Abriss der Algebra der Logik, by E. Schröder. Leipzig: Teubner.
  • Nicod, Jean. 1917. “A Reduction in the Number of the Primitive Propositions of Logic,” Proceedings of the Cambridge Philosophical Society 19: 32-41.
  • Peirce, C. S. 1885. “On the Algebra of Logic,” American Journal of Mathematics 7: 180-202.
  • Post, Emil. 1921. “Introduction to a General Theory of Propositions,” American Journal of Mathematics 43: 163-185.
  • Priest, Graham, Richard Routley and Jean Norman, eds. 1990. Paraconsistent Logic. Munich: Verlag.
  • Prior, Arthur. 1990. Formal Logic. 2nd. ed. Oxford: Oxford University Press.
  • Read, Stephen, 1988. Relevant Logic. New York: Blackwell.
  • Rescher, Nicholas. 1966. The Logic of Commands. London: Routledge and Kegan Paul.
  • Rescher, Nicholas. 1969. Many-Valued Logic. New York: McGraw Hill.
  • Rosser, J. B. 1953. Logic for Mathematicians. New York: McGraw Hill.
  • Russell, Bertrand. 1906. “The Theory of Implication,” American Journal of Mathematics 28: 159-202.
  • Schlesinger, G. N. 1985. The Range of Epistemic Logic. Aberdeen: Aberdeen University Press.
  • Sheffer, H. M. 1913. “A Set of Five Postulates for Boolean Algebras with Application to Logical Constants,” Transactions of the American Mathematical Society 14: 481-88.
  • Smullyan, Raymond. 1961. Theory of Formal Systems. Princeton: Princeton University Press.
  • Tarski, Alfred. 1956. Logic, Semantics and Meta-Mathematics. Oxford: Oxford University Press.
  • Urquhart, Alasdair. 1986. “Many-valued Logic,” In Handbook of Philosophical Logic, vol. 3, edited by D. Gabbay and F. Guenthner. Dordrecht: Reidel.
  • Venn, John. 1881. Symbolic Logic. London: Macmillan.
  • Whitehead, Alfred North and Bertrand Russell. 1910-1913. Principia Mathematica. 3 vols. Cambridge: Cambridge University Press.
  • Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus. London: Routledge and Kegan Paul.

Author Information

Kevin C. Klement
Email: klement@philos.umass.edu
University of Massachusetts, Amherst
U. S. A.

Epsilon Calculi

Epsilon Calculi are extended forms of the predicate calculus that incorporate epsilon terms. Epsilon terms are individual terms of the form ‘εxFx’, being defined for all predicates in the language. The epsilon term ‘εxFx’ denotes a chosen F, if there are any F’s, and has an arbitrary reference otherwise. Epsilon calculi were originally developed to study certain forms of arithmetic, and set theory; also to prove some important meta-theorems about the predicate calculus. Later formal developments have included a variety of intensional epsilon calculi, of use in the study of necessity, and more general intensional notions, like belief. An epsilon term such as ‘εxFx’ was originally read as ‘the first F’, and in arithmetical contexts as ‘the least F’. More generally it can be read as the demonstrative description ‘that F’, when arising either deictically, that is, in a pragmatic context where some F is being pointed at, or in linguistic cross-reference situations, as with, for example, ‘There is a red-haired man in the room. That red-haired man is Caucasian’. The application of epsilon terms to natural language shares some features with the use of iota terms within the theory of descriptions given by Bertrand Russell, but differs in formalising aspects of a slightly different theory of reference, first given by Keith Donnellan. More recently, epsilon terms have been used by a number of writers to formalise cross-sentential anaphora, which would arise if ‘that red-haired man’ in the linguistic case above was replaced with a pronoun such as ‘he’. There is then also the similar application in intensional cases, like ‘There is a red-haired man in the room. Celia believed he was a woman.’

Table of Contents

  1. Introduction
  2. Descriptions and Identity
  3. Rigid Epsilon Terms
  4. The Epsilon Calculus’ Problematic
  5. The Formal Semantics of Epsilon Terms
  6. Some Metatheory
  7. References and Further Reading

1. Introduction

Epsilon terms were introduced by the german mathematician David Hilbert, in Hilbert 1923, 1925, to provide explicit definitions of the existential and universal quantifiers, and resolve some problems in infinitistic mathematics. But it is not just the related formal results, and structures which are of interest. In Hilbert’s major book Grundlagen der Mathematik, which he wrote with his collaborator Paul Bernays, epsilon terms were presented as formalising certain natural language constructions, like definite descriptions. And they in fact have a considerably larger range of such applications, for instance in the symbolisation of certain cross-sentential anaphora. Hilbert and Bernays also used their epsilon calculus to prove two important meta-theorems about the predicate calculus. One theorem subsequently led, for instance, to the development of semantic tableaux: it is called the First Epsilon Theorem, and its content and proof will be given later, in section 6 below. A second theorem that Hilbert and Bernays proved, which we shall also look at then, establishes that epsilon calculi are conservative extensions of the predicate calculus, that is, that no more theorems expressible just in the quantificational language of the predicate calculus can be proved in epsilon calculi than can be proved in the predicate calculus itself. But while epsilon calculi do have these further important formal functions, we will not only be concerned to explore them, for we shall also first discuss the natural language structures upon which epsilon calculi have a considerable bearing.

The growing awareness of the larger meaning and significance of epsilon calculi has only come in stages. Hilbert and Bernays introduced epsilon terms for several meta-mathematical purposes, as above, but the extended presentation of an epsilon calculus, as a formal logic of interest in its own right, in fact only first appeared in Bourbaki’s Éléments de Mathématique (although see also Ackermann 1937-8). Bourbaki’s epsilon calculus with identity (Bourbaki, 1954, Book 1) is axiomatic, with Modus Ponens as the only primitive inference or derivation rule. Thus, in effect, we get:

(X ∨ X) → X,
X → (X ∨ Y),
(X ∨ Y) → (Y ∨ X),
(X ∨ Y) → ((Z ∨ X) → (Z ∨ Y)),
Fy → FεxFx,
x = y → (Fx ↔ Fy),
(x)(Fx ↔ Gx) → εxFx = εxGx.

This adds to a basis for the propositional calculus an epsilon axiom schema, then Leibniz’ Law, and a second epsilon axiom schema, which is a further law of identity. Bourbaki, though, used the Greek letter tau rather than epsilon to form what are now called ‘epsilon terms’; nevertheless, he defined the quantifiers in terms of his tau symbol in the manner of Hilbert and Bernays, namely:

(∃x)Fx ↔ FεxFx,
(x)Fx ↔ Fεx¬Fx;

and note that, in his system the other usual law of identity, ‘x = x’, is derivable.

The principle purpose Bourbaki found for his system of logic was in his theory of sets, although through that, in the modern manner, it thereby came to be the foundation for the rest of mathematics. Bourbaki’s theory of sets discriminates amongst predicates those which determine sets: thus some, but only some, predicates determine sets, i.e. are ‘collectivisantes’. All the main axioms of classical Set Theory are incorporated in his theory, but he does not have an Axiom of Choice as a separate axiom, since its functions are taken over by his tau symbol. The same point holds in Bernays’ epsilon version of his set theory (Bernays 1958, Ch VIII).

Epsilon calculi, during this period, were developed without any semantics, but a semantic interpretation was produced by Gunter Asser in 1957, and subsequently published in a book by A.C. Leisenring, in 1969. Even then, readings of epsilon terms in ordinary language were still uncommon. A natural language reading of epsilon terms, however, was present in Hilbert and Bernays’ work. In fact the last chapter of book 1 of the Grundlagen is a presentation of a theory of definite descriptions, and epsilon terms relate closely to this. In the more well known theory of definite descriptions by Bertrand Russell (Russell 1905) there are three clauses: with

The king of France is bald

we get, on Russell’s theory, first

there is a king of France,

second

there is only one king of France,

and third

anyone who is king of France is bald.

Russell uses the Greek letter iota to formalise the definite description, writing the whole

BιxKx,

but he recognises the iota term is not a proper individual symbol. He calls it an ‘incomplete symbol’, since, because of the three parts, the whole proposition is taken to have the quantificational analysis,

(∃x)(Kx & (y)(Ky → y = x) & (y)(Ky → By)),

which is equivalent to

(∃x)(Kx & (y)(Ky→ y = x) & Bx).

And that means that it does not have the form ‘Bx’. Russell believed that, in addition to his iota terms, there was another class of individual terms, which he called ‘logically proper names’. These would simply fit into the ‘x’ place in ‘Bx’. He believed that ‘this’ and ‘that’ were in this class, but gave no symbolic characterisation of them.

Hilbert and Bernays, by contrast, produced what is called a ‘pre-suppositional theory’ of definite descriptions. The first two clauses of Russell’s definition were not taken to be part of the meaning of ‘The King of France is bald’: they were merely conditions under which they took it to be permitted to introduce a complete individual term for ‘the King of France’, which then satisfies

Kx & (y)(Ky → y = x).

Hilbert and Bernays continued to use the Greek letter iota in their individual term, although it has a quite different grammar from Russell’s iota term, since, when Hilbert and Bernays’ term can be introduced, it is provably equivalent to the corresponding epsilon term (Kneebone 1963, p102). In fact it was later suggested by many that epsilon terms are not only complete symbols, but can be seen as playing the same role as the ‘logically proper names’ Russell discussed.

It is at the start of book 2 of the Grundlagen that we find the definition of epsilon terms. There, Hilbert and Bernays first construct a theory of indefinite descriptions in a similar manner to their theory of definite descriptions. They allow, now, an eta term to be introduced as long as just the first of Russell’s conditions is met. That is to say, given

(∃x)Fx,

one can introduce the term ‘ηxFx’, and say

FηxFx.

But the condition for the introduction of the eta term can be established logically, for certain predicates, since

(∃x)((∃y)Fy → Fx),

is a predicate calculus theorem (Copi 1973, p110). It is the eta term this theorem allows us to introduce which is otherwise called an epsilon term, and its logical basis enables entirely formal theories to be constructed, since such individual terms are invariably defined. Thus we may invariably introduce ‘ηx((∃y)Fy → Fx)’, and this is commonly written ‘εxFx’, about which we can therefore say

(∃y)Fy → FεxFx.

Since it is that F which exists if anything is F, Hilbert read the epsilon term in this case ‘the first F’. For instance, in arithmetic, ‘the first’ may be taken to be the least number operator. However, while if there are F’s then the first F is clearly some chosen one of them, if there are no F’s then ‘the first F’ must be a misnomer. And that form of speech only came to be fully understood in the theories of reference which appeared much later, when reference and denotation came to be more clearly separated from description and attribution. Donnellan (Donnellan 1966) used the example ‘the man with martini in his glass’, and pointed out that, in certain uses, this can refer to someone without martini in his glass. In the terminology Donnellan made popular, ‘the first F’, in the second case above works similarly: it cannot be attributive, and so, while it refers to something, it must refer arbitrarily, from a semantic point of view.

With reference in this way separated from attribution it becomes possible to symbolise the anaphoric cross-reference between, for instance, ‘There is one and only one king of France’ and ‘He is bald’. For, independently of whether the former is true, the ‘he’ in the latter is a pronoun for the epsilon term in the former — by a simple extension of the epsilon definition of the existential quantifier. Thus the pair of remarks may be symbolised

(∃x)(Kx & (y)(Ky → y = x)) & Bεx(Kx & (y)(Ky → y = x)).

Furthermore such cross-reference may occur in connection with intensional constructions of a kind Russell also considered, such as

George IV wondered whether the author of Waverley was Scott.

Thus we can say ‘There is an author of Waverley, and George IV wondered whether he was Scott’. But the epsilon analysis of these cases puts intensional epsilon calculi at odds with Russellian views of such constructions, as we shall see later. The Russellian approach, by not having complete symbols for individuals, tends to confuse cases in which assertions are made about individuals and cases in which assertions are made about identifying properties. As we shall see, epsilon terms enable us to make the discrimination between, for instance,

s = εx(y)(Ay ↔ y = x),

(i.e. ‘Scott is the author of Waverley’), and

(y)(Ay ↔ y = s),

(that is, ‘there is one and only one author of Waverley and he is Scott’), and so it enables us to locate more exactly the object of George IV’s thought.

2. Descriptions and Identity

When one starts to ask about the natural language meaning of epsilon terms, it is interesting that Leisenring just mentions the ‘formal superiority’ of the epsilon calculus (Leisenring 1969, p63, see also Routley 1969, Hazen 1987). Leisenring took the epsilon calculus to be a better logic than the predicate calculus, but merely because of the Second Epsilon Theorem. Its main virtue, to Leisenring, was that it could prove all that seemingly needed to be proved, but in a more elegant way. Epsilon terms were just neater at calculating which were the valid theorems of the predicate calculus.

Remembering Hilbert and Bernays’ discussion of definite and indefinite descriptions, clearly there is more to the epsilon calculus than this. And there are, in fact, two specific theorems provable within the epsilon calculus, though not the predicate calculus, which will start to indicate the epsilon calculus’ more general range of application. They concern individuals, since the epsilon calculus is distinctive in providing an appropriate, and systematic means of reference to them.

The need to have complete symbols for individuals became evident some years after Russell’s promotion of incomplete symbols for them. The first major book to allow for this was Rosser’s Logic for Mathematicians, in 1953, although there were precursors. For the classical difficulty with providing complete terms for individuals concerns what to do with ‘non-denoting’ terms, and Quine, for instance, following Frege, often gave them an arbitrary, though specific referent (Marciszewski 1981, p113). This idea is also present in Kalish and Montague (Kalish and Montague 1964, pp242-243), who gave the two rules:

(∃x)(y)(Fy ↔ y = x) ├ FιxFx,
¬(∃x)(y)(Fy ↔ y = x) ├ιxFx = ιx¬(x = x),

where ‘ιxFx’ is what otherwise might be written ‘εx(y)(Fy ↔ y = x)’. Kalish and Montague believed, however, that the second rule ‘has no intuitive counterpart, simply because ordinary language shuns improper definite descriptions’ (Kalish and Montague 1964, p244). And, at that time, what Donnellan was to publish in Donnellan 1966, about improper definite descriptions, was certainly not well known. In fact ordinary speech does not shun improper definite descriptions, although their referents are not as fixed as the above second rule requires. Indeed the very fact that the descriptions are improper means that their referents are not determined semantically: instead they are just a practical, pragmatic choice.

Stalnaker and Thomason recognised the need to be more liberal when they defined their referential terms, which also had to refer, in the contexts they were concerned with, in more than one possible world (Thomason and Stalnaker 1968, p363):

In contrast with the Russellian analysis, definite descriptions are treated as genuine singular terms; but in general they will not be substance terms [rigid designators]. An expression like ιxPx is assigned a referent which may vary from world to world. If in a given world there is a unique existing individual which has the property corresponding to P, this individual is the referent of ιxPx; otherwise, ιxPx refers to an arbitrarily chosen individual which does not exist in that world.

Stalnaker and Thomason appreciated that ‘A substance term is much like what Russell called a logically proper name’, but they said that an individual constant might or might not be a substance term, depending on whether it was more like ‘Socrates’ or ‘Miss America’ (Thomason and Stalnaker 1968, p362). A more complete investigation of identity and descriptions, in modal and general intensional contexts, was provided in Routley, Meyer and Goddard 1974, and Routley 1977, see also Hughes and Cresswell 1968, Ch 11. And with these writers we get the explicit rendering of definite descriptions in epsilon terms, as in Goddard and Routley 1973, p558, Routley 1980, p277, c.f. Hughes and Cresswell 1968, p203.

Certain specific theorems in the epsilon calculus, as was said before, support these kinds of identification. One theorem demonstrates directly the relation between Russell’s attributive, and some of Donnellan’s referential ideas. For

(∃x)(Fx & (y)(Fy → y = x) & Gx)

is logically equivalent to

(∃x)(Fx & (y)(Fy → y = x)) & Ga,

where a = εx(Fx & (y)(Fy → y = x)). This arises because the latter is equivalent to

Fa & (y)(Fy → y = a) & Ga,

which entails the former. But the former is

Fb & (y)(Fy → y = b) & Gb,

with b = εx(Fx & (y)(Fy → y = x) & Gx), and so entails

(∃x)(Fx & (y)(Fy → y = x)),

and

Fa & (y)(Fy → y = a).

But that means that, from the uniqueness clause,

a = b,

and so

Ga,

meaning the former entails the latter, and therefore the former is equivalent to the latter.

The former, of course, gives Russell’s Theory of Descriptions, in the case of ‘The F is G’; it explicitly asserts the first two clauses, to do with the existence and uniqueness of an F. A presuppositional theory, such as we saw in Hilbert and Bernays, would not explicitly assert these two clauses: on such an account they are a precondition before the term ‘the F’ can be introduced. But neither of these theories accommodate improper definite descriptions. Since Donnellan it is more common to allow that we can always use ‘the F’: if the description is improper then the referent of this term is simply found in the term’s practical use.

One detail of Donnellan’s historical account, however, must be treated with some care, at this point. Donnellan was himself concerned with definite descriptions which were improper in the sense that they did not uniquely describe what the speaker took to be their referent. So the description might still be ‘proper’ in the above sense — if there still was something to which it uniquely applied, on account of its semantic content. Thus Donnellan allowed ‘the man with martini in his glass’ to identify someone without martini in his glass irrespective of whether there was some sole man with martini in his glass. But if one talks about ‘the man with martini in his glass’ one can be correctly taken to be talking about who this describes, if it does in fact correctly describe someone — as Devitt and Bertolet pointed out in criticism of Donnellan (Devitt 1974, Bertolet 1980). It is this aspect of our language which the epsilon account matches, for an epsilon account allows definite descriptions to refer without attribution of their semantic character, but only if nothing uniquely has that semantic character. Thus it is not the whole of the first statement above , but only the third part of the second statement which makes the remark ‘The F is G’.

The difficulty with Russell’s account becomes more plain if we read the two equivalent statements using relative and personal pronouns. They then become

There is one and only one F, which is G,
There is one and only one F; it is G.

But using just the logic derived from Frege, Russell could formalise the ‘which’, but could not separate out the last clause, ‘it is G’. In that clause ‘it’ is an anaphor for ‘the (one and only) F’, and it still has this linguistic meaning if there is no such thing, since that is just a matter of grammar. But the uniqueness clause is needed for the two statements to be equivalent — without uniqueness there is no equivalence, as we shall see – so ‘which’ is not itself equivalent to ‘it’. Russell, however, because he could not separate out the ‘it’, had to take the whole of the first expression as the analysis of ‘The F is G’ — he could not formulate the needed ‘logically proper name’.

But how can something be the one and only F ‘if there is no such thing’? That is where another important theorem provable in the epsilon calculus is illuminating, namely:

(Fa & (y)(Fy → y = a)) → a = εx(Fx & (y)(Fy → y = x)).

The important thing is that there is a difference between the left hand side and the right hand side, i.e. between something being alone F, and that thing being the one and only F. For the left-right implication cannot be reversed. We get from the left to the right when we see that the left as a whole entails

(∃x)(Fx & (y)(Fy → y = x)),

and so also its epsilon equivalent

Fεx(Fx & (y)(Fy → y = x)) & (z)(Fz → z = εx(Fx & (y)(Fy → y = x))).

Given Fa, then from the second clause here we get the right hand side of our original implication. But if we substitute ‘εx(Fx & (y)(Fy → y = x))’ for ‘a’ in that implication then on the right we have something which is necessarily true. But the left hand side is then the same as

(∃x)(Fx & (y)(Fy → y = x)),

and that is in general contingent. Hence the implication cannot generally be reversed. Having the property of being alone F is here contingent, but possessing the identity of the one and only F is necessary.

The distinction is not made in Russell’s logic, since possession of the relevant property is the only thing which can be formally expressed there. In Russell’s theory of descriptions, a’s possession of the property of being alone a king of France is expressed as a quasi identity

a = ιxKx,

and that has the consequence that such identities are contingent. Indeed, in counterpart theories of objects in other possible worlds the idea is pervasive that an entity may be defined in terms of its contingent properties in a given world. Hughes and Cresswell, however, differentiated between contingent identities and necessary identities in the following way (Hughes and Cresswell 1968, p191):

Now it is contingent that the man who is in fact the man who lives next door is the man who lives next door, for he might have lived somewhere else; that is living next door is a property which belongs contingently, not necessarily, to the man to whom it does belong. And similarly, it is contingent that the man who is in fact the mayor is the mayor; for someone else might have been elected instead. But if we understand [The man who lives next door is the mayor] to mean that the object which (as a matter of contingent fact) possesses the property of being the man who lives next door is identical with the object which (as a matter of contingent fact) possesses the property of being the mayor, then we are understanding it to assert that a certain object (variously described) is identical with itself, and this we need have no qualms about regarding as a necessary truth. This would give us a way of construing identity statements which makes [(x = y) → L(x = y)] perfectly acceptable: for whenever x = y is true we can take it as expressing the necessary truth that a certain object is identical with itself.

There are more consequences of this matter, however, than Hughes and Cresswell drew out. For now that we have proper referring terms for individuals to go into such expressions as ‘x = y’, we first see better where the contingency of the properties of such individuals comes from — simply the linguistic facility of using improper definite descriptions. But we also see, because identities between such terms are necessary, that proper referring terms must be rigid, i.e. have the same reference in all possible worlds.

This is not how Stalnaker and Thomason saw the matter. Stalnaker and Thomason, it will be remembered, said that there were two kinds of individual constants: ones like ‘Socrates’ which can take the place of individual variables, and others like ‘Miss America’ which cannot. The latter, as a result, they took to be non-rigid. But it is strictly ‘Miss America in year t’ which is meant in the second case, and that is not a constant expression, even though such functions can take the place of individual variables. It was Routley, Meyer and Goddard who most seriously considered the resultant possibility that all properly individual terms are rigid. At least, they worked out many of the implications of this position, even though Routley was not entirely content with it.

Routley described several rigid intensional semantics (Routley 1977, pp185-186). One of these, for instance, just took the first epsilon axiom to hold in any interpretation, and made the value of an epsilon term itself. On such a basis Routley, Meyer and Goddard derived what may be called ‘Routley’s Formula’, i.e.

L(∃x)Fx → (∃x)LFx.

In fact, on their understanding, this formula holds for any operator and any predicate, but they had in mind principally the case of necessity illustrated here, with ‘Fx’ taken as ‘x numbers the planets’, making ‘εxFx’ ‘the number of the planets’. The formula is derived quite simply, in the following way: from

L(∃x)Fx,

we can get

LFεxFx,

by the epsilon definition of the existential quantifier, and so

(∃x)LFx,

by existential generalisation over the rigid term (Routley, Meyer and Goddard 1974, p308, see also Hughes and Cresswell 1968, pp197, 204). Routley, however, was still inclined to think that a rigid semantics was philosophically objectionable (Routley 1977, p186):

Rigid semantics tend to clutter up the semantics for enriched systems with ad hoc modelling conditions. More important, rigid semantics, whether substitutional or objectual, are philosophically objectionable. For one thing, they make Vulcan and Hephaestus everywhere indistinguishable though there are intensional claims that hold of one but not of the other. The standard escape from this sort of problem, that of taking proper names like ‘Vulcan’ as disguised descriptions we have already found wanting… Flexible semantics, which satisfactorily avoid these objections, impose a more objectual interpretation, since, even if [the domain] is construed as the domain of terms, [the value of a term in a world] has to be permitted, in some cases at least, to vary from world to world.

As a result, while Routley, Meyer and Goddard were still prepared to defend the formula, and say, for instance, that there was a number which necessarily numbers the planets, namely the number of the planets (np), they thought that this was only in fact the same as 9, so that one still could not argue correctly that as L(np numbers the planets), so L(9 numbers the planets). ‘For extensional identity does not warrant intersubstitutivity in intensional frames’ (Routley, Meyer and Goddard 1974, p309). They held, in other words that the number of the planets was only contingently 9.

This means that they denied ‘(x = y) → L(x = y)’, but, as we shall see in more detail later, there are ways to hold onto this principle, i.e. maintain the invariable necessity of identity.

3. Rigid Epsilon Terms

There is some further work which has helped us to understand how reference in modal and general intensional contexts must be rigid. But it involves some different ideas in semantics, and starts, even, outside our main area of interest, namely predicate logic, in the semantics of propositional logic.

When one thinks of ‘semantics’ one maybe thinks of the valuation of formulas. Since the 1920s a meta-study of this kind was certainly added to the previous logical interest in proof theory. Traditional proof theory is commonly associated with axiomatic procedures, but, from a modern perspective, its distinction is that it is to do with ‘object languages’. Tarski’s theory of truth relies crucially on the distinction between object languages and meta-languages, and so semantics generally seems to be necessarily a meta-discipline. In fact Tarski believed that such an elevation of our interest was forced upon us by the threat of semantic paradoxes like The Liar. If there was, by contrast, ‘semantic closure’, i.e. if truth and other semantic notions were definable at the object level, then there would be contradictions galore (c.f. Priest 1984). In this way truth may seem to be necessarily a predicate of (object-level) sentences.

But there is another way of looking at the matter which is explicitly non-Tarskian, and which others have followed (see Prior 1971, Ch 7, Sayward 1987). This involves seeing ‘it is true that’ as not a predicate, but an object-level operator, with the truth tabulations in Truth Tables, for instance, being just another form of proof procedure. Operators indeed include ‘it is provable that’, and this is distinct from Gödel’s provability predicate, as Gödel himself pointed out (Gödel 1969). Operators are intensional expressions, as in the often discussed ‘it is necessary that’ and ‘it is believed that’, and trying to see such forms of indirect discourse as metalinguistic predicates was very common in the middle of the last century. It was pervasive, for instance, in Quine’s many discussions of modality and intensionality. Wouldn’t someone be believing that the Morning Star is in the sky, but the Evening Star is not, if, respectively, they assented to the sentence ‘the Morning Star is in the sky’, and dissented from ‘the Evening Star is in the sky’? Anyone saying ‘yes’ is still following the Quinean tradition, but after Montague’s and Thomason’s work on operators (e.g. Montague 1963, Thomason 1977, 1980) many logicians are more persuaded that indirect discourse is not quotational. It is open to doubt, that is to say, whether we should see the mind in terms of the direct words which the subject would use.

The alternative involves seeing the words ‘the Morning Star is in the sky’ in such an indirect speech locution as ‘Quine believes that the Morning Star is in the sky’ as words merely used by the reporter, which need not directly reflect what the subject actually says. That is indeed central to reported speech — putting something into the reporter’s own words rather than just parroting them from another source. Thus a reporter may say

Celia believed that the man in the room was a woman,

but clearly that does not mean that Celia would use ‘the man in the room’ for who she was thinking about. So referential terms in the subordinate proposition are only certainly in the mouth of the reporter, and as a result only certainly refer to what the reporter means by them. It is a short step from this thought to seeing

There was a man in the room, but Celia believed that he was a woman,

as involving a transparent intensional locution, with the same object, as one might say, ‘inside’ the belief as ‘outside’ in the room. So it is here where rigid constant epsilon terms are needed, to symbolise the cross-sentential anaphor ‘he’, as in:

(∃x)(Mx & Rx) & BcWεx(Mx & Rx).

To understand the matter fully, however, we must make the shift from meta- to object language we saw at the propositional level above with truth. Routley, Meyer and Goddard realised that a rigid semantics required treating such expressions as ‘BcWx’ as simple predicates, and we must now see what this implies. They derived, as we saw before, ‘Routley’s Formula’

L(∃x)Fx → (∃x)LFx,

but we can now start to spell out how this is to be understood, if we hold to the necessity of identities, i.e. if we use ‘=’ so that

x = y → L(x = y).

Again a clear illustration of the validity of Routley’s Formula is provided by the number of the planets, but now we may respect the fact that some things may lack a number, and also the fact that referential, and attributive senses of terms may be distinguished. Thus if we write ‘(nx)Px’ for ‘there are n P’s’, then εn(ny)Py will be the number of P’s, and it is what numbers them (i.e. ([εn(ny)Py]x)Px) if they have a number (i.e. if (∃n)(nx)Px) — by the epsilon definition of the existential quantifier. Then, with ‘Fx’ as the proper (necessary) identity ‘x = εn(ny)Py’ Routley’s Formula holds because the number in question exists eternally, making both sides of the formula true. But if ‘Fn’ is simply the attributive ‘(ny)Py’ then this is not necessary, since it is contingent even, in the first place, that there is a number of P’s, instead of just some P, making both sides of the formula false.

Hughes and Cresswell argue against the principle saying (Hughes and Cresswell 1968, p144):

…let [Fx] be ‘x is the number of the planets’. Then the antecedent is true, for there must be some number which is the number of the planets (even if there were no planets at all there would still be such a number, namely 0): but the consequent is false, for since it is a contingent matter how many planets there are, there is no number which must be the number of the planets.

But this forgets continuous quantities, where there are no discrete items before the nomination of a unit. The number associated with some planetary material, for instance, numbers only arbitrary units of that material, and not the material itself. So the antecedent of Routley’s Formula is not necessarily true.

Quine also used the number of the planets in his central argument against quantification into modal contexts. He said (Quine 1960, pp195-197):

If for the sake of argument we accept the term ‘analytic’ as predicable of sentences (hence as attachable predicatively to quotations or other singular terms designating sentences), then ‘necessarily’ amounts to ‘is analytic’ plus an antecedent pair of quotation marks. For example, the sentence:

(1) Necessarily 9 > 4

is explained thus:

(2) ‘9 > 4’ is analytic…

So suppose (1) explained as in (2). Why, one may ask, should we preserve the operatorial form as of (1), and therewith modal logic, instead of just leaving matters as in (2)? An apparent advantage is the possibility of quantifying into modal positions; for we know we cannot quantify into quotation, and (2) uses quotation…

But is it more legitimate to quantify into modal positions than into quotation? For consider (1) even without regard to (2); surely, on any plausible interpretation, (1) is true and this is false:

(3) Necessarily the number of major planets > 4.

Since 9 = the number of major planets, we can conclude that the position of ‘9’ in (1) is not purely referential and hence that the necessity operator is opaque.

But here Quine does not separate out the referential ‘the number of the major planets is greater than 4’, i.e. ‘εn(ny)Py > 4’, from the attributive ‘There are more than 4 major planets’, i.e. ‘(∃n)((ny)Py & n > 4)’. If 9 = εn(ny)Py, then it follows that εn(ny)Py > 4, but it does not follow that (∃n)((ny)Py & n > 4). Substitution of identicals in (1), therefore, does yield (3), even though it is not necessary that there are more than 4 major planets.

We can now go into some details of how one gets the ‘x’ in such a form as ‘LFx’ to be open for quantification. For, what one finds in traditional modal semantics (see Hughes and Cresswell 1968, passim) are formulas in the meta-linguistic style, like

V(Fx, i) = 1,

which say that the valuation put on ‘Fx’ is 1, in world i. There should be quotation marks around the ‘Fx’ in such a formula, to make it meta-linguistic, but by convention they are generally omitted. To effect the change to the non-meta-linguistic point of view, we must simply read this formula as it literally is, so that the ‘Fx’ is in indirect speech rather than direct speech, and the whole becomes the operator form ‘it would be true in world i that Fx’. In this way, the term ‘x’ gets into the language of the reporter, and the meta/object distinction is not relevant. Any variable inside the subordinate proposition can now be quantified over, just like a variable outside it, which means there is ‘quantifying in’, and indeed all the normal predicate logic operations apply, since all individual terms are rigid.

A example illustrating this rigidity involves the actual top card in a pack, and the cards which might have been top card in other circumstances (see Slater 1988a). If the actual top card is the Ace of Spades, and it is supposed that the top card is the Queen of Hearts, then clearly what would have to be true for those circumstances to obtain would be for the Ace of Spades to be the Queen of Hearts. The Ace of Spades is not in fact the Queen of Hearts, but that does not mean they cannot be identical in other worlds (c.f. Hughes and Cresswell, 1968, p190). Certainly if there were several cards people variously thought were on top, those cards in the various supposed circumstances would not provide a constant c such that Fc is true in all worlds. But that is because those cards are functions of the imagined worlds — the card a believes is top (εxBaFx) need not be the card b believes is top (εxBbFx), etc. It still remains that there is a constant, c, such that Fc is true in all worlds. Moreover, that c is not an ‘intensional object’, for the given Ace of Spades is a plain and solid extensional object, the actual top card (εxFx).

Routley, Meyer and Goddard did not accept the latter point, wanting a rigid semantics in terms of ‘intensional objects’ (Goddard and Routley, 1973, p561, Routley, Meyer and Goddard, 1974, p309, see also Hughes and Cresswell 1968, p197). Stalnaker and Thomason accepted that certain referential terms could be functional, when discriminating ‘Socrates’ from ‘Miss America’ — although the functionality of ‘Miss America in year t’ is significantly different from that of ‘the top card in y’s belief’. For if this year’s Miss America is last year’s Miss America, still it is only one thing which is identical with itself, unlike with the two cards. Also, there is nothing which can force this year’s Miss America to be last year’s different Miss America, in the way that the counterfactuality of the situation with the playing cards forces two non-identical things in the actual world to be the same thing in the other possible world. Other possible worlds are thus significantly different from other times, and so, arguably, other possible worlds should not be seen from the Realist perspective appropriate for other times — or other spaces.

4. The Epsilon Calculus’ Problematic

It might be said that Realism has delayed a proper logical understanding of many of these things. If you look ‘realistically’ at picturesque remarks like that made before, namely ‘the same object is ‘inside’ the belief as ‘outside’ in the room’, then it is easy for inappropriate views about the mind to start to interfere, and make it seem that the same object cannot be in these two places at once. But if the mind were something like another space or time, then counterfactuality could get no proper purchase — no one could be ‘wrong’, since they would only be talking about elements in their ‘world’, not any objective, common world. But really, all that is going on when one says, for instance,

There was a man in the room, but Celia believed he was a woman,

is that the same term — or one term and a pronominal surrogate for it — appears at two linguistic places in some discourse, with the same reference. Hence there is no grammatical difference between the cross reference in such an intensional case and the cross reference in a non-intensional case, such as

There was a man in the room. He was hungry.

i.e.

(∃x)Mx & HεxMx.

What has been difficult has merely been getting a symbolisation of the cross-reference in this more elementary kind of case. But it just involves extending the epsilon definition of existential statements, using a reiteration of the substituted epsilon term, as we can see.

It is now widely recognised how the epsilon calculus allows us to do this (Purdy 1994, Egli and von Heusinger 1995, Meyer Viol 1995, Ch 6), the theoretical starting point being the theorem about the Russellian theory of definite descriptions proved before, which breaks up what otherwise would be a single sentence into a sequential piece of discourse, enabling the existence and uniqueness clauses to be put in one sentence while the characterising remark is in another. The relationship starts to matter when, in fact, there is no obvious way to formulate a combination of anaphoric remarks in the predicate calculus, as in, for instance,

There is a king of France. He is bald,

where there is no uniqueness clause. This difficulty became a major problem when logicians started to consider anaphoric reference in the 1960s.

Geach, for instance, in Geach 1962, even believed there could not be a syllogism of the following kind (Geach 1962, p126):

A man has just drunk a pint of sulphuric acid.
Nobody who drinks a pint of sulphuric acid lives through the day.
So, he won’t live through the day.

He said, one could only draw the conclusion:

Some man who has just drunk a pint of sulphuric acid won’t live through the day.

Certainly one can only derive

(∃x)(Mx & Dx & ¬Lx)

from

(∃x)(Mx & Dx),

and

(x)(Dx → ¬Lx),

within predicate logic. But one can still derive

¬Lεx(Mx & Dx),

within the epsilon calculus.

Geach likewise was foxed later when he produced his famous case (numbered 3 in Geach 1967):

Hob thinks a witch has blighted Bob’s mare, and Nob wonders whether she (the same witch) killed Cob’s sow,

which is, in epsilon terms

Th(∃x)(Wx & Bxb) & OnKεx(Wx & Bxb)c.

For Geach saw that this could not be (4)

(∃x)(Wx & ThBxb & OnKxc),

or (5)

(∃x)(Th(Wx & Bxb)& OnKxc).

But also a reading of the second clause as (c.f. 18)

Nob wonders whether the witch who blighted Bob’s mare killed Cob’s sow,

in which ‘the witch who blighted Bob’s mare killed Cob’s sow’ is analysed in the Russellian manner, i.e. as (20)

just one witch blighted Bob’s mare and she killed Cob’s sow,

Geach realised does not catch the specific cross-reference — amongst other things because of the uniqueness condition which is then introduced.

This difficulty with the uniqueness clause in Russellian analyses has been widely commented on, although a recent theorist, Neale, has said that Russell’s theory only needs to be modestly modified: Neale’s main idea is that, in general, definite descriptions should just be localised to the context. His resolution of Geach’s troubling cases thus involves suggesting that ‘she’, in the above, might simply be ‘the witch we have been hearing about’ (Neale 1990, p221). Neale might here have said ‘that witch who blighted Bob’s mare’, showing that an Hilbertian account of demonstrative descriptions would have a parallel effect.

A good deal of the ground breaking work on these matters, however, was done by someone again much influenced by Russell: Evans. But Evans significantly broke with Russell over uniqueness (Evans 1977, pp516-517):

One does not want to be committed, by this way of telling the story, to the existence of a day on which just one man and boy walked along a road. It was with this possibility in mind that I stated the requirement for the appropriate use of an E-type pronoun in terms of having answered, or being prepared to answer upon demand, the question ‘He? Who?’ or ‘It? Which?’ In order to effect this liberalisation we should allow the reference of the E-type pronoun to be fixed not only by predicative material explicitly in the antecedent clause, but also by material which the speaker supplies upon demand. This ruling has the effect of making the truth conditions of such remarks somewhat indeterminate; a determinate proposition will have been put forward only when the demand has been made and the material supplied.

It was Evans who gave us the title ‘E-type pronoun’ for the ‘he’ in such expressions as

A Cambridge philosopher smoked a pipe, and he drank a lot of whisky,

i.e., in epsilon terms,

(∃x)(Cx & Px) & Dεx(Cx & Px).

He also insisted (Evans 1977, p516) that what was unique about such pronouns was that this conjunction of statements was not equivalent to

A Cambridge philosopher, who smoked a pipe, drank a lot of whisky,

i.e.

(∃x)(Cx & Px & Dx).

Clearly the epsilon account is entirely in line with this, since it illustrates the point made before about cases without a uniqueness clause. Only the second expression, which contains a relative pronoun, is formalisable in the predicate calculus. To formalise the first expression, which contains a personal pronoun, one at least needs something with the expressive capabilities of the epsilon calculus.

5. The Formal Semantics of Epsilon Terms

The semantics of epsilon terms is nowadays more general, but the first interpretations of epsilon terms were restricted to arithmetical cases, and specifically took epsilon to be the least number operator. Hilbert and Bernays developed Arithmetic using the epsilon calculus, using the further epsilon axiom schema (Hilbert and Bernays 1970, Book 2, p85f, c.f. Leisenring 1969, p92) :

(εxAx = st) → ¬At,

where ‘s’ is intended to be the successor function, and ‘t’ is any numeral. This constrains the interpretation of the epsilon symbol, but the least number interpretation is not strictly forced, since the axiom only ensures that no number having the property A immediately precedes εxAx.

The new axiom, however, is sufficient to prove mathematical induction, in the form:

(A0 & (x)(Ax → Asx)) → (x)Ax.

For assume the reverse, namely

A0 & (x)(Ax → Asx) & ¬(x)Ax,

and consider what happens when the term ‘εx¬Ax’ is substituted in

t = 0 ∨ t = sn,

which is derivable from the other axioms of number theory which Hilbert and Bernays are using. If we had

εx¬Ax = 0,

then, since it is given that A0, then we would have Aεx¬Ax. But since, by the definition of the universal quantifier,

Aεx¬Ax ↔ (x)Ax,

we know, because ¬(x)Ax is also given, that ¬Aεx¬Ax, which means we cannot have εx¬Ax = 0. Hence we must have the other alternative, i.e.

εx¬Ax = sn,

for some n. But from the new axiom

(εx¬Ax = sn) → An,

hence we must have An, although we must also have

An → Asn,

because (x)(Ax → Asx). All together that requires Aεx¬Ax again, which is impossible. Hence the further epsilon axiom is sufficient to establish the given principle of induction.

The more general link between epsilon terms and choice functions was first set out by Asser, although Asser’s semantics for an elementary epsilon calculus without the second epsilon axiom makes epsilon terms denote rather complex choice functions. Wilfrid Meyer Viol, calling an epsilon calculus without the second axiom an ‘intensional’ epsilon calculus, makes the epsilon terms in such a calculus instead name Skolem functions. Skolem functions are also called Herbrand functions, although they arise in a different way, namely in Skolem’s Theorem. Skolem’s Theorem states that, if a formula in prenex normal form is provable in the predicate calculus, then a certain corresponding formula, with the existential quantifiers removed, is provable in a predicate calculus enriched with function symbols. The functions symbolised are called Skolem functions, although, in another context, they would be Herbrand functions.

Skolem’s Theorem is a meta-logical theorem, about the relation between two logical calculi, but a non-metalogical version is in fact provable in the epsilon calculus from which Skolem’s actual theorem follows, since, for example, we can get, by the epsilon definition, now of the existential quantifier

(x)(∃y)Fxy ↔ (x)FxεyFxy.

As a result, if the left hand side of such an equivalence is provable in an epsilon calculus the right hand side is provable there. But the left hand side is provable in an epsilon calculus if it is provable in the predicate calculus, by the Second Epsilon Theorem; and if the right hand side is provable in an epsilon calculus it is provable in a predicate calculus enriched with certain function symbols — epsilon terms, like ‘εyFxy’. So, by generalisation, we get Skolem’s original result.

When we add to an intensional epsilon calculus the second epsilon axiom

(x)(Fx ↔ Gx) →εxFx = εxGx,

the interpretation of epsilon terms is commonly extensional, i.e. in terms of sets, since two predicates ‘F’ and ‘G’ satisfying the antecedent of this second axiom will determine the same set — if they determine sets at all, that is. For that requires the predicates to be collectivisantes, in Bourbaki’s terms, as with explicit set membership statements, like ‘x ∈ y’. In such a case the epsilon term ‘εx(x ∈ y)’ designates a choice function, i.e. a function which selects one from a given set (c.f. Leisenring 1969, p19, Meyer Viol 1995, p42). In the case where there are no members of the set the selection is arbitrary, although for all empty sets it is invariably the same. Thus the second axiom validates, for example, Kalish and Montague’s rule for this case, which they put in the form

εxFx = εx¬(x = x).

Kalish and Montague in fact prove a version of the second epsilon axiom in their system (Kalish and Montague 1964, see T407, p256). The second axiom also holds in Hermes’ system (Hermes 1965), although there one in addition finds a third epsilon axiom,

εx¬(x = x) = εx(x = x),

for which there would seem to be no real justification.

But the second epsilon axiom itself is curious. One questionable thing about it is that both Leisenring and Meyer Viol do not state that the predicates in question must determine sets before their choice function semantics can apply. That the predicates are collectivisantes is merely presumed in their theories, since ‘εxBx’ is invariably modelled by means of a choice from the presumed set of things which in the model are B. Certainly there is a special clause dealing with the empty set; but there is no consideration of the case where some things are B although those things are not discrete, as with the things which are red, for instance. If the predicate in question is not a count noun then there is no set of things involved, since with mass terms, and continuous quantities there are no given elements to be counted (c.f. Bunt 1985, pp262-263 in particular). Of course numbers can still be associated with them, but only given an arbitrary unit. With the cows in a field, for instance, we can associate a determinate number, but with the beef there we cannot, unless we consider, say, the number of pounds of it.

The point, as we saw before, has a formalisation in epsilon terms. Thus if we write ‘(nx)Fx’, for ‘there are n F’s’, then εn(ny)Fy will be the number of F’s, and it is what numbers them if they have a number. But in the reverse case the previously mentioned arbitrariness of the epsilon term comes in. For if ¬(∃n)(nx)Fx, then ¬([εn(ny)Fy]x)Fx, and so, although an arbitrary number exists, it does not number the F’s. In that case, in other words, we do not have a number of F’s, merely some F.

In fact, even when there is a set of things, the second epsilon axiom, as stated above, does not apply in general, since there are intensional differences between properties to consider, as in, for instance ‘There is a red-haired man, and a Caucasian in the room, and they are different’. Here, if there were only red-haired Caucasians in the room, then with the above second axiom, we could not find epsilon substitutions to differentiate the two individuals involved. This may remind us that it is necessary co-extensionality, and not just contingent co-extensionality which is the normal criterion for the identity of properties (c.f. Hughes and Cresswell 1968, pp209-210). So it leads us to see the appropriateness of a modalised second axiom, which uses just an intensional version of the antecedent of the previous second epsilon axiom, in which ‘L’ means ‘it is necessary that’, namely:

L(x)(Fx ↔ Gx) →εxFx = εxGx.

For with this axiom only the co-extensionalities which are necessary will produce identities between the associated epsilon terms. We can only get, for instance,

εxPx = εx(Px ∨ Px),

and

εxFx = εyFy,

and all other identities derivable in a similar way.

However, the original second epsilon axiom is then provable, in the special case where the predicates express set membership. For if necessarily

(x)(x ∈ y ↔ x ∈ z) ↔ y = z,

while necessarily

y = z ↔ L(y = z),

(see Hughes and Cresswell, 1968, p190) then

L(x)(x ∈ y ↔ x ∈ z) ↔ (x)(x ∈ y ↔ x ∈ z),

and so, from the modalised second axiom we can get

(x)(x ∈ y ↔ x ∈ z) →εx(x ∈ y) = εx(x ∈ z).

Note, however, that if one only has contingently

(x)(Fx ↔ x ∈ z),

then one cannot get, on this basis,

εxFx = εx(x ∈ z).

But this is something which is desirable, as well. For we have seen that it is contingent that the number of the planets does number the planets — because it is not necessary that ([εn(ny)Py]x)Px. This makes ‘(9x)Px’ contingent, even though the identity ‘9 = εn(nx)Px’ remains necessary. But also it is contingent that there is the set of planets, p, which there is, since while, say,

(x)(x ∈ p ↔ Px),

where

εn(nx)(x ∈ p) = εn(nx)Px = 9,

it is still possible that, in some other possible world,

(x)(x ∈ p’ ↔ Px),

with p’ the set of planets there, and

¬(εn(nx)(x ∈ p’) = 9).

We could not have this further contingency, however, if the original second epsilon axiom held universally.

It is on this fuller basis that we can continue to hold ‘x = y → L(x = y)’, i.e. the invariable necessity of identity — one merely distinguishes ‘(9x)Px’ from ‘9 = εx(nx)Px’, and from ‘9 = εx(nx)(x ∈ p)’, as above.

Adding the original second epsilon axiom to an intensional epsilon calculus is therefore acceptable only if all the predicates are about set membership. This is not an uncommon assumption, indeed it is pervasive in the usually given semantics for predicate logic, for instance. But if, by contrast, we want to allow for the fact that not all predicates are collectivisantes then we should take just the first epsilon axiom with merely a modalised version of the second epsilon axiom. The interpretation of epsilon terms is then always in terms of Skolem functions, although if we are dealing with the membership of sets, those Skolem functions naturally are choice functions.

6. Some Metatheory

To finish we shall briefly look, as promised, at some meta-theory.

The epsilon calculi that were first described were not very convenient to use, and Hilbert and Bernays’ proofs of the First and Second Epsilon Theorems were very complex. This was because the presentation was axiomatic, however, and with the development of other means of presenting the same logics we get more readily available meta-logical results. I will indicate some of the early difficulties before showing how these theorems can be proved, nowadays, much more simply.

The problem with proving the Second Epsilon Theorem, on an axiomatic basis, is that complex, and non-constant epsilon terms may enter a proof in the epsilon calculus by means of substitutions into the axioms. What has to be proved is that an epsilon calculus proof of an epsilon-free theorem (i.e. one which can be expressed just in predicate calculus language) can be replaced by a predicate calculus proof. So some analysis of complex epsilon terms is required, to show that they can be eliminated in the relevant cases, leaving only constant epsilon terms, which are sufficiently similar to the individual symbols in standard predicate logic. Hilbert and Bernays (Hilbert and Bernays 1970, Book 2, p23f) say that one epsilon term ‘εxFx’ is subordinate to another ‘εyGy’ if and only if ‘G’ contains ‘εxFx’, and a free occurrence of the variable ‘y’ lies within ‘εxFx’. For instance ‘εxRxy’ is a complex, and non-constant epsilon term, which is subordinate to ‘εySyεxRyx’. Hilbert and Bernays then define the rank of an epsilon term to be 1 if there are no epsilon terms subordinate to it, and otherwise to be one greater than the maximal rank of the epsilon terms which are subordinate to it. Using the same general ideas, Leisenring proves two theorems (Leisenring 1969, p72f). First he proves a rank reduction theorem, which shows that epsilon proofs of epsilon-free formulas in which the second epsilon axiom is not used, but in which every term is of rank less than or equal to r, may be replaced by epsilon proofs in which every term is of rank less than or equal to r – 1. Then he proves the eliminability of the second epsilon axiom in proofs of epsilon-free formulas. Together, these two theorems show that if there is an epsilon proof of an epsilon-free formula, then there is such a proof not using the second epsilon axiom, and in which all epsilon terms have rank just 1. Even though such epsilon terms might still contain free variables, if one replaces those that do with a fixed symbol ‘a’ (starting with those of maximal length) that reduces the proof to one in what is called the ‘epsilon star’ system, in which there are only constant epsilon terms (Leisenring 1969, p66f). Leisenring shows that proofs in the epsilon star system can be turned into proofs in the predicate calculus, by replacing the epsilon terms by individual symbols.

But, as was said before, there is now available a much shorter proof of the Second Epsilon Theorem. In fact there are several, but I shall just indicate one, which arises simply by modifying the predicate calculus truth trees, as found in, for instance, Jeffrey (see Jeffrey 1967). Jeffrey uses the standard propositional truth tree rules, together with the rules of quantifier interchange, which remain unaffected, and which are not material to the present purpose. He also has, however, a rule of existential quantifier elimination,

(∃x)Fx ├ Fa,

in which ‘a’ must be new, and a rule of universal quantifier elimination

(x)Fx ├ Fb,

in which ‘b’ must be old — unless no other individual terms are available. By reducing closed formulas of the form ‘P & ¬C’ to absurdity Jeffrey can then prove ‘P → C’, and validate ‘P ├ C’ in his calculus. But clearly, upon adding epsilon terms to the language, the first of these rules must be changed to

(∃x)Fx ├ FεxFx,

while also the second rule can be replaced by the pair

(x)Fx ├ Fεx¬Fx,
Fεx¬Fx ├ Fa,

(where ‘a’ is old) to produce an appropriate proof procedure. Steen reads ‘εx¬Fx’ as ‘the most un-F-like thing’ (Steen 1972, p162), which explains why Fεx¬Fx entails Fa, since if the most un-F-like thing is in fact F, then the most plausible counter-example to the generalisation is in fact not so, making the generalisation exceptionless. But there is a more important reason why the rule of universal quantifier elimination is best broken up into two parts.

For Jeffrey’s rules only allow him ‘limited upward correctness’ (Jeffrey 1967, p167), since Jeffrey has to say, with respect to his universal quantifier elimination rule, that the range of the quantification there be limited merely to the universe of discourse of the path below. This is because, if an initial sentence is false in a valuation so also must be one of its conclusions. But the first epsilon rule which replaces Jeffrey’s rule ensures, instead, that there is ‘total upwards correctness’. For if it is false that everything is F then, without any special interpretation of the quantifier, one of the given consequences of the universal statement is false, namely the immediate one — since Fεx¬Fx is in fact equivalent to (x)Fx. A similar improvement also arises with the existential quantifier elimination rule. For Jeffrey can only get ‘limited downwards correctness’, with his existential quantifier elimination rule (Jeffrey 1967, p165), since it is not an entailment. In fact, in order to show that if an initial sentence is true in a valuation so is one of its conclusions, in this case, Jeffrey has to stretch his notion of ‘truth’ to being true either in the given valuation, or some nominal variant of it.

The epsilon rule which replaces Jeffrey’s overcomes this difficulty by not employing names, only demonstrative descriptions, and by being, as a result, totally downward correct. For if there is an F then that F is F, whatever name is used to refer to it. The epsilon calculus terminology thus precedes any naming: it gets hold of the more primitive, demonstrative way we have of referring to objects, using phrases like ‘that F’. Thus in explication of the predicate calculus rule we might well have said

suppose there is an F, well, call that F ‘a’, then Fa,

but that requires we understand ‘that F’ before we come to use ‘a’.

So how does the Second Epsilon Theorem follow? This theorem, as before, states that an epsilon calculus proof of an epsilon-free theorem may be replaced by a predicate calculus proof of the same formula. But the transformation required in the present setting is now evident: simply change to new names all epsilon terms introduced in the epsilon calculus quantifier elimination rules. This covers both the new names in Jeffrey’s first rule, but also the odd case where there are no old names in Jeffrey’s second rule. The epsilon calculus proofs invariably use constant epsilon terms, and are thus effectively in Leisenring’s epsilon star system.

Epsilon terms which are non-constant, however, crucially enter the proof of the First Epsilon Theorem. The First Epsilon Theorem states that if C is a provable predicate calculus formula, in prenex normal form, i.e. with all quantifiers at the front, then a finite disjunction of instances of C’s matrix is provable in the epsilon calculus. The crucial fact is that the epsilon calculus gives us access to Herbrand functions, which arise when universal quantifiers are eliminated from formulas using their epsilon definition. Thus

(∃y)(x)¬Fyx,

for instance, is equivalent to

(∃y)¬Fyεx¬¬Fyx,

and so

(∃y)¬FyεxFyx,

and the resulting epsilon term ‘εxFyx’ is a Herbrand function.

Using such reductions, all universal quantifiers can evidently be removed from formulas in prenex normal form, and the additional fact that, in a certain specific way, the remaining existential quantifiers are disjunctions makes all predicate calculus formulas equivalent to disjunctions. Remember that a formula is provable if its negation is reducible to absurdity, which means that its truth tree must close. But, by König’s Lemma, if there is no open path through a truth tree then there is some finite stage at which there is no open path, so, in the case above, for instance, if no valuation makes the last formula’s negation true, then the tree of the instances of that negative statement must close in a finite length. But the negative statement is the universal formula

(y)FyεxFyx,

by the rules of quantifier interchange, so a finite conjunction of instances of the matrix of this universal formula, namely Fyx, must reduce to absurdity. For the rules of universal quantifier elimination only produce consequences with the form of this matrix. By de Morgan’s Laws, that makes necessary a finite disjunction of instances of ¬Fyx. By generalisation we thus get the First Epsilon Theorem.

The epsilon calculus, however, can take us further than the First Epsilon Theorem. Indeed, one has to take care with the impression this theorem may give that existential statements are just equivalent to disjunctions. If that were the case, then existential statements would be unlike individual statements, saying not that one specified thing has a certain property, but merely that one of a certain group of things has a certain property. The group in question is normally called the ‘domain’ of the quantification, and this, it seems, has to be specified when setting out the semantics of quantifiers. But study of the epsilon calculus shows that there is no need for such ‘domains’, or indeed for such semantics. This is because the example above, for instance, is also equivalent to

¬FaεzFaz,

where a = εy¬FεxFyx. So the previous disjunction of instances of ¬Fyx is in fact only true because this specific disjunct is true. The First Epsilon Theorem, it must be remembered, does not prove that an existential statement is equivalent to a certain disjunction; it shows merely that an existential statement is provable if and only if a certain disjunction is provable. And what is also provable, in such a case, is a statement merely about one object. Indeed the existential statement is provably equivalent to it. It is this fact which supports the epsilon definition of the quantifiers; and it is what permits anaphoric reference to the same object by means of the same epsilon term. An existential statement is thus just another statement about an individual — merely a nameless one.

The reverse point goes for the universal quantifier: a universal statement is not the conjunction of its instances, even though it implies them. A generalisation is simply equivalent to one of its instances — to the one involving the prime putative exception to it, as we have seen. Not being able to specify that prime putative exception leaves Jeffrey saying that if a generalisation is false then one of its instances is false without any way of ensuring that that instance has been drawn as a conclusion below it in the truth tree except by limiting the interpretation of the generalisation just to the universe of discourse of the path. It thus seems necessary, within the predicate calculus, that there be a ‘model’ for the quantifiers which restricts them to a certain ‘domain’, which means that they do not necessarily range over everything. But in the epsilon calculus the quantifiers do, invariably, range over everything, and so there is no need to specify their range.

7. References and Further Reading

  • Ackermann, W. 1937-8, ‘Mengentheoretische Begründung der Logik’, Mathematische Annalen, 115, 1-22.
  • Asser, G. 1957, ‘Theorie der Logischen Auswahlfunktionen’, Zeitschrift für Mathematische Logik und Grundlagen der Mathematik, 3, 30-68.
  • Bernays, P. 1958, Axiomatic Set Theory, North Holland, Dordrecht.
  • Bertolet, R. 1980, ‘The Semantic Significance of Donnellan’s Distinction’, Philosophical Studies, 37, 281-288.
  • Bourbaki, N. 1954, Éléments de Mathématique, Hermann, Paris.
  • Bunt, H.C. 1985, Mass Terms and Model-Theoretic Semantics, C.U.P., Cambridge.
  • Church, A. 1940, ‘A Formulation of the Simple Theory of Types’, Journal of Symbolic Logic, 5, 56-68.
  • Copi, I. 1973, Symbolic Logic, 4th ed. Macmillan, New York.
  • Devitt, M. 1974, ‘Singular Terms’, The Journal of Philosophy, 71, 183-205.
  • Donnellan, K. 1966, ‘Reference and Definite Descriptions’, Philosophical Review, 75, 281-304.
  • Egli, U. and von Heusinger, K. 1995, ‘The Epsilon Operator and E-Type Pronouns’ in U. Egli et al. (eds.), Lexical Knowledge in the Organisation of Language, Benjamins, Amsterdam.
  • Evans, G. 1977, ‘Pronouns, Quantifiers and Relative Clauses’, Canadian Journal of Philosophy, 7, 467-536.
  • Geach, P.T. 1962, Reference and Generality, Cornell University Press, Ithaca.
  • Geach, P.T. 1967, ‘Intentional Identity’, The Journal of Philosophy, 64, 627-632.
  • Goddard, L. and Routley, R. 1973, The Logic of Significance and Context, Scottish Academic Press, Aberdeen.
  • Gödel, K. 1969, ‘An Interpretation of the Intuitionistic Sentential Calculus’, in J. Hintikka (ed.), The Philosophy of Mathematics, O.U.P. Oxford.
  • Hazen, A. 1987, ‘Natural Deduction and Hilbert’s ε-operator’, Journal of Philosophical Logic, 16, 411-421.
  • Hermes, H. 1965, Eine Termlogik mit Auswahloperator, Springer Verlag, Berlin.
  • Hilbert, D. 1923, ‘Die Logischen Grundlagen der Mathematik’, Mathematische Annalen, 88, 151-165.
  • Hilbert, D. 1925, ‘On the Infinite’ in J. van Heijenhoort (ed.), From Frege to Gödel, Harvard University Press, Cambridge MA.
  • Hilbert, D. and Bernays, P. 1970, Grundlagen der Mathematik, 2nd ed., Springer, Berlin.
  • Hughes, G.E. and Cresswell, M.J. 1968, An Introduction to Modal Logic, Methuen, London.
  • Jeffrey, R. 1967, Formal Logic: Its Scope and Limits, 1st Ed. McGraw-Hill, New York.
  • Kalish, D. and Montague, R. 1964, Logic: Techniques of Formal Reasoning, Harcourt, Brace and World, Inc, New York.
  • Kneebone, G.T. 1963, Mathematical Logic and the Foundations of Mathematics, Van Nostrand, Dordrecht.
  • Leisenring, A.C. 1969, Mathematical Logic and Hilbert’s ε-symbol, Macdonald, London.
  • Marciszewski, W. 1981, Dictionary of Logic, Martinus Nijhoff, The Hague.
  • Meyer Viol, W.P.M. 1995, Instantial Logic, ILLC Dissertation Series 1995-11, Amsterdam.
  • Montague, R. 1963, ‘Syntactical Treatments of Modality, with Corollaries on Reflection Principles and Finite Axiomatisability’, Acta Philosophica Fennica, 16, 155-167.
  • Neale, S. 1990, Descriptions, MIT Press, Cambridge MA.
  • Priest, G.G. 1984, ‘Semantic Closure’, Studia Logica, XLIII 1/2, 117-129.
  • Prior, A.N., 1971, Objects of Thought, O.U.P. Oxford.
  • Purdy, W.C. 1994, ‘A Variable-Free Logic for Anaphora’ in P. Humphreys (ed.) Patrick Suppes: Scientific Philosopher, Vol 3, Kluwer, Dordrecht, 41-70.
  • Quine, W.V.O. 1960, Word and Object, Wiley, New York.
  • Rasiowa, H. 1956, ‘On the ε-theorems’, Fundamenta Mathematicae, 43, 156-165.
  • Rosser, J. B. 1953, Logic for Mathematicians, McGraw-Hill, New York.
  • Routley, R. 1969, ‘A Simple Natural Deduction System’, Logique et Analyse, 12, 129-152.
  • Routley, R. 1977, ‘Choice and Descriptions in Enriched Intensional Languages II, and III’, in E. Morscher, J. Czermak, and P. Weingartner (eds), Problems in Logic and Ontology, Akademische Druck und Velagsanstalt, Graz.
  • Routley, R. 1980, Exploring Meinong’s Jungle, Departmental Monograph #3, Philosophy Department, R.S.S.S., A.N.U. Canberra.
  • Routley, R., Meyer, R. and Goddard, L. 1974, ‘Choice and Descriptions in Enriched Intensional Languages I’, Journal of Philosophical Logic, 3, 291-316.
  • Russell, B. 1905, ‘On Denoting’ Mind, 14, 479-493.
  • Sayward, C. 1987, ‘Prior’s Theory of Truth’ Analysis, 47, 83-87.
  • Slater, B.H. 1986(a), ‘E-type Pronouns and ε-terms’, Canadian Journal of Philosophy, 16, 27-38.
  • Slater, B.H. 1986(b), ‘Prior’s Analytic’, Analysis, 46, 76-81.
  • Slater, B.H. 1988(a), ‘Intensional Identities’, Logique et Analyse, 121-2, 93-107.
  • Slater, B.H. 1988(b), ‘Hilbertian Reference’, Noûs, 22, 283-97.
  • Slater, B.H. 1989(a), ‘Modal Semantics’, Logique et Analyse, 127-8, 195-209.
  • Slater, B.H. 1990, ‘Using Hilbert’s Calculus’, Logique et Analyse, 129-130, 45-67.
  • Slater, B.H. 1992(a), ‘Routley’s Formulation of Transparency’, History and Philosophy of Logic, 13, 215-24.
  • Slater, B.H. 1994(a), ‘The Epsilon Calculus’ Problematic’, Philosophical Papers, XXIII, 217-42.
  • Steen, S.W.P. 1972, Mathematical Logic, C.U.P. Cambridge.
  • Thomason, R. 1977, ‘Indirect Discourse is not Quotational’, Monist, 60, 340-354.
  • Thomason, R. 1980, ‘A Note on Syntactical Treatments of Modality’, Synthese, 44, 391-395.
  • Thomason, R.H. and Stalnaker, R.C. 1968, ‘Modality and Reference’, Noûs, 2, 359-372.

Author Information

Barry Hartley Slater
Email: slaterbh@cyllene.uwa.edu.au
University of Western Australia
Australia

Thomas Hobbes: Moral and Political Philosophy

hobbesThe English philosopher Thomas Hobbes (1588-1679) is best known for his political thought, and deservedly so. His vision of the world is strikingly original and still relevant to contemporary politics. His main concern is the problem of social and political order: how human beings can live together in peace and avoid the danger and fear of civil conflict. He poses stark alternatives: we should give our obedience to an unaccountable sovereign (a person or group empowered to decide every social and political issue). Otherwise what awaits us is a state of nature that closely resembles civil war – a situation of universal insecurity, where all have reason to fear violent death and where rewarding human cooperation is all but impossible.

One controversy has dominated interpretations of Hobbes. Does he see human beings as purely self-interested or egoistic? Several passages support such a reading, leading some to think that his political conclusions can be avoided if we adopt a more realistic picture of human nature. However, most scholars now accept that Hobbes himself had a much more complex view of human motivation. A major theme below will be why the problems he poses cannot be avoided simply by taking a less selfish view of human nature.

Table of Contents

  1. Introduction
  2. Life and Times
  3. Two Intellectual Influences
  4. Ethics and Human Nature
    1. Materialism Versus Self-Knowledge
    2. The Poverty of Human Judgment and our Need for Science
    3. Motivation
    4. Political Philosophy
  5. The Natural Condition of Mankind
    1. The Laws of Nature and the Social Contract
    2. Why Should we Obey the Sovereign?
    3. Life Under the Sovereign
  6. Conclusion
  7. References and Further Reading

1. Introduction

Hobbes is the founding father of modern political philosophy. Directly or indirectly, he has set the terms of debate about the fundamentals of political life right into our own times. Few have liked his thesis, that the problems of political life mean that a society should accept an unaccountable sovereign as its sole political authority. Nonetheless, we still live in the world that Hobbes addressed head on: a world where human authority is something that requires justification, and is automatically accepted by few; a world where social and political inequality also appears questionable; and a world where religious authority faces significant dispute. We can put the matter in terms of the concern with equality and rights that Hobbes’s thought heralded: we live in a world where all human beings are supposed to have rights, that is, moral claims that protect their basic interests. But what or who determines what those rights are? And who will enforce them? In other words, who will exercise the most important political powers, when the basic assumption is that we all share the same entitlements?

We can see Hobbes’s importance if we briefly compare him with the most famous political thinkers before and after him. A century before, Nicolo Machiavelli had emphasized the harsh realities of power, as well as recalling ancient Roman experiences of political freedom. Machiavelli appears as the first modern political thinker, because like Hobbes he was no longer prepared to talk about politics in terms set by religious faith (indeed, he was still more offensive than Hobbes to many orthodox believers), instead, he looked upon politics as a secular discipline divorced from theology. But unlike Hobbes, Machiavelli offers us no comprehensive philosophy: we have to reconstruct his views on the importance and nature of freedom; it remains uncertain which, if any, principles Machiavelli draws on in his apparent praise of amoral power politics.

Writing a few years after Hobbes, John Locke had definitely accepted the terms of debate Hobbes had laid down: how can human beings live together, when religious or traditional justifications of authority are no longer effective or persuasive? How is political authority justified and how far does it extend? In particular, are our political rulers properly as unlimited in their powers as Hobbes had suggested? And if they are not, what system of politics will ensure that they do not overstep the mark, do not trespass on the rights of their subjects?

So, in assessing Hobbes’s political philosophy, our guiding questions can be: What did Hobbes write that was so important? How was he able to set out a way of thinking about politics and power that remains decisive nearly four centuries afterwards? We can get some clues to this second question if we look at Hobbes’s life and times.

2. Life and Times

Hobbes’s biography is dominated by the political events in England and Scotland during his long life. Born in 1588, the year the Spanish Armada made its ill-fated attempt to invade England, he lived to the exceptional age of 91, dying in 1679. He was not born to power or wealth or influence: the son of a disgraced village vicar, he was lucky that his uncle was wealthy enough to provide for his education and that his intellectual talents were soon recognized and developed (through thorough training in the classics of Latin and Greek). Those intellectual abilities, and his uncle’s support, brought him to university at Oxford. And these in turn—together with a good deal of common sense and personal maturity—won him a place tutoring the son of an important noble family, the Cavendishes. This meant that Hobbes entered circles where the activities of the King, of Members of Parliament, and of other wealthy landowners were known and discussed, and indeed influenced. Thus intellectual and practical ability brought Hobbes to a place close to power—later he would even be math tutor to the future King Charles II. Although this never made Hobbes powerful, it meant he was acquainted with and indeed vulnerable to those who were. As the scene was being set for the Civil Wars of 1642-46 and 1648-51—wars that would lead to the King being executed and a republic being declared—Hobbes felt forced to leave the country for his personal safety, and lived in France from 1640 to 1651. Even after the monarchy had been restored in 1660, Hobbes’s security was not always certain: powerful religious figures, critical of his writings, made moves in Parliament that apparently led Hobbes to burn some of his papers for fear of prosecution.

Thus Hobbes lived in a time of upheaval, sharper than any England has since known. This turmoil had many aspects and causes, political and religious, military and economic. England stood divided against itself in several ways. The rich and powerful were divided in their support for the King, especially concerning the monarch’s powers of taxation. Parliament was similarly divided concerning its own powers vis-à-vis the King. Society was divided religiously, economically, and by region. Inequalities in wealth were huge, and the upheavals of the Civil Wars saw the emergence of astonishingly radical religious and political sects. (For instance, the Levellers called for much greater equality in terms of wealth and political rights; the Diggers, more radical still, fought for the abolition of wage labor.) Civil war meant that the country became militarily divided. And all these divisions cut across one another: for example, the army of the republican challenger, Cromwell, was the main home of the Levellers, yet Cromwell in turn would act to destroy their power within the army’s ranks. In addition, England’s recent union with Scotland was fragile at best, and was almost destroyed by King Charles I’s attempts to impose consistency in religious practices. We shall see that Hobbes’s greatest fear was social and political chaos—and he had ample opportunity both to observe it and to suffer its effects.

Although social and political turmoil affected Hobbes’s life and shaped his thought, it never hampered his intellectual development. His early position as a tutor gave him the scope to read, write and publish (a brilliant translation of the Greek writer Thucydides appeared in 1629), and brought him into contact with notable English intellectuals such as Francis Bacon. His self-imposed exile in France, along with his emerging reputation as a scientist and thinker, brought him into contact with major European intellectual figures of his time, leading to exchange and controversy with figures such as Descartes, Mersenne and Gassendi. Intensely disputatious, Hobbes repeatedly embroiled himself in prolonged arguments with clerics, mathematicians, scientists and philosophers—sometimes to the cost of his intellectual reputation. (For instance, he argued repeatedly that it is possible to square the circle  It is no accident that the phrase is now proverbial for a problem that cannot be solved!) His writing was as undaunted by age and ill health as it was by the events of his times. Though his health slowly failed—from about sixty, he began to suffer shaking palsy, probably Parkinson’s disease, which steadily worsened—even in his eighties he continued to dictate his thoughts to a secretary, and to defend his quarter in various controversies.

Hobbes gained a reputation in many fields. He was known as a scientist (especially in optics), as a mathematician (especially in geometry), as a translator of the classics, as a writer on law, as a disputant in metaphysics and epistemology; not least, he became notorious for his writings and disputes on religious questions. But it is for his writings on morality and politics that he has, rightly, been most remembered. Without these, scholars might remember Hobbes as an interesting intellectual of the seventeenth century; but few philosophers would even recognize his name.

What are the writings that earned Hobbes his philosophical fame? The first was entitled The Elements of Law (1640); this was Hobbes’s attempt to provide arguments supporting the King against his challengers. De Cive [On the Citizen] (1642) has much in common with Elements, and offers a clear, concise statement of Hobbes’s moral and political philosophy. His most famous work is Leviathan, a classic of English prose (1651; a slightly altered Latin edition appeared in 1668). Leviathan expands on the argument of De Cive, mostly in terms of its huge second half that deals with questions of religion. Other important works include: De Corpore [On the Body] (1655), which deals with questions of metaphysics; De Homine [On Man] (1657); and Behemoth (published 1682, though written rather earlier), in which Hobbes gives his account of England’s Civil Wars. But to understand the essentials of Hobbes’s ideas and system, one can rely on De Cive and Leviathan. It is also worth noting that, although Leviathan is more famous and more often read, De Cive actually gives a much more straightforward account of Hobbes’s ideas. Readers whose main interest is in those ideas may wish to skip the next section and go straight to ethics and human nature.

3. Two Intellectual Influences

As well as the political background just stressed, two influences are extremely marked in Hobbes’s work. The first is a reaction against religious authority as it had been known, and especially against the scholastic philosophy that accepted and defended such authority. The second is a deep admiration for (and involvement in) the emerging scientific method, alongside an admiration for a much older discipline, geometry. Both influences affected how Hobbes expressed his moral and political ideas. In some areas it is also clear that they significantly affected the ideas themselves.

Hobbes’s contempt for scholastic philosophy is boundless. Leviathan and other works are littered with references to the “frequency of insignificant speech” in the speculations of the scholastics, with their combinations of Christian theology and Aristotelian metaphysics. Hobbes’s reaction, apart from much savage and sparkling sarcasm, is twofold. In the first place, he makes very strong claims about the proper relation between religion and politics. He was not (as many have charged) an atheist, but he was deadly serious in insisting that theological disputes should be kept out of politics. (He also adopts a strongly materialist metaphysics, that—as his critics were quick to charge—makes it difficult to account for God’s existence as a spiritual entity.) For Hobbes, the sovereign should determine the proper forms of religious worship, and citizens never have duties to God that override their duty to obey political authority. Second, this reaction against scholasticism shapes the presentation of Hobbes’s own ideas. He insists that terms be clearly defined and relate to actual concrete experiences—part of his empiricism. (Many early sections of Leviathan read rather like a dictionary.) Commentators debate how seriously to take Hobbes’s stress on the importance of definition, and whether it embodies a definite philosophical doctrine. What is certain, and more important from the point of view of his moral and political thought, is that he tries extremely hard to avoid any metaphysical categories that do not relate to physical realities (especially the mechanical realities of matter and motion). Commentators further disagree whether Hobbes’s often mechanical sounding definitions of human nature and human behavior are actually important in shaping his moral and political ideas—see Materialism versus self-knowledge below.

Hobbes’s determination to avoid the “insignificant” (that is, meaningless) speech of the scholastics also overlaps with his admiration for the emerging physical sciences and for geometry. His admiration is not so much for the emerging method of experimental science, but rather for deductive science—science that deduces the workings of things from basic first principles and from true definitions of the basic elements. Hobbes therefore approves a mechanistic view of science and knowledge, one that models itself very much on the clarity and deductive power exhibited in proofs in geometry. It is fair to say that this a priori account of science has found little favor after Hobbes’s time. It looks rather like a dead-end on the way to the modern idea of science based on patient observation, theory-building and experiment. Nonetheless, it certainly provided Hobbes with a method that he follows in setting out his ideas about human nature and politics. As presented in Leviathan, especially, Hobbes seems to build from first elements of human perception and reasoning, up to a picture of human motivation and action, to a deduction of the possible forms of political relations and their relative desirability. Once more, it can be disputed whether this method is significant in shaping those ideas, or merely provides Hobbes with a distinctive way of presenting them.

4. Ethics and Human Nature

Hobbes’s moral thought is difficult to disentangle from his politics. On his view, what we ought to do depends greatly on the situation in which we find ourselves. Where political authority is lacking (as in his famous natural condition of mankind), our fundamental right seems to be to save our skins, by whatever means we think fit. Where political authority exists, our duty seems to be quite straightforward: to obey those in power.

But we can usefully separate the ethics from the politics if we follow Hobbes’s own division. For him ethics is concerned with human nature, while political philosophy deals with what happens when human beings interact. What, then, is Hobbes’s view of human nature?

a. Materialism Versus Self-Knowledge

Reading the opening chapters of Leviathan is a confusing business, and the reason for this is already apparent in Hobbes’s very short Introduction. He begins by telling us that the human body is like a machine, and that political organization (the commonwealth) is like an artificial human being. He ends by saying that the truth of his ideas can be gauged only by self-examination, by looking into our selves to adjudge our characteristic thoughts and passions, which form the basis of all human action. But what is the relationship between these two very different claims? For obviously when we look into our selves we do not see mechanical pushes and pulls. This mystery is hardly answered by Hobbes’s method in the opening chapters, where he persists in talking about all manner of psychological phenomena—from emotions to thoughts to whole trains of reasoning – as products of mechanical interactions. (As to what he will say about successful political organization, the resemblance between the commonwealth and a functioning human being is slim indeed. Hobbes’s only real point seems to be that there should be a head that decides most of the important things that the body does.)

Most commentators now agree with an argument made in the 1960’s by the political philosopher Leo Strauss. Hobbes draws on his notion of a mechanistic science, that works deductively from first principles, in setting out his ideas about human nature. Science provides him with a distinctive method and some memorable metaphors and similes. What it does not provide—nor could it, given the rudimentary state of physiology and psychology in Hobbes’s day—are any decisive or substantive ideas about what human nature really is. Those ideas may have come, as Hobbes also claims, from self-examination. In all likelihood, they actually derived from his reflection on contemporary events and his reading of classics of political history such as Thucydides.

This is not to say that we should ignore Hobbes’s ideas on human nature—far from it. But it does mean we should not be misled by scientific imagery that stems from an in fact non-existent science (and also, to some extent, from an unproven and uncertain metaphysics). The point is important mainly when it comes to a central interpretative point in Hobbes’s work: whether or not he thinks of human beings as mechanical objects, programmed as it were to pursue their self-interest. Some have suggested that Hobbes’s mechanical world-view leaves no room for the influence of moral ideas, that he thinks the only effective influence on our behavior will be incentives of pleasure and pain. But while it is true that Hobbes sometimes says things like this, we should be clear that the ideas fit together only in a metaphorical way. For example, there is no reason why moral ideas should not “get into” the mechanisms that drive us round (like so many clock-work dolls perhaps?). Likewise, there is no reason why pursuing pleasure and pain should work in our self-interest. (What self-interest is depends on the time-scale we adopt, and how effectively we might achieve this goal also depends on our insight into what harms and benefits us). If we want to know what drives human beings, on Hobbes’s view, we must read carefully all he says about this, as well as what he needs to assume if the rest of his thought is to make sense. The mechanistic metaphor is something of a red herring and, in the end, probably less useful than his other starting point in Leviathan, the Delphic epithet: nosce teipsum (know thyself).

b. The Poverty of Human Judgment and our Need for Science

There are two major aspects to Hobbes’s picture of human nature. As we have seen, and will explore below, what motivates human beings to act is extremely important to Hobbes. The other aspect concerns human powers of judgment and reasoning, about which Hobbes tends to be extremely skeptical. Like many philosophers before him, Hobbes wants to present a more solid and certain account of human morality than is contained in everyday beliefs. Plato had contrasted knowledge with opinion. Hobbes contrasts science with a whole raft of less reliable forms of belief—from probable inference based on experience, right down to “absurdity, to which no living creature is subject but man” (Leviathan, v.7).

Hobbes has several reasons for thinking that human judgment is unreliable, and needs to be guided by science. Our judgments tend to be distorted by self-interest or by the pleasures and pains of the moment. We may share the same basic passions, but the various things of the world affect us all very differently; and we are inclined to use our feelings as measures for others. It becomes dogmatic through vanity and morality, as with “men vehemently in love with their own new opinions…and obstinately bent to maintain them, [who give] their opinions also that reverenced name of conscience” (Leviathan, vii.4). When we use words which lack any real objects of reference, or are unclear about the meaning of the words we use, the danger is not only that our thoughts will be meaningless, but also that we will fall into violent dispute. (Hobbes has scholastic philosophy in mind, but he also makes related points about the dangerous effects of faulty political ideas and ideologies.) We form beliefs about supernatural entities, fairies and spirits and so on, and fear follows where belief has gone, further distorting our judgment. Judgment can be swayed this way and that by rhetoric, that is, by the persuasive and “colored speech” of others, who can deliberately deceive us and may well have purposes that go against the common good or indeed our own good. Not least, much judgment is concerned with what we should do now, that is, with future events, “the future being but a fiction of the mind” (Leviathan, iii.7) and therefore not reliably known to us.

For Hobbes, it is only science, “the knowledge of consequences” (Leviathan, v.17), that offers reliable knowledge of the future and overcomes the frailties of human judgment. Unfortunately, his picture of science, based on crudely mechanistic premises and developed through deductive demonstrations, is not even plausible in the physical sciences. When it comes to the complexities of human behavior, Hobbes’s model of science is even less satisfactory. He is certainly an acute and wise commentator of political affairs; we can praise him for his hard-headedness about the realities of human conduct, and for his determination to create solid chains of logical reasoning. Nonetheless, this does not mean that Hobbes was able to reach a level of scientific certainty in his judgments that had been lacking in all previous reflection on morals and politics.

c. Motivation

The most consequential aspect of Hobbes’s account of human nature centers on his ideas about human motivation, and this topic is therefore at the heart of many debates about how to understand Hobbes’s philosophy. Many interpreters have presented the Hobbesian agent as a self-interested, rationally calculating actor (those ideas have been important in modern political philosophy and economic thought, especially in terms of rational choice theories). It is true that some of the problems that face people like this—rational egoists, as philosophers call them—are similar to the problems Hobbes wants to solve in his political philosophy. And it is also very common for first-time readers of Hobbes to get the impression that he believes we are all basically selfish.

There are good reasons why earlier interpreters and new readers tend to think the Hobbesian agent is ultimately self-interested. Hobbes likes to make bold and even shocking claims to get his point across. “I obtained two absolutely certain postulates of human nature,” he says, “one, the postulate of human greed by which each man insists upon his own private use of common property; the other, the postulate of natural reason, by which each man strives to avoid violent death” (De Cive, Epistle Dedicatory). What could be clearer?—We want all we can get, and we certainly want to avoid death. There are two problems with thinking that this is Hobbes’s considered view, however. First, quite simply, it represents a false view of human nature. People do all sorts of altruistic things that go against their interests. They also do all sorts of needlessly cruel things that go against self-interest (think of the self-defeating lengths that revenge can run to). So it would be uncharitable to interpret Hobbes this way, if we can find a more plausible account in his work. Second, in any case Hobbes often relies on a more sophisticated view of human nature. He describes or even relies on motives that go beyond or against self-interest, such as pity, a sense of honor or courage, and so on. And he frequently emphasizes that we find it difficult to judge or appreciate just what our interests are anyhow. (Some also suggest that Hobbes’s views on the matter shifted away from egoism after De Cive, but the point is not crucial here.)

The upshot is that Hobbes does not think that we are basically or reliably selfish; and he does not think we are fundamentally or reliably rational in our ideas about what is in our interests. He is rarely surprised to find human beings doing things that go against self-interest: we will cut off our noses to spite our faces, we will torture others for their eternal salvation, we will charge to our deaths for love of country. In fact, a lot of the problems that befall human beings, according to Hobbes, result from their being too little concerned with self-interest. Too often, he thinks, we are too much concerned with what others think of us, or inflamed by religious doctrine, or carried away by others’ inflammatory words. This weakness as regards our self-interest has even led some to think that Hobbes is advocating a theory known as ethical egoism. This is to claim that Hobbes bases morality upon self-interest, claiming that we ought to do what it is most in our interest to do. But we shall see that this would over-simplify the conclusions that Hobbes draws from his account of human nature.

d. Political Philosophy

This is Hobbes’s picture of human nature. We are needy and vulnerable. We are easily led astray in our attempts to know the world around us. Our capacity to reason is as fragile as our capacity to know; it relies upon language and is prone to error and undue influence. When we act, we may do so selfishly or impulsively or in ignorance, on the basis of faulty reasoning or bad theology or others’ emotive speech.

What is the political fate of this rather pathetic sounding creature—that is, of us? Unsurprisingly, Hobbes thinks little happiness can be expected of our lives together. The best we can hope for is peaceful life under an authoritarian-sounding sovereign. The worst, on Hobbes’s account, is what he calls the natural condition of mankind, a state of violence, insecurity and constant threat. In outline, Hobbes’s argument is that the alternative to government is a situation no one could reasonably wish for, and that any attempt to make government accountable to the people must undermine it, so threatening the situation of non-government that we must all wish to avoid. Our only reasonable option, therefore, is a “sovereign” authority that is totally unaccountable to its subjects. Let us deal with the “natural condition” of non-government, also called the “state of nature,” first of all.

5. The Natural Condition of Mankind

The state of nature is “natural” in one specific sense only. For Hobbes political authority is artificial: in the “natural” condition human beings lack government, which is an authority created by men. What is Hobbes’s reasoning here? He claims that the only authority that naturally exists among human beings is that of a mother over her child, because the child is so very much weaker than the mother (and indebted to her for its survival). Among adult human beings this is invariably not the case. Hobbes concedes an obvious objection, admitting that some of us are much stronger than others. And although he is very sarcastic about the idea that some are wiser than others, he does not have much difficulty with the idea that some are fools and others are dangerously cunning. Nonetheless, it is almost invariably true that every human being is capable of killing any other. “Even the strongest must sleep; even the weakest might persuade others to help him kill another”. (Leviathan, xiii.1-2) Because adults are equal in this capacity to threaten one another’s lives, Hobbes claims there is no natural source of authority to order their lives together. (He is strongly opposing arguments that established monarchs have a natural or God-given right to rule over us.)

Thus, as long as human beings have not successfully arranged some form of government, they live in Hobbes’s state of nature. Such a condition might occur at the “beginning of time” (see Hobbes’s comments on Cain and Abel, Leviathan, xiii.11, Latin version only), or in “primitive” societies (Hobbes thought the American Indians lived in such a condition). But the real point for Hobbes is that a state of nature could just as well occur in seventeenth century England, should the King’s authority be successfully undermined. It could occur tomorrow in every modern society, for example, if the police and army suddenly refused to do their jobs on behalf of government. Unless some effective authority stepped into the King’s place (or the place of army and police and government), Hobbes argues the result is doomed to be deeply awful, nothing less than a state of war.

Why should peaceful cooperation be impossible without an overarching authority? Hobbes provides a series of powerful arguments that suggest it is extremely unlikely that human beings will live in security and peaceful cooperation without government. (Anarchism, the thesis that we should live without government, of course disputes these arguments.) His most basic argument is threefold. (Leviathan, xiii.3-9) (i) He thinks we will compete, violently compete, to secure the basic necessities of life and perhaps to make other material gains. (ii) He argues that we will challenge others and fight out of fear (“diffidence”), so as to ensure our personal safety. (iii) And he believes that we will seek reputation (“glory”), both for its own sake and for its protective effects (for example, so that others will be afraid to challenge us).

This is a more difficult argument than it might seem. Hobbes does not suppose that we are all selfish, that we are all cowards, or that we are all desperately concerned with how others see us. Two points, though. First, he does think that some of us are selfish, some of us cowardly, and some of us “vainglorious” (perhaps some people are of all of these!). Moreover, many of these people will be prepared to use violence to attain their ends—especially if there is no government or police to stop them. In this Hobbes is surely correct. Second, in some situations it makes good sense, at least in the short term, to use violence and to behave selfishly, fearfully or vaingloriously. If our lives seem to be at stake, after all, we are unlikely to have many scruples about stealing a loaf of bread; if we perceive someone as a deadly threat, we may well want to attack first, while his guard is down; if we think that there are lots of potential attackers out there, it is going to make perfect sense to get a reputation as someone who should not be messed with. In Hobbes’s words, “the wickedness of bad men also compels good men to have recourse, for their own protection, to the virtues of war, which are violence and fraud”. (De Cive, Epistle Dedicatory) As well as being more complex than first appears, Hobbes’s argument becomes very difficult to refute.

Underlying this most basic argument is an important consideration about insecurity. As we shall see Hobbes places great weight on contracts (thus some interpreters see Hobbes as heralding a market society dominated by contractual exchanges). In particular, he often speaks of “covenants,” by which he means a contract where one party performs his part of the bargain later than the other. In the state of nature such agreements are not going to work. Only the weakest will have good reason to perform the second part of a covenant, and then only if the stronger party is standing over them. Yet a huge amount of human cooperation relies on trust, that others will return their part of the bargain over time. A similar point can be made about property, most of which we cannot carry about with us and watch over. This means we must rely on others respecting our possessions over extended periods of time. If we cannot do this, then many of the achievements of human society that involve putting hard work into land (farming, building) or material objects (the crafts, or modern industrial production, still unknown in Hobbes’s time) will be near impossible.

One can reasonably object to such points: Surely there are basic duties to reciprocate fairly and to behave in a trustworthy manner? Even if there is no government providing a framework of law, judgment and punishment, do not most people have a reasonable sense of what is right and wrong, which will prevent the sort of contract-breaking and generalized insecurity that Hobbes is concerned with? Indeed, should not our basic sense of morality prevent much of the greed, pre-emptive attack and reputation-seeking that Hobbes stressed in the first place? This is the crunch point of Hobbes’s argument, and it is here (if anywhere) that one can accuse Hobbes of pessimism. He makes two claims. The first concerns our duties in the state of nature (that is, the so-called “right of nature”). The second follows from this, and is less often noticed: it concerns the danger posed by our different and variable judgments of what is right and wrong.

On Hobbes’s view the right of nature is quite simple to define. Naturally speaking—that is, outside of civil society – we have a right to do whatever we think will ensure our self-preservation. The worst that can happen to us is violent death at the hands of others. If we have any rights at all, if (as we might put it) nature has given us any rights whatsoever, then the first is surely this: the right to prevent violent death befalling us. But Hobbes says more than this, and it is this point that makes his argument so powerful. We do not just have a right to ensure our self-preservation: we each have a right to judge what will ensure our self-preservation. And this is where Hobbes’s picture of humankind becomes important. Hobbes has given us good reasons to think that human beings rarely judge wisely. Yet in the state of nature no one is in a position to successfully define what is good judgment. If I judge that killing you is a sensible or even necessary move to safeguard my life, then—in Hobbes’s state of nature – I have a right to kill you. Others might judge the matter differently, of course. Almost certainly you will have quite a different view of things (perhaps you were just stretching your arms, not raising a musket to shoot me). Because we are all insecure, because trust is more-or-less absent, there is little chance of our sorting out misunderstandings peacefully, nor can we rely on some (trusted) third party to decide whose judgment is right. We all have to be judges in our own causes, and the stakes are very high indeed: life or death.

For this reason Hobbes makes very bold claims that sound totally amoral. “To this war of every man against every man,” he says, “this also is consequent [i.e., it follows]: that nothing can be unjust. The notions of right and wrong, justice and injustice have no place [in the state of nature]”. (Leviathan, xiii.13) He further argues that in the state of nature we each have a right to all things, “even to one another’s body’ (Leviathan, xiv.4). Hobbes is dramatizing his point, but the core is defensible. If I judge that I need such and such—an object, another person’s labor, another person’s death—to ensure my continued existence, then in the state of nature, there is no agreed authority to decide whether I’m right or wrong. New readers of Hobbes often suppose that the state of nature would be a much nicer place, if only he were to picture human beings with some basic moral ideas. But this is naïve: unless people share the same moral ideas, not just at the level of general principles but also at the level of individual judgment, then the challenge he poses remains unsolved: human beings who lack some shared authority are almost certain to fall into dangerous and deadly conflict.

There are different ways of interpreting Hobbes’s view of the absence of moral constraints in the state of nature. Some think that Hobbes is imagining human beings who have no idea of social interaction and therefore no ideas about right and wrong. In this case, the natural condition would be a purely theoretical construction, and would demonstrate what both government and society do for human beings. (A famous statement about the state of nature in De Cive (viii.1) might support this interpretation: “looking at men as if they had just emerged from the earth like mushrooms and grown up without any obligation to each other…”) Another, complementary view reads Hobbes as a psychological egoist, so that—in the state of nature as elsewhere – he is merely describing the interaction of ultimately selfish and amoral human beings.

Others suppose that Hobbes has a much more complex picture of human motivation, so that there is no reason to think moral ideas are absent in the state of nature. In particular, it is historically reasonable to think that Hobbes invariably has civil war in mind, when he describes our “natural condition.” If we think of civil war, we need to imagine people who have lived together and indeed still do live together—huddled together in fear in their houses, banded together as armies or guerrillas or groups of looters. The problem here is not a lack of moral ideas—far from it – rather that moral ideas and judgments differ enormously. This means (for example) that two people who are fighting tooth and nail over a cow or a gun can both think they are perfectly entitled to the object and both think they are perfectly right to kill the other—a point Hobbes makes explicitly and often. It also enables us to see that many Hobbesian conflicts are about religious ideas or political ideals (as well as self-preservation and so on)—as in the British Civil War raging while Hobbes wrote Leviathan, and in the many violent sectarian conflicts throughout the world today.

In the end, though, whatever account of the state of nature and its (a) morality we attribute to Hobbes, we must remember that it is meant to function as a powerful and decisive threat: if we do not heed Hobbes’s teachings and fail to respect existing political authority, then the natural condition and its horrors of war await us.

a. The Laws of Nature and the Social Contract

Hobbes thinks the state of nature is something we ought to avoid, at any cost except our own self-preservation (this being our “right of nature,” as we saw above). But what sort of ought is this? There are two basic ways of interpreting Hobbes here. It might be a counsel of prudence: avoid the state of nature, if you’re concerned to avoid violent death. In this case Hobbes’s advice only applies to us (i) if we agree that violent death is what we should fear most and should therefore avoid; and (ii) if we agree with Hobbes that only an unaccountable sovereign stands between human beings and the state of nature. This line of thought fits well with an egoistic reading of Hobbes, but it faces serious problems, as will be seen.

The other way of interpreting Hobbes is not without problems either. This takes Hobbes to be saying that we ought, morally speaking, to avoid the state of nature. We have a duty to do what we can to avoid this situation arising, and a duty to end it, if at all possible. Hobbes often makes his view clear, that we have such moral obligations. But then two difficult questions arise: Why these obligations? And why are they obligatory?

Hobbes frames the issues in terms of an older vocabulary, using the idea of natural law that many ancient and medieval philosophers had relied on. Like them, he thinks that human reason can discern some eternal principles to govern our conduct. These principles are independent of (though also complementary to) whatever moral instruction we might get from God or religion. In other words, they are laws given by nature rather than revealed by God. But Hobbes makes radical changes to the content of these so-called laws of nature. In particular, he does not think that natural law provides any scope whatsoever to criticize or disobey the actual laws made by a government. He thus disagrees with those Protestants who thought that religious conscience might sanction disobedience of immoral laws, and with Catholics who thought that the commandments of the Pope have primacy over those of national political authorities.

Although he sets out nineteen laws of nature, it is the first two that are politically crucial. A third, that stresses the important of keeping to contracts we have entered into, is important in Hobbes’s moral justifications of obedience to the sovereign. (The remaining sixteen can be quite simply encapsulated in the formula, do as you would be done by. While the details are important for scholars of Hobbes, they do not affect the overall theory and will be ignored here.)

The first law reads as follows:

Every man ought to endeavor peace, as far as he has hope of obtaining it, and when he cannot obtain it, that he may seek and use all helps and advantages of war. (Leviathan, xiv.4)

This repeats the points we have already seen about our right of nature, so long as peace does not appear to be a realistic prospect. The second law of nature is more complicated:

That a man be willing, when others are so too, as far-forth as for peace and defense of himself he shall think it necessary, to lay down this right to all things, and be contented with so much liberty against other men, as he would allow other men against himself. (Leviathan, xiv.5)

What Hobbes tries to tackle here is the transition from the state of nature to civil society. But how he does this is misleading and has generated much confusion and disagreement. The way that Hobbes describes this second law of nature makes it look as if we should all put down our weapons, give up (much of) our “right of nature,” and jointly authorize a sovereign who will tell us what is permitted and punish us if we do not obey. But the problem is obvious. If the state of nature is anything like as bad as Hobbes has argued, then there is just no way people could ever make an agreement like this or put it into practice.

At the end of Leviathan, Hobbes seems to concede this point, saying “there is scarce a commonwealth in the world whose beginnings can in conscience be justified” (Review and Conclusion, 8). That is: governments have invariably been foisted upon people by force and fraud, not by collective agreement. But Hobbes means to defend every existing government that is powerful enough to secure peace among its subjects—not just a mythical government that’s been created by a peaceful contract out of a state of nature. His basic claim is that we should behave as if we had voluntarily entered into such a contract with everyone else in our society—everyone else, that is, except the sovereign authority.

In Hobbes’s myth of the social contract, everyone except the person or group who will wield sovereign power lays down their “right to all things.” They agree to limit drastically their right of nature, retaining only a right to defend their lives in case of immediate threat. (How limited this right of nature becomes in civil society has caused much dispute, because deciding what is an immediate threat is a question of judgment. It certainly permits us to fight back if the sovereign tries to kill us. But what if the sovereign conscripts us as soldiers? What if the sovereign looks weak and we doubt whether he can continue to secure peace…?) The sovereign, however, retains his (or her, or their) right of nature, which we have seen is effectively a right to all things—to decide what everyone else should do, to decide the rules of property, to judge disputes and so on. Hobbes concedes that there are moral limits on what sovereigns should do (God might call a sovereign to account). However, since in any case of dispute the sovereign is the only rightful judge—on this earth, that is – those moral limits make no practical difference. In every moral and political matter, the decisive question for Hobbes is always: who is to judge? As we have seen, in the state of nature, each of us is judge in our own cause, part of the reason why Hobbes thinks it is inevitably a state of war. Once civil society exists, the only rightful judge is the sovereign.

b. Why Should we Obey the Sovereign?

If we had all made a voluntary contract, a mutual promise, then it might seem half-way plausible to think we have an obligation to obey the sovereign (although even this requires the claim that promising is a moral value that overrides all others). If we have been conquered or, more fortunately, have simply been born into a society with an established political authority, this seems quite improbable. Hobbes has to make three steps here, all of which have seemed weak to many of his readers. First of all, he insists that promises made under threat of violence are nonetheless freely made, and just as binding as any others. Second, he has to put great weight on the moral value of promise keeping, which hardly fits with the absence of duties in the state of nature. Third, he has to give a story of how those of us born and raised in a political society have made some sort of implied promise to each other to obey, or at least, he has to show that we are bound (either morally or out of self-interest) to behave as if we had made such a promise.

In the first place, Hobbes draws on his mechanistic picture of the world, to suggest that threats of force do not deprive us of liberty. Liberty, he says, is freedom of motion, and I am free to move whichever way I wish, unless I am literally enchained. If I yield to threats of violence, that is my choice, for physically I could have done otherwise. If I obey the sovereign for fear of punishment or in fear of the state of nature, then that is equally my choice. Such obedience then comes, for Hobbes, to constitute a promise that I will continue to obey.

Second, promises carry a huge moral weight for Hobbes, as they do in all social contract theories. The question, however, is why we should think they are so important. Why should my (coerced) promise oblige me, given the wrong you committed in threatening me and demanding my valuables? Hobbes has no good answer to this question (but see below, on egoistic interpretations of Hobbes’s thinking here). His theory suggests that (in the state of nature) you could do me no wrong, as the right of nature dictates that we all have a right to all things. Likewise, promises do not oblige in the state of nature, inasmuch as they go against our right of nature. In civil society, the sovereign’s laws dictate what is right and wrong; if your threat was wrongful, then my promise will not bind me. But as the sovereign is outside of the original contract, he sets the terms for everyone else: so his threats create obligations.

As this suggests, Hobbesian promises are strangely fragile. Implausibly binding so long as a sovereign exists to adjudicate and enforce them, they lose all power should things revert to a state of nature. Relatedly, they seem to contain not one jot of loyalty. To be logically consistent, Hobbes needs to be politically implausible. Now there are passages where Hobbes sacrifices consistency for plausibility, arguing we have a duty to fight for our (former) sovereign even in the midst of civil war. Nonetheless the logic of his theory suggests that, as soon as government starts to weaken and disorder sets in, our duty of obedience lapses. That is, when the sovereign power needs our support, because it is no longer able to coerce us, there is no effective judge or enforcer of covenants, so that such promises no longer override our right of nature. This turns common sense on its head. Surely a powerful government can afford to be challenged, for instance by civil disobedience or conscientious objection? But when civil conflict and the state of nature threaten, in other words when government is failing, then we might reasonably think that political unity is as morally important as Hobbes always suggests. A similar question of loyalty also comes up when the sovereign power has been usurped—when Cromwell has supplanted the King, when a foreign invader has ousted our government. Right from the start, Hobbes’s critics saw that his theory makes turncoats into moral heroes: our allegiance belongs to whoever happens to be holding the gun(s). Perversely, the only crime the makers of a coup can commit is to fail.

Why does this problem come about? To overcome the fact that his contract is a fiction, Hobbes is driven to construct a “sort of” promise out of the fact of our subjugation to whatever political authority exists. He stays wedded to the idea that obedience can only find a moral basis in a “voluntary” promise, because only this seems to justify the almost unlimited obedience and renunciation of individual judgment he is determined to prove. It is no surprise that Hobbes’s arguments creak at every point: nothing could bear the weight of justifying such an overriding duty.

All the difficulties in finding a reliable moral obligation to obey might tempt us back to the idea that Hobbes is some sort of egoist. However, the difficulties with this tack are even greater. There are two sorts of egoism commentators have attributed to Hobbes: psychological and ethical. The first theory says that human beings always act egoistically, the second that they ought to act egoistically. Either view might support this simple idea: we should obey the sovereign, because his political authority is what keeps us from the evils of the natural condition. But the basic problem with such egoistic interpretations, from the point of view of Hobbes’s system of politics, is shown when we think about cases where selfishness seems to conflict with the commands of the sovereign—for example, where illegal conduct will benefit us or keep us from danger. For a psychologically egoist agent, such behavior will be irresistible; for an ethically egoist agent, it will be morally obligatory. Now, providing the sovereign is sufficiently powerful and well-informed, he can prevent many such cases arising by threatening and enforcing punishments of those who disobey. Effective threats of punishment mean that obedience is in our self-interest. But such threats will not be effective when we think our disobedience can go undetected. After Orwell’s 1984 we can imagine a state that is so powerful that no reasonable person would ever think disobedience could pay. But for Hobbes, such a powerful sovereign was not even conceivable: he would have had to assume that there would be many situations where people could reasonably hope to “get away with it.” (Likewise, under non-totalitarian, liberal politics, there are many situations where illegal behavior is very unlikely to be detected or punished.) So, still thinking of egoistic agents, the more people do get away with it, the more reason others have to think they can do the same. Thus the problem of disobedience threatens to “snowball,” undermining the sovereign and plunging selfish agents back into the chaos of the state of nature.

In other words, sovereignty as Hobbes imagined it, and liberal political authority as we know it, can only function where people feel some additional motivation apart from pure self-interest. Moreover, there is strong evidence that Hobbes was well aware of this. Part of Hobbes’s interest in religion (a topic that occupies half of Leviathan) lies in its power to shape human conduct. Sometimes this does seem to work through self-interest, as in crude threats of damnation and hell-fire. But Hobbes’s main interest lies in the educative power of religion, and indeed of political authority. Religious practices, the doctrines taught in the universities (!), the beliefs and habits inculcated by the institutions of government and society: how these can encourage and secure respect for law and authority seem to be even more important to Hobbes’s political solutions than his theoretical social contract or shaky appeals to simple self-interest.

What are we to conclude, then, given the difficulties in finding a reliable moral or selfish justification for obedience? In the end, for Hobbes, everything rides on the value of peace. Hobbes wants to say both that civil order is in our “enlightened” self-interest, and that it is of overwhelming moral value. Life is never going to be perfect for us, and life under the sovereign is the best we can do. Recognizing this aspect of everyone’s self-interest should lead us to recognize the moral value of supporting whatever authority we happen to live under. For Hobbes, this moral value is so great—and the alternatives so stark – that it should override every threat to our self-interest except the imminent danger of death. The million-dollar question is then: is a life of obedience to the sovereign really the best human beings can hope for?

c. Life Under the Sovereign

Hobbes has definite ideas about the proper nature, scope and exercise of sovereignty. Much that he says is cogent, and much of it can reduce the worries we might have about living under this drastically authoritarian sounding regime. Many commentators have stressed, for example, the importance Hobbes places upon the rule of law. His claim that much of our freedom, in civil society, “depends on the silence of the laws” is often quoted (Leviathan, xxi.18). In addition, Hobbes makes many points that are obviously aimed at contemporary debates about the rights of King and Parliament—especially about the sovereign’s rights as regards taxation and the seizure of property, and about the proper relation between religion and politics. Some of these points continue to be relevant, others are obviously anachronistic: evidently Hobbes could not have imagined the modern state, with its vast bureaucracies, massive welfare provision and complicated interfaces with society. Nor could he have foreseen how incredibly powerful the state might become, meaning that sovereigns such as Hitler or Stalin might starve, brutalize and kill their subjects, to such an extent that the state of nature looks clearly preferable.

However, the problem with all of Hobbes’s notions about sovereignty is that—on his account – it is not Hobbes the philosopher, nor we the citizens, who decide what counts as the proper nature, scope or exercise of sovereignty. He faces a systematic problem: justifying any limits or constraints on the sovereign involves making judgments about moral or practical requirements. But one of his greatest insights, still little recognized by many moral philosophers, is that any right or entitlement is only practically meaningful when combined with a concrete judgment as to what it dictates in some given case. Hobbes’s own failure, however understandable, to foresee the growth of government and its powers only supports this thought: that the proper nature, scope or exercise of sovereignty is a matter of complex judgment. Alone among the people who comprise Hobbes’s commonwealth, it is the sovereign who judges what form he should appear in, how far he should reach into the lives of his subjects, and how he should exercise his powers.

It should be added that the one part of his system that Hobbes concedes not to be proven with certainty is just this question: who or what should constitute the sovereign power. It was natural for Hobbes to think of a King, or indeed a Queen (he was born under Elizabeth I). But he was certainly very familiar with ancient forms of government, including aristocracy (government by an elite) and democracy (government by the citizens, who formed a relatively small group within the total population). Hobbes was also aware that an assembly such as Parliament could constitute a sovereign body. All have advantages and disadvantages, he argues. But the unity that comes about from having a single person at the apex, together with fixed rules of succession that pre-empt dispute about who this person should be, makes monarchy Hobbes’s preferred option.

In fact, if we want to crack open Hobbes’s sovereign, to be able to lay down concrete ideas about its nature and limits, we must begin with the question of judgment. For Hobbes, dividing capacities to judge between different bodies is tantamount to letting the state of nature straight back in. “For what is it to divide the power of a commonwealth, but to dissolve it; for powers divided mutually destroy each other”. (Leviathan, xxix.12; cf De Cive, xii.5) Beyond the example of England in the 1640s, Hobbes hardly bothers to argue the point, although it is crucial to his entire theory. Always in his mind is the Civil War that arose when Parliament claimed the right to judge rules of taxation, and thereby prevented the King from ruling and making war as he saw fit, and when churches and religious sects claimed prerogatives that went against the King’s decisions.

Especially given modern experiences of the division of powers, however, it is easy to see that these examples are extreme and atypical. We might recall the American constitution, where powers of legislation, execution and case-by-case judgment are separated (to Congress, President and the judiciary respectively) and counter-balance one another. Each of these bodies is responsible for judging different questions. There are often, of course, boundary disputes, as to whether legislative, executive or judicial powers should apply to a given issue, and no one body is empowered to settle this crucial question of judgment. Equally obviously, however, such disputes have not led to a state of nature (well, at least if we think of the US after the Civil War). For Hobbes it is simply axiomatic that disputation as to who should judge important social and political issues spells the end of the commonwealth. For us, it is equally obvious that only a few extreme forms of dispute have this very dangerous power. Dividing the powers that are important to government need not leave a society more open to those dangerous conflicts. Indeed, many would now argue that political compromises which provide different groups and bodies with independent space to judge certain social or political issues can be crucial for preventing disputes from escalating into violent conflict or civil war.

6. Conclusion

What happens, then, if we do not follow Hobbes in his arguments that judgment must, by necessity or by social contract or both, be the sole province of the sovereign? If we are optimists about the power of human judgment, and about the extent of moral consensus among human beings, we have a straightforward route to the concerns of modern liberalism. Our attention will not be on the question of social and political order, rather on how to maximize liberty, how to define social justice, how to draw the limits of government power, and how to realize democratic ideals. We will probably interpret Hobbes as a psychological egoist, and think that the problems of political order that obsessed him were the product of an unrealistic view of human nature, or unfortunate historical circumstances, or both. In this case, I suggest, we might as well not have read Hobbes at all.

If we are less optimistic about human judgment in morals and politics, however, we should not doubt that Hobbes’s problems remain our problems. But hindsight shows grave limitations to his solutions. Theoretically, Hobbes fails to prove that we have an almost unlimited obligation to obey the sovereign. His arguments that sovereignty—the power to judge moral and political matters, and enforce those judgments—cannot be divided are not only weak; they are simply refuted by the (relatively) successful distribution of powers in modern liberal societies. Not least, the horrific crimes of twentieth century dictatorships show beyond doubt that judgment about right and wrong cannot be a question only for our political leaders.

If Hobbes’s problems are real and his solutions only partly convincing, where will we go? It might reasonably be thought that this is the central question of modern political thought. We will have no doubt that peaceful coexistence is one of the greatest goods of human life, something worth many inconveniences, sacrifices and compromises. We will see that there is moral force behind the laws and requirements of the state, simply because human beings do indeed need authority and systems of enforcement if they are to cooperate peacefully. But we can hardly accept that, because human judgment is weak and faulty, that there can be only one judge of these matters—precisely because that judge might turn out to be very faulty indeed. Our concern will be how we can effectively divide power between government and people, while still ensuring that important questions of moral and political judgment are peacefully adjudicated. We will be concerned with the standards and institutions that provide for compromise between many different and conflicting judgments. And all the time, we will remember Hobbes’s reminder that human life is never without inconvenience and troubles, that we must live with a certain amount of bad, to prevent the worst: fear of violence, and violent death.

7. References and Further Reading

  • Edwards, Alistair (2002) “Hobbes” in Interpreting Modern Political Philosophy: From Machiavelli to Marx, eds. A Edwards and J Townshend (Palgrave Macmillan, Houndmills)
    • A very helpful overview of key interpretative debates about Hobbes in the twentieth century.
  • Hill, Christopher (1961/1980) The Century of Revolution, 1603-1714, second ed (Routledge, London)
    • The classic work on the history and repercussions of England’s civil war.
  • Hobbes, Thomas (1998 [1642]) On the Citizen, ed & trans Richard Tuck and Michael Silverthorne (Cambridge University Press, Cambridge)
    • The best translation of Hobbes’s most straightforward book,De Cive.
  • Hobbes, Thomas (1994 [1651/1668]) Leviathan, ed Edwin Curley (Hackett, Indianapolis)
    • The best edition of Hobbes’s magnum opus, including extensive additional material and many important variations (ignored by all other editions) between the English text and later Latin edition.
  • Sorrell, Tom (1986) Hobbes (Routledge & Kegan Paul, London)
    • A concise and well-judged account of Hobbes’s life and works.
  • Sorrell, Tom, ed (1996) The Cambridge Companion to Hobbes (Cambridge University Press, Cambridge)
    • An excellent set of essays on all aspects of Hobbes’s intellectual endeavors.

Author Information

Garrath Williams
Email: g.d.williams@lancaster.ac.uk
Lancaster University
United Kingdom

Nasir Khusraw (1004—1060)

Abu Mo’in Hamid al-Din Nasir ibn Khusraw is an important figure in the development of Ismaili philosophy. Much of his biography and philosophical ideology has been obtained through fragmented texts, both in poetry and prose.  Born into a politically connected family, Khusraw was well-educated and in the sciences and humanities.  Having spent most of his life occupying prestigious positions within the Sajuq court, Khusraw converted to the Ismaili faith at the age of forty after careful study.  He spent the rest of his life writing and advocating for the Ismaili faith, and eventually was forced into exile by Sunni authorities.

Consistent with other Ismaili philosopher, Khusraw’s cosmology is heavily inspired by Neoplatonism.  His metaphysics describes a God from which everything emanates and consistently strives back towards.  Through God, existence is cast into being through Universal Soul and Universal Intellect.  Each of these concepts provides the foundation for material objects, ascending from minerals to human beings.  Within each human being exists a soul and intellect, imperfect in form but existing within the Universals.   Khusraw interweaves his metaphysics within the Shi’i doctrine, requiring a divinely inspired guide to assist us in our journey to reconnect with Universal Intellect and Soul.  In holding to this cosmogonic description, Khusraw distinguishes his philosophy from previous Ismaili thought introduced by al-Farabi and picked up by Ibn Sina and al-Kirmani.

Table of Contents

  1. Life
  2. Philosophy
  3. References and Further Reading

1. Life

In striking contrast to other Ismaili writers of the time (s.v., Hamid ai-din al Kirmani; Abu Ya‘qub al-Sijistani), many sources of information exist pertaining to Khusraw’s life.   Documentation was recorded,  with vary degrees of accuracy, by Khusraw himself, a (hostile) contemporary, and by later historians.  Since his death, Khusraw has been included in every major literary or historical survey of Ismailism.  Khusraw’s life can be divided into four periods: his early years up to the age of forty (discernible from fragments of various texts); his conversion to Ismailism (of which he has left two different versions in the form of prose and poetry); his seven-year journey (documented in Safarnama); and his years of preaching followed by persecution and exile (drawn primarily from his poetry, but also a few statements in his philosophical works).

In 1004, Abu Mo’in Hamid al-Din Nasir ibn Khusraw was born in Qobadiyan, the district of Marv, in the eastern Iranian province of Khurasan. Along with two of his brothers, Khusraw occupied a high position in the administrative ranks of the Saljuq court – reportedly in the revenue department.  Evidence also suggests that he was familiar with the court of previous dynasty, the Ghaznavids.  Based on the quality of his writings, he received an excellent education in the sciences, literatures and philosophies of his time, including the study of Greek and Neoplatonic philosophy.  In his writing, Khusraw reportes examining the doctrines of the different Islamic schools and not being satisfied until he found and understood the Ismaili faith.  As a result of his conversion to Ismailism he embarked on a seven-year journey, during which time he spent three years in the Ismaili court in Cairo under the Fatimid caliph, al-Mustansir (1029-1094). The Fatimid dynasty (909-1171) aimed at creating an Islamic state based on Ismaili tenets, and thus presented a direct theological and military challenge to the Sunni ‘Abbasid caliphate based in Baghdad. Khusraw left Cairo as the head (hujjat) of Ismaili missionary activities in his home province of Khurasan.  After leaving Cairo, Khusraw was forced into exile by the Sunni authorities.  He spent the rest of his life exiled in the Pamir Mountains in Badakhshan, located in modern-day Tajikistan and Afghanistan.

2. Philosophy

Khusraw’s philosophical works reveal a strong Neoplatonic structure and vocabulary.  For example, his cosmogony closely follows Plotinus, moving from God and God’s word (logos) to Intellect, Soul, and the world of Nature.  Underlying each of the Ismaili cosmogonic systems is a fundamental division of the world into two realms, the esoteric (batin) and the exoteric (zahir).  From this division, everything in the physical world points to its counterpart in the spiritual, which is seen as its source, or true form.  The cosmogonic structure itself reveals a purposeful, providential unfolding from the spiritual realm into the physical world.  Conversely, as a reflection, the physical world seeks to grasp the spiritual realm and comprehend it.    In holding to this cosmogonic description, Khusraw follows his fellow Ismailis (Nasafi and al-Sijistani) while differentiating his theory from the structure introduced by al-Farabi and later adopted by Ibn Sina and the Ismaili philosopher al-Kirmani.

Khusraw begins with a discussion of tawhid (oneness, God’s unity), the clear understanding of which is the only way to achieve spiritual perfection. For Nasir, God Himself is indescribable beyond all categories of being and non-being (nothing which has an opposite can be ascribed to Him, since that would be limiting Him to human concepts).   However, from God emerges his Word (kalmia), ‘Be!’, which brings into existence Universal Intellect, perfect in potentiality and actuality.  Universal Intellect transcends time and space,  containing all being within itself.  Universal Intellect enjoys a worshipful intimacy with God and derives perfection from this intimacy.  From this worship emerges Universal Soul, perfect in potentiality but not in actuality because it is separated from God by Intellect.  Universal Soul recognizes its separation from God, and moves closer to God in a desire for the perfection enjoyed by Intellect.  Through its search for perfection, Universal Soul introduces the first movement into the entire structure, manifest in time and space.

The entire cosmos is set into motion through the movement of Universal Soul.  As a corollary, being is differentiated into two sets of opposites:  hot and cold, wet and dry.  Derived from these sets of opposites are the four elements: earth, air, fire, and water.  From these four elements arise the successive development of   minerals, plants, and animals.  Finally, as the summit of physical creation, human beings arise.  Within each human being exists an individual intellect and individual soul manifesting the same characteristics (but on a smaller level) as the universals.  In fact, the entire cosmos is formed on a matrix of Intellect and Soul; everything within the cosmos displays original intelligence and the search for perfection exhibited by the soul.

Khusraw’s ethics grow from and reflect this cosmogony. Each individual’s task is to recognize his or her own imperfections and then move to correct them, seeking the closest relationship possible with God.  For Khusraw, this is achieved by stringent and repeated application of the intellect to both physical and spiritual matters.  In order to correct these imperfections a believer must find a guide and study dilligently, perform all required religious acts with a full understanding, and supplement new understanding with higher levels of worldly activity.  As an Ismaili, Khusraw held the Shi‘i doctrine that God would not send a revelation without a guide to interpret it.  For the Ismailis, this guide must be a living person, the Imam of the Time.  As a living bridge between the two realms, this person must be divinely inspired, infallible, and perfectly capable of providing guidance in spiritual and worldly affairs.

3. References and Further Reading

The following sources elucidate Khusraw’s philosophy:

  • H. Corbin, “Nasir-i Khusrau and Iranian Ismailism,” in The Cambridge History of Iran: Volume 4, ed., R. N. Frye (Cambridge 1975), pp. 520-42 and 689-90;
  • A. Hunsberger, “Nasir Khusraw: Fatimid Intellectual,” in F. Daftary, ed., Intellectual Traditions in Islam (London 2000), pp. 112-29;
  • A. Hunsberger, Nasir Khusraw’s Doctrine of the Soul: From the Universal Intellect to the Physical World in Ismaili Philosophy, PhD thesis, Columbia University, New York, 1992;
  • S. Meskoob, Shahrokh, “The Origin and Meaning of ‘Aql (Reason) in the View of Nasir Khusraw,” Iran Nameh, 6 (1989), pp. 239-57, and 7 (1989), pp. 405-29.

For a full bibliography of Nasir Khusraw’s works and ideas, see:

  • A. C. Hunsberger, Nasir Khusraw, the Ruby of Badakhshan: A Portrait of the Persian Poet, Traveller and Philosopher (London 2000).

For works still in manuscript, see:

  • I. K. Poonawala, Bibibliography of Ismaili Literature, Malibu, Calif., 1977, p. 123.

Author Information

Alice C. Hunsberger
Email: info@iis.ac.uk
Institute of Ismaili Studies
United Kingdom