Higher-Order Theories of Consciousness

The most fundamental and commonly used notion of the term ‘conscious’ in philosophical circles is captured by Thomas Nagel’s famous “what it is like” sense (Nagel 1974). When I am in a conscious mental state, there is “something it is like” for me to be in that state from the subjective or first-person point of view. When I smell a rose or have a conscious visual experience, there is something it “seems” or “feels like” from my perspective. This is primarily the sense of “conscious state” that will be used throughout this entry. There is also something it is like to be a conscious creature whereas there is nothing it is like to be a table or tree.

Representational theories of consciousness attempt to reduce consciousness to “mental representations” rather than directly to neural or other physical states. This approach has been fairly popular over the past few decades. Examples include first-order representationalism (FOR) which attempts to explain conscious experience primarily in terms of world-directed (or first-order) intentional states (Tye 2005) as well as several versions of higher-order representationalism (HOR) which holds that what makes a mental state M conscious is that it is the object of some kind of higher-order mental state directed at M (Rosenthal 2005, Gennaro 2012). The primary focus of this entry is on HOR and especially higher-order thought (HOT) theory. The key question that should be answered by any theory of consciousness is: What makes a mental state a conscious mental state?

Section 1 introduces the overall representationalist approach to consciousness and briefly discuss Tye’s FOR. Section 2 presents three major versions of HOR: higher-order thought theory, dispositional higher-order thought theory, and higher-order perception theory. In section 3, a number of common and important objections and replies are presented. Section 4 briefly outlines a close connection between HOT theory and conceptualism, that is, the claim that the representational content of a perceptual experience is entirely determined by the conceptual capacities the perceiver brings to bear in her experience. Section 5 examines several hybrid higher-order and “self-representational” theories of consciousness which all hold that conscious states are self-directed in some way. Section 6 addresses the potentially damaging claim that HOT theory requires neural activity in the prefrontal cortex (PFC) in order for one to have conscious states.

Table of Contents

  1. Representationalism
  2. Higher-Order Representationalism
    1. Higher-Order Thought (HOT) Theory
    2. Dispositional HOT Theory
    3. Higher-Order Perception (HOP) Theory
  3. Objections and Replies
  4. HOT Theory and Conceptualism
  5. Hybrid Higher-Order and Self-Representational Theories
  6. HOT Theory and the Prefrontal Cortex
  7. References and Further Reading

1. Representationalism

Representational theories of consciousness reduce consciousness to “mental representations” rather than directly to neural states. Examples include first-order representationalism (FOR) which attempts to explain conscious experience primarily in terms of world-directed (or first-order) intentional states (Tye 2005) as well as several versions of higher-order representationalism (HOR) which holds that what makes a mental state M conscious is that it is the object of some kind of higher-order mental state directed at M (Rosenthal 2005, Gennaro 2012). The primary focus of this entry is on HOR and especially higher-order thought (HOT) theory. The key question that should be answered by any theory of consciousness is: What makes a mental state a conscious mental state?

Some current theories attempt to reduce consciousness in mentalistic terms, such as ‘thoughts’ and ‘awareness,’ rather than directly in neurophysiological terms. One popular approach is to reduce consciousness to mental representations of some kind. The notion of a “representation” is of course very general and can even be applied to pictures and signs. Much of what goes on in the brain might also be understood in a representational way. For example, mental events represent outer objects partly because they are caused by such objects in, say, cases of veridical visual perception. Philosophers often call such mental states “intentional states” which have representational content, that is, mental states that are “about” or “directed at” at something as when one has a thought about a house or a perception of a tree. Although intentional states, such as beliefs and thoughts, are sometimes contrasted with “phenomenal states,” such as pains and color experiences, it is clear that many conscious states have both phenomenal and intentional properties, such as in visual perceptions.

The general view that we can explain conscious mental states in terms of representational or intentional states is called “representationalism.” Although not automatically reductionistic, most versions of it do attempt such a reduction. Most representationalists believe that there is room for a second-step reduction to be filled in later by neuroscience. A related motivation for representational theories of consciousness is the belief that an account of intentionality can more easily be given in naturalistic terms, such as a causal theory whereby mental states are understood as representing outer objects via some reliable causal connection. The idea, then, is that if consciousness can be explained in representational terms and representation can be understood in purely physical terms, then there is the promise of a naturalistic theory of consciousness. Most generally, however, representationalism can be defined as the view that the phenomenal properties of conscious experience (that is, the “qualia”) can be explained in terms of the experiences’ representational properties.

It is worth noting here that the relationship between intentionality and consciousness is itself a major ongoing area of research with some arguing that genuine intentionality actually presupposes consciousness in some way (Searle 1992, Horgan and Tienson 2002). If this is correct, then it would be impossible to reduce consciousness to intentionality, but representationalists argue that consciousness requires intentionality, not vice versa. Of course, few if any today hold the very strong view Cartesian view that all intentional states are conscious. Descartes thought that mental states are essentially conscious and there are no unconscious mental states at all.  For much more on the relationship between intentionality and consciousness, see Gennaro (2012, chapter two), Chudnoff (2015), and the essays in Bayne and Montague (2011) and Kriegel (2013).

A first-order representational (FOR) theory of consciousness is one that attempts to explain and reduce conscious experience primarily in terms of world-directed (or first-order) intentional states. The two most cited FOR theories are those of Fred Dretske (1995) and Michael Tye (1995, 2000), but the emphasis here will be on Tye’s more developed theory.

Of course not all mental representations are conscious, so the key question remains: What exactly distinguishes conscious from unconscious mental states (or representations)? What makes an unconscious mental state a conscious mental state? Tye defends what he calls “PANIC theory.” The acronym “PANIC” stands for poised, abstract, non-conceptual, intentional content. Tye holds that at least some of the representational content in question is non-conceptual (N), which is to say that the subject can lack the concept for the properties represented by the experience in question, such as an experience of a certain shade of red that one has never seen before. But conscious states clearly must also have “intentional content” (IC) for any representationalist. Tye also asserts that such content is “abstract” (A) and so not necessarily about particular concrete objects. This is needed to handle hallucination cases where there are no concrete objects at all or cases where different objects look phenomenally alike. Perhaps most important for mental states to be conscious, however, is that such content must be “poised” (P), which is an importantly functional notion about what conscious states do. The “key idea is that experiences and feelings…stand ready and available to make a direct impact on beliefs and/or desires. For example…feeling hungry… has an immediate cognitive effect, namely, the desire to eat….States with nonconceptual content that are not so poised lack phenomenal character [because]…they arise too early, as it were, in the information processing” (Tye 2000, 62).

One common objection to FOR is that it does not apply to all conscious states. Some conscious states do not seem to be “about” or “directed at” anything, such as pains or anxiety, and so they would be non-representational conscious states. If so, then conscious states cannot generally be explained in terms of representational properties (Block 1996). Tye responds that pains and itches do represent in the sense that they represent parts of the body. Even hallucinations either misrepresent (which is still a kind of representation) or the conscious subject still takes them to have representational properties from the first-person point of view. Tye (2000) goes to great lengths in response to a host of alleged counter-examples to FOR. For example, with regard to conscious emotions, he says that they “are frequently localized in particular parts of the body. . . . For example, if one feels sudden jealousy, one is likely to feel one’s stomach sink . . . [or] one’s blood pressure increase” (Tye 2000, 51). He believes that something similar is true for fear or anger. Moods, however, are quite different and do not seem so easily localizable in the same way. Perhaps the most serious objection to Tye’s theory, however, is that what seems to be doing most of the work on Tye’s account is the extremely functional-sounding “poised” notion, and so he is arguably not really explaining phenomenal consciousness in entirely representational terms (Kriegel 2002). For other versions of FOR, see Harman (1990), Byrne (2001), and Droege (2003). Chalmers (2004) does an excellent job of presenting and categorizing the plethora of representationalist positions.

2. Higher-Order Representationalism

a. Higher-Order Thought (HOT) Theory

Once again, the key question is: What makes a mental state a conscious mental state? There is also a long tradition that has attempted to understand consciousness in terms of some kind of higher-order awareness (Locke 1689/1975). This view has been revived by several contemporary philosophers (Armstrong 1981, Rosenthal 1986, 1997, 2005, Lycan 1996, 2001, Gennaro 1996, 2012). The basic idea is that what makes a mental state conscious is that it is the object of some kind of higher-order representation (HOR). A mental state M becomes conscious when there is a HOR of M. A HOR is a “meta-psychological” or “meta-cognitive” state, that is, a mental state directed at another mental state (“I am in mental state M”). So, for example, my desire to write a good entry becomes conscious when I am (non-inferentially) “aware” of the desire. Intuitively, conscious states, as opposed to unconscious ones, are mental states that I am “aware of” being in some sense. Conscious mental states arise when two unconscious mental states are related in a certain way, namely, that one of them (the HOR) is directed at the other (M).

This overall idea is sometimes referred to as the Transitivity Principle (TP):

(TP) A conscious state is a state whose subject is, in some way, aware of being in it.

The corresponding idea that I could be having a conscious state while totally unaware of being in that state seems like a contradiction. A mental state of which the subject is completely unaware is clearly an unconscious state. For example, I would not be aware of having a subliminal perception and thus it is an unconscious perception. There are various kinds of HOR theory with the most common division between higher-order thought (HOT) theories and higher-order perception (HOP) theories. HOT theorists, such as David Rosenthal (2005), think it is better to understand the HOR (or higher-order “awareness”) as a thought containing concepts. HOTs are treated as cognitive states involving some kind of conceptual component. HOP theorists (Lycan 1996) urge that the HOR is a perceptual state of some kind which does not require the kind of conceptual content invoked by HOT theorists. Although HOT and HOP theorists agree on the need for a HOR theory of consciousness, they do sometimes argue for the superiority of their respective positions (Rosenthal 2004, Lycan 2004, Gennaro 2012, chapter three).

One can also find something like TP in premise 1 of Lycan’s (2001) more general argument for HOR. The entire argument runs as follows:

(1) A conscious state is a mental state whose subject is aware of being in it.

(2) The “of” in (1) is the “of” of intentionality; what one is aware of is an intentional object of the awareness.

(3) Intentionality is representational; a state has a thing as its intentional object only if it represents that thing.

Therefore,

(4) Awareness of a mental state is a representation of that state. (From 2, 3)

Therefore,

(5) A conscious state is a state that is itself represented by another of the subject’s mental states. (1, 4)

The intuitive appeal of premise 1 leads naturally to the final conclusion— (5)—which is just another way of stating HOR.

A related rationale for HOR, and HOT theory in particular, can be put as follows (based on Rosenthal 2004): A non-HOT theorist might still agree with HOT theory as an account of introspection or reflection , namely, that it involves a conscious thought about a mental state. This seems to be a fairly common sense definition of introspection that includes the notion that introspection involves conceptual activity. It also seems reasonable for anyone to hold that when a mental state is unconscious, there is no HOT at all. But then it stands to reason that there should be something “in between” those two cases, that is, when one has a first-order conscious state. So what is in between no HOT at all and a conscious HOT? The answer is an unconscious HOT, which is precisely what HOT theory says, that is, a first-order conscious state is accompanied by an unconscious HOT. Moreover, this explains what happens when there is a transition from a first-order conscious state to an introspective state: an unconscious HOT becomes conscious.

Still, it might still seem that HOT theory results in circularity by defining consciousness in terms of HOTs. It also might seem that an infinite regress results because a conscious mental state must be accompanied by a HOT, which, in turn, must be accompanied by another HOT ad infinitum. However, as we have just seen, the standard and widely accepted reply is that when a conscious mental state is a first-order world-directed state the higher-order thought (HOT) is not itself conscious. But when the HOT is itself conscious, there is a yet higher-order (or third-order) thought directed at the second-order state. In this case, we have introspection which involves a conscious HOT directed at an inner mental state. When one introspects, one’s attention is directed back into one’s mind. For example, what makes my desire to write a good chapter a conscious first-order desire is that there is a (non-conscious) HOT directed at the desire. In this case, my conscious focus is directed outwardly at the paper or computer screen, so I am not consciously aware of having the HOT from the first-person point of view. When I introspect that desire, however, I then have a conscious HOT (accompanied by a yet higher, third-order, HOT) directed at the desire itself (Rosenthal 1986, 1997). Indeed, it is crucial to distinguish first-order conscious states (with unconscious HOTs) from introspective states (with conscious HOTs).
figure 1
HOT theorists do insist that the HOT must become aware of the lower-order (LO) state noninferentially in order to make it conscious. The point of this condition is mainly to rule out alleged counterexamples to HO theory, such as cases where I become aware of my unconscious desire to kill my boss because I have consciously inferred it from a session with a psychiatrist, or where my anger becomes conscious after making inferences based on my own behavior. The characteristic feel of such a conscious desire or anger may be absent in these cases, but since awareness of them arose via conscious inference, the HO theorist accounts for them by adding this noninferential condition.

b. Dispositional HOT Theory

Peter Carruthers (2000, 2005) has proposed a different form of HOT theory such that the HOTs are dispositional states instead of actual HOTs, though he also understands his “dispositional HOT theory” to be a form of HOP theory (Carruthers 2004). The basic idea is that the conscious status of an experience is due to its availability to higher-order thought. So “conscious experience occurs when perceptual contents are fed into a special short-term buffer memory store, whose function is to make those contents available to cause HOTs about themselves” (Carruthers 2000, 228). Some first-order perceptual contents are available to a higher-order “theory of mind mechanism,” which transforms those representational contents into conscious contents. Thus, no actual HOT occurs. Instead, according to Carruthers, some perceptual states acquire a dual intentional content, for example, a conscious experience of red not only has a first-order content of “red,” but also has the higher-order content “seems red” or “experience of red.” Thus, he also calls his theory “dual-content theory.” Carruthers makes interesting use of so-called “consumer semantics” in order to fill out his theory of phenomenal consciousness. That is, the content of a mental state depends, in part, on the powers of the organisms which “consume” that state, for example, the kinds of inferences which the organism can make when it is in that state.

Dispositional HOT theory is often criticized by those who do not see how the mere disposition toward a mental state can render it conscious (Rosenthal 2004). Recall that a key motivation for HOT theory is the Transitivity Principle (TP) but the TP clearly lends itself to an actualist HOT theory interpretation, namely, that we are aware of our conscious states and not aware of our unconscious states. And, as Rosenthal puts it, “Being disposed to have a thought about something doesn’t make one conscious of that thing, but only potentially conscious of it” (2004, 28). Thus it is natural to wonder just how dual-content theory explains phenomenal consciousness. It is difficult to understand how a dispositional HOT can render, say, a perceptual state actually conscious.

Carruthers is well aware of this objection and attempts to address it (Carruthers 2005, 55-60). He again relies heavily on consumer semantics in an attempt to show that changes in consumer systems can transform perceptual contents. That is, what a state represents will depend, in part, on the kinds of inferences that the cognitive system is prepared to make in the presence of that state, or on the kinds of behavioral control that it can exert. In that case, the presence of first-order perceptual representations to a consumer-system that can deploy a “theory of mind” and concepts of experience may be sufficient to render those representations at the same time as higher-order ones. This would confer phenomenal consciousness to such states. But the central and most serious problem remains: that is, dual-content theory is vulnerable to the same objection raised against FOR. This point is made most forcefully by Jehle and Kriegel (2006). They point out that dual-content theory “falls prey to the same problem that bedevils FOR: It attempts to account for the difference between conscious and [un]conscious . . . mental states purely in terms of the functional roles of those states” (Jehle and Kriegel 2006, 468). Carruthers, however, is more concerned to avoid what he takes to be a problem for “actualist” HOT theory, namely, that an unbelievably large amount of cognitive (and neural) space would have to be taken up if every conscious experience is accompanied by an actual HOT.

c. Higher-Order Perception (HOP) Theory

David Armstrong (1981) and William Lycan (1996, 2004) have been the leading proponents of HOP theory in recent decades. Unlike HOTs, HOPs are not thoughts and do not have conceptual content. Rather, they are to be understood as analogous to outer perception. One major objection to HOP theory is that, unlike outer perception, there is no obvious distinct sense organ or scanning mechanism responsible for HOPs. Similarly, no distinctive sensory quality or phenomenology is involved in having HOPs whereas outer perception always involves some sensory quality. Lycan concedes the disanalogy but argues that it does not outweigh other considerations favoring HOP theory. His reply is understandable, but the objection remains a serious one and the disanalogy cannot be overstated.

Gennaro argues against Lycan’s claim that HOP theory is superior to HOT theory because, by analogy to outer perception, there is an importantly passive aspect to perception not found in thought (Gennaro 2012, chapter three). The perceptions in HOPs are too passive to account for the interrelation between HORs and first-order states. Thus, HOTs are preferable. Gennaro sometimes frames it in Kantian terms: we can distinguish between the faculties of sensibility and understanding, which must work together to make experience possible. What is most relevant here is that the passive nature of the “sensibility” (through which outer objects are given to us) is contrasted with the active and cognitive nature of the “understanding,” which thinks about and applies concepts to that which enters via the sensibility. HOTs fit this latter description better than HOPs. In any case, what ultimately justifies treating HORs as thoughts is the exercise and application of concepts to first-order states (Rosenthal 2005, Gennaro 2012, chapter four).

More recently, however, Lycan has changed his mind and no longer holds HOP theory mainly because he now thinks that attention to first-order states is sufficient for an account of conscious states and there is little reason to view the relevant attentional mechanism as intentional or as representing first-order states (Sauret and Lycan 2014). Armstrong and Lycan had indeed previously spoken of HOP “monitors” or “scanners” as a kind of attentional mechanism but now it seems that “…leading contemporary cognitive and neurological theories of attention are unanimous in suggesting that attention is not intentional” (Sauret and Lycan 2014, 365). They cite Prinz (2012), for example, who holds that attention is a psychological process that connects first-order states with working memory. Sauret and Lycan explain that “attention is the mechanism that enables subjects to become aware of their mental states” (2014, 367) and yet this “awareness of” is supposed to be a non-intentional selection of mental states. Thus, Sauret and Lycan (2014) find that Lycan’s (2001) earlier argument, discussed above, goes wrong at premise 2 and that the “of” in question need not be the “of” of intentionality. Instead, the ‘of’ is perhaps more of an “acquaintance relation” although Sauret and Lycan do not really present a theory of acquaintance, let alone one with the level of detail offered by HOT theory.

Gennaro (2015a) offers reasons to doubt that the acquaintance strategy is a better alternative. Such acquaintance relations would presumably be somehow “closer” than the representational relation. But this strategy is arguably at best trading one difficult problem for an even deeper puzzle, namely, just how to understand the allegedly intimate and nonrepresentational “awareness of” relation between HORs and first-order states. It is also more difficult to understand such “acquaintance relations” within the context of any HOR reductionist approach. Indeed, acquaintance is often taken to be unanalyzable and simple in which case it is difficult to see how it could usefully explain anything, let alone the nature of conscious states. Zahavi (2007), who is not a HOT or HOP theorist, also recognizes how unsatisfying invoking ‘acquaintance’ can be. It remains unclear as to what this acquaintance relation is supposed to be. For other variations on HOT theory, see Rolls (2004), Picciuto (2011), and Coleman (2015).

3. Objections and Replies

Several prominent objections to HOR (and counter-replies) can be found in the literature. Although some also apply to HOP theory, others are aimed more specifically at HOT theory.

First, some argue that various animals (and even infants) are not likely to have to the conceptual sophistication required for HOTs, and so that would render animal (and infant) consciousness very unlikely (Dretske 1995, Seager 2004). Are cats and dogs capable of having complex higher-order thoughts such as “I am in mental state M”? Although most who bring forth this objection are not HO theorists, Carruthers (1989, 2000) is one HO theorist who actually embraces the conclusion that (most) animals do not have phenomenal consciousness.

However, perhaps HOTs need not be as sophisticated as it might initially appear, not to mention some comparative neurophysiological and experimental evidence supporting the conclusion that animals have conscious mental states (Gennaro 1993, 1996). Most HO theorists do not wish to accept the absence of animal or infant consciousness as a consequence of holding the theory. The debate has continued over the past two decades (see for example, Carruthers 2000, 2005, 2008, 2009, and Gennaro 2004b, 2009, 2012, chapters eight). To give an example which seems to favor animal HOTs, Clayton and Dickinson and their colleagues (in Clayton, Bussey, and Dickinson 2003) have reported convincing demonstrations of memory for time in scrub jays. Scrub jays are food-caching birds, and when they have food they cannot eat, they hide it and recover it later. Because some of the food is preferred but perishable (such as crickets), it must be eaten within a few days, while other food (such as nuts) is less preferred but does not perish as quickly. In cleverly designed experiments using these facts, scrub jays are shown, even days after caching, to know not only what kind of food was where but also when they had cached it (see also Clayton, Emery, and Dickinson 2006). Such experimental results seem to show that they have episodic memory which involves a sense of self over time. This strongly suggests that the birds have some degree of meta-cognition with a self-concept (or “I-concept”) which can figure into HOTs. Further, many crows and scrub jays return alone to caches they had hidden in the presence of others and recache them in new places (Emery and Clayton 2001). This suggests that they know that others know where the food is cached, and thus, to avoid having their food stolen, they recache the food. This strongly suggests that these birds can have some mental concepts, not only about their own minds but even of other minds, which is sometimes referred to as “mindreading” ability. Of course, there are many different experiments aimed at determining the conceptual and meta-cognitive abilities of various animals so it is difficult to generalize across species.

There does seem to be growing evidence that at least some animals can mind-read under familiar conditions. For example, Laurie Santos and colleagues show that rhesus monkeys attribute visual and auditory perceptions to others in more competitive paradigms (Flombaum and Santos 2005, Santos, Nissen, and Ferrugia 2006). Rhesus monkeys preferentially attempted to obtain food silently only in conditions in which silence was relevant to obtaining the food undetected. While a human competitor was looking away, monkeys would take grapes from a silent container, thus apparently understanding that hearing leads to knowing on the part of human competitors. Subjects reliably picked the container that did not alert the experimenter that a grape was being removed. This suggests that monkeys take into account how auditory information can change the knowledge state of the experimenter (see also for example the essays in Terrace and Metcalfe 2005). Some of these same issues arise with respect to infant concept possession and consciousness (see Gennaro 2012, chapter seven, Goldman 2006, Nichols and Stich 2003, but also Carruthers 2009).

A second objection to has been referred to as the “problem of the rock” and is originally due to Alvin Goldman (Goldman 1993). When I have a thought about a rock, it is certainly not true that the rock becomes conscious. So why should I suppose that a mental state becomes conscious when I think about it? This is puzzling to many and the objection forces HOT theorists to explain just how adding the HOT state changes an unconscious state into a conscious. There have been, however, a number of responses to this kind of objection (Rosenthal 1997, Van Gulick 2000, 2004, Gennaro 2005, 2012, chapter four). Perhaps the most common theme is that there is a principled difference in the objects of the thoughts in question. For one thing, rocks and similar objects are not mental states in the first place, and HOT theorists are first and foremost trying to explain how a mental state becomes conscious. The objects of the HOTs must be “in the head.”

Third, one might object to any reductionist theory of consciousness with something like Chalmers’ hard problem, that is, how or why brain activity produces conscious experience (Chalmers 1995). However, it is first important to keep in mind that HOT theory is unlike reductionist accounts in non-mentalistic terms and so is arguably immune to Chalmers’s criticism about the plausibility of theories which attempt a direct reduction to neurophysiology (Gennaro 2005). On HOT theory, there is no problem about how a specific brain activity “produces” conscious experience, nor is there an issue about any a priori or a posteriori relation between brains and consciousness. The issue instead is how HOT theory might be realized in our brains for which there seems to be some evidence thus far (Gennaro 2012, chapters four and nine).

Still, it might be asked just how exactly any HOR theory really explains the subjective or phenomenal aspect of conscious experience. How or why does a mental state come to have a first-person qualitative “what it is like” aspect by virtue of the presence of a HOR directed at it? HOR theorists have been slow to address this problem though a number of overlapping responses have emerged. Some argue that this objection misconstrues the main and more modest purpose of their HOT theories. The claim is that HOT theories are theories of consciousness only in the sense that they are attempting to explain what differentiates conscious from unconscious states, that is, in terms of a higher-order awareness of some kind. A full account of “qualitative properties” or “sensory qualities” (which can themselves be unconscious) can be found elsewhere in their work, but is independent of their theory of consciousness (Rosenthal 1991, 2005, Lycan 1996). Thus, a full explanation of phenomenal consciousness does require more than a HOR theory but that is no objection to HOR theories as such. There is also a concern that proponents of the hard problem unjustly raise the bar as to what would count as a viable reductionist explanation of consciousness so that any such reductionist attempt would inevitably fall short (Carruthers 2000). Part of the problem may even be a lack of clarity about what would count as an explanation of consciousness (Van Gulick 1995).

Gennaro responds that HOTs explain how conscious states occur because the concepts that figure into the HOTs are necessarily presupposed in conscious experience (Gennaro 2012, chapter four, 2005). The idea is that first we receive information via our senses (or the “faculty of sensibility”). Some of this information will then rise to the level of unconscious mental states but they do not become conscious until the more cognitive “faculty of understanding” operates on them via the application of concepts. We can arguably understand such concept application in terms of HOTs directed at first-order states. Thus, I consciously experience (and recognize) the blue house as a blue house partly because I apply the concepts “blue” and “house” (in my HOTs) to my basic perceptual states. Gennaro urges that if there is a real hard problem, it has more to do with explaining concept acquisition (Gennaro 2012, chapters six and seven).

A fourth, and very important, objection to higher-order approaches is the question of how such theories can explain cases where the HO state might misrepresent the lower-order (LO) mental state (Byrne 1997, Neander 1998, Levine 2001, Block 2011). After all, if we have a representational relation between two states, it seems possible for misrepresentation or malfunction to occur. If it does, then what explanation can be offered by the HO theorist? If my LO state registers a red percept and my HO state registers a thought about something green, then what happens? It seems that problems loom for any answer given by a HOT theorist and the cause of the problem has to do with the very nature of the HO theorist’s belief that there is a representational relation between the LO and HO states. For example, if a HOT theorist takes the option that the resulting conscious experience is reddish, then it seems that the HOT plays no role in determining the qualitative character of the experience. On the other hand, if the resulting experience is greenish, then the LO state seems irrelevant. Nonetheless, Rosenthal and Weisberg hold that the HOT determines the qualitative properties, even in so-called “targetless” or “empty” HOT cases where there is no LO state at all (Rosenthal 2005, 2011, Weisberg 2008, 2011).

Gennaro argues instead that no conscious color experience would result in such cases, that is, neither reddish nor greenish experience especially since, for example, it is difficult to see how a sole (unconscious) HOT can result in a conscious state at all (Gennaro 2012, chapter four, 2013). He argues that there must be a conceptual match, complete or partial, between the LO and HO state in order for the conscious experience to exist in the first place. Weisberg and Rosenthal argue that what really matters is how things seem to the subject and, if we can explain that, we have explained all that we need to. But the problem here is that somehow the HOT alone is what matters. Doesn’t this defeat the purpose of HOT theory which is supposed to explain state consciousness in terms of the relation between two states? Moreover, according to the theory, the lower-order state is supposed to be conscious when one has an unconscious HOT.

In the end, Gennaro argues for the more nuanced claim that:

Whenever a subject S has a HOT directed at experience e, the content c of S’s HOT determines the way that S experiences e (provided that there is a full or partial conceptual match with the lower-order state, or when the HO state contains more specific or fine-grained concepts than the LO state has, or when the LO state contains more specific or fine-grained concepts than the HO state has, or when the HO concepts can combine to match the LO concept) (Gennaro 2012, 180).

The reasons for the above qualifications are discussed in Gennaro (2012, chapter six) but they basically try to explain what happens in some abnormal cases (such as visual agnosia) and in some other atypical contexts (such as perceiving ambiguous figures such as the vase-two faces) where mismatches might occur between the HOT and LO state. For example, visual agnosia, or more specifically associative agnosia, seems to be a case where a subject has a conscious experience of an object without any conceptualization of the incoming visual information (Farah 2004). There appears to be a first-order perception of an object without the accompanying concept of that object (either first- or second-order, for that matter). Thus its “meaning” is gone and the object is not recognized. It seems that there can be conscious perceptions of objects without the application of concepts, that is, without recognition or identification of those objects. But one might instead hold that associative agnosia is simply an unusual case where the typical HOT does not fully match up with the first-order visual input. That is, we might view associative agnosia as a case where the “normal,” or most general, object concept in the HOT does not accompany the input received through the visual modality. There is a partial match instead. A HOT might partially recognize the LO state. So associative agnosia would be a case where the LO state could still register a percept of an object O (because the subject still does have the concept), but the HO state is limited to some features of O. Bare visual perception remains intact in the LO state but is confused and ambiguous, and thus the agnosic’s conscious experience of O “loses meaning,” resulting in a different phenomenological experience. When, for example, the agnosic does not (visually) recognize a whistle as a whistle, perhaps only the concepts ‘silver,’ ‘roundish,’ and ‘object’ are applied. But as long as that is how the agnosic experiences the object, then HOT theory is left unthreatened.

In any case, on Gennaro’s view, misrepresentations cannot occur between M and HOT and still result in a conscious state (Gennaro 2012, 2013). Misrepresentations cannot occur between M and HOT and result in a conscious experience reflecting mismatched and incompatible concepts.

A final kind of objection worth mentioning has to do with various pathologies of self-awareness, such as somatoparaphrenia which is a pathology of self characterized by the sense of alienation from parts of one’s body.  It is a bizarre type of body delusion where one denies ownership of a limb or an entire side of one’s body. It is sometimes called a “depersonalization disorder.” Relatedly, anosognosia is a condition in which a person who suffers from a disability seems unaware of the existence of the disability. A person whose limbs are paralyzed will insist that his limbs are moving and will become furious when family and caregivers say that they are not. Somatoparaphrenia is usually caused by extensive right-hemisphere lesions, most commonly in the temporoparietal junction (Valler and Ronchi 2009). Patients with somatoparaphrenia say some very strange things, such as “parts of my body feel as if they didn’t belong to me” (Sierra and Berrios 2000, 160) and “when a part of my body hurts, I feel so detached from the pain that it feels as if it were somebody else’s pain” (Sierra and Berrios 2000, 163). It is difficult to grasp what having these conscious thoughts and experiences are like.

There is some question as to whether or not the higher-order thought (HOT) theory of consciousness can plausibly account for the depersonalization psychopathology of somatoparaphrenia (Liang and Lane 2009, Rosenthal 2010, Lane and Liang 2010). Liang and Lane (2009) argue that it cannot. HOT theory has been critically examined in light of some psychopathologies because, according to HOT theory, what makes a mental state conscious is a HOT of the form that “I am in mental state M.” The requirement of an I-reference leads some to think that HOT theory cannot explain since there would seem to be cases where I can have a conscious state and not attribute it to myself (and instead to someone else). Liang and Lane (2009) initially argued that somatoparaphrenia threatens HOT theory because it contradicts the notion that the accompanying HOT that “I am in mental state M.” The “I” is not only importantly self-referential but essential in tying the conscious state to oneself and, thus, to one’s ownership of M.

Rosenthal (2010) basically responds that one can be aware of bodily sensations in two ways that, normally at least, go together: (1) aware of a bodily sensation as one’s own, and (2) aware of a bodily sensation as having some bodily location, like a hand or foot. Patients with somatoparaphrenia still experience the sensation as their own but also as having a mistaken bodily location (perhaps somewhat analogous to phantom limb pain where patients experience pain in missing limbs). Such patients still do have the awareness in (1), which is the main issue at hand, but they have the strange awareness in sense (2). So somatoparaphrenia leads some people to misidentify the bodily location of a sensation as some­one else’s, but the awareness of the sensation itself remains one’s own. Lane and Liang (2010) are not satisfied and, among other things, counter that Rosenthal’s analogy to phantom limbs is faulty, and that he has still not explained why the identification of the bearer of the pain can­not also go astray.

Among other things, Gennaro (Gennaro 2015b replies first that we must remember that many of these patients often deny feel­ing anything in the limb in question (Bottini et al. 2002). As Liang and Lane point out, patient FB (Bottini et al. 2002), while blindfolded, feels “no tactile sensation” (2009, 664) when the examiner would in fact touch the dorsal surface of FB’s hand. In these cases, it is particularly difficult to see what the problem is for HOT theory at all. But when there really is a bodily sensation of some kind, a HOT theorist might also argue that there are really two conscious states that seem to be at odds. There is a conscious feeling in a limb but also the (conscious) attribution of the limb to someone else. It is crucial to emphasize that somatoparaphrenia is often characterized as a delusion of belief often under the broader category of anosognosia. A delusion is often defined as a false belief that is held based on an incorrect (and probably unconscious) inference about external reality or one­self that is firmly sustained despite what almost everyone else believes and despite what constitutes incontrovertible and obvious proof or evidence to the contrary (Bortolotti 2009, Radden 2010). In some cases, delusions seriously inhibit normal day-to-day functioning. Beliefs are often taken to be intentional states integrated with other beliefs. They are typically understood as caused by perceptions or experiences that then lead to action or behavior. Thus, somatoparaphrenia is, in some ways, closer to self-deception and involves frequent confabulation. For more on this disagreement as well as the phenomenon of thought insertion in schizophrenia, see Lane (2015) as well.

4. HOT Theory and Conceptualism

Consider again the related claim that HOT theory can explain how one’s conceptual repertoire can transform our phenomenological experience. Concepts, at minimum, involve recognizing and understanding objects and properties. Having a concept C should also give the concept possessor the ability to discriminate instances of C and non-C’s. For example, if I have the concept ‘tiger’ I should be able to identify tigers and distinguish them from other even fairly similar land animals. Rosenthal invokes the idea that acquiring concepts can change one’s conscious experience with the help of several well-known examples (2005, 187-188). Acquiring various concepts from a wine-tasting course will lead to different experiences from those taste experiences enjoyed prior to the course. I acquire more fine-grained wine-related concepts, such as “dry” and “heavy,” which in turn can figure into my HOTs and thus alter my conscious experiences. I literally have different qualia due to the change in my conceptual repertoire. As we learn more concepts, we have more fine-grained experiences and thus experience more qualitative complexities. A botanist will likely have somewhat different perceptual experiences than I do while walking through a forest. Conversely, those with a more limited conceptual repertoire, such as infants and animals, will often have a more coarse-grained set of experiences. Much the same goes for other sensory modalities, such as the way that I experience a painting after learning more about artwork and color. The notion of “seeing-as” (“hearing-as” and so on) is often used in this context, that is, when I possess different concepts I literally experience the world differently.

Thus, Gennaro argues that there is a very close and natural connection between HOT theory and what is known as “conceptualism” (Gennaro 2012, chapter six, 2013). Chuard (2007) defines conceptualism as the claim that “the representational content of a perceptual experience is fully conceptual in the sense that what the experience represents (and how it represents it) is entirely determined by the conceptual capacities the perceiver brings to bear in her experience” (Chuard 2007, 25). In any case, the basic idea is that, just like beliefs and thoughts, perceptual experiences also have conceptual content. In a somewhat Kantian spirit, one might say that all conscious experience presupposes the application of concepts, or, even stronger, the way that one experiences the world is entirely determined by the concepts one possesses. Indeed, Gunther (2003, 1) initially uses Kant’s famous slogan that “thoughts without content are empty, intuitions [= sensory experiences] without concepts are blind” to sum up conceptualism (Kant 1781/1965, A51/B75).

5. Hybrid Higher-Order and Self-Representational Theories

Some related representationalist views hold that the HOR in question should be understood as intrinsic to (or part of) an overall complex conscious state. This stands in contrast, for example, to the standard view that the HOT is extrinsic to (that is, entirely distinct from) its target mental state. One motivation for this shift is renewed interest in a view somewhat closer to the one held by Franz Brentano (1874/1973) and others, normally associated with the phenomenological tradition (Sartre 1956, Smith 2004). To varying degrees, these theories have in common the idea that conscious mental states, in some sense, represent themselves, which still involves having a thought about a mental state but just not a distinct or separate state. Thus, when one has a conscious desire for a beer, one is also aware that one is in that very state. The conscious desire represents both the beer and itself. It is this “self-representing” which makes the state conscious.

Gennaro has argued that, when one has a first-order conscious state, the (unconscious) HOT is better viewed as intrinsic to the target state, so that we have a complex conscious state with parts (Gennaro 1996, 2006, 2012). This is what he calls the “wide intrinsicality view” (WIV) which he takes to be a version of HOT theory and argues elsewhere that Sartre’s theory of consciousness could be understood in this way (Gennaro 2002, 2015). On the WIV, first-order conscious states are complex states with a world-directed part and a meta-psychological component. Robert Van Gulick (2000, 2004, 2006) has also explored the alternative that the HO state is part of an overall global conscious state. He calls such states “HOGS” (Higher-Order Global States) whereby a lower-order unconscious state is “recruited” into a larger state, which becomes conscious partly due to the implicit self-awareness that one is in the lower-order state.

This general approach is also forcefully advocated by Uriah Kriegel in a series of papers, beginning with Kriegel (2003) and culminating in Kriegel (2009). He refers to it as the “self-representational theory of consciousness” (see also Kriegel and Williford 2006). To be sure, the notion of a mental state representing itself or a mental state with one part representing another part is in need of further development. Nonetheless, there is agreement among all of these authors that conscious mental states are, in some important sense, reflexive or self-directed.

More specifically, Kriegel (2003, 2006, 2009) has tried to cash out TP in terms of a ubiquitous (conscious) “peripheral” self-awareness which accompanies all of our first-order focal conscious states. Not all conscious “directedness” is attentive and so perhaps we should not restrict conscious directedness to that which we are consciously focused on. If this is right, then a first-order conscious state can be both attentively outer-directed and inattentively inner-directed. Gennaro has argued against this view at length (Gennaro 2008, Gennaro 2012, chapter five). For example, although it is surely true that there are degrees of conscious attention, the clearest example of genuine “inattentive” consciousness is outer-directed awareness in one’s peripheral visual field. But this obviously does not show that any inattentional consciousness is self-directed during outer-directed consciousness, let alone at the very same time. Also, what is the evidence for such self-directed inattentional consciousness? It is presumably based on phenomenological considerations but he claims not to find such ubiquitous inattentive self-directed “consciousness” in his outer-directed conscious experience. Except when he is introspecting, Gennaro thinks that conscious experience is so completely outer directed that there really is no such peripheral self-directed consciousness when in first-order conscious states. He says that it does not seem to him that he is consciously aware of his own experience when, say, consciously attending to a band in concert or to the task of building a bookcase. Even some who are otherwise very sympathetic to Kriegel’s phenomenological approach find it difficult to believe that “pre-reflective” (inattentional) self-awareness accompanies conscious states (Siewart 1998, Zahavi 2004) or at least that all conscious states involve such self-awareness (Smith 2004). Self-representationalism is also a target of the objection discussed in section 3 regarding somatoparaphrenia and related deficits of self-awareness (for more on this dispute, see Lane 2015 and Billon and Kriegel 2015).

In the end, Kriegel actually holds that there is an indirect self-representation applicable to conscious states with the self-representational peripheral component directed at the world-directed part of the state (2009, 215-226). This seems closer to Gennaro’s WIV but Kriegel thinks that “pre-reflective self-awareness” or the “self-representation” is itself (peripherally) conscious. For others who hold some form of the self-representational view, see Williford (2006) and Janzen (2008). Carruthers’ (2000, 2005) theory can also be viewed in this light since, as we have seen, he contends that conscious states have two representational contents.

6. HOT Theory and the Prefrontal Cortex

An interesting topic in recent years has focused on attempts to identify just how HOT theory and self-representationalism might be realized in the brain. We have seen that most representationalists tend to think that the structure of conscious states is realized in the brain (though it may take some time to identify all the main neural structures). The issue is sometimes framed in terms of the question: “how global is HOT theory?” That is, do conscious mental states require widespread brain activation or can at least some be fairly localized in narrower areas of the brain? Perhaps most interesting is whether or not the prefrontal cortex (PFC) is required for having conscious states (Gennaro 2012, chapter nine). Gennaro disagrees with Kriegel (2007, 2009 chapter seven) and Block (2007) that, according to the higher-order and self-representational view, the PFC is required for most conscious states (see also Del Cul et al. 2007, Lau and Rosenthal 2011). It may very well be that the PFC is required for the more sophisticated introspective states but this isn’t a problem for HOT theory as such because it does not require introspection for having first-order conscious states.

Are there conscious states without PFC activity? It seems so. For example, Rafael Malach and colleagues show that when subjects are engaged in a perceptual task or absorbed in watching a movie, there is widespread neural activation but little PFC activity (Grill-Spector and Malach 2004, Goldberg, Harel, and Malach 2006). Although some other studies do show PFC activation, this is mainly because of the need for subjects to report their experiences. Also, basic conscious experience is certainly not entirely eliminated even when there is extensive bilateral PFC damage or lobotomies (Pollen 2008). Zeki (2007) also cites evidence that the frontal cortex is engaged only when reportability is part of the conscious experience and that all human color imaging experiments have been unanimous in not showing any particular activation of the frontal lobes. Similar results are found for other sensory modalities, for example, in auditory perception (Baars and Gage 2010, chapter seven). Although areas outside the auditory cortex are sometimes cited, there is virtually no mention of the PFC.

Gennaro thinks that the above line of argument actually works to the advantage of HOT theory with regard to the problem of animal and infant consciousness. If HOT theory does not require PFC activity for all conscious states, then HOT theory is in even a better position to account for animal and infant consciousness since it is doubtful that they have the requisite PFC activity.

But why think that unconscious HOTs can occur outside the PFC? If we grant that unconscious HOTs can be regarded as a kind of “pre-reflective” self-consciousness, then one might for example look to Newen and Vogeley (2003) for answers. They distinguish five levels of self- consciousness ranging from “phenomenal self-acquaintance” and “conceptual self-consciousness” up to “iterative meta- representational self-consciousness.” They are explicitly concerned with the neural correlates of what they call the “first-person perspective” (1PP) and the “egocentric reference frame.” Citing numerous experiments, they point to various neural signatures of self-consciousness. The PFC is rarely mentioned and then usually only with regard to more sophisticated forms of self-consciousness. Other brain areas are much more prominently identified, such as the medial and inferior parietal cortices, the temporoparietal cortex, the posterior cingulate cortex, and the anterior cingulate cortex (ACC). Kriegel (2007) also mentions the ACC as a possible location for HOTs but it should be noted that the ACC is, at least sometimes, considered to be part of the PFC.

Damasio (1999) explicitly mentions the ACC as a site for some higher-order mental activity or “maps.” There are various cortical association areas that might be good candidates for HOTs depending on the modality. For example, key regions for spatial navigation comprise the medial parietal and right inferior parietal cortex, posterior cingulate cortex, and the hippocampus. Even when considering the neural signatures of theory of mind and mind-reading, Newen and Vogeley have replicated experiments indicating that such meta-representation is best located in the ACC. In addition, “the capacity for taking 1PP in such [theory of mind] contexts showed differential activation in the right temporo-parietal junction and the medial aspects of the superior parietal lobe” (Newen and Vogeley 2003, 538). Once again, even if the PFC is essential for having certain HOTs and conscious states, this poses no threat to HOT theory provided that the HOTs in question are of the more sophisticated introspective variety.

This matter is certainly not yet settled but Gennaro urges that it is a mistake, both philosophically and neurophysiologically, to claim that HOT theory should treat first-order conscious states as essentially including PFC activity. Further, and to tie this together with the animals issue, Gennaro concedes the following: “If all HOTs occur in the PFC, and if PFC activity is necessary for all conscious experience, and if there is little or no PFC activity in infants and most animals, then either (a) infants and most animals do not have conscious experience or (b) HOT theory is false” (Gennaro 2012, 281). Carruthers (2000, 2005) and perhaps Rosenthal opt for (b). Still, Gennaro argues that a good case can be made for the falsity of one or more of the conjuncts in the antecedent of the above conditional.

Kuzuch (2014) presents a very nice discussion of the PFC in relation to higher-order theories, arguing that the lack of dramatic deficits in visual consciousness even with PFC lesions presents a compelling case against higher-order theories. For example, in addition to the studies cited above, Kozuch references Alvarez and Emory (2006) as evidence for the view that

Lesions to the orbital, lateral, or medial PFC produce so-called executive dysfunction. Depending on the precise lesion location, subjects with damage to one of these areas have problems inhibiting inappropriate actions, switching efficiently from task to task, or retaining items in short-term memory. However, lesions to these areas appear not to produce notable deficits in visual consciousness: Tests of the perceptual abilities of subjects with lesions to the PFC proper reveal no such deficits; as well, PFC patients never report their visual experience to have changed in some remarkable way (Kozuch 2014, 729).

Kozuch notes that Gennaro’s WIV may be left undamaged, at least to some extent, since he does not require that the PFC is where HOTs are realized. It is also important to keep in mind the distinction between unconscious HOTs and conscious HOTs (= introspection). Perhaps the latter require PFC activity given the more sophisticated executive functions associated with introspection but having first-order conscious states does not require introspection. Yet another interesting argument along these lines is put forth by Sebastian (2014) with respect to some dream states. If some dreams are conscious states and there is little, if any, PFC activity during the dream period, then HOT theory would again be in trouble if we suppose that HOTs are realized in the PFC.

In conclusion, higher-order theory has remained a viable theory of consciousness, especially for those attracted to a reductionist account but not presently to a reduction in purely neurophysiological terms. Although there are significant objections to different versions of HOR, at least some plausible replies have emerged through the years. HOR also maintains a degree of intuitive plausibility due to the Transitivity Principle (TP). In addition, HOT theory might help to shed light on conceptualism and can contribute to the question of the PFC’s role in producing conscious states.

7. References and Further Reading

  • Alvarez, J. and Emory, E. 2006. Executive Function and the Frontal Lobes: A Meta-Analytic Review. Neuropsychology Review 16: 17-42.
  • Armstrong, D. 1981. What is Consciousness? In The Nature of Mind. Ithaca, NY: Cornell University Press.
  • Baars, B. and Gage, N. 2010. Cognition, Brain, and Consciousness: Introduction to Cognitive Neuroscience. Second Edition. Oxford: Elsevier.
  • Bayne, T. and Montague, M. eds. 2011. Cognitive Phenomenology. New York: Oxford University Press.
  • Billon, A. and Kriegel, U. 2015. Jaspers’ Dilemma: The Psychopathological Challenge to Subjectivity Theories of Consciousness. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Block, N. 1996. Mental Paint and Mental Latex. In E. Villanueva ed. Perception. Atascadero, CA: Ridgeview.
  • Block, N. 2007. Consciousness, Accessibility, and the Mesh between Psychology and Neuroscience. Behavioral and Brain Sciences 30: 481-499.
  • Block, N. 2011. The Higher-Order Approach to Consciousness is Defunct. Analysis 71: 419-431.
  • Bottini, G., Bisiach, E., Sterzi, R., and Vallar, G. 2002. Feeling Touches in Someone Else’s Hand. NeuroReport 13: 249-252.
  • Bortolotti, L. 2009. Delusions and Other Irrational Beliefs. New York: Oxford University Press.
  • Brentano, F. 1874/1973. Psychology From an Empirical Standpoint. New York: Humanities.
  • Byrne, A. 1997. Some like it HOT: Consciousness and Higher-Order Thoughts. Philosophical Studies 86: 103-129.
  • Byrne, A. 2001. Intentionalism Defended. Philosophical Review 110: 199-240.
  • Carruthers, P. 1989. Brute Experience. Journal of Philosophy 86: 258-269.
  • Carruthers, P. 2000. Phenomenal Consciousness. Cambridge: Cambridge University Press.
  • Carruthers, P. 2004. HOP over FOR, HOT Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Carruthers, P. 2005. Consciousness: Essays from a Higher-Order Perspective. New York: Oxford University Press.
  • Carruthers, P. 2008. Meta-Cognition in Animals: A Skeptical Look. Mind and Language 23: 58-89.
  • Carruthers, P. 2009. How we Know our Own Minds: The Relationship Between Mindreading and Metacognition. Behavioral and Brain Sciences 32: 121-138.
  • Chalmers, D. 1995. Facing Up to the Problem of Consciousness. Journal of Consciousness Studies 2: 200-219.
  • Chalmers, D. 1996. The Conscious Mind. New York: Oxford University Press.
  • Chalmers, D. 2004. The Representational Character of Experience. In B. Leiter ed. The Future for Philosophy. Oxford: Oxford University Press.
  • Chuard, P. 2007. The Riches of Experience. In R. Gennaro ed. The Interplay between Consciousness and Concepts. Exeter: Imprint Academic.
  • Chundoff, E. 2015. Cognitive Phenomenology. New York: Routledge.
  • Clayton, N., Bussey, T., and Dickinson, A. 2003. Can Animals Recall the Past and Plan for the Future? Nature Reviews Neuroscience 4: 685-691.
  • Clayton, N., Emery, N., and Dickinson, A. 2006. The Rationality of Animal Memory: Complex Caching Strategies of Western Scrub Jays. In Hurley and Nudds 2006.
  • Coleman, S. 2015. Quotational Higher-Order Thought Theory. Philosophical Studies 172: 2705-2733.
  • Damasio, A. 1999. The Feeling of What Happens. New York: Harcourt Brace and Co.
  • Del Cul, A., Baillet, S., and Dehaene, S. 2007. Brain Dynamics Underlying the Nonlinear Threshold for Access to Consciousness. PLoS Biology 5: 2408-2423.
  • Dretske, F. 1995. Naturalizing the Mind. Cambridge, MA: MIT Press.
  • Droege, P. 2003. Caging the Beast. Philadelphia and Amsterdam: John Benjamins Publishers.
  • Emery, N. and Clayton, N. 2001. Effects of Experience and Social Context on Prospective Caching Strategies in Scrub Jays. Nature 414: 443-446.
  • Farah, M. 2004. Visual Agnosia, 2nd ed. Cambridge, MA: MIT Press.
  • Flombaum, J. and Santos, L. 2005. Rhesus Monkeys Attribute Perceptions to Others. Current Biology 15: 447-452.
  • Gennaro, R. 1993. Brute Experience and the Higher-Order Thought Theory of Consciousness. Philosophical Papers 22: 51-69.
  • Gennaro, R. 1996. Consciousness and Self-consciousness: A Defense of the Higher-Order Thought Theory of Consciousness. Amsterdam and Philadelphia: John Benjamins.
  • Gennaro, R. 2002. Jean-Paul Sartre and the HOT Theory of Consciousness. Canadian Journal of Philosophy 32: 293-330.
  • Gennaro, R. ed. 2004a. Higher-Order Theories of Consciousness: An Anthology. Amsterdam and Philadelphia: John Benjamins.
  • Gennaro, R. 2004b. Higher-Order Thoughts, Animal Consciousness, and Misrepresentation: A Reply to Carruthers and Levine. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Gennaro, R. 2005. The HOT Theory of Consciousness: Between a Rock and a Hard Place? Journal of Consciousness Studies 12 (2): 3-21.
  • Gennaro, R. 2006. Between Pure Self-Referentialism and the (extrinsic) HOT Theory of Consciousness. In U. Kriegel and K. Williford eds. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Gennaro, R. 2008. Representationalism, Peripheral Awareness, and the Transparency of Experience. Philosophical Studies 139: 39-56.
  • Gennaro, R. 2009. Animals, consciousness, and I-thoughts. In R. Lurz ed. Philosophy of Animal Minds. New York: Cambridge University Press.
  • Gennaro, R. 2012. The Consciousness Paradox: Consciousness, Concepts, and Higher-Order Thoughts. Cambridge, MA: The MIT Press.
  • Gennaro, R. 2013. Defending HOT Theory and the Wide Intrinsicality View: A reply to Weisberg, Van Gulick, and Seager. Journal of Consciousness Studies 20 (11-12): 82-100.
  • Gennaro, R. 2015a. The ‘of’ of Intentionality and the ‘of’ of Acquaintance. In S. Miguens, G. Preyer, and C. Morando eds. Pre-Reflective Consciousness: Sartre and Contemporary Philosophy of Mind. New York: Routledge Publishers.
  • Gennaro, R. 2015b. Somatoparaphrenia, Anosognosia, and Higher-Order Thoughts. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Gennaro, R. ed. 2015c. Disturbed Consciousness: New Essays on Psychopathology and Theories of Consciousness. Cambridge, MA: The MIT Press.
  • Goldberg, I., Harel, M., and Malach, R. 2006. When the Brain Loses its Self: Prefrontal
  • Inactivation during Sensorimotor Processing. Neuron 50: 329-339.
  • Goldman, A. 1993. Consciousness, Folk Psychology and Cognitive Science. Consciousness and Cognition 2: 264-82.
  • Goldman, A. 2006. Simulating Minds. New York: Oxford University Press.
  • Grill-Spector, K. and Malach, R. 2004. The Human Visual Cortex. Annual Review of Neuroscience 7: 649-677.
  • Gunther, Y. ed. 2003. Essays on Nonconceptual Content. Cambridge, MA: MIT Press.
  • Harman, G. 1990. The Intrinsic Quality of Experience. In J. Tomberlin ed. Philosophical Perspectives, 4. Atascadero, CA: Ridgeview Publishing.
  • Horgan, T. and Tienson, J. 2002. The Intentionality of Phenomenology and the Phenomenology of Intentionality. In D. Chalmers ed. Philosophy of Mind: Classical and Contemporary Readings. New York: Oxford University Press.
  • Hurley, S. and Nudds, M. eds. 2006. Rational Animals? New York: Oxford University Press.
  • Janzen, G. 2008. The Reflexive Nature of Consciousness. Amsterdam and Philadelphia: John Benjamins.
  • Jehle, D. and Kriegel, U. 2006. An Argument against Dispositional HOT Theory. Philosophical Psychology 19: 462-476.
  • Kant, I. 1781/1965. Critique of Pure Reason. Translated by N. Kemp Smith. New York: MacMillan.
  • Kozuch, B. 2014. Prefrontal Lesion Evidence against Higher-Order Theories of Consciousness. Philosophical Studies 167: 721-746.
  • Kriegel, U. 2002. PANIC Theory and the Prospects for a Representational Theory of Phenomenal Consciousness. Philosophical Psychology 15: 55-64.
  • Kriegel, U. 2003. Consciousness as Intransitive Self-Consciousness: Two Views and an Argument. Canadian Journal of Philosophy 33: 103-132.
  • Kriegel, U. 2005. Naturalizing Subjective Character. Philosophy and Phenomenological Research 71: 23-56.
  • Kriegel, U. 2006. The Same Order Monitoring Theory of Consciousness. In U. Kriegel and K. Williford eds. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Kriegel, U. 2007. A Cross-Order Integration Hypothesis for the Neural Correlate of Consciousness. Consciousness and Cognition 16: 897-912.
  • Kriegel, U. 2009. Subjective Consciousness. New York: Oxford University Press.
  • Kriegel, U. ed. 2013. Phenomenal Intentionality. New York: Oxford University Press.
  • Kriegel, U. and Williford, K. eds. 2006. Self-Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Lane, T. 2015. Self, Belonging, and Conscious Experience: A Critique of Subjectivity Theories of Consciousness. In R. Gennaro ed. Disturbed Consciousness. Cambridge, MA: MIT Press.
  • Lane, T. and Liang, C. 2010. Mental Ownership and Higher-Order Thought. Analysis 70: 496-501.
  • Lau, H. and Rosenthal, D. 2011. Empirical Support for Higher-Order Theories of Conscious Awareness. Trends in Cognitive Sciences 15: 365–373.
  • Levine, J. 2001. Purple Haze: The Puzzle of Conscious Experience. Cambridge, MA: MIT Press.
  • Liang, L. and Lane, T. 2009. Higher-Order Thought and Pathological Self: The Case of Somatoparaphrenia. Analysis 69: 661-668.
  • Lurz, R. ed. 2009. The Philosophy of Animal Minds. Cambridge, MA: Cambridge University Press.
  • Lurz, R. 2011. Mindreading Animals. Cambridge, MA: MIT Press.
  • Lycan, W. 1996. Consciousness and Experience. Cambridge, MA: MIT Press.
  • Lycan, W. 2001. A Simple Argument for a Higher-Order Representation Theory of Consciousness. Analysis 61: 3-4.
  • Lycan, W. 2004. The Superiority of HOP to HOT. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Nagel, T. 1974. What is it Like to be a Bat? Philosophical Review 83: 435-456.
  • Neander, K. 1998. The Division of Phenomenal Labor: A Problem for Representational Theories of Consciousness. Philosophical Perspectives 12: 411-434.
  • Newen, A. and Vogeley, K. 2003. Self-Representation: Searching for a Neural Signature of Self-Consciousness. Consciousness and Cognition 12: 529-543.
  • Nichols, S. and Stich, S. 2003. Mindreading. New York: Oxford University Press.
  • Picciuto, V. 2011. Addressing Higher-Order Misrepresentation with Quotational Thought. Journal of Consciousness Studies 18 (3-4): 109-136.
  • Pollen, D. 2008. Fundamental Requirements for Primary Visual Perception. Cerebral Cortex 18: 1991-1998.
  • Prinz, J. 2012. The Conscious Brain. New York: Oxford University Press.
  • Radden, J. 2010. On Delusion. Abingdon and New York: Routledge.
  • Rolls, E. 2004. A Higher Order Syntactic Thought (HOST) Theory of Consciousness. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Rosenthal, D.M. 1986. Two Concepts of Consciousness. Philosophical Studies 49: 329-359.
  • Rosenthal, D.M. 1991. The Independence of Consciousness and Sensory Quality. Philosophical Issues 1: 15-36.
  • Rosenthal, D.M. 1997. A Theory of Consciousness. In N. Block, O. Flanagan, and G. Güzeldere eds. The Nature of Consciousness. Cambridge, MA: MIT Press.
  • Rosenthal, D.M. 2002. Explaining Consciousness. In D. Chalmers ed. Philosophy of Mind: Classical and Contemporary Readings. New York: Oxford University Press.
  • Rosenthal, D.M. 2004. Varieties of Higher-Order Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Philadelphia and Amsterdam: John Benjamins.
  • Rosenthal, D.M. 2005. Consciousness and Mind. New York: Oxford University Press.
  • Rosenthal, D.M. 2010. Consciousness, the Self and Bodily Location. Analysis 70: 270-276.
  • Rosenthal, D.M. 2011. Exaggerated Reports: Reply to Block. Analysis 71: 431-437.
  • Santos, L., Nissen, A., and Ferrugia, J. 2006. Rhesus monkeys, Macaca mulatta, Know
  • What Others Can and Cannot Hear. Animal Behaviour 71: 1175-1181.
  • Sartre, J. 1956. Being and Nothingness. New York: Philosophical Library.
  • Sauret, W. and Lycan, W. 2014. Attention and Internal Monitoring: A Farewell to HOP. Analysis 74: 363-370.
  • Seager, W. 2004. A Cold Look at HOT Theory. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Searle, J. 1992. The Rediscovery of the Mind. Cambridge. MA: MIT Press.
  • Sebastián, M. 2013. Not a HOT Dream. In R. Brown ed. Consciousness Inside and Out: Phenomenology, Neuroscience, and the Nature of Experience. Dordrecht: Springer.
  • Sierra, M. and Berrios, G. 2000. The Cambridge Depersonalisation Scale: a New Instrument for the Measurement of Depersonalisation. Psychiatry Research 93: 153-164.
  • Siewart, C. 1998. The Significance of Consciousness. Princeton: Princeton University Press.
  • Smith, D.W. 2004. Mind World: Essays in Phenomenology and Ontology. Cambridge, MA: Cambridge University Press.
  • Terrace, H. and Metcalfe, J. eds. 2005. The Missing Link in Cognition: Origins of Self-Reflective Consciousness. New York: Oxford University Press.
  • Tye, M. 1995. Ten Problems of Consciousness. Cambridge, MA: MIT Press.
  • Tye, M. 2000. Consciousness, Color, and Content. Cambridge, MA: MIT Press.
  • Vallar, G. and Ronchi, R. 2009. Somatoparaphrenia: A Body Delusion. A Review of the Neuropsychological Literature. Experimental Brain Research 192: 533-551.
  • Van Gulick, R. 1995. What Would Count as Explaining Consciousness? In T. Metzinger ed. Conscious Experience. Paderborn: Ferdinand Schöningh.
  • Van Gulick, R. 2000. Inward and Upward: Reflection, Introspection and Self-awareness. Philosophical Topics 28: 275-305.
  • Van Gulick, R. 2004. Higher-Order Global States (HOGS): An Alternative Higher-Order Model of Consciousness. In R. Gennaro ed. Higher-Order Theories of Consciousness: An Anthology. Amsterdam: John Benjamins.
  • Van Gulick, R. 2006. Mirror Mirror—Is That All? In U. Kriegel and K. Williford eds. Self- Representational Approaches to Consciousness. Cambridge, MA: MIT Press.
  • Weisberg, J. 2008. Same Old, Same Old: The Same-Order Representation Theory of Consciousness and the Division of Phenomenal Labor. Synthese 160: 161-181.
  • Weisberg, J. 2011. Misrepresenting Consciousness. Philosophical Studies 154: 409-433.
  • Williford, K. 2006. The Self-Representational Structure of Consciousness. In Kriegel and Williford 2006.
  • Zahavi, D. 2004. Back to Brentano? Journal of Consciousness Studies 11 (10-11): 66-87.
  • Zahavi, D. 2007. The Heidelberg School and the Limits of Reflection. In S. Heinämaa, V. Lähteenmäki, and P. Remes eds. Consciousness: From perception to reflection in the history of philosophy. Dordrecht: Springer.
  • Zeki, S. 2007. A Theory of Micro-Consciousness. In M. Velmans and S. Schneider eds. The Blackwell Companion to Consciousness. Malden, MA: Blackwell.

 

Author Information

Rocco J. Gennaro
Email: rjgennaro@usi.edu
University of Southern Indiana
U. S. A.

Capital Punishment

Capital punishment, or “the death penalty,” is an institutionalized practice designed to result in deliberately executing persons in response to actual or supposed misconduct and following an authorized, rule-governed process to conclude that the person is responsible for violating norms that warrant execution.  Punitive executions have historically been imposed by diverse kinds of authorities, for an expansive range of conduct, for political or religious beliefs and practices, for a status beyond one’s control, or without employing any significant due process procedures.  Punitive executions also have been and continue to be carried out more informally, such as by terrorist groups, urban gangs, or mobs.  But for centuries in Europe and America, discussions have focused on capital punishment as an institutionalized, rule-governed practice of modern states and legal systems governing serious criminal conduct and procedures.

Capital punishment has existed for millennia, as evident from ancient law codes and Plato’s famous rendition of Socrates’s trial and execution by democratic Athens in 399 B.C.E.  Among major European philosophers, specific or systematic attention to the death penalty is the exception until about 400 years ago.  Most modern philosophic attention to capital punishment emerged from penal reform proponents, as principled, moral evaluation of law and social practice, or amidst theories of the modern state and sovereignty.  The mid-twentieth century emergence of an international human rights regime and American constitutional controversies sparked anew much philosophic focus on theories of punishment and the death penalty, including arbitrariness, mistakes, or discrimination in the American institution of capital punishment.

The central philosophic question about capital punishment is one of moral justification:  on what grounds, if any, is the state’s deliberate killing of identified offenders a morally justifiable response to voluntary criminal conduct, even the most serious of crimes, such as murder?  As with questions about the morality of punishment, two broadly different approaches are commonly distinguished: retributivism, with a focus on past conduct that merits death as a penal response, and utilitarianism or consequentialism, with attention to the effects of the death penalty, especially any effects in preventing more crime through deterrence or incapacitation.  Section One provides some historical context and basic concepts for locating the central philosophic question about capital punishment:  Is death the amount or kind of penalty that is morally justified for the most serious of crimes, such as murder?  Section Two attends to classic considerations of lex talionis (“the law of retaliation”) and recent retributivist approaches to capital punishment that involve the right to life or a conception of fairness.  Section Three considers classic utilitarian approaches to justifying the death penalty: primarily as preventer of crime through deterrence or incapacitation, but also with respect to some other consequences of capital punishment.  Section Four attends to relatively recent approaches to punishment as expression or communication of fundamental values or norms, including for purposes of educating or reforming offenders.  Section Five explores issues of justification related to the institution of capital punishment, as in America: Is the death penalty morally justifiable if imperfect procedures produce mistakes, caprice, or (racial) discrimination in determining who is to be executed? Or if the actual execution of capital punishment requires unethical conduct by medical practitioners or other necessary participants?  Section Six considers the moral grounds, if any exist, for the state’s authority to punish by death.

Table of Contents

  1. Context and Basic Concepts
    1. Historical Practices
    2. Philosophic Frameworks and Approaches
  2. Retributivist Approaches
    1. Classic Retributivism: Kant and lex talionis
    2. Lex talionis as a Principle of Proportionality
    3. Retributivism and the Right to Life
    4. Retributivism and Fairness
    5. Challenges to Retributivism
  3. Utilitarian Approaches
    1. Classic Utilitarian Approaches: Bentham, Beccaria, Mill
    2. Empirical Considerations: Incapacitation, Deterrence
    3. Utilitarian Defenses: “Common Sense” and “Best Bet”
    4. Challenges to Utilitarianism
    5. Other Consequential Considerations
  4. Capital Punishment as Communication
  5. The Institution of Capital Punishment
    1. Procedural Issues: Imperfect Justice
    2. Discrimination: Race, Class
    3. Medicine and the Death Penalty
    4. Costs: Economic Issues
  6. State Authority and Capital Punishment
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Context and Basic Concepts

a. Historical Practices

Much philosophic focus on the death penalty is modern and relatively recent.  The phrase ‘capital punishment’ is older, used for nearly a millennium to signify the death penalty.  The classical Latin and medieval French roots of the term ‘capital’ indicate a punishment involving the loss of head or life, perhaps reflecting the use of beheading as a form of execution.  The actual practice of capital punishment is ancient, emerging much earlier than the familiar terms long used to refer to it.  In the ancient world, the Babylonian Code of Hammurabi (circa 1750 B.C.E.) included about 25 capital crimes; the Mosaic Code of the ancient Hebrews identifies numerous crimes punishable by death, invoking, like other ancient law codes, lex talionis, “the law of retaliation”; Draco’s Code of 621 B.C.E. Athens punished most crimes by death, and later Athenian law famously licensed the trial and death of Socrates; the fifth century B.C.E. Twelve Tables of Roman law include capital punishment for such crimes as publishing insulting songs or disturbing the nocturnal peace of urban areas, and later Roman law famously permitted the crucifixion of Jesus of Nazareth.  Even in such early practices, capital punishment was seen as within the authority of political rulers, embodied as a legal institution, and employed for a wide range of misconduct proscribed by law.

Medieval and early modern Europe retained expansive lists of capital crimes and notably expanded the forms of execution beyond the common ancient practices of stoning, crucifixion, drowning, beating to death, or poisoning.  In the Middle Ages both secular and ecclesiastical authorities participated in executions deliberately designed to be torturous and brutal, such as beheading, burning alive, drawing and quartering, hanging, disemboweling, using the rack, using thumb-screws, pressing with weights, boiling in oil, publicly dissecting, and castrating.  Such brutality was conducted publicly as spectacle and ritual­—an important or even essential element of capital punishment was not only the death of the accused, but the public process of killing and dying on display.  Capital punishment was varied in its severity by the spectrum of torturous ways by which the offender’s death was eventually effected by political and other penal authorities.

In “the new world” the American colonies’ use of the death penalty was influenced more by Britain than by any other nation.  The “Bloody Code” of the Elizabethan era included over 200 capital crimes, and the American colonies followed England in using public, ritualized hangings as the common form of execution.  Until the mid-18th century, the colonies employed elaborate variations of the ritual of execution by hanging, even to the point of holding fake hangings.  Stuart Banner summarizes the early American practices:

Capital punishment was more than just one penal technique among others. It was the base point from which all other kinds of punishment deviated.  When the state punished serious crime, most of the methods …were variations on execution.  Officials imposed death sentences that were never carried out, they conducted mock hangings…, and they dramatically halted real execution ceremonies at the last minute.  These were methods of inflicting a symbolic death …. Officials also wielded a set of tools capable of intensifying a death sentence – burning at the stake, public display of the corpse, dismemberment and dissection – ways of producing a punishment worse than death. (54)

In early America “capital punishment was not just a single penalty,” but “a spectrum of penalties with gradations of severity above and below an ordinary execution” (Banner, 86).

The late 18th century brought a “dramatic transformation of penal thought and practice” that was international in scope (Banner, 89). The dramatic change came with the birth of publicly supported prisons or penitentiaries that allowed extended incarceration for large numbers of people (Banner, 99).  Before prisons and the practical possibility of lengthy incarceration as an alternative, “the only available units of measurement for serious crime were degrees of deviation from an ordinary execution” (Banner, 70).  After the invention of prisons, for serious crimes there was now an alternative to capital punishment and to the practiced spectrum of torturous executions: prisons allowed varying conditions of confinement (for example, hard labor, solitary confinement, loss of privacy) and a temporal measure, at least, for distinguishing degrees of punishment to address kinds of serious misconduct.  Dramatic changes for capital punishment also came with the 1864 publication in Italy of Cesare Beccaria’s essay, “On Crimes and Punishments.”  Very influential in Europe and the United States, Beccaria’s sustained, philosophic investigation of the death penalty challenged both the authority of the state to punish by death and the utility of capital punishment as a superior deterrent to lengthy imprisonment.  Philosophic defenses of the death penalty, like that of Immanuel Kant, opposed reformers and others, who, like Beccaria, argued for abolition of capital punishment.  During the 19th century the methods of execution were made less brutal and the number of capital crimes was much reduced compared to earlier centuries of practice.  Discussions of the death penalty’s merits invoked divergent understandings of the aims of punishment in general and thus of capital punishment in particular.

By the mid-20th century, two developments prompted another period of focused philosophic attention to the death penalty.  In the United States a series of Supreme Court cases challenged whether the death penalty falls under the constitutional prohibition of “cruel and unusual punishments,” including questions about the legal and moral import of a criminal justice process that results in mistakes, caprice, or racial discrimination in capital cases.   Capital punishment also became a global concern with the post-World War II Nuremberg trials of Nazi leaders and after the 1948 Declaration of Universal Human Rights and subsequent human rights treaties explicitly accorded all persons a right to life and encouraged abolishing the death penalty worldwide.  Most nations have now abolished capital punishment, with notable exceptions including China, North Korea, Japan, India, Indonesia, Egypt, Somalia, and the United States, the only western “industrialized” nation still retaining the death penalty.

b. Philosophic Frameworks and Approaches

Capital punishment is often explored philosophically in the context of more general theories of “the standard or central case” of punishment as an institution or practice within a structure of legal rules (Hart, “Prolegomenon,” 3-5).  The philosopher’s interest in the death penalty, then, is embedded in broader issues about the moral permissibility of punishment.  Any punishment – and certainly an execution – intentionally inflicts on a person significant pain, suffering, unpleasantness, or deprivation that it is ordinarily wrong for an authority like the state to impose.  What conditions or considerations, if any, would morally justify such penal practices?  Following a framework famously offered by H.L.A. Hart,

[w]hat we should look for are answers to a number of different questions such as:  What justifies the general practice of punishment? To whom may punishment be applied? How severely may we punish? (“Prolegomenon,” 3)

These different questions are, respectively, about the general justifying aim of punishment, about the conditions of responsibility for criminal conduct and liability to punishment, and about the amount, kind, or form of punishment justifiable to address actual or supposed misconduct.  It is the last of these questions of justification – about the justified amount, kind, or form of punishment – that is foremost in philosophic approaches to the death penalty.  Almost all modern and recent discussions of capital punishment assume liability for the death penalty is only for the gravest of crimes, such as murder; almost all assume comparatively humane modes of execution and largely ignore considering obviously torturous or brutal killings of offenders; and it is assumed that some amount of punishment is merited for murderers.  The central question, then, is not often whether punishing murderers is morally justifiable (rather than rehabilitation or release, for example), but whether it is morally justifiable to punish by death (rather than by imprisonment, for example) those found to have committed a grave offense, such as murder.  Responses to this question about the death penalty often build on more general principles or theories about the purposes of punishment in general, and about general criteria for determining the proper measure or amount of punishment for various crimes.

Among philosophers there are typically identified two broadly different ways of thinking about the moral merits of punishment in general, and whether capital punishment is a proper amount of punishment to address serious criminal misconduct (see “Punishment”). Justifications are proposed either with reference to forward-looking considerations, such as various future effects or consequences of capital punishment, or with reference to backward-looking considerations, such as facets of the wrongdoing to be punished.   The latter approach, if dominant, has, since the 1930s, been called ‘retributivism’; retributivist justifications “look back” to the offense committed in order to link directly the amount, kind, or form of punishment to what the offense merits as penal response.  This linkage is often characterized as whether a punishment “fits” the crime committed.  For retributivists, any beneficial effects or consequences of capital punishment are wholly irrelevant or distinctly secondary.  Forward-looking justifications of punishment have been labeled ‘utilitarian’ since the 19th century and, since the mid-20th century, other versions are sometimes called ‘consequentialism’. Consequentialist or utilitarian approaches to the death penalty are distinguished from retributivist approaches because the former rely only on assessing the future effects or consequences of capital punishment, such as crime prevention through deterrence and incapacitation.

2. Retributivist Approaches

Retributivists approach justifying the amount of punishment for misconduct by “looking back” to aspects of the wrongdoing committed.  There are many different versions of retributivism; all maintain a tight, essential link between the offense voluntarily committed and the amount, form, or kind of punishment justifiably threatened or imposed.  Future effects or consequences, if any, are then irrelevant or distinctly secondary considerations to justifying punishments for misconduct, including the death penalty.  Retributivism about capital punishment often prominently appeals to the principle of lex talionis, or “the law of retaliation,” an idea popularly familiarized in the ancient and biblical phrase, “an eye for an eye and a tooth for a tooth.”  Forms of retributivism vary according to their interpretation of lex talionis or in their appealing to alternative moral notions, such as basic moral rights or a principle of fairness.

a. Classic Retributivism: Kant and lex talionis

 A classic expression of retributivism about capital punishment can be found in a late 18th century treatise by Immanuel Kant, The Metaphysical Elements of Justice (99-107; Ak. 331-337).  After dismissing Cesare Beccaria’s abolitionist stance and reliance on “sympathetic sentimentality and an affectation of humanitarianism,” Kant appeals to an interpretation of lex talionis, what he calls “jus talionis” or “the Law of Retribution,” as justifying capital punishment:

Judicial punishment… must in all cases be imposed on him only on the ground that he committed a crime.… He must first be found deserving of punishment… The law concerning punishment is a categorical imperative. (100; Ak. 331) What kind and degree of punishment does public legal justice adopt as its principle and standard?  None other than the principle of equality….  Only the Law of Retribution (jus talionis) can determine exactly the kind and degree of punishment (101; Ak. 332).

Kant then explicitly applies these principles to determine the punishment for the most serious of crimes:

 If… he has committed a murder, he must die.  In this case, there is no substitute that will satisfy the requirements of legal justice. There is no sameness of kind between death and remaining alive even under the most miserable conditions, and consequently there is also no equality between the crime and retribution unless the criminal is judicially condemned and put to death (102; Ak. 333).

Kant then employs a hypothetical case to insist that any social effects of the death penalty, good or bad, are wholly irrelevant to its justification:

Even if a civil society were to dissolve… the last murderer in prison would first have to be executed so that each should receive his just deserts and that the people should not bear the guilt of a capital crime… [and] be regarded as accomplices in the public violation of justice (102; Ak. 333).

So, even if social effects are not possible, since the society no longer exists, the death penalty is justified for murder.  Kant exemplifies a pure retributivism about capital punishment: murderers must die for their offense, social consequences are wholly irrelevant, and the basis for linking the death penalty to the crime is “the Law of Retribution,” the ancient maxim, lex talionis, rooted in “the principle of equality.”

The key to Kant’s defense of capital punishment is “the principle of equality,” by which the proper, merited amount and kind of punishment is determined for crimes.  Whether the best interpretation of Kant or not, the idea behind this common approach seems to be that offenders must suffer a punishment equal to the victim’s suffering: “an eye for an eye, a tooth for a tooth,” a life for a life.  But as often noted, any literalism about lex talionis cannot work as a general principle linking crimes and punishments. It seems to imply that the merited punishment for rape is to be raped, for robbery to be stolen from, for fraud to be defrauded, for assault to be assaulted, for arson to be “burned out,” etc.  For other crimes—forgery, drug peddling, serial killings or massacres, terrorism, genocide, smuggling—it is not at all clear what kind or form of punishment lex talionis would then license or require (for example, Nathanson 72-75).  As C. L. Ten succinctly says, “it would appear that the single murder is one of the few cases in which the lex talionis can be applied literally” (151).  Both practical considerations and moral principles about permissible forms of punishment, then, ground objections to invoking a literal interpretation of lex talionis to justify capital punishment for murder.

Some retributivists employ a less literal way of employing a principle of equality to justify death as the punishment for murder.  The relevant equivalence is one of harms caused and suffered:  the murder victim suffers the harm of a life ended, and the only equivalent harm to be imposed as punishment, then, must be the death of the killer.  As a general way of linking kinds of misconduct and proper amounts, kinds, or forms of punishment, this rendition of lex talionis also faces challenges (Ten, 151-154).  Furthermore, it is also often noted that, even in the case of murder, there is no equivalence between the penal experience of capital offenders and their victims’ suffering in being murdered.  Albert Camus, in his “Reflections on the Guillotine,” makes the point in a rather dramatic way:

But what is capital punishment if not the most premeditated of murders, to which no criminal act, no matter how calculated, can be compared?  If there were to be a real equivalence, the death penalty would have to be pronounced upon a criminal who had forewarned his victim of the very moment he would put him to a horrible death, and who, from that time on, had kept him confined at his own discretion for a period of months.  It is not in private life that one meets such monsters.  (199)

This inequality of experience claim is even more to the point since even Kant maintains that “the death of the criminal must be kept entirely free of any maltreatment that would make an abomination of the humanity residing in the person suffering it” (102; Ak. 333).

b. Lex talionis as a Principle of Proportionality

Most contemporary retributivists interpret lex talionis not as expressing equality of crimes and punishments, but as expressing a principle of proportionality for establishing the merited penal response to a crime such as murder.  The idea is that the amount of punishment merited is to be proportional to the seriousness of the offense, more serious offenses being punished more severely than less serious crimes.  So, one constructs an ordinal ranking of crimes according to their seriousness and then constructs a corresponding ranking of punishments according to their severity.  The least serious crime is then properly punished by the least severe penalty, the second least serious crime by the second least severe punishment, and so on.  The gravest misconduct, then, is properly addressed by the most severe of punishments, death.

To carry out such a general project of constructing scales of crimes and matching punishments is a daunting challenge, as even many retributivists admit.  Aside from these concerns, as a defense of capital punishment this approach to lex talionis simply raises the question about the morality of the death penalty, even for the most serious of crimes.   There is no reason to think that current capital punishment practices are the most severe punishment.  Consider medieval practices of death with torture, or death “with extreme prejudice”; and are there not possible conditions of confinement that are possibly more severe than execution, such as years of brutal, solitary confinement or excessively hard labor?  Such punishments would not likely now be on a list of morally permissible penal responses to even the most serious crimes.  But then what is needed is some justification for setting an upper bound of morally permissible severity for punishments, “a theory of permissibility” (Finkelstein, “A Contractarian Approach…,” 212-213).  But whether today’s death penalty is morally permissible is precisely the question at issue.  The retributivist proportionality interpretation of lex talionis simply assumes capital punishment is morally permissible, rather than offering a defense of it.

One general concern about appeals to lex talionis, under any interpretation, is that relying on “the law of retaliation” can appear to make capital punishment tantamount to justified vengeance.  But Kant and other retributivist defenders of the death penalty rightly distinguish principled retribution from vengeance.   Vengeance arises out of someone’s hatred, anger, or desires typically aimed at another:  there is no internal limit to the severity of the response, except perhaps that which flows from the personal perspective of the avenger.  The avenger’s response may be markedly disproportionate to the offense committed, whereas retributivists insist that the severity of punishments must be matched to the misconduct’s gravity.  Vengeance is typically personal, directed at someone about whom the avenger cares—it is personal.  Retribution requires responses even to injuries of people no one cares about:  its impersonality makes harms to the friendless as weighty as harms to the popular and justifies punishment without regard to whether anyone desires the offender suffer.  The avenger typically takes pleasure in the suffering of the offender, whereas “we may all deeply regret having to carry out the punishment” (Pojman, 23) or only take “pleasure at justice being done” (Nozick, 367) as a retributivist moral principle requires.  Even if desires for vengeance are satisfied by executing murderers, for retributivists such effects are not at the heart of the defense of capital punishment.  And to the extent that such satisfactions are sufficient justification, then the defense is no longer retributivist, but utilitarian or consequentialist (see sections 3 and 4).  For retributivists the morality of the death penalty for murder is a matter of general moral principle, not assuaging any desires for revenge or vengeance on the part of victims or others.

c. Retributivism and the Right to Life

Some forms of retributivism about capital punishment eschew reliance on lex talionis in favor of other kinds of moral principles, and they typically depart from Kant’s conclusion that murderers must be punished by death, regardless of any consequences.  One approach employs the idea of basic moral rights, such as the right to life, an expression of the value of life that seems to work against justifying capital punishment.   Yet John Locke, for example, in his Second Treatise on Government, posits both a natural right to life and defends the death penalty for murderers.  Echoing a line of reasoning exhibited in Thomas Aquinas’s defense of capital punishment (Summa Theologiae II-II, Q. 64, a.2), Locke claims that a murderer violates another’s right to life, and thereby “declares himself… to be a noxious creature… and therefore may be destroyed as a lion or a tiger, one of those wild savage beasts… both to deter others from doing the like injury… and also to secure men from the attempts of a criminal” (Treatise, sections 10-11).  For Locke, murderers have, by their voluntary wrongdoing, forfeited their own right to life and can therefore be treated as a being not possessing any right to life at all and as subject to execution to effect some good for society.

This retributivist position notably departs from Kant’s extreme view in concluding only that a murderer may be put to death, not must be, and by invoking utilitarian thinking as a secondary consideration in deciding whether capital punishment is morally justified for murderers who have forfeited their right to life.  This form of retributivism—rights forfeiture and considering consequences of the death penalty—is also explicitly expressed by W. D. Ross in his 1930 book, The Right and the Good:

But to hold that the state has no duty of retributive punishment is not necessarily to adopt a utilitarian view of punishment.… [T]he main element in any one’s right to life or property is extinguished by his failure to respect the corresponding right in others.… [T]he offender, by violating the life or liberty or property of another, has lost his own right to have his life, liberty, or property respected, so that the state has no prima facie duty to spare him as it has a prima facie duty to spare the innocent.  It is morally at liberty to injure him as he has injured others, or to inflict any lesser injury on him, or to spare him, exactly as consideration of both of the good of the community and of his own good requires. (60-61)

The retributivist argument, then, is that murderers forfeit their own right to life by virtue of voluntarily taking another’s life.  Since a right to life, like other rights, logically entails a correlative duty of others (see Consequentialism and Ethics, section 2b), by forfeiting their right to life murderers eliminate the state’s correlative duty not to kill them; the murderer’s forfeiture makes morally permissible the state’s putting them to death, at least as a means to some good.  Thus, capital punishment is not a violation of an offender’s right to life, as the offender has forfeited that right, and the death penalty is then justifiable as a morally permissible way to treat murderers in order to effect some good for society.

This kind of retributivist approach to capital punishment raises philosophic issues, aside from its reliance on empirical claims about the effects of the death penalty as a way to deter or incapacitate offenders (see section 3b). First, though the idea of forfeiting a right may be familiar, it leaves “troubling and unanswered questions: To whom is it forfeited? Can this right, once forfeited, ever be restored? If so, by whom, and under what conditions” (Bedau, “Capital Punishment,” 162-3)?  Second, given that the right to life is so fundamental to all rights and, as many maintain, held equally by each and all because they are humans, perhaps the right to life is exceptional or even unique in not being forfeitable at all: the right to life is actually a fundamental natural or human right.  One’s actions cannot and do not alter one’s status as a human being, Locke and Aquinas notwithstanding; thus, the right to life is inalienable and not forfeitable.  Even killers retain their right to life, the state remains bound by the correlative duty not to kill a murderer, and capital punishment, then, is a violation of the human right to life.

Developed in this way, as a matter of fundamental human rights, the merit of capital punishment becomes more about the moral standing of human beings and less about the logic and mobility of rights through forfeiture or alienation.  The point of a human right to life is that it “draws attention to the nature and value of persons, even those convicted of terrible crimes.… Whatever the criminal offense, the accused or convicted offender does not forfeit his rights and dignity as a person” (Bedau, “Reflections,” 152, 153).   This view reflects at least the spirit of the 1948 United Nations Universal Declaration of Human Rights: the right to life is universal, is rooted in each person’s dignity, and is unalienable (Preamble; Article 3).   But this view of offenders’ moral standing can be challenged if one considers the implication that, of equal standing with any of us, then, are masters of massacres or genocide (for example, Hitler, Stalin, Pol Pot), serial killers, terrorists, rampant rapists, and pedophiliac predators.  As one retributivist defender of capital punishment puts it, “though a popular dogma, the secular doctrine that all human beings have… worth is groundless.  The notion… [is] perhaps the most misused term in our moral vocabulary.… If humans do not possess some kind of intrinsic value… then why not rid ourselves of those who egregiously violate… our moral and legal codes” (Pojman, 35, 36).

d. Retributivism and Fairness

A recently revived retributivism about the death penalty builds not on individual rights, but on a notion of fairness in society.  Given a society with reasonably just rules of cooperation that bestow benefits and burdens on its members, misconduct takes unfair advantage of others, and punishment is thereby merited to address the advantage gained:

A person who violates the rules has something that others have—the benefits of the system—but by renouncing what others have assumed, the burdens of self-restraint, he has acquired an unfair advantage.  Matters are not even until this advantage is in some way erased….[P]unishing such individuals restores the equilibrium of benefits and burdens. (Morris 478)

The morally justified amount, kind, or form of punishment for a crime is then determined by an “unfair advantage principle”:

His crime consists only in the unfair advantage… [taken] by breaking the law in question. The greater the advantage, the greater the punishment should be.  The focus of the unfair advantage principle is on what the criminal gained.”  (Davis 241)

In justifying an amount of punishment, then, an unfairness principle focuses on the advantage gained, whereas the lex talionis principle attends to the harm done to another (Davis 241).

The fairness approach to punishment reflects recent uses of “the principle of fairness” as a theory of political obligation:  those engaged in a mutually beneficial system of cooperation have a duty to obey the rules from which they benefit (Rawls, 108-114).  As applied to punishment, though, its roots run also to ancient, archaic notions of justice as re-establishing an equilibrium, to Aristotle’s Nichomachean Ethics treatment of justice as requiring state corrective action to rectify the imbalances created by criminal misconduct (Book V, Chapter 4), and to G.W.F. Hegel’s claim in The Philosophy of Right that to punish “is to annul the crime… and to restore the right” (69, 331n).   Today’s popular parlance that punishment is how offenders pay for their crimes can also be seen as their paying for unfair advantages gained.

As a general approach to justifying the amount of punishment merited for misconduct, the fairness approach initially appears to work best for petty theft or possibly “free-loading” in cooperative schemes, such as penalizing tax evasion.   In such cases one can perhaps see unfair advantage gained and see the amount of punishment as tied to what is unfairly gained.  But for violent crimes such as murder, the fairness approach seems less plausible.  How does lengthy incarceration or even execution erase the unfair advantage gained, annul the crime, or  re-establish any prior balance between perpetrator and victim?  To the extent that punishment affects such things, it risks conflating retribution with restitution or restoration.  The unfair advantage principle also characterizes the wrong committed not in terms of its effects on a victim, but on third parties—society members who exercise self-restraint by obeying those norms the offender violates.  This oddly places the victim of criminal misconduct, especially for violent crimes: the person assaulted or killed is not the focus in justifying the amount of punishment, but third parties’ burdens of self-restraint are.  Additionally, taken by itself, the unfair advantage approach to establishing the proper amount of punishment can also have some odd consequences, as Jeffrey Reiman rather colorfully suggests:

For example, it would seem that the value of the unfair advantage taken of law-obeyers by one who robs a great deal of money is greater than the value of the unfair advantage taken by a murderer, since the latter gets only the advantage of ridding his world of a nuisance while the former will be able to make a new life… and have money left over for other things.  This leads to the counterintuitive conclusion that such robbers should be punished more severely… than murderers.  (“Justice, Civilization,…,” note 10)

The death penalty for murder, then, would not obviously be morally justified if the general criterion for the amount of punishment is an unfair advantage principle.

A defense of the death penalty for murder has been proposed by employing another version of this general approach to punishment.  The key is seeing the kind of unfair advantage gained by a murderer.  As Reiman suggests in the spirit of Hegelian retributivism, the act of killing another disrupts “the relations appropriate to equally sovereign individuals;” it is “an assault on the sovereignty of an individual that temporarily places one person (the criminal) in a position of illegitimate sovereignty over another (the victim)”; then there is “the right to rectify this loss of standing relative to the criminal by meting out a punishment that reduces the criminals’ sovereignty to the degree to which she vaunted it above her victim’s” (“Why…,” 89-90).   So, if a murder is committed and a life taken, the idea is that the amount of permissible punishment is for the state, as the victim’s agent, to assert a supremacy over the criminal similar to that already asserted by the killer; and to do that it is permissible for the state to impose the death penalty for murder.  So, on this interpretation of the fairness principle, the death penalty for murder is morally justified, though, for other crimes, it may not be “easy or even always possible to figure out what penalties are equivalent to the harms imposed by offenders” (Reiman, “Why…,” 69-90, 93).  As with other forms of retributivism, the fairness approach, on either interpretation, is challenged by the plausibility of using a principle that adequately addresses both the merits of capital punishment for murder and also generates a system of penalties that “fit” or are equivalent to various crimes.

e. Challenges to Retributivism

Retributivist approaches to capital punishment are many and varied.  But from even the small sample above, notable similarities are often cited as challenges for this way of thinking about the moral justification of punishment by death.   First, retributivism with respect to capital punishment either invokes principles that are plausible, if at all, only for death as penalty for murder; or it relies on principles met only with reasoned skepticism about their general adequacy for constructing a plausible scale matching various crimes with proper penal responses.

Second, retributivists presuppose that persons are responsible for any criminal misconduct for which they are to be punished, but actually instituting capital punishment confronts the reality of some social conditions, for example, that challenge the presupposition of voluntariness and, in the case of the fairness approach, that challenge the presupposition of a reasonably just system of social cooperation (see section 5b).  Third, it is often argued that, in addressing the moral merits of capital punishment, retributivists ignore or make markedly secondary the causal consequences of the practice.  What if no benefits accrue to anyone from the practice of capital punishment?  What if capital punishment significantly increases the rate of murders or violent crimes?  What if the institution of capital punishment sometimes, often, or inevitably is arbitrary, capricious, discriminatory, or even mistaken in its selecting those to be punished by death (see section 5)?  These and other possible consequences of capital punishment seem relevant, even probative.  The challenge is that retributivists ignore or diminish their importance, perhaps defending or opposing the death penalty despite such effects and not because of them.

3. Utilitarian Approaches

A utilitarian approach to justifying capital punishment appeals only to the consequences or effects of death being the penalty for serious crimes, such as murder.  A utilitarian approach, then, is a kind of consequentialism and is often said to be “forward looking,” in contrast to retributivists’ “backward looking” approach.   More specifically, a utilitarian approach sees punishment by death as justified only if that amount of punishment for murder best promotes the total happiness, pleasure, or well-being of the society.  The idea is that the inherent pain and any negative effects of capital punishment must be exceeded by its beneficial effects, such as crime prevention through incapacitation and deterrence; and furthermore, the total effects of the death penalty—good and bad, for offender and everyone else—must be greater than the total effects of alternative penal responses to serious misconduct, such as long-term incarceration.   A utilitarian approach to capital punishment is inherently comparative in this way: it is essentially tied to the consequences of the practice being best for the total happiness of the society.  It follows, then, that a utilitarian approach relies on what are, in principle, empirical, causal claims about the total marginal effects of capital punishment on offenders and others.

a. Classic Utilitarian Approaches: Bentham, Beccaria, Mill

A classic utilitarian approach to punishment is that of Jeremy Bentham.  In chapters XIII and XIV of his lengthy work, An Introduction to the Principles of Morals and Legislation, first published in 1789, Bentham addresses the appropriate amount of punishment for offenses, or, as he puts it, “the proportion between punishments and offences.”  He begins with some fundamental features of a utilitarian approach to such issues:

The general object which all law have, or ought to have in common, is to augment the total happiness of the community.… But all punishment is mischief: all punishment in itself is evil.  Upon the principle of utility, if it ought at all to be admitted, it ought only to be admitted in as far as it promises to exclude some greater evil.  (XIII. I, ii.)

Bentham continues by noting the importance of attending to “the ends of punishment”:

The immediate principal end of punishment is to control action.… [T]hat of the offender it controls by its influence… on his will, in which case it is said to operate in the way of reformation;  or on his physical power, in which case it is said to operate by disablement: that of others it can influence no otherwise than by its influence over their wills; in which case it is said to operate in the way of example. (XIII. ii. fn. 1)

So, there are three major ends of punishment related to controlling people’s action in ways promoting the total happiness of the community through crime reduction or prevention: reformation of the offender, disablement (that is, incapacitation) of the offender, and deterrence (that is, setting an example for others).   Of these three ends of punishment, Bentham says “example” – or deterrence – “is the most important end of all.” (XIII. ii. fn 1).  Since “all punishment is mischief [and] an evil,” any amount of punishment, then, is justified only if that mischief is exceeded by the penalty’s good effects, and, most importantly for Bentham, only if the punishment reduces crime by deterring others from misconduct and does so better than less painful punishments.  In other writings, Bentham explicitly applies his utilitarian approach to capital punishment, first allowing its possible justification for aggravated murder, particularly when the “effect may be the destruction of numbers” of people, and then, years later and late in life, calling for its complete abolition (Bedau, “Bentham’s Utilitarian Critique…”).

In his own writing about law, Bentham notably praises and acknowledges Cesare Beccaria’s On Crimes and Punishments, its utilitarian approach to penal reform, and its call for abolishing capital punishment. Beccaria called for abolition of the death penalty largely by appealing to its comparative inefficacy in reducing the crime rate.  In Chapter XII of his essay, Beccaria says the general aim of punishment is deterrence and that should govern the amount of punishment to be assigned crimes:

The purpose of punishment… is nothing other than to dissuade the criminal from doing fresh harm to his compatriots and to keep other people from doing the same.  Therefore, punishments and the method of inflicting them should be chosen that… will make the most effective and lasting impression on men’s minds and inflict the least torment on the body of the criminal. (23; Ch. XII)

He then argues that “capital punishment is neither useful nor necessary” in comparison to the general deterrent effects of lengthy prison sentences:

[T]here is no one who, on reflection, would choose the total and permanent loss of his own liberty, no matter how advantageous a crime might be.  Therefore, the intensity of a sentence of servitude for life, substituted for the death penalty, has everything needed to deter the most determined spirit.… With capital punishment, one crime is required for each example offered to the nation; with the penalty of a lifetime at hard labor, a single crime affords a host of lasting examples” (49-50, 51; Ch. XXVIII).

The idea here is that an execution is a single, severe event, perhaps not long remembered by others, whereas life imprisonment provides a continuing reminder of the punishment for misconduct.  In general, Beccaria says, “[i]t is not the severity of punishment that has the greatest impact on the human mind, but rather its duration, for our sensibility is more easily surely stimulated by tiny repeated impressions than by a strong but temporary movement” (49; Ch. XXVIII).

Beccaria adds to this thinking at least two claims about some bad social effects of capital punishment: first, for many the death penalty becomes a spectacle, and for some it evokes pity for the offender rather than the fear of execution needed for effective deterrence of criminal misconduct (49; Ch. XXVIII).  Second, “capital punishment is not useful because of the example of cruelty which it gives to men.… [T]he laws that moderate men’s conduct ought not to augment the cruel example, which is all the more pernicious because judicial execution is carried out methodically and formally” (51; Ch. XXVIII).  Thus, Beccaria opposes capital punishment by employing utilitarian thinking: the primary benefit of deterrence is better achieved through an alternative penal response of “a lifetime at hard labor,” and, furthermore, the cruelty of the death penalty affects society in ways much later called “the brutalization effect.”

Another major utilitarian, John Stuart Mill, also exemplifies distinctive facets of a utilitarian approach, but in defense of capital punishment.  In an 1868 speech as a Member of Parliament, Mill argues that capital punishment is justified as penalty for “atrocious cases” of aggravated murder (“Speech…,” 268).  Mill maintains that the “short pang of a rapid death” is, in actuality, far less cruel than “a long life in the hardest and most monotonous toil… debarred from all pleasant sights and sounds, and cut off from all earthly hope” (“Speech…,” 268).  As Sorell succinctly summarizes Mill’s position, “hard labor for life is really a more severe punishment than it seems, while the death penalty seems more severe than it is” (“Aggravated Murder…,” 204).  Since the deterrent effect of a punishment depends far more on what it seems than what it is, capital punishment is the better deterrent of others while also involving less pain and suffering for the offender.  Such a combination “is among the strongest recommendations a punishment can have” (Mill, “Speech…,” 269). And so, Mill says, “I defend [the death penalty] when confined to atrocious cases… as beyond comparison the least cruel mode in which it is possible adequately to deter from the crime” (“Speech…, 268).

b. Empirical Considerations: Incapacitation, Deterrence

A utilitarian approach to capital punishment depends essentially on what are, in fact, the causal effects of the practice, whether the death penalty is, in fact, effective in incapacitating or deterring potential offenders.  If, in fact, it does not effect these ends better than penal alternatives such as lengthy incarceration, then capital punishment is not justified on utilitarian grounds.   In principle, at least, the comparative efficacy of capital punishment is therefore an empirical issue.

A number of social scientific studies have been conducted in search of conclusions about the effects of capital punishment, at least in America.  With respect to the end of incapacitation, any crime prevention benefit of executing murderers depends on recidivism rates, that is, the likelihood that murderers again kill.  Recent studies of convicted murderers—death row inmates not executed, prison homicides, parolees, and released murderers—indicate that the recidivism rate is quite low, but not zero: a small percentage of murderers kill again, either in prison or upon release (Bedau, The Death Penalty, 162-182).  These crimes, of course, would not have occurred were capital punishment imposed, and, so, the death penalty does prevent commission of some serious crimes.  On the other hand, for a utilitarian, these benefits of incapacitation through execution must exceed those for possible punitive alternatives.  The data reflects recidivism rates under current practices, not other possible alternatives.  If, for example, pardons and commutations were eliminated for capital crimes, if atrocious crimes were punished by a life sentence without any possibility of parole, or if conditions of confinement were such that prison murders were not possible (for example, shackled, solitary confinement for life), then the recidivism rate might approach or be zero.  One issue, then, is how high or low a recidivism rate decides the justificatory issue for capital punishment.  Another issue is the moral permissibility of establishing conditions of confinement so restrictive that even murders in prison are reduced to nearly zero.

Since the mid-twentieth century, in America a number of empirical studies have been conducted in order to assess the deterrent effects of capital punishment in comparison to those of life imprisonment.  Scholars analyzed decades of data to compare jurisdictions with and without the death penalty, as well as the effects before and after a jurisdiction abolished or instituted capital punishment.   Such analyses “do not support the deterrence argument regarding capital punishment and homicide” (Bailey, 140).  Sophisticated statistical studies published in the mid-1970s claimed to show that each execution deterred seven to eight murders.  This exceptional study and its methodology have been much criticized (Bailey, 141-143).  Additional, more recent studies and analyses have “failed to produce evidence of a marginal deterrent effect for capital punishment” (Bailey, 155).  As indicated by Jeffrey Reiman’s succinct summary and numerous, cited literature surveys (“Why…” 100-102), nearly all relevant experts claim there is no conclusive evidence that capital punishment deters murder better than substantial prison sentences.

Determining the deterrent effects of capital punishment does present significant epistemic challenges.  In comparative studies of jurisdictions with and without the death penalty, “there simply are too many variables to be controlled for, including socio-economic conditions, genetic make-up,” demographic factors (for example, age, population densities), varying facets of law enforcement, etc.  (Pojman, 139). Numerous variables may or may not explain the data attempting to link crime rates and the death penalty in different places or times (Pojman, 139). Second, as Beccaria notes, for example, deterrent effects plausibly depend importantly on the certainty, speed, and public nature of penal responses to criminal conduct.  These factors have not been much evident in recent capital punishment practices in America, which may explain the lack of evidence revealed by recent statistical studies.  Third, deterrence is a causal concept:  the idea is that potential murderers do not kill because of the death penalty.  So, the challenges are to measure what does not occur—murders – and to establish what causes the omission—the death penalty.  The latter element is even more challenging to measure because most who do not murder do so out of habit, character, religious beliefs, lack of opportunity, etc., that is, for reasons other than any perceived threat or fear of execution by the state.  Deterrence studies, then, attempt to establish empirically a causal relationship for a small minority of people and omitted homicides within a death penalty jurisdiction.  Finally, there are disagreements about the importance of the studies’ conclusions.  For example, abolitionists typically see that, despite numerous attempts, the failure to provide conclusive evidence strongly suggests there is no such effect: the death penalty, in fact, does not deter.  Defenders of capital punishment are inclined to interpret the empirical studies as being inconclusive: it remains an open question whether the death penalty deters sufficiently to justify it.  And all this is further complicated by the fact that some studies focus on the effects of capital statutes and others look for links between actual executions and crime rates.

c. Utilitarian Defenses: “Common Sense” and “Best Bet”

Regardless of the outcomes or probative value of statistical studies, justifying capital punishment on grounds of deterrence may still have merit.  It would seem, some maintain, that “common sense” supports the notion that the death penalty deters.  The deterrence justification of capital punishment presupposes a model of calculating, deliberative rationality for potential murderers.  What people cherish most is life; what they most fear is being killed.  So, given a choice between life in prison and execution by the state, most people much prefer life and therefore will refrain from misconduct for which death is the punishment.  In short, “common sense” suggests that capital punishment does deter.  But this kind of appeal to “common sense” ignores the essentially comparative aspect of appeals to deterrence as justification: though capital punishment may deter, it may not deter any more (or significantly more) than a long life in prison. We cannot equate “what is most feared” with “what most effectively deters” (Conway, 435-436; Reiman, “Why…,” 102-106).

Another way of looking at capital punishment in terms of deterrence relies on making the best decision under conditions of uncertainty.  Given that the empirical evidence does not definitively preclude that capital punishment is a superior deterrent, “the best bet” is to employ the death penalty for serious crimes such as murder.  If capital punishment is not, in fact, a superior deterrent, then some murderers have been unnecessarily executed by the state; if, on the other hand, death is not a possible punishment for murder and capital punishment is, in fact, a superior deterrent, then some preventable killings of innocent persons would occur.  Given the greater value of innocent lives, the less risky, better option justifies capital punishment on grounds of deterrence. But the argument crucially depends on comparative risk assessments: if there is capital punishment, then certainly some murderers will be killed, whereas without the death penalty there is only a remote chance that more innocent lives would be victims of murder (Conway, 436-443).  Furthermore, the argument openly assumes that not all lives are equal—those of the innocent are not to be risked as much as those who have murdered—and that, for some, is a fundamental moral issue at stake in justifying capital punishment (see section 2c; Pojman, 35-36).

d. Challenges to Utilitarianism

Utilitarian approaches to justifying punishment are controversial and problematic, perhaps most often with respect to possibly justifying punishment of the innocent as a means to preventing crime and promoting total happiness of a society.  Even ignoring this issue and focusing only on justifying the proper amount of punishment for the guilty and the death penalty, in particular, there are concerns to be considered about a utilitarian approach.  The objection is that a utilitarian approach to the death penalty relies on a suspect general criterion—deterrence—for establishing the proper amount of punishment for crimes.  It is often argued that, for purposes of crime prevention through deterrence, a utilitarian is committed, at least in principle, to excessively severe punishments, such as torturous and gruesome executions in public even for crimes much less serious than murder (for example, Ten, 34-35, 143-145).  The idea is that the pain of excessively severe and public punishments for minor crimes is more than counterbalanced by a significant reduction in a crime rate.  It is also argued that significant crime rate reductions could perhaps be achieved, in some circumstances, by disproportionately minor punishments:  if fines, light prison sentences, or even fake executions could deter as well as actual ones, then a utilitarian is committed to disproportionately mild penalties for grave crimes.  Utilitarians respond to such possibilities by indicating additional considerations relevant to calculating the total costs of such disproportionate punishments, while critics continue creating even more elaborate, fantastic counterexamples designed to show the utilitarian approach cannot always avoid questions about the upper or lower limits of morally permissible penal responses to misconduct.  As C. L. Ten summarizes succinctly, a utilitarian approach establishing a proper amount of punishment is “inadequate to account for both the strength of the commitment to the maintenance of a proportion between crime and punishment, and [to] the great reluctance to depart… from that proportion when required to so do by purely aggregative consequential considerations” (146).

Another common criticism of the utilitarian approach points to the very structure of justifications rooted in deterrence.  As evident in Bentham’s classic statements, for example, the purpose of punishment “is to control action,” primarily through deterrence (see section 3a).  Punishments deter and “control action” by example, by the demonstration to others that they, too, will suffer similarly should they similarly misbehave. Capital punishment, then, aims to deter actions of potential killers by inflicting death on actual ones: the technique works by threat, by instilling fear in others.  A fundamental objection to this way of thinking is to see that, in effect, persons are being used as a means to controlling others’ actions; capital offenders are being used simply as a means to deter others and reduce the crime rate.  Such a use of persons is morally impermissible, it is argued, echoing Immanuel Kant’s famous categorical imperative against treating any person merely as means to an end.  No gain in deterrence, incapacitation, or other beneficial effects can justify deliberately killing a captive human being as a means to even such desirable ends as deterring others from committing grave crime.  The argument, then, is that justifying capital punishment on grounds of deterrence is a morally impermissible way to treat persons, even those found to have committed atrocious crimes.

e. Other Consequential Considerations

In discussions of capital punishment, it is deterrence that receives much of the attention for those exploring a utilitarian approach to the moral justification of the practice.  There are, however, other significant consequences of the death penalty that are relevant, as noted even by classic utilitarians.  Beccaria, for example, asserts a brutalization effect on society: executions are cruel and are examples to others of the states’ cruelty.  The suggestion seems to be that capital punishment increases people’s tolerance for another’s suffering, their callousness about human suffering, a willingness to impose suffering on another, even the rate of violent crimes (for example, assaults or homicides).  In contrast, one recent defender of the death penalty, Jeffrey Reiman, argues that, for some developed societies, abolition of capital punishment for serious crimes shows restraint and thereby actually advances civilization by reducing our tolerance for others’ suffering.  Such claims are, in principle, empirical ones about the causal effects of the practice of capital punishment.  As with recent deterrence studies, there is no clear empirical evidence of any brutalizing or civilizing effects of capital punishment.

For classic utilitarian thinking, another important consequence of punishment is its effect on the offender.   According to Jeremy Bentham, one of the three ends of punishment is reform of the offender through “its influence on his will” (XIII.ii. fn. 1).  This penal aim of reform (or rehabilitation) may suggest capital punishment is not justifiable for any crime.  But that need not be the case.  The ancient Roman Stoic Seneca, for example, argues that proper punishment for criminal misconduct depends on its “power to improve the life of the defendant” (Nussbaum, 103).   But he also defends capital punishment as a kind of merciful euthanasia: execution is “in the interest of the punished, given that a shorter bad life is better than a longer one” (Nussbaum, 103, note 43).  Plato also defends capital punishment by looking to its impact on the offender.  In his later works and as part of a general theory of penology, Plato maintains that the primary penal purpose is reform—to “cure” offenders, as he says.  For crimes that show offenders are “incurable,” Plato argues execution is justifiable.  In his late work, The Laws, Plato explicitly prescribes capital punishment for a wide range of offenses, such as deliberate murder, wounding a family member with the intent to kill, theft from temples or public property, taking bribes, and waging private war, among others (MacKenzie; Stalley).  In a utilitarian approach to capital punishment, then, attending to the end of reforming offenders need not be irrelevant to possible moral justifications of the death penalty.

4. Capital Punishment as Communication

A cluster of distinctive approaches to issues of justifying punishment and, at least by implication, the death penalty, are united by taking seriously the idea of punishment as expression or communication.  Often called “the expressive theory of punishment,” such approaches to punishment are sometimes classified as utilitarian or consequentialist, sometimes as retributivist, and sometimes as neither.  The root idea is that punishment is more than “the infliction of hard treatment” by an authority for prior misconduct; it is also “a conventional device for the expression of attitudes of resentment and indignation, and of judgments of disapproval and reprobation….  Punishment, in short, has a symbolic significance” (Feinberg, “The Expressive Function…,” 98).  Hard treatment, deprivations, incarceration, or even death can be, and perhaps are, vehicles by which messages are communicated by the community.  To see capital punishment as a deterrent is to see it as communicative:  the death penalty communicates to the community—at least potential killers—that murder is a serious wrong and that execution awaits those who kill others.  Various developments of punishment as communication, though, attend to other messages expressed, some emphasizing the sender and others the recipient of the message.

One version of this kind of approach emphasizes that, with capital punishment, a community is expressing strong disapproval or condemnation of the misconduct.  Sometimes called “the denunciation theory,” the basic contention is evident in Leslie Stephens’ late 19th-century work, Liberty, Equality, Fraternity (a reply to J.S. Mill’s On Liberty), as well as by the oft-quoted remarks of Lord Denning recorded in the 1953 Report of the Royal Commission on Capital Punishment:

The punishment for grave crimes should adequately reflect the revulsion felt by the great majority of citizens for them. It is a mistake to consider the object of punishment as being deterrent or reformative or preventive and nothing else.… The ultimate justification of any punishment is not that it is a deterrent but that it is the emphatic denunciation by the community of a crime; and from this point of view, there are some murders which, in the… public opinion, demand the most emphatic denunciation of all, namely the death penalty. (As quoted in Hart, “Punishment…,” 170)

In the United States, Supreme Court decisions in death penalty cases have more than once employed such reasoning:  a stable, ordered society is better promoted by capital punishment practices than risking “the anarchy of self-help, vigilante justice, and lynch law” as ways of expressing communal outrage (Justice Stewart, in Furman v. Georgia (1972), as quoted in Gregg v. Georgia (1976)).

As a defense of capital punishment, at least, this “denunciation theory” leaves multiple questions not adequately addressed.  For example, the approach presupposes some moral merit to popular sentiments of indignation, outrage, anger, condemnation, even vengeance or vindictiveness in response to serious misconduct.  There are significant differences between expressing such emotions and punishing justly or morally (see section 2b).  Secondly, the structure of the thinking seems entirely consequentialist or utilitarian: capital punishment is justified as effective means to communicate condemnation, or to satisfy others’ desires to see someone suffer for the crime, or as an outlet for strong, aggressive feelings that otherwise are expressed in socially disruptive ways.  Such utilitarian reasoning would seem to justify executing pedophiles or even innocent persons in order to communicate condemnation or avoid an “anarchy of self-help, vigilante justice, and lynch law.” On the other hand, even Jeremy Bentham argues that “no punishment ought to be allotted merely to this purpose” because such widespread satisfactions or pleasures cannot ever “be equivalent to the pain… produced by punishment” (Bentham XIII. ii. fn. 1).  Third, it leaves unanswered why the expression of communal outrage—even if morally warranted—is best or only accomplished through capital punishment.  Why would not harsh confinement for life serve as well any desirable expressive, cathartic function?  Or on what grounds are executions not to be conducted in ways torturous and prolonged, even publicly, as means of better communicating denunciation and expressing society’s outrage about the offenders’ misconduct?  And does not the death penalty also express or communicate other, conflicting messages about, for example, the value of life?  As a justification of capital punishment, even for the most heinous of crimes, a “denunciation theory” faces significant challenges.

Other uses of the idea of punishment as communication focus not on the sender of the message, but on the good of the intended recipient, the offender.  Punishment is paternalistic in purpose: it aims to effect some beneficial change in the offender through effective communication.  In Philosophical Explanations Robert Nozick, for example, holds that punishment is essentially “an act of communicative behavior” and the “message is: this is how wrong what you did was” (370).  Wrongdoers have “become disconnected from correct values, and the purpose of punishment is to (re)connect him” (374).  The justified amount of punishment, then, is tied to the magnitude of the wrong committed (363): “for the most serious flouting of the most important values… capital punishment is a response of equal magnitude” (377).  But, Nozick maintains, the aim of punishment is not to have an effect on the offender, but “for an effect in the wrongdoer: recognition of the correct value, internalizing it for future action—a transformation in him” (374-5).  This paternalistic end seems to preclude the death penalty being imposed for any kind of wrongdoing; however, in “truly monstrous cases” (for example, Adolph Hitler, genocides) there seems to be perhaps the highest magnitude of wrong, a disconnection from the most basic values, and acts worthy of the most emphatic penal expression possible.  As Nozick himself admits and others have noted, this approach to punishment as communication provides “no clear stable conclusion… on the issue of an institution of capital punishment” (378).

Some employing a similar reliance on punishment as communication are less ambivalent about its implications for the death penalty.   The “moral education theory of punishment,” its proponent maintains, precludes “cruel and disfiguring punishments such as torture or maiming,” as well as “rules out execution as punishment” (Hampton, 223).  This argument for death penalty abolition takes seriously the expressive, communicative function of punishments: as aiming to effect significant benefits in and for the offender and, through general deterrence and in other ways, as “teaching the public at large the moral reasons for choosing not to perform an offense” (Hampton, 213).  Punishment as education is not a conditioning program; it addresses autonomous beings, and the moral good aimed at is persons freely choosing attachment to that which is good.  Executing criminals, then, seems to require judging them as having “lost all their essential humanity, making them wild beasts or prey on a community that must, to survive, destroy them” (Hampton 223).  Furthermore, it is argued, capital punishment conveys multiple messages, for example, about the value of a human life; and, it is argued, since one can never be certain in identifying the truly incorrigible, the death penalty is morally unjustified in all cases.   As R.A. Duff puts the abolitionist point in Punishment, Communication, and Community (2001), “punishment should be understood as a species of secular penance that aims not just to communicate censure but thereby to persuade offenders to repentance, selfreform, and reconciliation” (xvii-xix).

Approaches to capital punishment as paternalistic communication are challenged on several grounds.  First, as a general theory of punishment, such expressive theories posit an extraordinarily optimistic view of offenders as open to the message that penal experiences aim to convey.  Are there not some offenders who will not be open to moral education, to hearing the message expressed through their penal experiences?  Are there not some offenders who are incorrigible?  On these approaches to capital punishment, the reasons against executing serious offenders are essentially empirical ones about the communicative effects on the public of executions or the limits of diagnostic capabilities in identifying the truly incorrigible.  Second, with respect to capital punishment, perhaps for some offenders, the experience of trial, sentencing, and awaiting execution does successfully communicate and effect reform in the offender, with the death penalty then imposed to affirm that which effected the beneficial reform in the offender.  Third, as with other approaches to punishment, the moral education theory renders it extremely difficult, if not impossible, to “fashion a tidy punishment table” pairing kinds of misconduct and merited penalties (Hampton, 228).  Focusing on reforming or educating a recipient of a message suggests very individualistic and situational sentencing guidelines.  Not only may this not be practical, such discretion in sentencing risks caprice or arbitrariness in punishing offenders by death or in other ways (see section 5); and it challenges the fundamental, formal principle of justice, that is, that like case be treated alike.  Finally, the implications of these approaches to punishment are quite at odds with the system of incarceration employed so universally for so many offenders.  The implications of punishment as communication aimed at the offender would require radical revisions of current penal practices, as some proponents readily admit.

5. The Institution of Capital Punishment

Much philosophic focus on punishment and the death penalty has been rooted in theoretical questions and principles.  A result is that philosophers have mostly ignored more practical matters and moral facets of the institution of capital punishment.  That historical tendency began to change in the mid-twentieth century with a decidedly American concern: whether the practice of capital punishment is legally permissible, given the United States Constitution’s eighth amendment prohibition of “cruel and unusual punishments.”  Scholars and lawyers investigated the history and continuing death penalty practices in America, producing evidence of racial discrimination in the institution of capital punishment, especially in southern states.  By the early 1970s, a series of United States Supreme Court decisions established especially elaborate criminal procedures to be followed in capital cases: bifurcated trials (one for conviction and one for establishing the sentence), a finding of at least one aggravator for a murder to be a capital crime, automatic appellate review of all sentences to death, guidelines for jury selections, etc. The aim of such “super due process” is to improve criminal procedures employed in capital cases so as avoid arbitrariness in administering the death penalty in America (Radin).

After implementation of these Court-mandated procedures for death penalty cases, a number of empirical studies indicated continuing concerns and problems with the practice of capital punishment in America.  For example, studies of capital cases conducted in some southern states showed that disproportionately large numbers of convicted murderers received death sentences if they were black, a disproportion even greater when the convicted murderer was black and the victim was white (Bedau, The Death Penalty, 268-274).   Also, especially with the advent of new, scientific sources of evidence (for example, DNA matching), studies suggest that numbers of persons innocent of any crime have been wrongly convicted, sentenced, and even executed for committing a capital crime (Bedau, The Death Penalty, 344-360).   Morally justifying punishment in theory is distinguishable from whether it is justified in practice, given extant conditions.  For some, even though questions of theory and practice are distinguishable, they may not be unrelated. As Stephen Nathanson asks, “does it matter if the death penalty is arbitrarily administered?”

a. Procedural Issues: Imperfect Justice

Moral arguments about the death penalty based on procedural issues attend to the outcomes and steps of a long and involved process “as a person goes the road from freedom to electric chair” (Black, 22).  Such a process involves an “entire series of decisions made by the legal system”:  whether to arrest; what criminal charges to file; decisions about plea bargaining offers, if any;  a criminal trial, with jury selection, countless tactical decisions, possible employment of a defense like insanity; sentencing that requires juries find and weigh statutory factors of aggravation and mitigation; post-conviction appeals and possible remedies decided; clemency decisions, to commute a sentence or even pardon the convicted (Black, 22-26).  It is apparent, then, “that the choice of death as the penalty is the result of not just one choice… but of a number of choices, starting with the prosecutor’s choice of a charge, and ending with the choice of the authority… charged with the administration of clemency” (Black, 27).  At each one of these points of decisions, it is argued, there is room for arbitrariness, mistakes, even discrimination.  Furthermore, it is impossible and undesirable to remove all latitude, all discretion, in order to allow each of these decisions to be properly made in light of the particularities of the case, person, situation.  And so, the institution of capital punishment, even as practiced in America, brings along with it “the inevitability of caprice and mistake” (Black).

A criminal trial and, more broadly, criminal procedures in toto are exemplars of what John Rawls, in A Theory of Justice, characterizes as imperfect procedural justice.   There is an independently defined standard external to the procedure by which we judge outcomes of the process; and there is no procedure “that is sure to give the desired outcome” (Rawls 74-75).  For criminal procedures, the aim is “to impose deprivations on all and only guilty convicted offenders because of their wrongdoing”; and for capital punishment, the aim is to impose the death penalty on all and only those guilty of committing crimes for which the merited amount of punishment is execution (Bedau, Reflections 173).  In capital procedures, too, it is “impossible to design the legal rules so that they always lead to the correct result” (Rawls, 75).  Whether due to inherent vagaries of legal language, the necessity of discretion to judge properly complex, particular cases, the fallibility of human beings, or political pressures and other factors affecting decisions made within the system, such as clemency, the risk of error is not eliminable for the institution of capital punishment.  Given unavoidably imperfect criminal justice procedures, at issue, then, is the moral import of any arbitrariness, caprice, mistake, or discrimination in the institution of capital punishment.

The appeal to procedural imperfections is often employed by those opposed to capital punishment and who seek its complete abolition on the grounds that its institution is intolerably arbitrary, capricious, or discriminatory in selecting who lives and who dies. This abolitionist reasoning is challenged in various ways.  Given the fact that there are imperfections in the system or practice of capital punishment, what follows is not abolition of the death penalty, but justification only for procedural improvements in order to reduce problematic outcomes.  A second issue, aside from disputes about the actual frequency of problematic outcomes, is a question of thresholds: how many imperfect outcomes are tolerable in the institution of capital punishment?  Abolitionists tend to have near-zero tolerance, whereas some defenders of capital punishment argue that some arbitrariness is acceptable.  For a utilitarian approach to capital punishment, assessing the total consequences—benefits and “costs”— of the death penalty must include the inevitable arbitrariness of its institution.  And in as much as any deterrent effects are linked to certainty of punishment, any degree of arbitrariness in administering capital punishment does affect a central utilitarian consideration in determining whether the institution is morally justified.  For retributivist approaches, the question is whether some arbitrariness in the institution violates requisite pre-conditions for morally justifying the institution of capital punishment (see section 2c).  Jeffrey Reiman, for example, argues, on retributivist grounds, that capital punishment is justified in principle; however, “the death penalty in… America is unjust in practice,” and he therefore favors abolition (see 5b).

A third issue for appeals to procedural imperfections involves limiting the scope of the argument for abolition.   Since all criminal cases are administered through unavoidably imperfect procedures, if arbitrariness justifies abolishing the death penalty for murder, then it would seem also to justify abolishing lesser punishments for less serious criminal misconduct.  In short, the imperfect administration of capital punishment matters morally only if the death penalty is distinctive among punishments.  Punishment by death is often said to be distinctive because, unlike incarceration, death is irrevocable.  But years spent imprisoned, for example, can also not be revoked, once they have been endured.  The idea must be that incarceration, if found to be mistaken, can be ceased: by executive or judicial action the imprisoned can be released and receive remedies, even if only gestures.   On the other hand, a death sentence, once executed, has none of those qualities: death is permanent; punishment by death has finality.  “Because of the finality and the extreme severity of the death penalty, we need to be more scrupulous in applying it as punishment than is necessary with any other punishment” (Nathanson, Eye, 67).

Another major issue involves distinguishing the kinds of imperfect outcomes resulting from the criminal procedures employed in capital cases.  For example, the arbitrariness evident in the procedures may be one of selectivity: among all the convicted killers who merit a death sentence, some of those are actually sentenced or executed and others are not.  As Ernest van den Haag argues, that some who merit the death penalty escape that punishment does not make morally unjustified selectively executing some who do merit that punishment (Nathanson, 49).  Analogies with selective ticketing for excessive speed support this kind of reasoning: justice is a matter of each individual being treated as they merit, without regard to how other, similar cases are treated.  But this argument makes what is just or justified entirely non-comparative, when substantive comparative considerations often are also necessary when arbitrariness or discrimination is at issue (Feinberg, “Noncomparative Justice,” 265-269).  Justice requires treating similar cases in similar ways, and this kind of arbitrary imposition of the death penalty violates that requirement.  Furthermore, it may matter morally what are the grounds of selecting only some convicted killers to receive death sentences or to be executed.  If the selectivity is based on race, for example, then the moral import of the arbitrariness might be far greater, whether for traffic tickets or the death penalty for murder.  Aside from the moral import of arbitrariness as selectivity, there is also an arbitrariness that issues in mistakes, where persons who did not commit a capital crime (or perhaps did not commit any crime at all) are wrongly convicted, sentenced and executed.  This sort of imperfect outcome would seem far more problematic morally than the selective execution of only some of those who merit the death penalty.  As Stephen Nathanson states it with respect to executing the innocent, “this is the moral force of the argument from arbitrary judgment” (Eye, 53).

b. Discrimination: Race, Class

Criminal justice systems that administer the death penalty operate in the context of a society that may or may not itself be entirely just.  The procedures employed in capital cases, then, can be imperfect due to external social factors affecting its outcomes, and not only due to features internal to the structure of a legal system itself.  Various sources of data suggest to many that American criminal justice procedures produce disproportionately large numbers of capital convictions and death sentences for the poor and for African-Americans.  In short, it is claimed, the institution of capital punishment is imperfect, capricious, or arbitrary in a particular way: it discriminates on the basis of economic class and race.   Poverty and race, it is argued, have “warping effects” on the long, involved process whereby “a person goes the road from freedom to electric chair” (Black, 22).   At numerous decision points, a lack of funds affects how the process proceeds for a poor person charged with a capital crime: the quality of legal counsel for plea bargaining, investigation, and conduct of a trial; financial resources needed to build a strong evidentiary case through crime scene investigation, forensic testing, and expert testimony at trial;  money for background investigations, professional examinations, and expert testimony in the crucial sentencing phase of a capital trial; securing attorneys for legally required and elective appeals; accessing those political offices and officers with the legally unlimited authority to commute a sentence or even pardon a convicted offender.   Given the high correlation in America between poverty and race, any disproportionate outcomes with respect to economic class parallel those with respect to race.  Also, as described above, the “entire series of decisions made by the legal system” in capital cases provides numerous opportunities for unconscious racial bias or blatant discrimination in the exercise of discretion by those administering the process.  Opponents of the death penalty, then, see factors of race and poverty as increasing the likelihood of error in capital cases, and see such discriminatory outcomes as especially problematic from a moral point of view.

This line of reasoning invokes the specter of discrimination in the institution of capital punishment.  The basic empirical claim is that, by race and economic class, America’s imperfect procedures produce disproportionate outcomes.  The issue is not necessarily one of intentional racial discrimination, though that may occur, as well.  Considerations of perhaps unintended discriminatory outcomes, however, need not support abolition of the death penalty.  Aside from disputes about the data supporting the basic empirical claim of disproportionate outcomes, responses parallel those reviewed above with respect to the internal structures of criminal justice procedures in capital cases (see section 5a).  In particular, it is argued that disproportionate outcomes support reforms to mitigate such discrimination, such as quality legal representation being provided for the poor, increased budgetary allegations for defense of the indigent in capital cases, etc. And given that what explains the disproportionate outcomes are social conditions external to the process itself, it would seem that discriminatory outcomes are not inevitable in the way that the effects of ineliminable discretion might be.  The issue, then, becomes the moral import of problematic social conditions that “warp” the institution of capital punishment.  How does such “warping” affect any justification of the death penalty?  Does it matter morally that the institution of capital punishment exists amidst a society insufficiently just regarding matters of economic class or race?

For a utilitarian approach to capital punishment, the issue is addressed in terms of total consequences for the society.  As with other kinds of arbitrariness previously reviewed, any discriminatory outcomes of the institution of capital punishment are part of the total cost of the practice and are to be considered along with all other costs and benefits.  Depending on the causal consequences of the practice in a society at a given time, then, capital punishment is or is not morally justified.  For some retributivists, however, the relevance of current social conditions can be quite different for whether capital punishment is morally justified.  For example, the fairness approach to punishment and the death penalty presupposes a society with reasonably just rules of cooperation that bestow benefits and burdens on its members. Whether America today, for example, satisfies such a pre-condition is, for some, doubtful; and thus, it is argued, even if justified in theory, capital punishment is not justified under current social conditions (for example, Reiman).  Also, retributivists typically presuppose punishment is to address misconduct that is voluntary, a matter of free choice.  But Marx, for example, maintains that such a presupposition of free will is simply false, a delusion:

Is it not a delusion to substitute for the individual with his real motives, with multifarious circumstances pressing upon him, the abstraction of “free will”…?  Is there not a necessity for deeply reflecting upon an alteration of the system that breeds these crimes, instead of glorifying the hangman who executes a lot of criminals to make room for the supply of new ones?

Though Marx is himself sympathetic to a retributivist justification of punishment, theory and practice cannot be divorced.  Marx and many Marxists oppose capital punishment because it is inapplicable to the actual conditions of society where criminality is rooted in structural inequalities of wealth (Murphy).  Thus, for some retributivist and utilitarian approaches to capital punishment, the death penalty may be morally unjustified because of inherently imperfect legal procedures, morally problematic outcomes, or the social conditions surrounding the institution.

c. Medicine and the Death Penalty

In recent years, issues of medical ethics have been a facet of philosophic focus on the institution of capital punishment, especially in America.  Health care professionals—including physicians—can be active participants in the actual execution of a death-row prisoner.  Medical expertise needed for an execution itself can include administering medicines or psychiatric treatments to calm the condemned, judging whether intramuscular or intravenous techniques are best, or actually injecting a lethal dose of drugs to bring about a death (Gaie, 1).  Even if not directly participating in executions and regardless of the method of execution employed, health care professionals can be involved by providing capital trial testimony related to findings of guilt or punishment, such as competency to stand trial, possibly exculpating mental illness, or forensic analyses of murder scene evidence.  Physicians are needed to certify death following a successful execution, and they may have a role in possible organ donations arranged by the deceased (Gaie, 2).  All such participation requires relevant expertise and is important to contemporary death penalty practices.  An important question, however, is whether it is morally permissible for health care professionals to be involved or participate in the institution of capital punishment.

A common assumption is that health care professionals—physicians, at least—have significant moral duties to those they treat or administer to.  Many, like Gaie, address such issues of professional ethics as independent of the morality of capital punishment itself.  Thus, for example, since physicians have a duty to minimize suffering, it would seem to follow that medical professionals’ participation is morally justified for that purpose, perhaps especially in executions by lethal injection.  Others maintain that, analogous to relieving the suffering of a torture victim so that they can be further tortured, physicians ought not participate in executions in order to reduce the suffering of the condemned (Dworkin).  Physician participation in an unjust practice, such as capital punishment, makes them complicit and, so, they ought not be involved. Thus, it is argued, one cannot separate the ethics of physicians’ participation in capital punishment from the moral merits of the institution itself (Litton).

Since the early 1980s, lethal injection has almost completely replaced electrocution as the preferred method of execution for those convicted of a capital crime and sentenced to death in the United States.  This recent, novel method of execution has itself generated considerable controversy.  First, unlike other constitutionally permissible modes of execution in America (that is, electrocution, hanging, firing squad, gas inhalation), a lethal injection requires medical expertise in order to be administered properly.  Thus, health care professionals must be direct participants in executions: for example, by preparing the lethal drug dosages, by establishing suitable sites for an injection, and by actually administering the drugs that cause the death of the convicted.   In comparison to other methods of execution, such participation is more essential, more direct, and ethically more problematic.  Execution by lethal injection makes more acute and controversial the ethical issues surrounding the involvement of health care professionals in the institution of capital punishment.  Second, whether employing the typical three-drug “cocktail,” or some variant of that process, acquiring the designated pharmaceuticals has often become difficult or impossible.  Some foreign-based companies face legal restrictions on exporting drugs for such uses, and some foreign and domestic drug companies, for reasons of public image or ethical considerations, for example, choose not to manufacture or supply their pharmaceutical products for use in executions.  This sometimes delays execution or leads governments to employ alternative drugs for which there may not be sufficient evidence of their effectiveness in effecting a human death.  Third, whether any formulas for lethal injections are a humane way (or a more humane way) of causing death is itself controversial, with disputes about the science (or lack thereof) behind the drug formulas and protocols used, disagreements about the evidentiary significance of physiological data from autopsies used to assess the humanity of death by lethal injection, etc.  Finally, so-called “botched executions” are still not entirely avoided by using lethal injection rather than electrocution or hanging, for example.  Cases do occur where the condemned endure an extended process of dying that sometimes suggests lingering sentience, discomfort, or suffering.  As with other facets of the institution of the death penalty, there is disagreement about the import of such practical challenges for the moral justification of capital punishment.

d. Costs: Economic Issues

At least in popular discourse, if rarely among philosophic discussions, considerations of monetary cost are adduced with respect to morally justifying capital punishment.  As Stephen Nathanson rightly recognizes, in its bald form it is a simple economic argument:  the state ought to execute murderers because it is less costly than imprisoning them for life (Eye, 33).  Even among proponents, though, cost considerations are perhaps plausibly relevant only as secondary, subsidiary supplements to some anterior justification for executing murderers: if murderers merit death as punishment for criminal misconduct, then economic cost is perhaps relevant to justifying their execution over a sentence of life spent in prison.

The argument depends crucially on the empirical claim that, in fact, it is less costly to execute murderers than it is to imprison them for life.  But the facts do not support this supposition.  The costs are not only those of a single execution, but for a system of due process and an infrastructure of facilities and personnel needed for the institution of capital punishment (Nathanson, Eye 36).  A possible reply is that such costs could be reduced, especially if we were to replace America’s elaborate “due process” for capital cases with something much more minimal: fewer appeals and appellate reviews, for example (Nathanson, Eye 38).  Such an approach may save some economic costs but increase the cost of thereby perhaps increasing the frequency of mistakes or arbitrariness.  Furthermore, reliance on comparative costs in determining who is executed potentially introduces a novel, morally suspect kind of arbitrariness.  Given that the cost of life imprisonment would be a function of a convicted murderer’s health and age, younger, healthier persons would be selected for the death penalty, while older, or more feeble, unhealthy killers would be sentenced to life in prison as the cheaper alternative.  The costs argument risks introducing a kind of age and medical status discrimination into the imperfect procedures employed to determine who merits the death penalty for murder.

6. State Authority and Capital Punishment

Exploring fully whether capital punishment is morally justified leads to considering a normative account of the modern state, its foundations, proper functions, and penal powers.  The modern practice of capital punishment presupposes a state which has the authority to make, administer, and enforce criminal law and procedures and then, if merited, impose the death penalty to address serious misconduct.  On what basis does the state possess the authority to punish by death?  This question of justification seems to raise issues about capital punishment that are “more squarely within the province of political philosophy” (Simmons, 311).

Contractarian accounts of the state share the feature that authority is derived from or constructed out of the authority granted to it by individuals that have or would “contract” to create it (see Social Contract Theory).  Any authority of the state to punish by death is, then, consent-based.  Thus, for example, as with others in the natural rights tradition, John Locke’s contractarian approach grounds state authority in individuals transferring their pre-political right to punish (including by death) those who have violated another’s basic rights by killing.   As Locke maintains in his Second Treatise on Government, the purpose of the state is to protect individuals’ basic rights, and individuals each grant the state the authority to protect rights through laws and punishments that are effective and comply with natural law principles about the amount of punishment (that is, lex talionis).  Though invoking such a pre-political right of individuals to punish is common in the natural rights tradition, and though there are some recent defenders of such an approach among libertarians (for example, Nozick), Locke himself admits that the notion of a natural executive right to punish “will seem a very strange doctrine to some men” (Treatise, sec. 9).

The classic contractarian theories of Jean-Jacques Rousseau and Thomas Hobbes also justify state authority to punish by death on grounds of individuals’ consent.  In the Leviathan, the pre-political state of nature is famously characterized by Hobbes as a life “solitary, poor, nasty, brutish, and short” (89; Ch. 13).  This life in the state of nature is so insecure that each person, as a means to self-preservation, authorizes the created sovereign power—the state—to punish by death criminal misconduct “to the end that the will of men may thereby better be disposed to obedience” (214; Ch. 28).  Rousseau, in On the Social Contract, holds that “the social treaty has as its purpose the conservation of the contracting parties,” each of whom wills the means to end of preserving his life.  “And whoever wishes to preserve his own life at the expense of others should also give it up for them when necessary….  It is in order to avoid being the victim of an assassin that a person consents to die, were he to become one” (35; Book II, Ch. v).  And so, Rousseau maintains, the political society has the right to put to death, even as an example, those who cannot be preserved without danger to others or the society itself.  In the case of all the classic social contract theories of the state, individuals’ consent to the practice of capital punishment is included in the created authority of the state to rule and to punish.

Some more recent contractarian accounts of state authority to punish are explored in the spirit of John Rawls’s A Theory of Justice, with its Kantian conceptions of rationality and basic human goods (for example, liberties, autonomy, dignity).  The general idea is that a system of social cooperation is just if it would be consented to by rational, mutually disinterested individuals making their choice while ignorant of particularities about themselves and their own place in the system.  Such contractarian approaches typically support a penal system which merges both retributivist and utilitarian approaches in establishing a just system of punishment.  Whether such contractarian approaches justify capital punishment depends, as do classic social contract theories, on the details of the conditions under which a rational choice would be made.  A recent proponent of a contractarian theory of punishment, for example, argues that individuals would consent to an institution only if it would leave individuals better off than they would be in its absence.  This “benefit principle,” it is argued, justifies a system of punishment, as each would be better off with punitive sanctions than without.  As to capital punishment, though, “[c]an a person who receives the death penalty… regard himself as better off… than he would have been had he never agreed to the contract in the first place” (Finkelstein, “A Contractarian Approach…,” 216)?  There is a paradoxical air to individuals consenting to a system whereby they may be executed.  Finkelstein argues that, even if the death penalty deters, the benefit principle is not satisfied by a system of punishment that includes the death penalty.  On this contemporary contractarian theory, then, capital punishment is not justified because it would not be agreed to by rational individuals choosing the social institutions under which they would live.

A quite different approach to justifying state authority to punish by death appeals to the idea of societal self-defense or self-protection.  In a short piece, “On Punishment,” John Stuart Mill says, “the only right by which society is warranted in inflicting any pain upon any human creature, is the right of self-defense…. Our right to punish, is a branch of the universal right of self-defence”(79).  One recent development of this approach argues that a societal right of self-protection entails the right to threaten punishment for misconduct, and that a right to impose punishments follows from the society’s right to threaten sanctions (Quinn).  Whether a society has a right to threaten or impose a death penalty for murder, then, is based on its efficacy for deterrence and incapacitation, that is, as a protector of society.  A second, slightly different argument appeals more directly to the model of individual self-defense as a right.  Just as an individual has a right to use deadly force to address imminent, unavoidable aggression against self or other innocent parties, so society, as a collective, has a right to employ deadly force to address violent aggression against innocent third parties within that society.  The amount of punishment that society has the right to employ is constrained as it is for an individual’s moral right of self-defense: the response must be proportionate to the threatened loss.  So, given a moral right of individuals to employ deadly force in defense of their own or other innocents’ lives, by analogy society has such a right to use death as a punishment for murders of innocent third parties in the society.  Whether as an exercise of a right of self-protection or self-defense, the state then has the right to institute capital punishment for serious crimes such as murder.

7. References and Further Reading

a. Primary Sources

  • Aquinas, Thomas. Summa Theologiae. (1271-1272)
    • References to this extensive work are by number of question and article in the second part of part two (i.e., II-II), available at http://www.gutenberg.org/cache/epub/18755/pg18755.html.
  • Beccaria, Cesare. On Crimes and Punishments. (1764) Trans. David Young. Indianapolis: Hackett Publishing Company, 1986.
    • Quotations and references are by page number and chapter number to this translation and edition.
  • Bentham, Jeremy. An Introduction to Principles of Morals and Legislation (1789, 1823).
    • References to this classic text are by chapter and section number.
  • Camus, Albert. “Reflections on the Guillotine.” Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Knopf, 1966. 175-234.
  • Hegel, G.W.F. The Philosophy of Right. (1821) Trans. T. M. Knox. Oxford: Clarendon Press, 1962.
  • Hobbes, Thomas. Leviathan. (1651) Edited by Richard Tuck. Cambridge: Cambridge University Press, 1991.
    • References to this text are by pagination in this edition, followed by chapter number, to allow reliance on various translations and editions available in print or on-line.
  • Kant, Immanuel. The Metaphysical Elements of Justice, Part I of The Metaphysics of Morals. (1797) Translated by John Ladd. Indianapolis: Bobbs-Merrill, 1965.
    • Quotations and parenthetical references are from this translation and edition, followed by the standard AK pagination, to allow reliance on various translations and editions available in print or on-line.
  • Locke, John. Two Treatises of Government. (1690) Ed Peter Laslett. Cambridge: Cambridge University Press, 1988.
    • Quotations are from this recent scholarly edition; all references are to section number of The Second Treatise, to allow reliance on various other editions available on-line or in print.
  • Marx, Karl. “Capital Punishment.” New York Tribune. 1853. https://www.marxists.org/archive/marx/works/1853/02/18.htm.
  • Mill, John Stuart. ”Speech in Favor of Capital Punishment 1868.” The Collected Works of John Stuart Mill, Vol. XXVIII.: Public and Parliamentary Speeches. Eds. John M. Robson and Bruce Kinzer. Toronto: University of Toronto Press, 1988. pp. 266-273. http://oll.libertyfund.org/titles/mill-the-collected-works-of-john-stuart-mill-volume-xxviii-public-and-parliamentary-speeches-part-i.
  • Mill, John Stuart. “On Punishment.” The Collected Works of John Stuart Mill, Vol. XXI: Equality, Law, and Education. Ed. John M. Robson. Toronto: University of Toronto Press, 1984, pp. 77-79. http://oll.libertyfund.org/titles/mill-the-collected-works-of-john-stuart-mill-volume-xxi-essays-on-equality-law-and-education.
  • Plato. The Collected Dialogues. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton University Press, 1961.
  • Ross, W.D. The Right and the Good. Oxford: Oxford University Press, 1930.
  • Rousseau, Jean Jacques. On the Social Contract. (1762) Trans. Donald A. Cress. Indianapolis: Hackett, 1987.
    • Quotations and references are to this translation and edition, using page number followed by book and chapter number, to allow reliance on various translations and editions available in print or on-line.

b. Secondary Sources

  • Bailey, William C. and Ruth D. Peterson. “Murder, Capital Punishment, and Deterrence: A Review of the Literature.” The Death Penalty in America: Current Controversies. Ed. Hugo Adam Bedau. Oxford: Oxford University Press, 1997. 135-161.
  • Banner, Stuart. The Death Penalty: An American History. Cambridge: Harvard University Press, 2003.
    • An excellent, thoughtful, and readable rendition of the long history of death penalty law and practice in America, from colonial beginnings through the end of the 20th century.
  • Bedau, Hugo Adam. “Bentham’s Utilitarian Critique of the Death Penalty.” Journal of Criminal Law and Criminology 74 (1983): 1033-1065.
  • Bedau, Hugo Adam. “Capital Punishment.” Matters of Life and Death: New Introductory Essays in Moral Philosophy. Third edition. Ed. Tom Regan. New York: Random House, 1980. 160-194.
  • Bedau, Hugo Adam, ed. The Death Penalty in America: Current Controversies. Oxford: Oxford University Press, 1997.
    • Despite its publication date, this anthology is still quite useful. It is the best, basic reference for primary and secondary source materials related to American death penalty law, constitutional issues, Supreme Court decisions, public attitudes, social scientific studies of deterrence, and explorations of procedural problems with capital punishment, including matters of race.
  • Bedau, Hugo Adam. Killing as Punishment: Reflections on the Death Penalty in America. Boston: Northeastern University Press, 2004.
    • Bedau has long been a prominent philosophic scholar specializing in research and writing about capital punishment in the United States. The first half of this volume is primarily descriptive of the American system, including problematic procedural outcomes and some recent history of the death penalty. The second half of the book “undertakes a critical evaluation…from a constitutional and ethical point of view.” As a matter of applied ethics, Bedau argues for abolition of the death penalty in reasonably just, constitutional democracies, such as the United States.
  • Black, Charles L., Jr. Capital Punishment: The Inevitability of Caprice and Mistake. Second edition. New York: Norton, 1981.
    • Written by a legal scholar, an accessible appeal to problematic outcomes of American criminal procedure as justification for abolishing the death penalty.
  • Caplan, Arthur A. “Should Physicians Participate in Capital Punishment?” Mayo Clinic Proceedings 82 (2007): 1047-48. http://www.mayoclinicproceedings.org/article/S0025-6196(11)61363-3/fulltext
  • Conway, David A. “Capital Punishment and Deterrence: Some Considerations in Dialogue Form.” Philosophy & Public Affairs 3 (1974): 431-443.
  • Davis, Michael. “Harm and Retribution.” Philosophy & Public Affairs 15 (1986): 236-266.
  • Duff, R. A. Punishment, Communication, and Community. Oxford: Oxford University Press, 2001.
  • Dworkin, Gerald. “Patients and Prisoners: The Ethics of Legal Injection.” Analysis 62 (2002): 181-189.
  • Feinberg, Joel. “The Expressive Function of Punishment. Doing and Deserving. Princeton: Princeton University Press, 1970. 95-118.
  • Feinberg, Joel. “Noncomparative Justice.” Rights, Justice, and the Bounds of Liberty: Essays in Social Philosophy. Princeton: Princeton University Press, 1980. 265-306.
  • Finkelstein, Claire. “A Contractarian Approach to Punishment.” The Blackwell Guide to the Philosophy of Law and Legal Theory. Ed. Martin Golding and William Edmundson. Oxford: Blackwell Publishing, 2005. 207-220.
  • Finkelstein, Claire. “A Contractarian Argument Against the Death Penalty.” New York University Law Review 81 (2006): 1283-1330.
  • Gaie, Joseph B.R. The Ethics of Medical Involvement in Capital Punishment: A Philosophical Discussion. Dordrecht: Kluwer Academic Publishers, 2004.
  • Hampton, Jean. “The Moral Education Theory of Punishment.” Philosophy & Public Affairs 13 (1984): 208-238.
  • Hart, H.L.A. “Bentham and Beccaria.” Essays on Bentham. Oxford: Clarendon Press, 1982. 40-52.
  • Hart, H. L. A. “Prolegomenon to the Principles of Punishment.” Punishment and Responsibility: Essays in the Philosophy of Law. Oxford: Clarendon Press, 1968. pp. 1-27.
    • This essay remains hugely influential in providing the dominant framework for philosophic theories of punishment, including the death penalty.
  • Hart, H.L.A. “Punishment and the Elimination of Responsibility.” Punishment and Responsibility: Essays in the Philosophy of Law. Oxford: Clarendon Press, 1968. pp. 158-185.
  • Heyd, David. “Hobbes on Capital Punishment.” History of Philosophy Quarterly 8 (1991): 119-134.
  • Litton, Paul, Physician Participation in Executions, the Morality of Capital Punishment, and the Practical Implications of Their Relationship (June 28, 2013). 41 Journal of Law, Medicine, & Ethics 333 (2013); University of Missouri School of Law Legal Studies Research Paper No. 2013-13.  https://ssrn.com/abstract=2286788.
  • Mackenzie, Mary Margaret. Plato on Punishment. Berkeley: University of California Press, 1981.
  • McGowen, Randall. “The Death Penalty.” The Oxford Handbook of the History of Crime and Criminal Justice. Edited by Paul Knepper and Anja Johansen. Oxford: Oxford University Press, 2016. 615-634.
  • Montague, Phillip. Punishment as Societal Defense. Lanham: Rowman & Littlefield, 1995.
  • Morris, Herbert. “Persons and Punishment.” The Monist 52 (1968): 475-501.
  • Murphy, Jeffrie. “Marxism and Retribution.” Philosophy & Public Affairs 2 (1973): 217-243.
  • Nathanson, Stephen. An Eye For An Eye? The Morality of Punishing by Death. Second edition. Totowa, NJ: Rowman & Littlefield, 2001.
    • An accessible, readable argument to the conclusion “that the death penalty is not morally acceptable.” Nathanson considers a variety of arguments offered in defense of capital punishment in America: deterrence, costs, problematic procedural outcomes, moral desert and the death penalty, American constitutional considerations. An especially helpful treatment of the arguments based on criminal procedure in America.
  • Nathanson, Stephen. “Does It Matter if the Death Penalty Is Arbitrarily Administered?” Philosophy & Public Affairs 14 (1985): 149-164. Print.
  • Nozick, Robert. Anarchy, State, & Utopia. New York: Basic Books, 1974.
    • Chapter 4 deals with theories of punishment (retributive and deterrence) with respect to a contractarian theory of a libertarian state developed in the spirit of John Locke’s emphasis on individual rights.
  • Nozick, Robert. Philosophical Explanations. Cambridge: Harvard U P, 1981.
    • Section III of Chapter 4 (pp. 363-398) deals with punishment as communication, including some ambivalence about its implications for the death penalty for murderous offenders.
  • Nussbaum, Martha. “Equity and Mercy.” Philosophy & Public Affairs 22 (1993): 83-125.
  • Pojman, Louis. “For the Death Penalty.” The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998. 1-66.
  • Pojman, Louis, and Jeffrey Reiman. The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998.
    • Distinctly different, opposing, nuanced approaches to the death penalty in the context of more general theories about punishment and illustrating ways in which justifications are often hybrid theories that synthesize elements of retributivism and consequentialism. Both authors also address the import of imperfect criminal procedures in the administration of the death penalty in America (or perhaps anywhere). The text includes a response by each to the other’s arguments.
  • Quinn, Warren. “The Right to Threaten and the Right to Punish.” Philosophy & Public Affairs 4 (1985): 327-373.
  • Radin, Margaret Jane. “Cruel Punishment and Respect for Person: Super Due Process for Death.” Southern California Law Review 53 (1980): 1143-1185.
  • Rawls, John. A Theory of Justice. Revised edition. Cambridge: Harvard University Press, 1971, 1999.
  • Reiman, Jeffrey. “Justice, Civilization, and the Death Penalty: Answering van den Haag.” Philosophy & Public Affairs 14 (1985): 115-148.
  • Reiman, Jeffrey. “Why the Death Penalty Should be Abolished in America.” The Death Penalty: For and Against. Lanham, MD: Rowman & Littlefield, 1998. 67-132.
  • Schabas, William. The Abolition of the Death Penalty in International Law. Third edition. Cambridge: Cambridge University Press, 2002.
    • An excellent survey of the title topic, an aspect of capital punishment not often engaged in the work of others in this list.
  • Royal Commission on Capital Punishment 1949-1953.: Report. Cmd.8932. London: Her Majesty’s Stationery Office, 1953.
  • Simmons, A. John. “Locke and the Right to Punish.” Philosophy & Public Affairs 20 (1991): 311-349.
  • Sorell, Tom. “Aggravated Murder and Capital Punishment.” Journal of Applied Philosophy 10 (1993): 201-213.
    • An excellent analysis of the arguments of John Stuart Mill and Immanuel Kant in defense of capital punishment for at least some murders.
  • Sorell, Tom. Moral Theory and Capital Punishment. Oxford: Basil Blackwell in association with the Open University, 1987.
    • Though the primary aim of this book is to show how philosophic arguments and theories “can be useful in producing an improved moral rhetoric,” Sorell does offer a non-consequentialist and retributivist defense of capital punishment on the ground that murderers deserve to die. He opposes alternative forms of retributivism (e.g., appeals to fairness) and argues that utilitarian or consequentialist arguments are inconclusive, including J.S. Mill’s little-known defense of capital punishment.
  • Stalley, R.F. An Introduction to Plato’s Laws. Indianapolis: Hackett, 1983.
  • Ten, C.L. Crime, Guilt, and Punishment. Oxford: Clarendon Press, 1987.
    • A clear, organized introduction to an array of recent theories of punishment, though not specifically addressed to issues of capital punishment. Chapter 7, “The Amount of Punishment,” engages retributivist and utilitarian approaches to justifying the form or kind of punishment for offenders.
  • United Nations. “The Universal Declaration of Human Rights.” (1948). http://www.un.org/en/universal-declaration-human-rights/.
  • United Nations. “International Covenant on Civil and Political Rights.” (1976). http://www.ohchr.org/en/professionalinterest/pages/ccpr.aspx.
  • United States. House of Representatives. The Constitution of the United States of America. Washington: Government Printing Office, 2000. https://www.gpo.gov/fdsys/pkg/CDOC-110hdoc50/pdf/CDOC-110hdoc50.pdf.
  • Waisel, David. “Physician Participation in Capital Punishment.” Mayo Clinic Proceedings 82 (2007): 1073-1080. http://www.mayoclinicproceedings.org/article/S0025-6196(11)61369-4/fulltext.

 

Author Information

Robert Hoag
Email: bob_hoag@berea.edu
Berea College
U. S. A.

Lao Sze-kwang (Lao Siguang) (1927—2012)

photo courtesy of The Chinese University of Hong Kong

The works of Lao Sze-kwang (Lao Siguang) cover a wide range of philosophies, including Confucianism, Buddhism, Daoism, Kantianism, Hegelianism, and, most importantly, the philosophy of culture. Like many other Chinese philosophers of the 20th century, Lao was personally affected by the Chinese Revolution of 1949 and made his career outside of mainland China, having first fled to Taiwan and then Hong Kong after the victory of Mao Zedong’s Communist forces in China’s civil war. Along with other modern Chinese philosophers, Lao was deeply interested in the problem of China’s modernization and actively participated in politics. These two aspects of his intellectual biography, in turn, help to define his work as a philosopher, which focused on contextualizing Chinese philosophy in relation to Western thought as well as emphasizing the practical, as opposed to purely theoretical, dimension of Chinese philosophy.

In his multi-volume New Edition of the History of Chinese Philosophy (1984-1986), he tried to reconstruct traditional Chinese philosophy with the help of modern Western philosophy but in ways that both resembled and differed from the work of so-called “New Confucians” such as Mou Zongsan and Tang Junyi. This work highly influenced the study of Chinese philosophy in Hong Kong and Taiwan and helped shape several generations of Chinese scholars’ understanding of traditional Chinese thought. Unlike Feng Youlan or Hu Shi, each of whom authored competing histories of Chinese philosophy, Lao attempted to define Chinese philosophy not in terms of what it is (an essentialist approach) but in terms of what it does (a functionalist approach). For Lao, Chinese philosophy functions primarily in an “orientative” manner, shaping and guiding Chinese people’s values at a deep level rather than merely advocating particular propositions or theories. In this way, Lao thought, Chinese philosophy was distinct from other social philosophies, particularly those of the West.

Having summarized the legacies of both traditional Chinese and modern Western thought, Lao later developed a philosophy of culture that preoccupied him from the 1990s until his death in 2012. In his Lectures on Philosophy of Culture (2003), Lao declared that his philosophy was driven by a cultural crisis of consciousness, as he realized that he grew up in a frustrating age in which traditional Chinese culture was declining in influence while modern culture was yet to be established as a new cultural order. Lao criticized both traditionalist and anti-traditionalist approaches to the modernization of Chinese culture and tried to develop his own approach.

Table of Contents

  1. Biography
  2. On Chinese Philosophy
    1. New Edition of the History of Chinese Philosophy
    2. The Fundamental Question Method
    3. Chinese Philosophy as an Orientative Philosophy
  3. On Philosophy of Culture
    1. Cultural Spirit as Self-Consciousness of Value
      1. Categories of the Self
      2. Moral Subjectivity in Confucianism and the Rejection of Metaphysical Interpretation
      3. “Eastern Spirit” vs. “Western Spirit”
    2. Modernization of Chinese Culture
      1. The Problem of Objectivity
      2. The Problem of Traditionalism
      3. The Problem of Anti-Traditionalism
  4. Criticisms and Influence
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Lao Sze-kwang was born in Xi’an, capital of Shaanxi province, in 1927 to a highly educated military officer’s family. His grandfather scored high on the Qing dynasty’s (1644-1912) imperial civil service examination and was appointed Governor General of Liangguang (modern Guangxi and Guangdong provinces). In 1860, he negotiated Britain’s lease of the Kowloon peninsula and the New Territories following China’s defeat in the Second Opium War (1856-1860). Throughout his childhood, Lao received a traditional Chinese education at home and even composed his first classical poem at the age of seven.

In 1946, Lao entered the Department of Philosophy at Peking University but fled to Taiwan in 1949 when the Communist Party overtook mainland China. He graduated from the Department of Philosophy at Taiwan University in 1951. As a liberal, however, Lao had difficulty tolerating the Kuomintang (KMT) military dictatorship in Taiwan headed by Chiang Kai-shek. To protect him from political persecution in Taiwan, Lao’s father—a general who served under Chiang—asked him to flee to Hong Kong in 1955. In 1964, Lao joined Chung Chi College at the Chinese University of Hong Kong as a lecturer of philosophy. Lao’s colleagues in Hong Kong included Mou Zongsan and Tang Junyi. As opposed to his pro-KMT colleagues, who condemned only the Communist regime in mainland China, Lao condemned both the KMT’s and the Chinese Communist Party’s dictatorships and refused to join Mou’s and Tang’s “New Confucian” campaign. Lao was later promoted to Senior Lecturer, Reader, and finally Head of the Philosophy Department at the Chinese University of Hong Kong. Although Lao officially retired in 1985, he continued to serve as Honorary Senior Research Fellow at the Institute of Chinese Studies and as Senior College Tutor at Shaw College.

In 1989, Lao returned to Taiwan, where the process of democratization had begun. In 1994, Lao became Chair Professor at the Department of Philosophy of Huafan University. The government of the Republic of China (Taiwan) awarded Lao the National Cultural Award in 2001 and the Republic of China Academia Sinica Fellowship in 2002. In 2012, Lao died in Taiwan.

2. On Chinese Philosophy

a. New Edition of the History of Chinese Philosophy

From 1984 to 1986, Lao published his most influential work, New Edition of the History of Chinese Philosophy. It consists of four volumes of a systematic reconstruction of Chinese philosophy, from the pre-Qin period (the period before the establishment of the Qin dynasty in 221 B.C.E.) to the Qing dynasty (1644-1912). Lao regarded it as the first complete work on the history of Chinese philosophy because, according to him, all previous writings about the history of Chinese philosophy focused on its history rather than on the reconstruction of its philosophical problems. While Lao acknowledged Huang Zongxi (1610-1695)’s Record of the Ming Scholars as work that satisfied his criteria for an authentic history of Chinese philosophy, it merely addressed the development of Confucian thought during the Ming dynasty (1368-1644). Similarly, Lao did not consider Hu Shi (1891-1962)’s History of Chinese Philosophy as an authentic study of the history of Chinese philosophy because, in his view, it focused too much on archaeological investigation rather than philosophical discussion. For Lao, a valid work on the history of Chinese philosophy should reconstruct the theories of past philosophers by investigating their philosophical questions and answers. Lao argued that although Feng Youlan (1895-1990)’s History of Chinese Philosophy explored the philosophical problems of past Chinese philosophers, it failed to grasp the nature of Chinese philosophy as the pursuit of virtue (that is, the complete actualization of humanity’s innate moral goodness) due to the overwhelming influence of Western metaphysical concepts and Marxist ideology on Feng’s thought. Thus, Lao refused to acknowledge Feng’s book as a work on Chinese philosophy. Instead, Lao argued that his own book was the first true account of the history of Chinese philosophy because of its accurate reconstruction of the philosophical questions and answers of past Chinese philosophers.

b. The Fundamental Question Method

In New Edition of the History of Chinese Philosophy, Lao suggested a new methodology for the study of Chinese philosophy called the Fundamental Question Method. Lao argued that every school of philosophy chooses a single ultimate question to answer. Such a question is known as “the fundamental question,” for which the procedure is as follows:

  1. Investigation of a philosophical text and its historical background
  2. Reconstruction of the text’s arguments
  3. Deduction of the original intention behind the arguments
  4. Identification of the fundamental question
  5. Reconstruction of the logical relations between the questions and answers on the basis of the fundamental question

In order to answer such questions, philosophers provide several answers or solutions, which may lead to new sub-questions. The question-answering process is the development of a philosophical school. A fundamental question leads to several levels of sub-questions with their corresponding answers. All of these questions and answers construct a complete theory. The problem is that some philosophers may not explicitly declare their own fundamental question. Therefore, a “theoretical reduction” is needed. One may deduce the philosopher’s original intention from the arguments found in his writings.

It is important to note that the Fundamental Question Method does not resist the introduction of Western philosophical concepts to the study of Chinese philosophy. Instead, Lao emphasized that his Fundamental Question Method employed a “Western logical analysis” due to the relative absence of logic and epistemology in the Chinese tradition. To articulate the fundamental questions of Chinese philosophies, one must inevitably employ Western disciplines such as hermeneutics and logic. Lao used a microscope analogy to justify his support of the use of logical analysis in the study of the history of Chinese philosophy. Although the microscope was invented by the Europeans in the modern era, it can still be used as an instrument to study ancient bacteria in Africa, which existed long before their discovery under the microscope. Likewise, logical analysis is a “microscope for thoughts.” Principles discovered through logical analysis have existed since before its introduction. Spatiotemporal differences do not weaken the universal validity of principles discovered through logical analysis, for logic is a universal science. The following section demonstrates Lao’s application of the Fundamental Question Method to Confucius’ philosophy as articulated in New Edition of the History of Chinese Philosophy.

Lao identified the Analects as the only reliable source for studying Confucius’ philosophy, as it was allegedly composed by Confucius’ students. According to the Records of the Grand Historian, written by Sima Qian (145-86 B.C.E.), Confucius was employed as an official ritual expert by the ancient state of Lu (modern Shandong province). For Confucius, the ceremonies and literature of the Western Zhou dynasty (1046-771 B.C.E.) provided the basis for all proper and efficacious ritual (li), but by his time, these had been forgotten, diluted, or abused as a result of China’s increasingly fractious political and military environment. As Lao pointed out, the Confucian view of li goes beyond mere ceremony to encompass internal attitudes within ritual actors and participants, especially a sense of reverence for tradition and an aesthetic sensitivity to beauty as a mode of order. As a way of restoring harmonious order to a disorderly and divided society, li was vital to Confucius’ understanding of the purpose of both collective culture and personal self-cultivation. Therefore, Lao considered the concept of li as the starting point of Confucius’ philosophy.

However, Lao did not think that li itself was the central concept of Confucius’ philosophy. According to him, Confucius’ concept of li points at a deeper principle: social order. The distinction between li as the guarantor of social order and mere ritual was already being discussed before Confucius developed his philosophy. In the Zuozhuan (Zuo Commentary on the Spring and Autumn Annuals), an ancient Chinese chronicle, when the Duke of the state of Jin praised the Duke of Lu for being good at keeping li in archery performance, his minister, Ru Shuqui, disagreed and distinguished li from deportment, or the following of rituals: “[li] is that by which [a ruler] maintains his State, carries out his governmental orders, and does not lose his people.” Lao coined the term quanfen (division of power and responsibility) to describe what Neo-Confucian philosophers called lifen (division by ritual, that is, reason and duty)—a sense of social order achieved by the division of society into class-specific titles and duties. It is this sense of social order on which Confucius’ philosophy was based, according to Lao, and toward which li as a kind of social performance and inner experience was to be aimed.

What is the legitimacy of li? According to Lao, Confucius reduced li to yi (righteousness) and ren (humaneness or benevolence). Confucius said that “[t]he gentleman takes rightness as his substance, puts it into practice by means of ritual [li], gives it expression through modesty, and perfects it by being trustworthy. Now that is a gentleman” (Analects 15:18). Righteousness stands for the normative principles used to distinguish between right and wrong. Ren is the basis of righteousness, as Confucius said a person practising ren is free from wickedness (Analects 4:4). Ren is the highest moral goodness. Therefore, li is ultimately reduced to ren. Confucius clearly argued that ren constitutes the restraint of the self for the return to li (Analects 12:1). Li is the way of practising ren. But, what is ren? In Lao’s Essential of Chinese Culture (1998), he argued that ren is the foundation of righteousness, devotion to public well-being (gongxin), and purification of intention. Ren implies the denial of self-interest, for righteousness contrasts with narrow self-interest (Analects 4:16). Once a person has denied his self-interest, he will pursue public well-being and righteousness.

To sum up, Lao argued that Confucius’ fundamental question was, “How can we preserve the social order?” As such, it leads to the question, “What is the legitimacy of li?” By reducing the concept of li to yi (righteousness) and ren (humaneness or benevolence), Lao approached Confucius’ answer systematically: the legitimacy of li comes from ren. Once a person practices li, he actualizes ren as devotion to public well-being.

c. Chinese Philosophy as an Orientative Philosophy

In his 1989 article “On Understanding Chinese Philosophy: An Inquiry and a Proposal,” Lao defined Chinese philosophy as an Orientative Philosophy and tried to convince the Western scholar to consider Chinese thoughts as a philosophical tradition. He introduced a simplified version of the Fundamental Question Method for the reconstruction of Chinese philosophies, which he called the “purpose theory.”

Firstly, Lao argued that philosophy is reflective thinking about certain functions of philosophy. Reflective thinking occurs when a person reflects on his own activities. Epistemology and hermeneutics reflect on the nature of human knowledge and understanding, while ethics reflects on the nature of human moral action. Metaphysics reflects on the underlying unity of the empirical world. The subject matter of reflective thinking in different periods and places can be radically different. When reflective thinking addresses a certain type of subject matter, it provides particular philosophical solutions to a particular philosophical problem. Therefore, to understand a particular philosophy, one must understand the problems with which it deals. If the problems of a particular philosophy have no relevance whatsoever to real life, said Lao, it should be rejected.

Secondly, Chinese philosophy is an orientative reflective thinking that intends to transform the self or the world. According to Lao’s definition, an orientative philosopher should suggest that there is a final purpose in life and that people should try to actualize such purpose in their daily living. Different schools of Chinese philosophy provide different normative guidelines or regulations for daily life. Lao named such guidelines “purpose-theories,” which contain three steps: (1) selecting a purpose, (2) justifying the purpose, and (3) offering practical maxims for people actualizing such a purpose.

Lao used the work of the Daoist philosopher Zhuangzi and the Confucian philosopher Mencius (Mengzi) as examples. Zhuangzi’s purpose was to achieve xiaoyao, which means “absolutely unburdened and unbound freedom” of the mind or the self. Lao identified xiaoyao as “transcendent freedom” because it exerts no influence on the objective world. In order to justify his purpose, one must understand Zhuangzi’s view of the world. According to Zhuangzi, the world is changing. He used the term hua to indicate the concept of change. The physical self is illusory, as “every empirical existence is in a relation of transformation with other existences. The elements that constitute my body did constitute, and will constitute, other things in the same time/space structure, or empirical world.” The “body is no more than a congregation of physical elements… which will disintegrate when the elements move to form other physical things” (Lao 1989, 280). The principle governing all changes in the world is known as zaohua (meaning “making changes”). In other words, the real self cannot be anything physical that changes restlessly. Instead of defining the non-physical self as the “real self,” Zhuangzi simply did not consider the real self as an object. The real self is beyond all beings and gets rid of self-limiting inclinations. The real self should not fall into the realm of beings, or else it would be limited by changes. Zhuangzi even denied the existence of anything valuable in the physical world. He rejected cultural values as limiting one’s freedom, for knowledge and values exist only within a system in which the criterion of truth is relative. There is no universal criterion of truth that transcends all theories and systems, according to Zhuangzi. The endless debates among philosophies, like the debate between Confucianism and Mohism, can never manifest the truth. Only the transcendent freedom of the self is valuable, according to Zhuangzi. One must become enlightened to enjoy transcendent freedom. Lao, however, argued that Zhuangzi did not provide a practical maxim that could teach people how to become enlightened.

In contrast to Zhuangzi’s purpose of the transformation of self, Lao saw Mencius’ purpose as the transformation of the world by creating a cultural order actualized by li (ritual propriety), yi (righteousness), and ren (humaneness or benevolence). In order to justify his purpose, Mencius established a doctrine of mind and essence (xin xing lun). He argued that the real root of all moral and cultural values is within human nature. As long as one can maintain the mastery of the mind over the body, one can act morally. According to him, there are “four beginnings (sishanduan),” which are also known as the four basic qualities of the mind: “The sprout of humaneness or benevolence [ren],” “the sprout of righteousness [yi],” “the sprout of ritual propriety [li],” and “the sprout of wisdom [zhi]” (Mencius 2A6). Xing, which can be translated as human nature or essence, is universal to every human being. The four beginnings are four innate moral capacities. Thus, the legitimacy of moral and cultural values and orders is determined by such universal virtues. A righteous government is one in which the rulers love the people through the virtue of ren. The legitimacy of authority lies on the will of the people. For practical maxims, Mencius emphasised not only self-transformation but also social transformation. Social transformation can only be achieved by a virtuous leader who fully actualizes his innate moral capacities. Self-transformation, however, can simply be achieved by the enlightenment. As long as a person is conscious of his innate moral capacities and actualizes them, he achieves self-transformation. Using Zhuangzi and Mencius as examples, Lao argued that Chinese philosophies, as orientative philosophies, aim to answer the question of “where to go” instead of “what it is” (Lao 1989, 290). Both Zhuangzi and Mencius tried to provide some directions for how one ought to live.

Overall, Lao’s purpose theory is merely a simplified version of the Fundamental Question Method. With the Fundamental Question Method, one must identify the fundamental questions, sub-questions, and answers so as to reconstruct the system of a particular Chinese school of philosophy. With the purpose theory, however, one only needs to identify a Chinese philosopher’s purpose with justification. One does not need to investigate how a fundamental question leads to several secondary questions and answers.

3. On Philosophy of Culture

While Lao is famous for his contribution to the study of the history of Chinese philosophy, his major research interest lies in Chinese cultural issues—namely, the philosophy of culture. Despite Lao’s deep interest in Chinese thought, his philosophy of culture draws heavily from German idealism, especially Hegel’s doctrine of “externalization,” which postulates that just as human beings come to perceive the world as alien and even hostile because it is “other,” not human—because the observed world is different from the human self that observes—so too can human beings reconcile themselves to the world by realizing that the world really is not “other” in relation to themselves.

In Lectures on Philosophy of Culture, Lao distinguished philosophy of culture from cultural critique or cultural studies. Having adopted Habermas’ trichotomy (distinction between philosophy, critique, and science) from the essay “Between Philosophy and Science—Marxism as Critique,” Lao argued that philosophy is merely theoretical, science is merely epistemically judgmental (according to experience), and critique is both theoretical and judgmental (Lao 2002, 42).

In order to understand Lao’s take on philosophy of culture, one must understand his definition of the term “philosophy of culture.” According to him, philosophy of culture is both descriptive and normative. As a descriptive philosophy, it aims to describe the nature or essence of a particular culture, while as a normative philosophy, it aims to evaluate the orientations or trends of cultural development. The term “culture” has two meanings: phenomenon and spirit. Human activities that express meanings and values are cultural phenomena, while the systems of those inner meanings and values are cultural spirits. Anthropology and social sciences only study cultural phenomena. Only philosophy investigates the underlying values and meanings behind that phenomena. Cultural spirit, which is “value consciousness,” determines the modes of cultural phenomena. When human beings are conscious of the existence of values and meanings and try to manifest them in human activities—including ideas, attitudes, systems, and customs—they are actualizing their cultural spirits. A cultural spirit is determined by free will (Lao 1998, 6-7). Therefore, culture is defined as a spirit or as “the actualization of value consciousness,” namely, the process of manifesting values in human activities. Because the subject matter of philosophy of culture is the cultural spirit, it is appropriate to look at Lao’s narratives of Chinese cultural spirit to understand his particular philosophy of culture.

a. Cultural Spirit as Self-Consciousness of Value

i. Categories of the Self

Subjectivity is an essential concept in Lao’s philosophy. Lao categorized the self into the moral self, the cognitive self, and the aesthetic self in terms of the “field of subjective activities,” namely, the mind activity of the self, manifesting itself in the external world against the limitations of the body. He calls this the “trichotomy of the self.” In terms of the numbers of the self, Lao provided another distinction: the “single subject” and the “multiple subjects.” Such distinction is known as the “dichotomy of the subject” (Lao 2000, 219).

In order to understand the concepts of cognitive self, moral self, and aesthetic self, one must understand the nature of the “subjective mind activity.” According to Lao, mind activity tries to manifest itself in the external world and to achieve self-actualization. To achieve self-actualization, one must overcome the physical limitation imposed by the body and the external world. In other words, self-actualization is a struggle for freedom against limitation. The body or the physical self of a person is not his real self, for the body is determined by external factors instead of one’s free will. The body is mechanical and determined by conditions; when the hand feels the heat of the fire, it immediately moves away from the fire.

The cognitive self is manifested as cognitive judgment. Cognitive judgment aims to analyze or reflect on experience so as to produce certain knowledge. Cognitive judgment is objective and universally valid. The facts that 1 + 1 = 2 and that H2O is water are both universally and objectively true. The cognitive self is the subject undertaking such cognitive judgment. Such self-freedom is not complete freedom, for that cognitive judgment is nonetheless limited by external conditions, which are experiences. The self has no dominion over the experience it receives. Free will does not affect the validity of the judgment.

The moral self is manifested as ethical judgment. Ethical judgments are value judgments. Value judgments are unconditional. When one says that killing is morally wrong, one means that killing is always wrong. Therefore, ethical judgments are also universal. However, the objects of value judgments must be free human beings. When person A considers his own action C1 as being morally right or wrong, he assumes that C1 is an intentional action done by himself according to his free will. In other words, person A assumes that he has dominion over his own action C1 and is responsible for his own actions. The self, manifested in value judgment, therefore, is always dominant over some actions (Lao 1998, 143).

The aesthetic self-manifests itself in aesthetic or emotional judgment. Unlike the moral self or the cognitive self, the aesthetic self brings no universal judgment. Aesthetic judgment, like sexual desires or appetite, is merely about preferences. When person B makes an aesthetic judgment that he wants an object or a state of affair x, he assumes that his desire for x is intentional, but his desire for x is never universal. Someone from Hong Kong may want to drink Hong Kong-style milk tea for breakfast this morning but British-style Earl Grey tea tomorrow morning. A person may fall in love with different people at different times and places. More importantly, “B wants x” is only valid for person B alone. In other words, the validity of an aesthetic judgment is subjective. Person B “clings to” object x where such satisfaction depends on external conditions, for the satisfaction of desire is not determined by one’s own free will. Therefore, the dominion and the freedom of an aesthetic self is very limited.

ii. Moral Subjectivity in Confucianism and the Rejection of Metaphysical Interpretation

Confucianism, according to Lao, emphasizes only the moral self. Like his contemporaries, Lao acknowledged that subjectivity—being a moral subject itself—is essential to Confucianism. Mencius’ doctrine of mind and essence affirmed that everyone shares the innate moral goodness to act morally in the form of “four beginnings.” One does not need to learn how to be a moral person. One need only to acknowledge the fact that human nature is good, and one needs to actualize his own moral capacity by following certain teachings. According to Lao, Mencius was the first person to argue that the origin of moral values and virtues, namely, the self-consciousness of one’s innate moral capacity, is internal instead of external. However, the Han dynasty Confucians were ignorant of Mencius’ xin xing lun and instead defined heaven as the external source of moral values with the help of yin yang cosmology. Lao criticized the Han Confucians for being too “metaphysical” (Lao 1984, vol. 2, 10). Nonetheless, the later Confucian thinkers known as “Neo-Confucians” gradually moved from cosmology to the xin xing lun. The only exception is the Lu-Wang school, which argued that the mind itself is the innate moral principle (xin ji li, “the mind is the reason”) is closer to Mencius’ philosophy than the ChengZhu school, which argued that the heavenly command is the supreme moral principle (xing ji li, “the essence is the reason”) (Lao 1984, vol. 3a, 489).

Because Confucians argued that the origin of moral value is internal to every human being, they emphasized the priority of the moral self or of subjectivity. The moral self transcends from the bodily limitation and postulates moral principles according to its own innate moral capacity. The moral self-manifests freedom through moral actualization or virtue completion. The social and cultural orders are not sources of moral values. Rather, they are the instruments that help every individual to actualize his own moral capacity. Confucian emphasis on the moral self leads to a Confucian doctrine of culture: that all cultural phenomena should manifest the internal moral values in every individual. Such a statement is essential to Lao’s analysis of the nature of traditional Chinese culture, which is strongly shaped by Confucian ethics.

iii. “Eastern Spirit” vs. “Western Spirit”

Lao contrasted the “Eastern spirit” with the “Western spirit” by claiming that the Eastern spirit is virtue-oriented while the Western spirit is wisdom-oriented. In Collection of Essays on Cultural Problems (2000), Lao argued that the wisdom-oriented spirit originating from ancient Greek philosophy is the orthodox cultural spirit in the West, unlike the faith-oriented Hebrew spirit (Lao 2000, 30). A wisdom-oriented spirit is a spirit that pursues objective knowledge. The Eastern spirit, however, is a virtue-oriented spirit, which pursues moral order. Easterners aim to establish a proper way of living. Both the Chinese cultural spirit and the Indian cultural spirit are virtue-oriented Eastern spirits. The Chinese cultural spirit, however, emphasises moral actualization and virtue completion, while the Indian cultural spirit emphasises renunciation. The Chinese cultural spirit is dominated by the Confucian spirit (Lao 2001, 218), which aims to construct a “reasonable” or “proper” social order according to human nature (Lao 2000, 48).

According to Lao, the ancient Chinese—unlike Westerners—emphasized social order more than knowledge about the external world. The “reasonableness” of social order is not about objective evidence. Instead, it is about its coherence with innate human nature. As discussed above, Confucius reduced li as social order into yi (righteousness) and ren (humaneness or benevolence). The legitimacy or the moral value of social order does not depend on an external God, monarch, or objective form. Instead, it depends on ren alone, which is internal to the moral self. Li is the actualization of the virtue of ren. The rituals and order between parents and children manifest the filial and family love, while those between monarchs and ministers manifest teamwork, solicitude, and loyalty. As a result, the moral self becomes the only source of all moral and cultural values. The legitimacy of moral and cultural values is not demonstrated by experiences or reasoning.

Under the Chinese cultural spirit, the aesthetic self and the ethical self are submissive to the moral self. Because, for Lao, the ultimate concern in Chinese culture is virtue completion or actualization, knowledge and arts are merely instruments for moral practices. For example, poems and verses should manifest proper moral values. Architecture is merely used for the sake of improving people’s well-being, as in the construction of China’s Great Canal and Great Wall. Arts and sciences are not used for their own sakes. They are all used for the sake of moral actualization.

Having distinguished the Chinese cultural spirit from other cultural spirits, Lao argued that the Chinese emphasis on social order led to a different perspective on interpersonal relationships. Moral actualization requires the actualization of the innate human nature of the moral self instead of other selves. Only a single self participates in moral actualization. Moreover, the Chinese cultural spirit only emphasises the role of the moral self. Lao did not consider the cognitive self and the aesthetic self as independent from the moral self, as all cognitive and aesthetic activities are merely instruments of moral actualization. Therefore, there is only one self in the Chinese cultural spirit, namely, the moral self.

b. Modernization of Chinese Culture

i. The Problem of Objectivity

Based on his philosophy of culture, Lao established his political philosophy and provided criticism of traditional Chinese culture. As discussed above, for Lao, the Chinese cultural spirit is dominated by the Confucian spirit, which only emphasises the actualization of the moral self as a single self, instead of multiple and equal selves. In Lao’s Liberty, Democracy and Cultural Creation (2001), he explained why democracy and modern natural science are absent from traditional Chinese culture.

There are two kinds of interpersonal relations according to Lao, namely, coordinative relations (bing li guan xi) and hierarchical relations (ceng ji guan xi). A coordinative relation is an equal relation among individuals, while a hierarchical relation is unequal (Lao 2001, 224). In order to deal with certain “public affairs,” individual members of the society gather under a particular relation. If the members gather in a coordinative relation, acknowledging each member as an equal individual, they may develop a democracy. However, if they gather in a hierarchical relation, they may have a monarchy or an aristocracy.

The dominance of the moral self leads to a hierarchical relation, for the moral self suppresses the cognitive and aesthetic selves. The authority of the moral self cannot be challenged by the cognitive or aesthetic selves. The cognitive self cannot question the legitimacy of virtue and rituals, as their legitimacy does not rely on reason or empirical evidence. The moral self is a single individual who does not seek anything outward, for it has already attained all innate virtues. The Confucian emphasis on the moral self leads to a hierarchical relation, which is the social and cultural order according to the principle of li.

The absence of coordinative relations in the Chinese cultural spirit implies the absence of democracy and modern natural science. Lao argued that the virtue-oriented Chinese cultural spirit does not acknowledge inter-subjectivity, as the moral self is a single self suppressing all other selves. Without a coordinative relation, equality is impossible. Scientific knowledge is objective knowledge, which is verifiable or falsifiable by empirical evidence. In other words, the authority of scientific knowledge does not depend on any individual but on the objective evidence to which everyone has equal access. Scientific knowledge assumes the concept of equality and coordinative relations. If a scientist S1 verifies that theory T is true with experiment E, other scientists like S2, S3, S4 … and Sn should be able to verify theory T with the same experiment E. There are multiple subjects who are cognitive selves having equal access to the same method and the same knowledge, regardless of their social status.

Social status and roles, however, are important in the Confucian concept of virtue completion or moral actualization. Confucius argued that a monarch should behave like a monarch, a minister should behave like a minister, a father should behave like a father, and a son should behave like a son. The hierarchical order determines the particular ways of virtue completion for different people. A son has no access to his father’s moral practices and vice versa. A minister acting as a virtuous monarch is vicious for having seized the power from the monarch. One cannot actualize his innate virtue if he refuses to act according to his social role.

Hierarchical relations emphasise authority. Education and entertainment are distributed to different people according to their social statuses. A peasant should not play royal music in his home as it is not proper. A student should not condemn his teacher’s teaching openly as it is impolite. The overemphasis on authority prevents the development of modern natural science, which constantly falsifies previous theories. Hierarchical relations also prevent the development of democracy due to the denial of equality.

In short, Lao criticized the Chinese cultural spirit for its overemphasis on the moral self, which suppresses the cognitive and aesthetic selves and denies the existence of a coordinate interpersonal relation. Lao assumed that the Chinese cultural spirit is dominated by the Confucian spirit, which aims to actualize the internal and innate virtues of ren. Although he did not provide a concrete solution to cure the “sickness of Chinese culture,” he argued that if Chinese culture is to be modernized, it should acknowledge the existence of multiple subjects and a coordinate relation and develop an independent cognitive self. Lao claimed that most contemporary Chinese scholars failed to diagnose the problem of Chinese culture. This led to two extreme perspectives on the modernization of Chinese culture: traditionalism and anti-traditionalism. Lao condemned both perspectives as one-sided misinterpretations of the Chinese cultural spirit, as explained below.

ii. The Problem of Traditionalism

Traditionalists are people who argue for the full restoration and preservation of traditional Chinese culture against Western cultural invasion. Traditionalists do not argue against Chinese modernization. Rather, traditionalists argue that there are precious values within traditional Chinese culture that must be preserved. Traditionalists do not reject Western thoughts. Instead, they argue that the introduction of Western thought must be based on the preservation of essential traditional Chinese values. Such a perspective is known as zhongti xiyong (“Chinese [thought for] fundamentals and Western [thought for] practical application”) (Lao 2003, 104). For example, the early 20th century Chinese reformer Liang Qichao (1873-1929) argued that Western constitutional monarchies are consistent with traditional Confucian ethics, while Mou Zongsan tried to reinterpret Confucianism with the help of Kantian ethics.

According to Lao, traditionalists generally follow a Hegelian model of culture, according to which “[internal] values determine the activities which are manifested as [external] systems” (Lao 2003, 104-105). In the Manifesto on the Reappraisal of Chinese Culture (1958), “new Confucian” philosophers—including the aforementioned Mou Zongsan and Tang Junyi—reconstructed traditional Chinese culture by articulating how Confucian values determine the Chinese cultural phenomenon. Tang even argued that conservative attitudes can be progressive. Progression must be based on value consciousness: how to manifest values in contemporary situations (Tang, 1974, 24). The overseas Chinese refugee must be confident with traditional Chinese culture and “re-root oneself spiritually” in order to preserve traditional Chinese values (Tang 1974, 49). However, Lao questioned Tang and his fellow “new Confucians” by asking why Confucian and traditional Chinese values are worthy of preservation and manifestation in the contemporary world. Lao argued that if traditional Chinese culture needs to be restored or preserved, as new Confucians argued, then traditional Chinese culture had already declined. So why did traditional Chinese culture decline? And why is traditional Chinese culture worthy of being preserved? Someone taking Tang’s position might reply by arguing in favour of the unique features of Chinese Confucian values, for example the xin xing lun, and blaming the Western invasion for the decline of traditional Chinese culture. As discussed before, however, Lao argued that Confucian overemphasis on moral self and the suppression of cognitive and aesthetic selves led to that cultural decline. Natural science and democracy failed to develop within traditional Chinese culture until the Western invasion arrived. For Lao, this amounted to a devastating critique of the traditionalist arguments advanced by Tang and his colleagues. Still, this did not mean that Lao embraced the anti-traditionalist view, either.

iii. The Problem of Anti-Traditionalism

As opposed to traditionalists, anti-traditionalists are people who reject the value of traditional Chinese culture entirely in order to achieve Chinese modernization. Hostility towards traditional Chinese culture was the mainline ideology in early 20th century China. After the First Opium War (1839-1842), the Qing dynasty suffered from invasions and interventions by the West. Chinese people were aware of the weakness of traditional Chinese culture and tried to modernize China through Westernization. After the overthrow of the Qing dynasty and founding of the Republic of China in 1912, the imperial system collapsed, which challenged the hierarchical order of traditional Chinese culture. The May Fourth New Culture Movement in 1919 blamed traditional Chinese culture for being a barrier to Chinese modernization. This movement assumed that modernization counteracts traditional Chinese culture.

To evaluate anti-traditionalists’ criticism of traditional Chinese culture, Lao reconstructed the anti-traditionalist argument as follows:

  1. Traditional Chinese culture is a barrier to Chinese modernization.
  2. Confucianism is an influential origin of traditional Chinese culture.
  3. To achieve modernization, one must oppose traditional Chinese culture and therefore reject Confucianism (Lao 2003, 59).

Furthermore, to demonstrate that traditional Chinese culture is a barrier to Chinese modernization, one must demonstrate the truth of two statements: that traditional Chinese culture is influential enough to prevent Chinese modernization and that all factors preventing Chinese modernization stem from traditional Chinese culture. Lao argued that both statements can hardly be substantiated. Traditional Chinese culture was very weak in modern Chinese society after the Opium Wars. The republic replaced the imperial monarchy in the 1912 revolution, and modern written Chinese replaced classical written Chinese in the 1919 May Fourth New Cultural Movement. Both the Chinese Communist Party and the KMT, while bitterly opposed to one another, engaged in critiques of traditional Chinese culture (Lao 2003, 59-60).

Lao argued that, while there are counterexamples that reject the first statement, it is difficult to justify the second statement. There are numerous factors preventing Chinese modernization that may come not from traditional Chinese culture, but from humans’ simple animal nature, namely, instincts like selfishness, fear, and so on. People’s fear of change may prevent social reform, and rulers’ ambitions of power may prevent democratic reform. Traditional Chinese culture may have nothing to do with these factors. Therefore, it is unfair to blame traditional Chinese culture for preventing Chinese modernization. Besides, while the Confucian spirit dominates the traditional Chinese cultural spirit, Confucianism is not the only influential origin of Chinese values. Daoism and Buddhism are more influential at the popular level of society than Confucianism. Even if traditional Chinese culture were the major barrier to Chinese modernization, it is unfair to argue that Confucianism is also a barrier, as the traditional Chinese culture is not identical with Confucianism (Lao 2003, 62).

Lao criticized anti-traditionalists for their failure to acknowledge modernization as imitation or learning, rather than destroying one’s own traditional cultural heritage. A native speaker of Cantonese cannot learn English as his second language until he can speak his native language. When he learns English as a second language, he does not give up Cantonese as his native language. Learning is the process of obtaining new abilities based on existing capacity. Without existing capacity, it is difficult to learn any new skill (Lao 2003, 73).

4. Criticisms and Influence

Lao’s writings make it clear that he tended to overemphasize the importance of Confucianism in characterizing Chinese culture at the expense of other traditions, especially Buddhism and Daoism. In Essentials of Chinese Culture, Lao expressed his bias against the Daoist religion. He argued that the Daoist religion has little impact on political systems and moral teachings, as the Daoist religion “is not believed by the scholars” (Lao 1998, 180).

Moreover, Lao assumed that Chinese scholars or philosophers—most of whom were social elites—determined the nature of the Chinese cultural spirit. Given that since the 11th century or so, most Chinese elites have embraced Confucianism, for Lao it seemed obvious that Confucianism defines the Chinese cultural spirit. However, as continental philosophers such as N. F. S. Grundtvig have pointed out, it is questionable whether a cultural spirit is defined by scholars alone. Lao seemed to assume that the working class and folk religion play only small roles in cultural development. However, if a cultural spirit bears the true meaning and values behind all cultural phenomena, one should observe how those cultural phenomena are manifested in reality and how community members interpret a particular cultural phenomenon.

Additionally, Lao’s definition of culture as a “cultural spirit” is problematic. If culture is defined as a cultural spirit, which is value consciousness, it means that all members share and manifest the same values in their cultural behavior, for as long as they are conformists. However, the fact is that it is possible for members of the same culture to interpret the same cultural phenomenon with different values. An uneducated peasant may have no knowledge about the deep meanings behind the rituals of ancestor worship or a traditional Chinese marriage ceremony. He may merely follow the customs and habits without reflection. Alternatively, he may interpret ancestor worship in a very different way from the standard Confucian interpretation endorsed by Confucian scholars. While Confucian scholars interpret the ritual of ancestor worship as a way to show respect and commemorate ancestors, a peasant may practice ancestor worship as a way to ask ancestors to bless his family. Different subgroups, economic classes, or families within the same community may have vastly different interpretations of the same cultural phenomenon. Differences in value consciousness imply different cultural spirits. Thus, the definition of culture as cultural spirit or value consciousness is problematic, as it is very difficult to verify what values are essential to a particular culture.

Agreeing with the idea that the Confucian spirit dominates the Chinese cultural spirit and that the latter emphasises virtue completion or moral actualization, it is nonetheless unclear why the dominion of the moral self in the Chinese cultural spirit implies a hierarchical relation among individuals. The moral self’s suppression of the aesthetic and cognitive selves does not signify the single self’s suppression of the other selves. A real individual self is a union of the aesthetic self, the cognitive self, and the moral self. The relation between the aesthetic, cognitive, and moral selves should be distinguished from the relation between individual selves. The former can be a suppression within a single individual self, but the latter is a suppression among different individual selves, namely, an individual who oppresses other individuals. Undoubtedly, Lao confused the real individual self with the moral self. More importantly, as Confucianism acknowledges that everyone has the innate moral capacity to achieve virtue completion, it should be able to acknowledge the equality of human beings. Why, then, did classical Confucianism not acknowledge such equality but instead develop a hierarchical relation among people?

Furthermore, when it comes to the role of Confucian values in traditional Chinese culture, Lao seems to contradict himself. On one hand, Lao himself realized that Confucian values have a limited influence on traditional Chinese culture when he distinguished the opposition against traditional Chinese culture from the opposition against Confucianism in his discussion of anti-traditionalism. On the other hand, Lao maintained that the Confucian cultural spirit dominates traditional Chinese culture, assuming that Chinese scholars determined the structure of traditional Chinese culture while Daoism and Buddhism were less influential.

Finally, despite its having been influenced by Western thought, Lao’s philosophy of culture contained a certain bias against Western culture, which weakened his discussions on transcultural dialogue between Chinese culture and Western cultures. Lao displayed no interest whatsoever in Christian contributions to such transcultural dialogue, nor did he acknowledge Christianity’s influence on the modernization of Chinese culture. Considering the fact that Lao spent many years teaching at Chung Chi College, a Protestant Christian institution in which discussions of dialogue between Christianity and Confucianism were frequent and enthusiastic, it is surprising that he has so little to say about Christianity. He devoted only a few pages of The Essentials of Chinese Culture to summarizing the history of Christianity in China. Without offering any evidence, Lao argued that Christianity “has yet to infiltrate the cultural life of the Chinese nation” and that Chinese people have little passion for the Christian faith (Lao 1998, 191).

5. References and Further Reading

a. Primary Sources

  • Kang de zhi shi lun yao yi [Essential of Kant’s Theory of Knowledge]. Hong Kong: Union Press, 1974.
  • Li shi zhi cheng fa [The Punishment of History]. Hong Kong: University Life Ltd., 1971.
  • Zhongguo zhi lu xiang [China’s Way Out]. Hong Kong: Wisdom Publishing, 1981.
  • Xin bian zhong guo zhe xue shi [New Edition of the History of Chinese Philosophy]. Taipei: San Min Book Co. Ltd, 1984-1986.
  • “On Understanding Chinese Philosophy: An Inquiry and a Proposal,” in Understanding the Chinese Mind: The Philosophical Roots, ed. Robert E. Allinson (Hong Kong: Oxford University Press, 1989), 265-293.
  • Zhongguo wenhua yao yi xin bian [Essentials of Chinese Culture]. Hong Kong: The Chinese University Press, 1998.
  • Wen hua wen ti lun ji xin bian [Collection of Essays on Cultural Problems]. Hong Kong: The Chinese University Press, 2000.
  • Wen hua zhe xue yan jiang lu [Lectures on Philosophy of Culture]. Hong Kong: The Chinese University Press, 2003.
  • Xu jing yu xi wang: lun dang dai zhe xue yu wen hua [Illusion and Hope: On Contemporary Philosophy and Culture]. Hong Kong: The Chinese University Press, 2003.
  • Dang dai xi fang si xiang de kun ju [The Dilemma of the Contemporary Western Thoughts]. Taipei: Commercial Press Taiwan, 2014.

b. Secondary Sources

  • Mou Zongsan, Xu Fuguan, Zhang Junmai, Tang Junyi, and Xie Youwei. “Manifesto on Behalf of Chinese Culture Respectfully Announced to the People of the World: Our Joint Understanding of Sinological Study and Chinese Culture with Respect to the Future Prospects of World Culture,” trans. Eirik Lang Harris. Hackett Publishing, 2018.
  • Shen, Vincent. “Obituary of Lao Sze Kwang.” Journal of Chinese Philosophy 40/1 (2013): 215-217.
  • Tam, Andrew Ka Pok. A Discourse on Hong Kong Culture. Hong Kong: Passion Times, 2016.
  • Tang Junyi. Shuo zhong hua min zu zhi hua guo piao ling [On the Falling Flower and the Withering Fruit of the Chinese Nation]. Taipei: San Min Book Co. Ltd, 1974.

 

Author Information

Andrew Ka Pok Tam
Email: k.tam.1@research.gla.ac.uk
University of Glasgow
United Kingdom

Edward Jonathan Lowe (1950-2014)

E. J. LoweEdward Jonathan Lowe (usually cited as E. J. Lowe) was one of the most significant philosophers of the twentieth and early twenty-first century. He made sustained and significant contributions to debates in metaphysics, ontology, philosophy of mind, philosophy of language, philosophical logic, and philosophy of religion, as well as contributing important scholarly work in early modern philosophy (most notably on Locke).

Over the length of his career, Lowe published eleven single-authored books, four co-edited collections, and well over 300 papers and book reviews in journals and edited volumes. The range of topics covered in his published work is highly eclectic. Given this, and his prolific rate of publication, this article cannot aim to cover all of the questions that Lowe contributed work on. Instead, it will focus on some of his most significant contributions in metaphysics and ontology, and related topics in other areas of philosophy.

This choice of focus stems, in part, from Lowe’s strong belief in the inescapability of metaphysical questions. Lowe argued for the need to approach metaphysics, and philosophy more broadly, in a serious, systematic fashion, likening metaphysics to putting together the pieces of a gigantic jigsaw puzzle, working with, rather than trying to overrule or being secondary to, natural science.

Although the sections in this article focus on different topics, the highly systematic nature of Lowe’s work means that there are many potential points of intersection that could be drawn between them. In the interests of providing a navigable summary of Lowe’s work, this article highlights only some of these connections.

Table of Contents

  1. Biography
  2. What Is Metaphysics?
    1. The Science of the Possible
    2. The Science of Essence
    3. Metaphysics, and Logic and Language
    4. Metaphysics and Common Sense
  3. Ontology
    1. The Four-Category Ontology
    2. Objects
    3. Properties
    4. Universals
    5. Kinds
    6. Further Formal Ontological Relations
      1. Exemplification
      2. Identity
      3. Composition
      4. Constitution
    7. Persistence and Change
      1. Endurantism vs Perdurantism
      2. Persistence and Intrinsic Change
  4. Essence
    1. What Are Essences?
    2. Modality and Essence
    3. Categoricalism
  5. Mind, Persons, and Agency
    1. The Non-Identity of Mental and Physical States
    2. Non-Cartesian Substance Dualism (NCSD)
    3. The Unity Argument for NCSD
    4. Mental Causation
    5. Agent Causation
  6. Other Work
  7. References and Further Reading
    1. E. J. Lowe
    2. Other References

1. Biography

Lowe was born in Dover, England, on 24 March 1950. He went to Cambridge to study Natural Sciences in 1968, changing to History after one year and was awarded a BA (first class) in 1971. Lowe switched to studying philosophy and moved to Oxford. He was awarded his BPhil and DPhil degrees in 1974 and 1975 (supervised by Rom Harré and Simon Blackburn respectively). After briefly teaching at the University of Reading, Lowe moved to the University of Durham in 1980, where he would stay for the rest of his career until his death in 2014.

2. What Is Metaphysics?

In the preface to The Possibility of Metaphysics, Lowe states that his ‘overall objective in this book is to help to restore metaphysics to a central position in philosophy as the most fundamental form of rational inquiry, with its own distinctive methods and criteria of validation’ (1998: iii). This section outlines Lowe’s view on what metaphysics is, how it relates to other areas of research and inquiry, and why metaphysics is, for Lowe, ‘unavoidable’. Understanding the inevitability of metaphysical inquiry, and the relationship of metaphysical research to other areas (including physics and the other natural sciences, but also to ‘common sense’ and ordinary perception) is crucial to understand Lowe’s motivation to defend various first-order metaphysical positions. As such, whilst important in its own right, the significance of Lowe’s views about these metametaphysical issues may only become clear later in this entry, once we begin to grapple with the first-order issues.

a. The Science of the Possible

For Lowe, metaphysics has dual characterisations: as the science of the possible and the science of essence.

As the science of the possible, Lowe does ‘not claim that metaphysics on its own can, in general, tell us what there is. Rather—to a first approximation—I hold that metaphysics by itself only tells us what there could be’ (1998: 9; see also 2006a: 4–5, 2011a: 106; 2007b; 2008a; 2008b; Ms.). Metaphysics is, in part, the process of charting the domain of objective or real possibility, which Lowe holds, is ‘an indispensable prerequisite for the acquisition of any empirical knowledge of actuality’ (2011a: 100). That is, in metaphysics and ontology we explore how things might be—what is possible and compossible (what things could co-exist). This enquiry into the possible ways reality might be, in conjunction with empirical work, can allow us to get at what is actually the case for we must, for Lowe, understand what is possible before we can understand what is actual. In this way, metaphysics becomes indispensable, as a way to illuminate the features of reality that empirical scientific enquiry presupposes, but must be combined with that empirical enquiry to arrive at a full account of how reality is.

This claim about the science of the possible also leads Lowe to a position about the methods of metaphysics, holding that metaphysics’ method ‘is first to argue, in an a priori fashion, for the possibility—and compossibility—of certain sorts of things and then to argue, on partly empirical grounds, for the actuality of some of those things that are compossible’ (2011a: 105). Metaphysics is a holistic enterprise, not to be done in a piecemeal way, as the attempt to understand what things exist and, just as crucially, co-exist.

Lowe’s conception of metaphysics is not divorced from experience and empirical data. There is no clear boundary for Lowe between the work of the metaphysicians and that of the theoretical sciences. But this is not to say that there is not a distinctive role for the philosopher. For Lowe, ‘science presupposes metaphysics… Empirical science at most tells us what is the case, not what must or may be (but happens not to be) the case. Metaphysics deals in possibilities’ (1998: 5).

Lowe’s view holds that metaphysics, or more precisely ontology, comes in two parts: ‘one which is wholly a priori and another which admits empirical elements’ (2006a: 4). The a priori part of ontological theorising is best taken to be that part of metaphysics that is the ‘science of the possible’ described above. That is, the a priori part of ontology explores the realm of genuine metaphysical possibility, and what things could co-exist in a single possible world.

Note that the use of ‘possible world’ here is not intended to invoke a commitment to the concrete reality of possible worlds. Lowe rejects Lewis’ modal realism, denying that possible words, whether they exist or not, are objects (1988: 256). Rather, ‘possible world’ here is only used as a phrase to highlight that we can produce a number of theories that seek to describe how reality is, and call each of them a possible way that reality could be. The a priori part of ontology is thus devoted to exploring those possible ways that reality might be.

The ‘empirically conditioned’ part of ontology seeks ‘to establish, on the basis of empirical evidence and informed by our most successful scientific theories, what kinds of things do exist in this, the actual world’ (2006a: 4–5). Given that metaphysics is, in part, the science of the possible, we can see that for Lowe metaphysics differs in both its subject matter and methodology from the empirical sciences, but crucially the two exist ‘in a symbiotic relationship, in which each complements the other (2011a: 102; see Morganti and Tahko 2017).

By holding that one aspect of ontology is (predominantly) a priori, ontology is methodologically distinct from the empirical sciences. By holding that its subject matter is genuinely possible ways reality could be, its subject matter is distinct as empirical science does not concern itself with how reality could be, only with how it is. But crucially, as ontology has two aspects, and two tasks, it overlaps in one of those tasks with the empirical sciences. This is what gives rise to a truly symbiotic relationship, avoiding many of the issues that arise in other accounts that seek to give priority (epistemic, or otherwise) to either the empirical sciences or to metaphysics.

However, it also brings into focus why, for Lowe, no science can provide the map of reality. Natural sciences are focused on restricted domains, and on what is actual, but grasping what is actual requires us first to know what is possible (2006a: 4). Metaphysics is unavoidable, essential, and cannot be rejected (despite the various arguments that have attempted to do so). For Lowe, metaphysics provides the foundation for natural science, and without that grasp on what is possible, we cannot have knowledge of what is actual, nor come to recognise the implicit (or explicit) assumptions within natural science (see Mumford and Tugby 2013).

b. The Science of Essence

In parallel with the above, as the science of essence, Lowe takes metaphysics to be the task of saying what some entity is such that it is that entity—to provide the real definition for that entity (as opposed to the verbal definition; see Fine 1994). To enquire into the real definition of an entity is to attempt ‘to characterise, as perspicuously as possible, the nature or essence of some actual or possible being’ (2007a). Lowe takes this interest in real definition and ‘essence’ from Aristotle. For example, a characterisation of the essence of a circle—‘a perspicuous way of saying what it is, or would be, for something to be a circle’ (2007a)—is to be the locus of a point moving continuously at a fixed distance around another point. This is what it is, or world be, for something to be a circle.

This focus on essence has meant that Lowe is commonly listed as a key figure in the recent resurgence in ‘neo-Aristotelean’ approaches to metaphysics, taking metaphysics not to be primarily concerned with what exists (as in the neo-Quinean tradition), but rather with the essence of those types of entities that do exist, and the metaphysical relations that hold between them.

This conception of neo-Aristoteleanism should be distinguished from another conception, as discussed by Schaffer (2009). Under this alternative conception, neo-Aristoteleans need not accept essences into their ontology, but they do share the focus on how entities are related to each other, rejecting the neo-Quinean focus on what exists (for more on the neo-Aristoteleanism that Lowe endorsed, see Lowe 2013c, Novotný and Novák 2014, Tahko 2012).

We comment more specifically on Lowe’s notion of essence in a later section. However, it is important to see the links that exist for Lowe between metaphysics as the science of the possible, and metaphysics as the science of essence. To elucidate the essential nature of an entity is to provide the existence and identity conditions for that entity, or that kind of entity. The essence is what dictates what that entity is, or would be.

Note the ‘or would be’ in this account. Lowe is clear throughout his work that he is investigating what entities or kinds of entities would be like independently of whether any of them do actually exist. This is not to say that Lowe thinks that he is engaging in some conceptual analysis around the notion of, say, a circle. Rather it is to say that given that metaphysics is about what is possible, we must understand what it would be for something to be a circle so that we can then consider whether reality does in fact contain anything that fits that real definition. This again shows the connection between metaphysics as the science of the possible and metaphysics as the science of essence.

c. Metaphysics, and Logic and Language

Another domain that is important to highlight in understanding the role and importance of metaphysics is that of language and formal logic. This is particularly the case given the central role in much of the analytic philosophy tradition given to first-order predicate logic with identity. Lowe is clear in his rejection, not of such logic per se, but its assumed dominance, and the types of ontological claims and distinctions that arise from this logical system. For example, Lowe rejects what Smith called ‘Fantology’, the view that the ‘key to the ontological structure of reality is captured syntactically in the “Fa” […] of first-order logic, where “F” stands for what is general in reality and “a” for what is individual (Smith 2005: 1; see also Smith 1997; Lowe 2013a: chapter 4).

The central problem with Fantology, for Lowe, is that it equips us with ‘a certain conception of reference and predication which is, from the point of view of serious ontology, extremely thin and superficial’ (2013a: 50). First-order predicate logic with identity only provides a restricted formal machinery that only allows for ontological distinctions between objects and properties, and between existence and identity. These distinctions are most certainly present in Lowe’s ontology; however, there are many more in addition to these two.

A further problem with Fantology comes from its adherents holding to Quine’s maxim that ‘to be is to be the value of a variable’. Lowe holds that ‘∃’ should be analysed as the ‘particular quantifier’ rather than as an existential quantifier. By so doing, the particular quantifier can quantify over non-existent objects, without having to accept Meinong-like distinctions. For expressing existence, Lowe prefers the use of a monadic existence predicate, ‘E!’. This logical machinery, he argues, better suits the ontological framework that he defends, and thus is to be preferred (see 2013: chapter 4).

This brings us to the main point for Lowe with respect to logic, and language. Understanding language and logic is important, and he does on occasion use arguments from natural language in particular to highlight ontological distinctions (for example between categorical and dispositional predication; see 2013a: chapter 5). But language, and logic, mislead. It is central to Lowe’s philosophical theorising that the hard work of ‘serious ontology’ must come first, and that ontological conclusions cannot be read off of our language or our logic. This is what motivated Lowe’s adoption of a version of sortal logic instead of first-order predicate logic. It is not that sortal logic is intrinsically ‘better’. It is that a version of sortal logic allows Lowe to express the ontological distinctions that he believes exist, which cannot be expressed perspicuously with the tools of first-order predicate logic (2006: chapter 4).

Therefore, in what follows when commenting on Lowe’s first-order views, it should be stressed that arguments that might initially seem to derive from grammatical or semantic points to ontological conclusions are not of the form: language expresses facts in this way; therefore, we should adopt the corresponding ontology. Instead, the move always has to be from ontology to a correct language or logical system.

This does not rule out that some distinctions appear in our natural language, in part, due to those distinctions being indicative of corresponding distinctions in reality (Lowe, personal communication). For example, the grammatical distinction between subject and object might exist in our language because there is a relevantly similar distinction in reality between objects and properties. This is not though to read the distinction off of our language; the case for an ontological distinction between objects and properties must stem from ontological rather than linguistic arguments.

d. Metaphysics and Common Sense

As a last point on the more meta- or methodological parts of Lowe’s work, it is important to note the commitment to common sense in Lowe’s metaphysics. Common sense, for Lowe, is the starting point for many metaphysical and philosophical problems. All else being equal, Lowe often appeals to solutions that are the ‘least revisionary’ either with respect to how we perceive the world, or how we typically talk about the world. Coherence with common sense should be retained if possible, and only rejected if ‘moving away’ from common sense yields significant theoretical advantage. Metaphysics will not always follow common sense, but it can be our starting point, when combined with a respect for science that resists scientism. This sensitivity to common sense becomes further apparent in various places throughout the rest of this article.

For example: on the tensed view of time, ‘for what it is worth, I consider it to be a distinct merit of the tensed view of time that it delivers this verdict, for it surely coincides with the verdict of common sense’ (1998: 104); on intrinsic change, ‘it seems to me that if we have to accept one or other of these three solutions to the semantic problem of intrinsic change, then we had better opt for solution (ii), as this is clearly the least revisionary with respect to our common-sense talk of persistence through change’ (1998: 130); on predicates and properties, ‘the idea behind the proposal is the seemingly common-sense one that the property of being F is what all and only the Fs have in common’ (2006: 122); on four dimensionalism about objects, ‘I have grave doubts about the ultimate coherence of this view of things, suspecting that what superficial plausibility it possesses is parasitic upon our prior grasp of the very neo-Aristotelian or “common-sense” conception which it seeks to challenge’ (2009: 18); and on Quine’s ontological relativism, ‘it is not one that should be contemplated as long as the prevailing common-sense ontological scheme can be defended as viable, as I believe it can’ (2009: 90).

This acceptance of a role in our metaphysics for common sense is not to deny Lowe’s view that metaphysics should be approached as the study of the fundamental nature of reality in a serious and steadfastly realist way. Rather, it is to say that for Lowe it is not the case that metaphysicians have some infallible insight on eternal truths, insulated from the human perspective that otherwise might distort our claims. Metaphysics seeks to understand the nature of reality, whilst accepting that any claims about reality will be made from a particular perspective. We have a relation to reality, but ‘that we cannot stand outside ourselves to study that relation need not imply that it cannot be studied by us at all’ (1998: 4).

This last point serves as the basis on which Lowe rejects what he calls the neo-Kantian objection to metaphysics (2001: 4). Lowe argues that we are ourselves part of reality, and so are our thoughts. This means that claims that knowledge of how things really are is impossible are foundered on a contradiction. Metaphysics, and metaphysicians, must be ‘critical’ (2001: 5). Metaphysics may involve refining concepts, but this is to make those concepts more reflective of reality. That we have a particular viewpoint on the world does not stop this from being possible; rather, it just means that we must be careful to ensure that we are suitably critical.

3. Ontology

a. The Four-Category Ontology

At the heart of Lowe’s metaphysical (and much of his broader philosophical) work is his defence of a four-category ontology. This was developed over a long period of time, with its most extensive exposition in the 2006 book named for it. This ontology, Lowe argues, best allows for a balance between explanatory power and ontological parsimony, and, along with the equally central notion of ‘essence’, provides the basis for a unified account of a wide range of phenomenon (as it becomes clear in the remainder of this article).

(Note that in earlier work (1989) Lowe defended a three-category ontology but came to believe that an additional category was needed, and theoretically justifiable despite the additional ontological cost. Later, Lowe argues that ‘persons’ may be a further (fifth) fundamental category of entities (2008a). The status of persons is discussed in section 5.)

The four-category ontology explicitly takes its inspiration from the early work of Aristotle, most centrally in the Categories. As Lowe interprets Aristotle:

[Aristotle] articulates a fourfold ontological scheme in terms of the two technical notions of ‘being said of a subject’ and ‘being in a subject’. Primary substances […] are described as being neither said of a subject nor in a subject. Secondary substances—the species and genera to which primary substances belong—are described as being said of a subject but not in a subject. That leaves two other classes of items: those that are both said of a subject and in a subject, and those that are not said of a subject but are in a subject. Since these two classes receive no official names and have been variously denominated over the centuries, I propose to call them, respectively, attributes and modes. (2012a: 97)

Put into Lowe’s preferred terminology, the four-category ontology thus emerges from the intersection of two exhaustive and exclusive ontological distinctions: the first between entities that are substantial and non-substantial (that is, properties or relations), and the second between entities that are universal and particular. This leads to the view that all entities (actual and possible) are assigned to one of the following ontological categories: object (substantial particular), kind (substantial universal), attribute (non-substantial universal), mode (non-substantial particular).

A note on this terminology: Lowe preferred the terminology of ‘mode’ as he took inspiration (and the term) from Locke for his category of particular non-substantial entities. These are property- or relation-instances, and are elsewhere, including by Lowe, called ‘tropes’, ‘abstract particulars’, or ‘individual accidents’.

All four of these categories are equally basic or fundamental. Terms such as ‘universal’, ‘particular’, and ‘entity’—the all-encompassing category that all entities, both universal and particular, belong to—are taken to be transcategorial as they apply to entities from multiple categories.

The four fundamental categories are related to each other through patterns of instantiation and characterisation relations. A particular object is an instance of a kind—a particular tiger is an instance of the kind tiger; and a particular mode is an instance of the non-substantial universal or attribute—the particular redness, say of a particular ball, is an instance of the non-substantial universal redness. The instantiation relation thus tracks the distinction between universal and particular.

The characterisation relation holds along the other dimension of the four-category ontology, between the substantial and non-substantial. A particular redness characterises the particular substance whose redness it is, and the non-substantial universal (attribute) redness characterises the substantial universal (kind) tomato.

Taken together, these categories and relations can be summarised in what Lowe called the ‘Ontological Square’ (2006a: 18):

It should be stressed that these relations are not further elements of being. That is, the ‘relations’ of instantiation and characterisation are strictly ‘formal’. This connected to, but strictly not the same as, the form/content distinction in logic: ‘The ontological form of an entity is provided by its place in the system of categories, for it is in virtue of a being’s category that it is suited or unsuited to combine in various ways with other beings of the same or different categories’ (2006a: 48). Instantiation and characterisation (and other relations discussed below) are thus not relational properties—they are formal relations that illustrate how those entities they relate are. This means that formal relations metaphysically explain the nature of those entities they relate, without those relations themselves being further things (compare the notion of internal relation drawn from Moore 1919).

That these formal relations are not further elements of being also has an additional benefit for Lowe’s system in that it avoids the possible threat of Bradley’s Regress. Bradley’s Regress arises when we consider what explains the claim that objects and properties (or bundles of properties) are related (see Bradley 1893). If we conceive of such relations as distinct from their relata, then we would need to posit further relations to relate them to the original relata, and so on ad infinitum. The formal nature of instantiation and characterisation for Lowe ensures that this problem does not arise. The formal ontological relations are not distinct from the relata they relate, and hold purely in virtue of the existence and intrinsic nature of the relata.

Similarly, the categories themselves are formal and are not further elements of being. Rather the categories indicate the ontological form of the entities that fall under that category, and how those entities that fall under distinct categories are related to each other. The ontological categories therefore do not themselves exist.

The main reason for the non-existence of the categories themselves is that all entities that do exist must fall under one of these categories, but the categories themselves cannot be so analysed. In brief, the categories cannot be universals, as universals have particular instances as their kinds—if the categories were universals, then they would have to have universals (such as the kind dog) as instances. One way out of this is to posit the category of kinds as being a particular—say a set (an abstract particular object). However, this immediately raises the problem of requiring that this set—the set of the categories—is a member of itself. This is a sufficient problem for Lowe for him to reject this possibility.

One further possibility is to take the categories to be ‘higher-order’ universal, and therefore has the first-order categories as its instances. However, and leaving aside Lowe’s general reluctance to accept higher-order universals, the higher-order universals that the different categories would belong to would have to be different under Lowe’s system: the category of kinds would be a second-order universal as its instances are other kinds that are themselves universals; whilst the category of objects would be a first-order universal as its instances are particular substances. Given that the categories would not in fact be of the same order, they would not actually be the same kind, and therefore categories cannot be higher-order universals (2006a: section 3.3; see also Griffith 2015, Miller 2016).

To be clear, this is still a realist account of ontological categories, despite the categories not themselves being elements of being:

An object is different from a property or a mode in virtue of the intrinsic natures of these entities, quite independently of us and our ways of describing or thinking of things. We place things in different ontological categories correctly if we distinguish them rightly in respect of these intrinsic and objective differences between them. (2006a: 43–44)

As discussed in more detail below, we categorise correctly if we correctly account for the existence and identity conditions of an entity—or the essence of that entity—which will be in line with which of the categories the entity falls under.

For example, it is part of the essence of a mode that it depends for its existence and identity upon the object that it characterises and that it is an instance of an attribute. All modes are intrinsically different from all entities that fall under other categories due to these mind-independent existence and identity conditions.

The categories and the relations that hold between them create various forms of asymmetrical dependence relations. Particular modes are ontologically dependent on the particular substance that they are a mode of. Indeed, for Lowe, a mode can only be the mode that it is if it is a way that that particular substance is. For example, the particular mode of redness of a particular apple cannot characterise any other particular object. However, that particular object (the apple) could have been characterised by a different mode. Therefore, the particular object is only weakly dependent on the modes that characterise it, whilst the modes are strongly dependent on the particular objects that they characterise.

Similar asymmetrical dependency relations hold between the other categories. Some non-substantial universal (say, the attribute redness) is weakly dependent on the particular modes that are instances of it. The attribute redness would still exist if all of the actually existing redness modes did not exist, just so long as at least one redness mode did exist. The same is not the case for a particular mode, and so the mode is strongly dependent on the existence of the attribute that the mode is an instance of.

The role of dependency relations within this ontological system is important. However, Lowe holds that dependence is not so much a single relation as a family of relations, including, at least, rigid existential dependence, non-rigid existential dependence and also identity dependence. Dependence, though genuine, is not fundamental, but rather is ‘founded’ upon other formal ontological relations that are more ontologically basic (see Lowe 1994; Tahko and Lowe 2015).

It should be stressed that the non-fundamental nature of dependency should not lead us to think it is unimportant. The variety of dependency relations, and the ‘founded’ nature of dependence, allows for a wide range of intricately distinct dependence relations of differing modal and metaphysical strengths. This feature becomes clear throughout the whole of this article by discussing a range of formal ontological relations, and the key role that they play in differentiating entities and categories of entities.

Beyond the fundamental categories, Lowe argues that a complete metaphysical picture of the world will contain further categories, which are interrelated in a hierarchical structure. This allows Lowe to say that there are both more general but non-fundamental categories (‘substantial’, ‘non-substantial’, ‘entity’, ‘universal’, and ‘particular’), and less general non-fundamental categories (such as ‘concrete objects’ and ‘events’). This acceptance of a hierarchy of entities means that Lowe is not committed to the claim that there only exist fundamental entities. Instead, clearly non-fundamental entities (such as money, or a dog) can exist, and the task of the metaphysician is, in line with neo-Aristotelean claims, is to map how such entities are related to each other. It is the case that all of the fundamental categories are occupied, but ‘there is plenty of scope to debate whether or not various subcategories of those basic ones are filled in actuality’ (2006a: 44). Lowe provides an illustrative example of the hierarchical system he ‘favours himself’, but does note it as a ‘partial sketch’ at (2006: 8).

As noted in passing in the initial description, it is important to stress that Lowe sees this categorial scheme as applying to both actual and possible entities, and suggests that it is the role of the empirical sciences, not philosophers, to decide what entities are actual:

Metaphysics should not be in the business of dictating to empirical scientists precisely how they should categorise the theoretical entities whose existence they postulate. Metaphysics supplies the categories, but how best to apply them in the construction of specific scientific theories is a matter best left to the theorists themselves, provided that they respect the constraints which the categorial framework imposes. (2006a: 19)

b. Objects

Lowe defends the idea that particular objects cannot be mere bundles of properties (either of tropes or of universals), nor should be thought of as some mixture of a ‘mysterious substratum’ (2006a: 28, see also 2000b). Thus the view here is that objects are an irreducible and basic category of entity, which, as part of their essential nature, perform a ‘supporting-role’ for particular property-instances. Thus:

According to my conception of objects, an object is not a complex which is somehow constituted by a collection of particular properties together with some further entity which is itself neither a particular property nor a propertied object. The mistake is to suppose that an object is even partially constituted by its particular properties, as this inverts the true direction of ontological dependency between object and property. Particular properties are no more (and no less) than features or aspects of particular objects, which may indeed be selectively attended to through a mental process of abstraction when we perceive or think of particular objects, but which have no being independently of those objects and which consequently cannot in any sense be regarded as ‘constituents’ of objects. In this respect, the particular properties of an object differ radically from its parts, if it has any, for these are just further objects with particular properties of their own. (2006a: 97)

Justification for this view—for the additional posit in our fundamental ontology of particular substances or objects—comes largely from what Lowe sees as flaws or confusions in the competing views.

On the bundle theory, Lowe thinks that the problem is that if we take that view, we cannot provide adequate identity-conditions for property-instances: ‘Property-instances are ontologically dependent entities, depending for their existence and identity upon the individual substances which they characterize, or to which they “belong”’ (2006a: 27). They cannot ‘float free’ from an individual substance, as properties are ‘ways that objects are’.

On ‘substratum’ views, we are in danger of being committed to ‘bare particularism’, ‘or to the notion of a property-less “substratum” that somehow “supports” and “unites” the properties of a single object’ (2006a: 27). The mistake here is to think of individual substance as complex entities that are composed of a non-propertied substratum and some properties. Rather, Lowe thinks that particular substance are simple fundamental entities, that are weakly dependent on property-instances—in that all particular substances are some way, and so must be characterised by at one property-instance. But this does not make them complex, nor make the individual substance’s properties items that compose that individual substance. This also does not mean that objects cannot be composite. Some objects, living organisms for instance, may be made up of lesser substantial parts.

Together, these motivations lead Lowe to hold that we have reached ‘explanatory bedrock’ in the concept of ‘substance’ or ‘object’, and thus that we should accept the category of ‘individual substance’ into our ontology.

c. Properties

Some further comment is required on Lowe’s views about properties, particularly given his commitment to the existence of both universal and particular properties.

Properties are ways of being, or ways that objects are (2006: 90–91). The particular property of ‘redness’ thus is a way that some object is, and the universal property of redness is a way that more than one object is, such that those objects can be said to be the same colour. This means that properties are not objects as, in line with the above, they cannot exist independently. Properties are strongly dependent on objects, but objects only weakly so on properties.

Relational properties are non-formal and are also taken to be ways that objects are, but ways that two or more objects are such that they are related. As such, relational properties are further elements of reality, and do not hold purely in virtue of the nature of those entities they relate. For example, if ‘loving’ is a genuine existing relation such that ‘John loves Mary’ is true, then it is a non-formal relational property. The ‘loves’ relation tells us something about the way that John and Mary are.

Lowe is an ‘immanent’ realist about universals. This is because Lowe thinks that entities that do not exist ‘in’ space and time, such as transcendent universals, are causally inert and therefore cannot play the role in perception and causation that properties of objects are required to play (2006a: 98). However, Lowe rejects the view that an immanent universal is ‘wholly present’ in all of that universal’s instances, due to the view being committed to an ‘inexplicable mystery which borders on incoherence’, in having to hold that the same universal could be wholly present in two places at the same time.

Instead, Lowe supports a ‘weak’ doctrine of immanence which ‘just amounts to an insistence upon the instantiation principle—the principle that every existing universal is instantiated. Applied to a universal such as the property of being red, it implies that this universal must have particular instances which exist “in” space and time, but it doesn’t imply that the universal itself must literally exist “in” space and time’ (2006a: 99). This solution, though, requires a commitment to both the existence of (instantiated) universals and modes. We have already seen that this is something that Lowe is willing to endorse, but again it is of note that the holistic and systematic nature of Lowe’s ontological theorising is part of the reasoning that gives rise to these commitments.

It should additionally be noted, that in line with the comments above about Lowe’s views on the relationship between language and metaphysics, that Lowe does favour a somewhat sparse conception of properties, at least in the sense that he does not think that every meaningful predicate refers to real property (2006a: 122). In fact, Lowe generally is of the view that far fewer than all meaningful predicates express real properties; however, the job of the philosopher is not primarily to decide which predicates are the ones that express real properties. That, rather, is left to the more empirically informed aspect of our research, just so long as the overall ontological framework is taken into account when considering each case.

d. Universals

A further reason for the positing of non-substantial universals comes from the commitment to kinds, or substantial universals, for if there are kinds, then it cannot be that kinds are characterised by particular property-instances, but instead must be characterised by universal properties. What this means is that given that Lowe defends the existence of kinds (more on this in a moment), it must be that such kinds are characterised by universal properties, not by particular properties. A particular property, or mode, is instantiated by a particular substance, not some kind of object. Universal properties can only characterise universal substances, and particular properties can only characterise particular substances.

Perhaps the main reason that Lowe endorses the existence of universals comes from concerns about laws of nature, arguing that to account fully for such laws we must posit both substantial and non-substantial universals.

Lowe criticises one common universal-invoking account of laws: that natural laws are relations between universal properties as a second-order relation of necessitation (see Armstrong 1983). Under such views, the form of a law is ‘F-ness necessitates G-ness’ and this entails the constant conjunction amongst particulars that ‘For any x, if x is F, then x is G’. However, Lowe argues that laws do not in fact entail constant conjunctions amongst particulars, because ‘laws—apart, perhaps, certain fundamental physical laws—admit of exceptions, which arise from the possibility of interfering factors in the course of nature’ (2006a: 29).

Lowe argues that we should think of laws of nature as determining ‘tendencies’ in the particular objects that they apply to, which result from the complex interaction of multiple laws. This means, and leading from the ontological square above, laws consist, in the simplest cases, of kinds being characterised by some non-substantial universal or property, or, in two or more kinds being characterised by a relational universal. There is no need to invoke second-order necessitation relations, and we can more directly read the correct form of a law from our everyday talk: ‘The basic form of a law is not ‘F-ness necessitates G-ness’, but ‘Ks are F’, or ‘Ks are R-related to Js’, where ‘K’ and ‘J’ denote substantial universals, ‘F’ denotes a property and ‘R’ denotes a relation—that is, where ‘F’ and ‘R’ denote non-substantial universals (2006a: 30).

For example, if it is a law that ‘rubber stretches’ this is to say that things of kind ‘rubber’ is characterised by the non-substantial universal of ‘stretchiness’, or if it is a law that ‘Protons and electrons attract each other’ this is to say that the kind ‘proton’ and the kind ‘electron’ are characterised by the ‘attraction’ relation.

This account, additionally, has the benefit of distinguishing logically between statements of laws, and the corresponding generalisations—between ‘Violets are blue’ and ‘All (particular) violets are blue’ (2006a: 94). One is a statement of law; the other is a statement about all instances of a kind. That is, one tells us something about the nature or essence of the kind ‘violets’, whilst the other tells us something about all the particulars of that kind, which might be something that is not of the essence of the kind. For example, ‘all swans are white’ might be true in that all particular swans might be white. ‘Swans are white’, in contrast, is a statement about the kind swan and is false as the kind swan is not characterised by the property of whiteness—it is not part of the essence of that kind.

Thus, under this account, we have no need to invoke some new relation (in the formal or the ontological sense) to explain laws of nature—all that is required is the already posited relation of characterisation, but on this occasion holding between substantial and non-substantial universals instead of particulars. No further second-order necessitation relation is required.

Lowe does not think that laws of nature are (always) necessary states of affairs. This is because ‘natural’ or ‘physical’ necessity—that which laws of nature are about—is a species of ‘relative’ necessity: ‘a matter of what is necessarily the case given that some contingent truth obtains’ (2006a: 132). Natural necessity is therefore not the same as genuine metaphysical necessity. As it is the case, for Lowe, that all natural laws concerning a kind involve all and only those properties that belong to essence of that kind, the laws of nature may not be necessary in the metaphysical sense. For example, Lowe denies that it is part of the essence of water that it dissolves salt, as he thinks it possible that water—the same substance—could exist in a possible world in which it does not dissolve salt. Instead,

[a]t most we can say that if there is a law, in a given possible world, that water dissolves common salt, then it follows of necessity in that world that any particular quantity of water has a tendency or disposition to dissolve any piece of common salt with which it may come into contact. (2006a: 132)

This, of course, leaves open the question of what is the essence of water—this is discussed in section 4. However, we can see that laws of nature about water are only physically necessary as it could have been that water was different, ruling out the claims from being metaphysically necessary.

e. Kinds

Kinds, or substantial universals, are, for Lowe, abstract objects. This is because kinds satisfy two plausible ways in which an entity might be thought to be abstract.

First, abstract could be contrasted with concrete, where a concrete entity exists in space and time, whilst an abstract entity does not. We should not take this to mean that abstract entities and concrete entities have different types of existence. Rather, to be abstract or concrete is to have certain sorts of properties, or better to essentially have certain sorts of properties. Thus, we can hold that an entity is concrete if it essentially has spatiotemporal properties or relations, and an abstract entity does not essentially have any spatiotemporal properties or relations. A table is concrete as a table essentially possesses some spatiotemporal properties (a particular table must be somewhere and somewhen), whilst numbers are abstract as they do not essentially have spatiotemporal properties.

Second, an abstract entity is one that is logically incapable of existing independently. Here, we mean metaphysically independent rather than being independent in thought. So an abstract entity is one that cannot exist independently of some further entity. For example, as we have seen, the particular shade of red of an apple might be thought to be incapable of existing without the further existing of some other properties of that apple, or the apple itself.

Thus, to give an example, the kind horse does not essentially possess any spatiotemporal properties even if particular horses do. As an immanent realist, Lowe does not think the kind is ‘wholly present’ where the instances are. The kind ‘horse’ also cannot exist independently of there being instances of that kind. This also is in line with the weak immanence thesis such that every existing universal is instantiated (2006: 99–100).

Thus, we can conclude that kinds must be objects, as to be an object is to have determinate identity conditions, where if x and y are objects, then there will be some ‘fact of the matter’ as to whether x is identical to y or not (1995: 511–513); and are abstract for the above two reasons. Note, that this does not mean that Lowe denies that there might be particular objects that are abstract. If numbers should be thought of as objects, then they would appear to satisfy the conditions of being abstract particular objects. Again, Lowe’s conception of ontology is such that it need not take a firm position on this in order to the delineate the categories and their formal characteristics. Instead, Lowe was keen to build a system first, and to later consider what entities, if any, fall under which categories.

f. Further Formal Ontological Relations

Despite the focus on characterisation, and instantiation in the preceding discussion, they are not the only formal ontological relations that Lowe is committed to. The following will briefly summarise some other key formal ontological relations in Lowe’s system. Dependence will not be discussed directly, but this is not because Lowe had little to say about dependence. In fact, the opposite is true (see 1994, Tahko and Lowe 2015), though Lowe’s work on this is harder to provide an overview of in an accessible way. Rather, as noted above, dependence is taken by Lowe to be a family of relations, founded upon other formal relations, including those mentioned above, and those discussed in this section.

i. Exemplification

The relation of exemplification holds, diagrammatically, diagonally between particular objects, and non-substantial universals. Therefore, Lowe holds that an object can exemplify an attribute in two ways: the object may instantiate a kind, which is characterised by the attribute; or the object may be characterised by a mode which instantiates that attribute. Exemplification is thus not fundamental, as it can be analysed as two distinct patterns of instantiation and characterisation relations.

Though not fundamental, exemplification is important in Lowe’s system, and as is the distinction between the two ways in which an object might exemplify an attribute. This is because these two ways to exemplify an attribute express the distinction between occurrent (or categorical) and dispositional predication—the difference between saying ‘This stuff dissolves in water’ and ‘This stuff is dissolving in water’. For Lowe, both of those predications express the exemplification of the same attribute (non-substantial universal), but do so in distinct ways.

On Lowe’s view then, it is not strictly correct to distinguish between, as many do, between occurrent (or categorical) and dispositional properties. But, this distinction does have an ontological ground—it is not merely a difference in language:

A sentence of the form ‘a is occurrently F’ means ‘a possesses a mode of Fness’, whereas a sentence of form ‘a is dispositionally F’ means ‘a instantiates a kind K which possesses Fness’. Thus, according to this view, properties (in the sense of universals) primarily characterize kinds and only derivatively or indirectly characterize individual substances or objects. (2006a: 125)

This is an ontological difference in how this indirect characterisation occurs, although not one where the difference lies in there being distinct types of properties (compare Heil 2010 and the view that properties are ‘powerful qualities’).

ii. Identity

Identity, for Lowe, is purely formal, rather than being a relational property. This is because whilst Lowe does not think that questions about self-identity are trivial, having to do with complex issues about identity conditions, nor are unimportant—it being in virtue of self-identity that objects are countable and can constitute a plurality; identity (and self-identity) is a necessary condition upon the existence of objects. This makes identity too fundamental to be something in the world, and rather describes how items are in the world.

As detailed in section 4, understanding the identity conditions of an object is a crucial aspect to understanding the essence of the object, as these identity conditions are supplied by the kind that the object instantiates, which is itself part of the essence of that object—an object cannot become a different kind of object without the original object ceasing to be.

iii. Composition

The distinction between constitution and composition is important for many reasons, but perhaps in Lowe’s work this distinction is best known as being at the centre of the claim that that a statue and the lump of bronze that it is composed of are distinct. That is, Lowe’s defence that where there is a statue, there are two non-identical overlapping objects that are the statue and the lump of bronze (see 2009a: chapter 6; 2006a: 49–51).

Composition is a many-one relation that holds between a (non-simple) whole and (some) of its proper parts. A bronze statue is composed of the bronze atoms that are the proper parts of the statue. The conditions under which an object can be composed are given by the kind (at the relative level of composition) that the object is an instance of. Thus, the composition conditions of a bronze statue are different from the composition conditions of the bronze atoms (which are composed of sub-atomic particles) that compose the statue. Therefore, as an example, some bronze atoms compose a lump of bronze at a time t just in case (1) those bronze atoms are fused together over a period of time to which t belongs and (2) during that period there are no other bronze atoms with which any of them are fused. (2) ensures that the lump of bronze is ‘maximal’, meaning that during the period in which we are discussing the composition conditions of that lump of bronze, it cannot be fused to further bronze atoms—that is, the lump of bronze is not a proper part of some further larger lump of bronze (2006a: 50); and (2) also ensures that there cannot be two spatially coinciding objects of the same kind. Lowe rejects this due to the problems that such a possibility give rise to with respect to individuating those distinct objects (1998: 202; 2002a: 71).

From this Lowe concludes that a lump of bronze and the statue it composes have different composition conditions, as it is case that the condition on a statue must include its shape, whilst this is not the case for a lump of bronze. Furthermore, a statue can be composed of different lumps of bronze over its lifetime (1995b). This is not possible for a lump of bronze for if a lump of bronze were to lose one of its bronze atoms then it would cease to be that (original) lump of bronze. In this way, composition conditions are closely related to persistence conditions.

iv. Constitution

Constitution, for Lowe, is not identity, but rather is ‘the closest way in which two entities can be related while still remaining numerically distinct’ (2006a: 51). Perhaps the closest Lowe comes to providing a precise definition of constitution, though one explicitly restricted to cases in which both x and y are composite objects, is that ‘x constitutes y at time t just in case x and y coincide spatially at t and every component part of x at t is also a component part of y at t, but not every component part of y at t is also a component part of x at t’ (2009a: 89). Constitution can be said to result in the view that there can be two distinct spatially coinciding objects.

An example is perhaps the best way to get at Lowe’s conception here. Take again our statue and lump of bronze. The statue has the shape and weight it has in virtue of the shape and weight of the lump of bronze. The ‘in virtue’ of phrase is for Lowe ‘typically apt’ when constitution holds. But the lump of bronze constitutes the statue—they are distinct entities. We know they are distinct because they have distinct identity and existence conditions. This is not, though, discovered empirically—rather we know it because in grasping part of the essence of statues and lump of bronze, we know that they have distinct essences and thus are distinct entities (2008b: 46).

That constitution is not identity is also important for Lowe’s solution to the problem of Tibbles, raised by Geach (1980) in his argument for relative identity. Without going into full detail of the case, we want to say that both ‘Tibbles is a cat’ and ‘The lump of feline tissue, c, is a cat’ are true. This raises a puzzle if we imagine some proper part cn of c that contains all of c except for one hair. If ‘c is a cat’ is true, then presumably so is ‘cn is a cat’. Extending this, we now seem to have to accept, as the full example goes, 1001 different cats all sitting on the mat.

Lowe’s distinction between constitution and identity allows for a solution to this. Lowe argues that we must recognise that the sortal terms ‘lump of feline tissue’ and ‘cat’ have different criteria of identity associated with them. The removal of one part of a lump renders the remaining lump a different lump—we cannot take a hair away from c without destroying c. However, the same is not the case with Tibbles, as Tibbles might, as in other extensions to the example, lose its tail, but this would not destroy Tibbles.

This difference in the criterion of identity of c and Tibbles indicates to us that there are two different senses of the predicate ‘-is a cat’. c is a cat only in the sense of c constituting a cat, whilst Tibbles is a cat in the sense of Tibbles being an instance of the sortal kind ‘cat’. ‘-is a cat’ is not ambiguous once we recognise the distinction between the ‘is’ of constitution and the ‘is’ of instantiation (2009a: chapter 6).

g. Persistence and Change

i. Endurantism vs Perdurantism

Lowe’s views on persistence, temporary intrinsics, and change are perhaps best shown in a dialogue that he had with David Lewis in Analysis in the late 1980s. In these papers, Lowe outlines his rejection of temporal parts, and of Lewis’ still standard distinction between endurantism and perdurantism. In Lewis’ terminology, ‘something perdures iff it persists by having different temporal parts, or stages, at different times, though no one part of it is wholly present at more than one time; whereas it endures iff it persists by being wholly present at more than one time’ (Lewis, 1986: 202).

Lowe rejects this way of framing the question, and thereby rejects both endurantism (so conceived) and perdurantism. On endurantism, Lowe argues that, in parallel to the above arguments about universal properties, that ‘there is no useful notion of such a thing being “wholly present” at a time’ (1987a: 152). The issue is that ‘wholly present’ must be contrasted with the notion of ‘partially present’. However, if we were an endurantist on Lewis’ conception, the idea of a ‘partially present’ object simply makes no sense.

Lowe rejects perdurantism as he rejects the existence of temporal parts for ordinary, concrete objects, noting that he finds the notion ‘scarcely intelligible’ (1987a: 152; he accepts as possible, though without committing himself to, the view that events and processes have temporal parts; see 1998: 99–100).

More substantively, Lowe thinks that the only way to get some grip on the notion of temporal parts is by analogy to spatial parts. Lowe thinks that concrete things can only have spatial parts if the things are extended in space. Following the analogy, a concrete thing can only have temporal parts if it is extended in time. However, this means that the debate is no longer about endurantism or perdurantism:

the perdurance versus endurance debate doesn’t really hinge upon issues in mereology (the study of part–whole relations) as such, but rather upon the question of whether anything…is extended in time, in anything like the way in which things are extended in space. But this is at bottom a question about the nature of time, rather than a question about the nature of things existing in it. The question is whether we can properly talk about time as being some sort of dimension of reality, relevantly akin to the three dimensions of space. (1998: 102)

Indeed, Lowe, in a set of later papers (though clearly echoing the above sentiments), ultimately argues that the distinction between 3D (roughly endurantist views) and 4D (roughly perdurantist views) descriptions of the world ‘are equivalent in the sense of being intertranslatable without remainder, and [Lowe and McCall] take the position that there is no “fact of the matter” as to whether we live in a 3D or 4D world’ (2006c: 570; see also 2003b).

The main reason for this is that in the case of some particles that have no parts, and exist at only one time, we can describe those particles ‘indifferently’ as instantaneous 4D temporal parts, or as 3D objects that exist only at one time, with a one–one relationship between such descriptions (2006c). This claim should be considered alongside Lowe’s related but separate claim that perdurance and endurance account of persistence are in fact equally good at handing problems such as vagueness (see 2005b, a response to views expressed in Sider 2001 and Hawley 2001).

ii. Persistence and Intrinsic Change

What then is Lowe’s view about persistence over time? To get at this, we must distinguish between the metaphysical and the semantic problems, as each requires their own answer—indeed, Lowe thinks that Lewis’ solution fails in part because it tries to provide both a semantic and metaphysical solution at once (1988).

The semantic problem is that of specifying the logical form of sentences ascribing temporary intrinsics to objects. Lowe’s solution is ‘adverbalism’. This solution to the correct form of such sentences is ‘a is-at-t F’—that is, that it is the having of a property that is relativised to a time. Thus, an object is not simply characterised by a property, but instead the relation of characterisation—or whatever relation holds between an object and a property that it has—is relativised to time. The main reason for Lowe’s endorsement of this is that he thinks it is the ‘least revisionary’ to our common-sense talk of objects persisting through change when compared to solutions that relativise the property such that it is in fact a relational property, and that relativise the object itself—that is posit temporal parts.

We can see the intuitive pull of this view when we (cautiously) consider the analogous case in spatial properties, as the sentence ‘The Thames is broad in London’ is best analysed to understand what we mean by that sentence as ‘The Thames is-in-London broad’. Again we can see that it is the ascription of a property is, in this case, spatially relativised (1988: 73–75).

The metaphysical problem of intrinsic change is the problem for how there can be objects for which the semantic problem arises—how there can be objects that can seemingly survive through change. To this problem, Lowe’s solution is that the identity over time of objects is founded in the preservation of certain relationships between that object’s constituent parts at any given time. Thus, a tree can survive a change in its properties because its ‘diachronic identity is consistent with a degree of replacement and/or rearrangement amongst its components, sufficient to allow for growth and maturation and so forth’ (1988: 76). This replacement or rearrangement explains how an object can change its shape and yet remain the same object as the change in shape can be explained as a change in the relations between the object’s constituent parts, and the shape of the object supervenes on these relations between its constituent parts.

There are two main consequences of this view (see Lewis 1988 for a discussion of both). First, Lowe has to deny that constitution is identity. However, we have already seen that Lowe accepts this claim independently. Second, Lowe is committed to there being fundamental particles that have their intrinsic properties unchangeably. This, again, is something Lowe is willing to accept, arguing that classical atoms and fundamental particles of modern physics are posited as having their properties unchangeably.

Note that the question of how much change an object can persist through—its persistence conditions—has not been addressed so far. Lowe’s proposed solution, as with other notions such as identity, composition, and existence conditions, is that an object inherits its persistence conditions from the kind that that object is an instance of, and in turn the kind has those specific conditions as it is part of the essence of that kind.

The topics of persistence and change are, of course, related to questions about the nature of time. For some of Lowe’s writings on time, see (1987b, 1992, 1998) where Lowe holds an adverbial view, or (2005b) where Lowe leans towards presentism.

4. Essence

As the last paragraph of the preceding section made clear, the notion of essence plays a significant role in Lowe’s metaphysics. This section outlines what Lowe means by essence, how we might some to know the essence of some entity, and it highlights some further crucial theoretical roles that essence plays for Lowe and his ‘serious essentialism’ thesis that essences exist, but are not further entities (see 2013a: chapter 8).

a. What Are Essences?

Lowe claimed that the closest that we have for a definition of what an essence is comes from Locke: ‘the very being of any thing, whereby it is what it is’ (Locke, 1975: III, III, §15). Alternatively, we can approach the notion via the Aristotelean idea of a ‘real definition’, as opposed to a ‘verbal definition’: ‘A real definition of an entity, E, is to be understood as a proposition which tells us, in the most perspicuous fashion, what E is—or, more broadly, since we do not want to restrict ourselves solely to the essences of actually existing things, what E is or would be’ (2012a: 104–105). To ask what the essence of an entity is to ask for the real definition. It is to ask for a definition of that thing.

Though heavily inspired by Locke, Lowe stresses that, contra Locke, essences are not further entities, since if all essences were entities, and all entities had essences, an infinite regress would arise. Further to this, essences are also not entities, as essences are, in a sense, the identity of an entity. This is because to express the essence of an entity is to express its identity and existence conditions (from which other knowledge such as the entities persistence conditions can be derived). Expressing these identity and existence conditions is to express what that entity essentially depends upon, which ultimately is to express its essence.

This might seem strange given the above quote about real definitions being understood ‘as a proposition’. However, the real definition may be a proposition, but only as this proposition expresses the essence of the entity. The essence is not a further entity of any kind: not a set of identity and existence conditions, or a proposition. Therefore:

To know something’s essence is not to be acquainted with some further thing of a special kind, but simply to understand what exactly that thing is. This, indeed, is why knowledge of essence is possible, for it is a product simply of understanding, not of some mysterious kind of quasi-perceptual acquaintance with esoteric entities of any sort. And, on pain of incoherence, we cannot deny that we understand what at least some things are, and thereby know their essences. (2013a: 147)

This insistence that an essence is not a further entity is one reason that Lowe’s account can be distinguished from perhaps the best-known account of essence, especially as it derives from the work of Kripke and Putnam. Under that account, essences are discovered a posteriori as the essence of an entity is what that entity consists of—the essence of water consists in its molecular make-up of H2O, or the essence of a living organism consists in its DNA. However, this makes the essence of an entity some further entity, opening the way for the possibility of an infinite regress once we ask what the essence of those further entities are.

Providing a clear illustrative example of an essence of an entity is difficult. Lowe thought that specifying or providing the real definition of an entity is incredibly hard, even though we can know aspects of the essence of entities. The one normally given, that Lowe borrows from Spinoza, is that of a circle:

Circle: A circle is the locus of a point moving continuously in a plane at a fixed distance from a given point. (2012a: 105; see Spinoza 1955)

This tells us what a circle is, and what Lowe termed its generating principle—what it takes for a circle to come into being. It is a necessary truth about circles because it is part of the essence of what it is to be a circle. However, importantly, not all necessary truths will be essential truths. This is because certain necessary truths, as mentioned above, are not metaphysically necessary, but only physically necessary.

That we can know something of the essence of non-existing things means that for Lowe essence precedes existence (2013a: 148). The reason that Lowe thinks this is tied to the metametaphysical claims is discussed in section 2—to find out if some X exists, we must first know what X is. This is not to deny that to understand the essence of something might have first discovered the existence of certain other kinds of things. For example, we knew what transuranic elements were before we discovered them, but only because we had already learned about the composition of other atomic nuclei and thus that what we were trying to find was elements with new combinations of protons and neutrons. Those transuranic elements could not even have been understood prior to the discovery of sub-atomic particles, but given that discovery we could come to know some part of the essence of some elements that we at that time had not empirically discovered.

Note, that this also counts against the a posteriori nature of the Kripke–Putnam view. Given that essences are not further entities, they are not things out in the world to be discovered. As we have seen, this is not to deny that there is perhaps some empirical knowledge required prior to understanding the essences of some entities. It is only to say that some a priori grasping of an entities essence is required first.

b. Modality and Essence

One major theoretical role that essences play in Lowe’s ontology is that, contra Kripke–Putnam essentialism, and in line with other supporters of (broadly conceived) Aristotelean essentialism (See Fine, 1994, 1995a, 1995b; Oderberg 2007, 2011; Koslicki 2012), essence is ontologically prior to modality. Essences should not be reduced to de re modal properties: ‘essences are the ground of all metaphysical necessity and possibility’ (2013a: 152; see also 2011b).

Much of the reason for this comes from Lowe’s arguments that other accounts, most prominently those built around the notion of ‘possible worlds’ are flawed. At heart, Lowe’s objection to such views is that they do not actually explain what they set out to—modal truths. This is because the very notion of a ‘possible world’ upon which such views must rely is itself highly obscure. Thus, in the end, have to resort to a form of modal primitivism whereby modal truths have to be taken to be brutely true or false. For this reason, Lowe argues that it is better to take essence to be the more fundamental notion, as essence can both be more readily independently grasped, and used to explain modality. (There is not space here for a full overview of Lowe’s criticisms of the various versions of alternative views; for more, see 2013a: chapter 8, 2008b; 2012c.)

c. Categoricalism

A further major element in Lowe’s account of essence is his distinction between general and individual essences:

If X is something of kind K, then, X ’s general essence is what it is to be a K, while X’s individual essence is what it is to be the individual of kind K that X is, as opposed to any other individual of that kind. So suppose, for example, that X is a particular cat. Then X’s general essence is what it is to be a cat and X’s individual essence is what it is to be this particular cat, X. (2013a: 145)

The individual essence is required in addition to a general essence to ensure that being a particular entity is distinct from being just some entity of a particular kind. That is, as the general essence is shared by all entities that are Ks, the individual essence allows us to individuate between different Ks. Specifying the essence of an entity is to express that entity’s identity conditions, and identity conditions (or criterion of identity) are what allow us to individuate entities (see 1989; 2013a: chapter 5).

This distinction is closely tied to Lowe’s thesis of categoricalism—the view that one necessary condition on a thinker’s ability to pick out single objects in thought is the grasping of a categorial concepts under which the object is conceived to fall (2013a: 21).

The question this is answering is how can we comprehendingly have singular thoughts about objects. Categoricalism is the answer. For Lowe, we cannot have singular thoughts about, to use Lowe’s example, a cat, Oscar, unless we have already grasped that Oscar falls under the categorial concept of ‘living being’, as this would appear to be the narrowest general concept that Oscar could fall under.

Of course, Oscar falls under other categories also—such as animal, and cat—but these are subcategories of the more general category of living organism. This explains why we may have singular thoughts about Oscar even if we mistakenly believe that Oscar is a dog (because, say, we have misheard my neighbours and not actually seen Oscar ourselves). Categoricalism allows that we might mis-categorise objects as it only requires a sufficient grasp of the essence of an entity; but it does rule out situations in which we thought that Oscar was actually a piece of furniture. In such cases, it seems correct to say that we have not actually grasped the essence of Oscar at all.

One immediate objection here might be that we could use notions such a ‘thing’ or ‘entity’ in which case we would always, trivially, be able to grasp part of the essence of an entity. However, as noted above, notions like ‘thing’ and ‘entity’ are transcategorial. This means that they cannot provide us any essential knowledge about the entity in question. Transcategorial notions cannot allow for thinking about an object comprehendingly because the terms do not express categories, and therefore do not provide implicit or explicit knowledge of the relevant object’s criteria of identity.

We can see that categoricalism has a major consequence for Lowe—it means that Lowe thinks that we cannot think comprehendingly about any entity without first grasping some aspect of its essence. The requirements for grasping a part of an essence are minimal, and Lowe is explicit that he thinks that even young children are capable of doing this. But, crucially, because we do not require the ability to grasp the full essence of an entity to think comprehendingly about it, nor do we require empirical knowledge to grasp part of an essence of an object. As seen above, the statue and the lump of bronze are empirically identical, but we can distinguish them because we know what kind of thing they are—that is, what their identity and existence conditions are, and therefore what they essentially depend upon.

5. Mind, Persons, and Agency

Alongside and intertwining with the above described complex ontological system, Lowe defended some less commonly held positions in the metaphysics of mind. This last section outlines some of the key aspects of these positions, though again noting that this due to space limitations must be taken as only a survey of his thinking on these matters, and, in particular, one overlooking significant negative arguments Lowe developed against the alternative positions.

These positions, especially his views on persons, are driven by what Lowe thinks about substances, properties, and other metaphysical and ontological topics already covered in this article. That is, there is a sense in which Lowe’s views about persons, agency, and the mind can be seen as an application to these debates of the metaphysical principles and views that Lowe defended. For example, throughout this section, the role of identity conditions is central. Similarly, in Lowe’s work on mental causation, universals play a significant role. This, of course, does not mean that we must accept Lowe’s positions in the philosophy of mind if we accept his broader ontological picture, or vice versa. Rather, this is only to highlight the intricate and systematic nature of Lowe’s philosophical views.

a. The Non-Identity of Mental and Physical States

The non-identity of mental and physical states for Lowe ultimately comes from his claim that the two have different identity conditions, and, as seen above when discussing identity more broadly, if two entities have distinct identity conditions, then they cannot be entities of the same kind.

It can, of course, subsequently be asked in what way are the identity conditions for mental and physical states different. One difference that Lowe appeals to is that a physical state is, by its essential nature, a thing whose possession makes a difference to at least part of the space that the thing that possesses it occupies. For example, the property of being sitting is physical, as in virtue of possessing that property a person fills space in a particular way. In contrast, Lowe holds that there are no such spatial connotations for mental states. As such, mental and physical states have distinct identity conditions and thus cannot be of the same kind (2008a: 22–23).

Lowe in fact wishes to go further, stating that physicalism simply cannot be true and is an unintelligible thesis. One reason for this claim is that he thinks that truths about identity cannot be exciting in the way that physicalism would require. This is because identity statements can only intelligibly hold between entities of the same kind. However, ‘exciting identifications—of physical objects with mathematical objects, or of mental states with physical states—all violate this principle by trying to identify items of quite different kinds’ (2008a: 23, chapter 5).

b. Non-Cartesian Substance Dualism (NCSD)

Lowe, as we have seen above, holds that a substance is an individual or a bearer of properties. In the case of mental properties (such as pain and desire), this bearer is the subject of experience, with human persons being a prime example of such subjects (though note that for Lowe non-human animals might be also considered subjects of experience, and as such his non-Cartesian substance dualism is not inherently restricted to humans only). As well as this subject of experience, there also exists a physical body—a substance that is the bearer of physical properties. Persons are to be identified with the subject of experience rather than the biological organism. Two distinct substances exist (the person and the body), but they are not identical with each other‘a human person is not identical with his or her “organized body” nor with any part of it’ (2008a: 95–96; see also 1996: chapter 2). Indeed, for Lowe, the non-identity of the self with its body or any part of it implies that the self is a simple, non-composite substance (2001).

Famously, Descartes’ dualism additionally held that a person cannot be identified with the person’s brain or body as the person can only be the bearer of mental properties, and not physical properties. Lowe is clear that his version of dualism is not committed to this additional claim. Instead, Lowe rejects the idea that persons can only have mental properties:

this sort of [non-Cartesian] substance dualist may maintain that I [a person] possess certain physical properties in virtue of possessing a body that possesses those properties: that, for instance, I have a certain shape and size for this reason, and that for this reason I have a certain velocity when my body moves. (2008a: 95)

This, though, is not to say that every physical property of the body is also possessed by the person, as otherwise the view would collapse into the view that the person is the body.

Thus there are two distinct substances, a subject of experience (a person) and the physical body that the person possesses, and, contra Descartes, the person can be the bearer of psychological and physical properties. This has an important consequence that Lowe does hold that persons are not necessarily separable from their bodies, in the sense of being capable of disembodied existence. This is because Lowe thinks that it is part of the essence of what it is to be a human that we have bodies. If there were just disembodied minds, then that would not be a human.

As discussed in more detail in section 5d, Lowe’s non-Cartesian substance dualism is a form of interactionist dualism—he is committed to the claim that at least some mental events cause changes in the physical world.

c. The Unity Argument for NCSD

Above we have largely just asserted in line with substance dualism that a person is not to be identified with their body. Lowe does provide arguments for this; here we focus on an argument that Lowe described as the strongest (2008a: chapter 5.2, see also 2010, 2014). The argument is as follows:

(1) I am the subject of all and only my own mental states.

(2) Neither my body as a whole nor any part of it could be the subject of all and only my own mental states.

Therefore,

(3) I am not identical with my body nor with any part of it.

Lowe takes (1) to be a self-evident truth (see 2006b for a defence of this from responses from certain psychopathological conditions). So it is (2) that requires a defence.

The defence comes from the assertion that that ‘no entity can qualify as the subject of certain mental states if those mental states could exist in the absence of that entity’ (2008a: 96). That is, mental states must have a subject, and it is not possible for the very same mental states to belong to a different subject than the one that the do in fact belong to.

However, the same cannot be said of the body. Whilst it may be true that if we were to lose some parts of our body then we might lose some mental states—we might lose certain sensations, though not always as shown through instances of ‘phantom pain’—we would still have in such cases many of the same mental states despite not having that bodily part. This means that many, if not all, of our mental states could exist even if our bodies, as a whole, did not exist. Our bodies might be different in terms of possessing different parts, but in those circumstances, we could still have the same mental states. This shows us, for Lowe, that the body as a whole cannot be the subject of mental states of all of and only our specific mental states, and thus why we cannot be identified with our bodies.

If this line of reasoning is accepted, it is further apparent that the physicalist cannot respond by saying that it is the brain, and not the body, that is identical to the self, as the same argument can be run, except replacing ‘body as a whole’ with ‘brain as a whole’ with the same conclusion.

To be clear, this is not a claim that if the brain were destroyed then our mental states would continue to exist. Lowe’s account is not committed to the view that the mind could continue to exist without the brain. Rather, Lowe’s claim is that a person’s mental states do not depend on any particular part of the brain in the way that they do depend on the person continuing to exist—that there is no part of the brain which is such that were any part of it destroyed (say, one neuron destroyed), then all of the person’s mental states would cease to be. The same cannot be said for the person, especially in light of Lowe’s claim that persons are simple, non-composite objects.

The unity of mental states with the subject that those mental states belong to thus, for Lowe, shows that the body cannot be identified with the person as the subject of experience.

d. Mental Causation

Given the interactionist nature of Lowe’s dualism, a major issue that arises is arguments for the causal closure of the physical, and providing an understanding of the nature of mental causation.

A central part of Lowe’s case that mental causation is a real phenomenon, and not something that can be reduced to physical causation, is the recognition that mental causation is intentional, unlike physical causation. Physical causation does not have this feature, and Lowe argues that we need both sorts of causation in order to fully account for human behaviour. He writes:

Intentional causation is fact causation, while bodily causation is event causation. That is to say, a choice or decision to move one’s body in a certain way is causally responsible for the fact that a bodily movement of a certain kind occurs, whereas a neural event, or set of neural events, is causally responsible for a particular bodily movement, which is a particular event. The decision, unlike the neural event, doesn’t causally explain why that particular bodily movement occurs, not least because one cannot intend to bring about what one cannot voluntarily control—for, as I pointed out earlier, one cannot voluntarily control the precise bodily movement that occurs when one decides, say, to raise one’s arm. (2008a: 110)

The claim is this: we have voluntary control over certain actions as shown simply by our everyday experience of the world. A person cannot have voluntary control over the neural causes of a particular action, in part shown through the multiple realisability of neural causes. But, to understand why a particular event happened, it is not sufficient to know that an event of that kind occurred. For Lowe, only intentional causation can provide that kind of explanation, and, as we cannot intend to bring about what we cannot voluntarily control, it must be the case that there is a further, non-neural, mental cause of voluntary actions.

The mental decision, D, does not cause the particular bodily event, B, as it is the neural cause, N, that is causing the particular bodily movement. The occurrence of D is compatible with both B and B* occurring, as distinct particular bodily movements caused by N and N* respectively. However, D is required to fully explain why an event of the kind B occurred.

A significant upshot of this account of mental causation is that Lowe argues it means that we can avoid even the strongest form of arguments from the causal closure of the physical (see 2000c, 2003a, 2008a: chapters 2 and 3 for more extended discussions about the causal closure of the physical, including in-depth discussions of its various different forms). The form Lowe cites is as follows:

1) No chain of event-causation can lead backwards from a purely physical effect to antecedent causes some of which are non-physical in character.

2) Some purely physical effects have mental causes.

3) Any cause of a purely physical effect must belong to a chain of event-causation that leads backwards from that effect.

Therefore,

4) All of the mental causes of purely physical effects are themselves physical in character (from 2008a: 100–101, numbers changed from original).

In this form, Lowe rejects (3). It is not only event-causation that is involved in explaining the voluntary bodily movements of humans. We also require intentional-causation, but intentional-causation is fact-causation. Mental states are thus causally efficacious in determining what kind of event occurs, and, for Lowe, this is entirely compatible with the claim that some particular physical bodily movement, B, is caused by a particular neural event, N.

e. Agent Causation

In the above, we have made use of the notion of a voluntary action, without expanding upon it. In this last section, we outline in brief Lowe’s view of willing action, and agent causation.

Lowe is clear that he thinks that strictly speaking there is no such thing as event causation. Rather, there is only agent causation—that is, only causation by agents which is agents acting in some manner. This means that whilst agent causation is in a sense primary, it not the case that the agent just causes the event qua agent, thereby rejecting classical agent causalism and libertarianism as both distinguish between different types of causation, whereas Lowe does not. Instead, the agent causes an event by willing, or having a volition to perform, some action.

An agent, in the above statement, can include inanimate objects. However, when it comes to humans, and voluntary human behaviour, the agent causes some event by willing to cause an event of that type. The act of willing is an event, ‘but not merely an event: it is an action of A’s—indeed, it is a primitive action of A’s, because it is not further analysable in terms of more basic actions of A’s and the consequences of such actions’ (2008a: 7; chapter 9).

Such willings, or volitions to do such-and-such, are for Lowe the most basic kind of action that a free agent can perform, and they are completely uncaused and spontaneous. The idea of uncaused events is of course controversial; however, Lowe argues that it is no more mysterious than the spontaneous decay of a radioactive atom. As before, it is important to see that Lowe is keen here to stress that in order to explain human action, uncaused volitions are a required posit, but also that such volitions are entirely consistent with modern science.

The relationship between the agent and their volitions, though, is a non-causal relation. The relation is ‘internal’: ‘to speak of “performing a volition” amounts to speaking of doing a doing, which is similarly tautologous. This is why it is less misleading simply to say something like “A willed to ϕ”, rather than “A performed a volition to ϕ”’ (2008a: 8).

Lastly, these volitions or acts of the will are performed in light of reasons. Free agents, such as humans, have a special place in the causal world precisely because our agent causation occurs in light of such reasons and rational reflection. Thus, Lowe’s account does not posit some special restricted notion of agent causation. All causation is agent causation, with agents causing events by acting in certain ways. What is special about humans (and potentially other free agents) is that they possess a distinctively rational power of willing certain events to be caused in light of reasons.

This summary of Lowe’s conception of persons and personal agency is admittedly very brief. However, it should be enough to indicate that Lowe’s views are a complex response to the apparent evidence of free choice or action in the world, whilst wishing to propose a theory that is consistent with modern science. Lowe’s views are certainly distinctive, and run counter much of the contemporary literature. For Lowe, the claims that some might find troublesome—non-Cartesian substance dualism, his conception of persons and agent causation amongst others—are warranted in virtue of the fact that Lowe thinks that the other views available and supported in the literature cannot adequately explain what they set out to explain. That is, Lowe’s views on this (and the above discussion of essence and ontology) need to be approached from the view that there is a certain range of phenomenon to be explained, and Lowe thinks that additional posits are required to explain those phenomena.

6. Other Work

As stated, this summary of Lowe’s work has focused on issues in metaphysics, and related topics in philosophy of mind, logic and philosophy of language, focusing mainly on Lowe’s positive views on these topics. Lowe’s work has had numerous influences beyond the scope of this piece.

Some specific areas include his work on Locke (1986a, 1995a, 2005a, 2013b); the ontological argument (2007a, 2012b); truth and truth-making (2003c, 2007c, 2009c); reference (1993, 2012e, 2013a); vagueness (2005c, 2011c); intentionality (1978, 1980, 1982a, 1982b); predication (1986b, 2012d, 2013a); counterfactuals (1979, 1984, 1995c); and consciousness (1995d, 1996, 2006b). Lowe also wrote highly accessible general overviews of metaphysics (2002a) and philosophy of mind (2000a).

7. References and Further Reading

a. E. J. Lowe

  • 1978. ‘Neither intentional nor unintentional’, Analysis, 38: 117-18.
  • 1979. ‘Indicative and counterfactual conditionals’, Analysis, 39: 139-41.
  • 1980. ‘An analysis of intentionality’, Philosophical Quarterly, 30: 294-304.
  • 1982a. ‘Intentionality and intuition: a reply to Davies’, Analysis, 42: 85.
  • 1982b. ‘Intentionality: a reply to Stiffler’, Philosophical Quarterly, 32: 354-7.
  • 1984. ‘Wright versus Lewis on the transitivity of counterfactuals’, Analysis, 44: 180-3.
  • 1986a. ‘Necessity and the will in Locke’s theory of action’, History of Philosophy Quarterly, 3: 149-63.
  • 1986b. ‘Noonan on naming and predicating’, Analysis, 46: 159.
  • 1987a. ‘Lewis on perdurance versus endurance’, Analysis, 47: 152-4.
  • 1987b. ‘The indexical fallacy in McTaggart’s proof of the unreality of time’, Mind, 96: 62-70.
  • 1988. ‘The problems of intrinsic change: rejoinder to Lewis’, Analysis, 48: 72-7.
  • 1989. Kinds of Being: A Study of Individuation, Identity and the Logic of Sortal Terms, Oxford and New York: Basil Blackwell.
  • 1992. ‘McTaggart’s paradox revisited’, Mind, 101: 323-6.
  • 1993. ‘Self, reference and self-reference’, Philosophy, 68: 15-33.
  • 1994. ‘Ontological dependency’, Philosophical Papers, 23(1): 31-48.
  • 1995a. Locke on Human Understanding, London and New York: Routledge.
  • 1995b. ‘Coinciding objects: In defence of the “standard account”’, Analysis, 55: 171-8.
  • 1995c. ‘The truth about counterfactuals’, Philosophical Quarterly, 45: 41-59.
  • 1995d. ‘There are no easy problems of consciousness’, Journal of Consciousness Studies, 2: 266-71.
  • 1996. Subjects of Experience, Cambridge: Cambridge University Press.
  • 1998. The Possibility of Metaphysics: Substance, Identity and Time, Oxford: Oxford University Press.
  • 2000. An Introduction to the Philosophy of Mind, Cambridge: Cambridge University Press.
  • 2000b. ‘Locke, Martin and substance’, Philosophical Quarterly, 50: 499-514.
  • 2000c. ‘Causal closure principles and emergentism’, Philosophy, 75: 571-85.
  • 2001. ‘Identity, composition and the self’, in Soul, Body and Survival, K. Corcoran (ed.), Ithaca: Cornell University Press, pp. 139-58.
  • 2002a. A Survey of Metaphysics, Oxford: Oxford University Press.
  • 2002b. ‘Kinds, essence and natural necessity’, in Individuals, Essence and Identity: Themes of Analytic Metaphysics, A. Bottani, M. Carrara, and P. Giaretta (eds.), Dordrecht: Kluwer, pp. 189-206.
  • 2003a. ‘Physical causal closure and the invisibility of mental causation’, in Physicalism and Mental Causation: The Metaphysics of Mind and Action, S. Walter, and H.-D. Heckmann (eds.), Exeter: Imprint Academic, pp. 137-54.
  • 2003b. ‘3D/4D equivalence, the twins paradox, and absolute time’, with Storrs McCall, Analysis, 63: 114-23.
  • 2003c. ‘Metaphysical realism and the unity of truth’, in Monism, A. Bachli, and K. Petrus (eds.), Frankfurt: Ontos Verlag, 2003, pp. 109-23.
  • 2005a. Locke, London and New York: Routledge.
  • 2005b. ‘Endurance versus perdurance and the nature of time’, Philosophical Writings, 10: 45-58.
  • 2005c. ‘Identity, vagueness and modality’, in Thought, Reference, and Experience: Themes from the Philosophy of Gareth Evans, J. L. Bermudez (ed.), Oxford: Oxford University Press, pp. 290-310.
  • 2006a. The Four-Category Ontology: A Metaphysical Foundation for Natural Science, Oxford: Oxford University Press.
  • 2006b. ‘Can the self disintegrate? Personal identity, psychopathology, and disunities of consciousness’, in Dementia: Mind, Meaning and the Person, J. Hughes, S. Louw, and S. Sabat (eds.), Oxford: Oxford University Press.
  • 2006c. ‘The 3D/4D controversy: a storm in a teacup’, with Storrs McCall, Nous, 40: 570-8.
  • 2007a. ‘The ontological argument’, in The Routledge Companion to Philosophy of Religion, C. Meister, and P. Copan (eds.), London and New York: Routledge, pp. 331-40.
  • 2007b. ‘La métaphysique comme science de l’essence’, in Métaphysique contemporaine: propriétés, mondes possibles, et personnes, E. Garcia, and F. Nef (eds.), Paris: J. Vrin, pp. 85-117. Translated as ‘Metaphysics as the science of essence’.
  • 2007c. ‘Truthmaking as essential dependence’, in Metaphysics and Truthmakers, J.-M. Monnoyer (ed.), Frankfurt: Ontos Verlag, pp. 237-59.
  • 2008a. Personal Agency: The Metaphysics of Mind and Action, Oxford: Oxford University Press.
  • 2008b. ‘Two notions of being: Entity and essence’, Royal Institute of Philosophy Supplement, 83 (62): 23-48.
  • 2008c. ‘Essentialism, metaphysical realism, and the errors of conceptualism’, Philosophia Scientiæ, 12 (1): 9-33.
  • 2009a. More Kinds of Being: A Further Study of Individuation, Identity, and the Logic of Sortal Terms, Malden, MA and Oxford: Wiley-Blackwell.
  • 2009b. Truth and Truth-Making, E. J. Lowe, and A. Rami (eds.), Stocksfield: Acumen.
  • 2009c. An essentialist approach to truth-making, in Truth and Truth-Making, E. J. Lowe, and A. Rami (eds.), Stocksfield: Acumen, pp. 201-16.
  • 2010. ‘Why my body is not me: the unity argument for emergentist self-body dualism’, in Emergence in Science and Philosophy, A. Corradini, and T. O’Connor (eds.), New York and London: Routledge.
  • 2011a, ‘The rationality of metaphysics’, in Stance and Rationality, O. Bueno, and D. P. Rowbottom (eds.), Special Issue of Synthese, 178: 99-109.
  • 2011b. ‘Locke on real essence and water as a natural kind: a qualified defence’, Aristotelian Society Supplementary Volume, 85: 1-19.
  • 2011c. ‘Vagueness and metaphysics’, in Vagueness: A Guide, G. Ronzitti (ed.), Dordrecht: Springer, pp. 19-53.
  • 2012a. ‘Essence and ontology’, in L. Novak, D. D. Novotny, P. Sousedik, and D. Svoboda (eds), Metaphysics: Aristotelian, Scholastic, Analytic, Frankfurt: Ontos Verlag, pp. 93-111.
  • 2012b. ‘A new modal version of the ontological argument’, in M. Szatkowski (ed.), Ontological Proofs Today, Frankfurt: Ontos Verlag, pp. 179-91.
  • 2012c. ‘What is the source of our knowledge of modal truths?’, Mind, 121: 919-50.
  • 2012d. ‘Categorial predication’, Ratio, 25: 369-86.
  • 2012e. ‘Individuation, reference, and sortal terms’, in Perception, Realism, and the Problem of Reference, A. Raftopoulos, and P. Machamer (eds.), Cambridge: Cambridge University Press, pp. 123-41.
  • 2013a. Forms of Thought: A Study in Philosophical Logic, Cambridge: Cambridge University Press.
  • 2013b. Locke’s Essay Concerning Human Understanding, London and New York: Routledge.
  • 2013c. ‘Neo-Aristotelian metaphysics: A brief exposition and defense’, in Aristotle on Method and Metaphysics, E. Feser (ed.), Palgrave Macmillan.
  • 2014. ‘Why my body is not me: The Unity Argument for Emergentist Self-Body Dualism’, in Contemporary Dualism: A Defense, A. Lavazza and H. Robinson (eds.), New York: Routledge.
  • 2015. ‘Ontological dependence’, with Tuomas Tahko, The Stanford Encyclopedia of Philosophy.

b. Other References

  • Armstrong, D. M., 1983. What is a law of nature?, Cambridge: Cambridge University Press.
  • Bradley, F. H., 1893. Appearance and Reality, Oxford: Clarendon Press.
  • Fine, K. 1994. ‘Essence and Modality’, Philosophical Perspectives, 8:1–16.
  • Fine, K. 1995a. ‘Senses of Essence’, in Modality, Morality and Belief. Essays in Honor of Ruth Barcan Marcus, Sinnott-Armstrong, W. (ed.), Cambridge: Cambridge University Press, pp. 53–73.
  • Fine, K. 1995b. ‘The Logic of Essence’, Journal of Philosophical Logic, 24:241–273.
  • Geach, P. T., 1980. Reference and Generality, Ithaca, NY: Cornell University Press.
  • Griffith, A. M. 2015. ‘Do Ontological Categories Exist?’, Metaphysica, 16 (1):25–35.
  • Hawley, K., 2001. How Things Persist, Oxford: Oxford University Press.
  • Heil, J. 2010. ‘Powerful Qualities’, in The Metaphysics of Powers: Their Grounding and Their Manifestations, A. Marmadoro (ed.), New York: Routledge.
  • Koslicki, K., 2012. ‘Essence, Necessity and Explanation’, in Contemporary Aristotelian Metaphysics, T. Tahko (ed.), Cambridge: Cambridge University Press.
  • Lewis, D. K., 1986. On the Plurality of Worlds. Oxford: Blackwell.
  • Lewis, D. K., 1988. ‘Rearrangement of Particles: Reply to Lowe’, Analysis, 48(2): 65–72.
  • Locke, J., 1975. An Essay Concerning Human Understanding, P. H. Nidditch (ed.), Oxford: Clarendon Press.
  • Miller, J. T. M., 2016. ‘The Non-existence of Ontological Categories: A defence of Lowe’, Metaphysica, 17(2): 163–176.
  • Moore, G. E., 1919. ‘External and Internal Relations’, Proceedings of the Aristotelian Society, 20: 40–62.
  • Morganti, M., and Tahko, T., 2017. ‘Moderately Naturalistic Metaphysics’, Synthese, 194(7): 2557–2580.
  • Mumford, S., and Tugby, M. (eds.), 2013. Metaphysics of Science, Oxford: Oxford University Press.
  • Novotný, D. D., and Novák, L. (eds.), 2014. Neo-Aristotelian Perspectives in Metaphysics, New York: Routledge.
  • Oderberg, D., 2007. Real Essentialism, London: Routledge.
  • Oderberg, D., 2011. ‘Essence and Properties’, Erkenntnis, 75: 85–111.
  • Schaffer, J., 2009. ‘On What Grounds What’, in Metametaphysics: New Essays on the Foundations of Ontology, D. Manley, D. J. Chalmers, and R. Wasserman (eds.), Oxford University Press.
  • Sider, T., 2001. Four-Dimensionalism: An Ontology of Persistence and Time, Oxford: Oxford University Press.
  • Smith, B., 1997. ‘Of Substances, Accidents and Universals: In Defence of a Constituent Ontology’, Philosophical Papers, 26:105–127.
  • Smith, B., 2005. ‘Against Fantology’, in Experience and Analysis, M. E. Reicher and J. C. Marek (eds.), Vienna: HPT and ÖBV.
  • Spinoza, B., 1955. On the Improvement of the Understanding, Ethics, Correspondence, trans. R. H. M. Elwes, New York: Dover.
  • Tahko, T. (ed.), 2012. Contemporary Aristotelian Metaphysics, Cambridge University Press.

 

Author Information

J. T. M. Miller
Email: jamiller@tcd.ie
Trinity College Dublin
Ireland

Explication

Explication is a method employed throughout philosophy and most sciences, as well as any cognitive endeavors which involve allocating concepts. It is also notably found in the sphere of law. Since explication is part and parcel of the traditionally philosophical subject of concept formation, philosophy is the main discipline to reflect on it extensively. Explication has sometimes been compared to a number of philosophical methods, such as logical (or conceptual) analysis, conceptual reduction, and conceptual engineering. Within a broad classification of kinds of analyses (Beaney 2014, sect. 1.1), explication is one kind of transformative analysis (as opposed to decompositional and regressive analysis; all three of them understood as analyses in a wide sense).

Historically, explication was most prominently described by Rudolf Carnap, according to whom “[t]he task of explication consists in transforming a given more or less inexact concept into an exact one […]. We call the given concept (or the term used for it) the explicandum, and the exact concept proposed to take the place of the first (or the term proposed for it) the explicatum.” (1950, 3) Carnap’s exposition remains the main reference point for scholars working on the topic of explication today. It is the most widely accepted general outline of the method of explication. As such, it allows for diverging interpretations in theoretical and procedural respects. This is demonstrated by the increase in research on explication around the turn of the 20th century.

A widely accepted instance of a simple and largely unproblematic explication is the 2006 definition of ‘planet’ by the International Astronomical Union (IAU). The discussion about that term was triggered by a number of discoveries of objects in orbit around the sun that are similar to the nine bodies that had until then been recognized as planets. Since there was no binding definition of ‘planet’ at that point, insecurity arose about whether to call certain objects planets. The IAU member assembly established a definition according to which a planet is “a celestial body that (a) is in orbit around the Sun, (b) has sufficient mass for its self-gravity to overcome rigid body forces so that it assumes a hydrostatic equilibrium (nearly round) shape, and (c) has cleared the neighbourhood around its orbit” (IAU 2006). This disqualified Pluto as a planet, whereas the other eight planets kept their status; and to a large degree the new understanding of the term ‘planet’ incorporated key aspects of the earlier use patterns, while at the same time being much clearer (cf. Murzi 2007).

This article provides a procedural account of explication outlining each step that is part of the overall explicative effort (2). It is prefaced by a summary of the historical development of the method (1). The latter part of the article includes a rough structural theory of explication (3) and a detailed presentation of an examplary explication taken from the history of philosophy and the foundations of mathematics (4).

Table of Contents

  1. History of the Explicative Method
    1. Pre-Analytic Reflections on Explication
    2. The Analytic Classics
    3. Recent Developments
  2. Procedure
    1. Framework
    2. Preparation of an Explicative Introduction
    3. Introducing an Explicatum
    4. Assessing Adequacy
  3. Explication Theory
    1. Constituents of Explication
    2. The Analysis Paradox
  4. An Exemplary Explication
  5. References and Further Reading

1. History of the Explicative Method

Explication is often seen as a central method in philosophy, “an activity to which philosophers are given, and scientists also in their more philosophical moments” (Quine 1951, 25). As such, it has itself become a subject of philosophy. Philosophers are therefore in the business of both performing explications and reflecting on explications. Often, either activity gives rise to the other. Explication and similar methods have been continually employed throughout all of Western philosophy since Socrates/Plato, though under other names, such as ‘definition’, ‘concept-formation’, ‘characterization’, ‘description’, and many others. In the context of definition, the term ‘explication’ was already used by Locke (1997, 378, §III.IV.8). Recent debates are ultimately caused by the methodological character of the writings of logical empiricists. Since the late 20th century revival of metaphysics and the investigation into the history of analytic philosophy, metaphilosophical questions have (re)surfaced repeatedly. Ideas of authors like Rudolf Carnap, including his conception of explication, are being revisited. This renewed activity brings with it debates about new and old issues of explication. Further investigations into the precursors of explication are to be expected in the future.

Accordingly, at least three phases of engagement with explication can be distinguished in the history of philosophy: pre-analytic reflections (1.a), the analytic classics (1.b), and recent developments (1.c).

a. Pre-Analytic Reflections on Explication

The full procedure of explication comprises both analysis of given meanings and, based on that, the stipulation of new meaning. Investigations into acts of describing or prescribing meaning can therefore be conceived as contributions to the methodology of explication in a wide sense. In a narrow sense, only considerations that relate analysis of meaning to stipulation of meaning in a specific fashion qualify as relevant to explicative methodology. In the wide sense, all comments on analysis and stipulation of meaning—by, for example, Socrates/Plato, Aristotle, Thomas Aquinas, Locke, Leibniz, and Mill—deserve to be mentioned in order to fully represent the pre-analytic history of explication. The summary following in the sequel relies on a selection: The conception of Johann Heinrich Lambert, who contributed to the methodology of explication in the narrow sense (without using the term ‘explication’), will be described. Furthermore, Immanuel Kant’s theory will be recapitulated briefly. The assessment of some commentators will be viewed critically, including that Kant provided a relevant and even distinguished contribution.

Johann Heinrich Lambert was a mathematician, scientist, and philosopher highly regarded by Kant. Lambert contributed to a methodology of pre-explicative procedures in the preface to his book “Architectonic” (1771, VI-VII), where the procedures proposed are applied to metaphysical concepts. Lambert envisioned five pre-explicative measures: (i) Many expressions, especially philosophical ones, are highly ambiguous. This needs to be addressed in an analysis of ambiguity. (ii) Many, especially philosophical, expressions have multiple synonyms. They have to be named in an analysis of synonymy. (iii) The terroir analysis, as applied to predicates, identifies examples, counterexamples, sub- and superpredicates, neighboring predicates, etc. It is directed at clarifying the surrounding field of concepts (somewhat anticipating connective analysis; see Strawson 1992, ch. 2). (iv) The final analysis enquires into the purposes that are associated with the use of given expressions. (v) The diachronic analysis reviews the history of the use and characterization of explicanda. The order of these analyses and the interrelations between their results vary on a case-to-case basis (Siegwart 2007a, 109-112). Lambert merits special mention as his remarks predate what is usually considered the starting point of the systematic treatment of explication (Carnap 1950) by over 170 years. He assembles several of the main components of explication preparation (see sect. 2.b) in one paragraph with a clear perspective toward purpose-oriented definitions, although he does not discuss the stipulative step succeeding the preparation. To Lambert, the analysis of the preexisting concepts is never the goal, however, it is a prerequisite for the explicative characterization.

Circumstances are different with Kant’s role in explication theory. In justifying his explication terminology, Carnap (1950, 3) superficially referred to Kant and Husserl (Boniolo 2003, 293-294). In recent years, it has been debated how accurate this ascription of parentage is. However, some authors (Boniolo 2003, 294-295) allege that Kant had a conception of explication superior to Carnap. In order to represent these issues faithfully, one has to pay attention to the essential synopsis regarding explanation of meaning and definition (in a wide sense) in the “Critique of Pure Reason” (Kant 1998, A 712-738/B 740-766, especially A 727-732/B 755-760). Kant started with two distinctions: (i) given and (ii) originally (to be) made concepts; (a) a priori and (b) a posteriori (or: empirical) concepts. According to Kant’s unusual terminology, explications are analyses of empirical concepts (i-b), exemplified by (the analysis of) the concept of gold. Expositions are analyses (in the sense of decompositions) or clarifications of a priori given concepts (i-a), such as the concept of substance. This is characteristic for philosophy, especially for metaphysics. Declarations create new empirical concepts (ii-b), like the concept of a chronometer. Definitions in a narrow sense create non-empirical, a priori concepts (ii-a), as in the concept of a circle. They are relevant for mathematics, but inappropriate for philosophy. The two latter kinds of concept formation are purely novative and therefore cannot be regarded as explications in Carnap’s sense. The two former kinds are analytic, according to Kant. Anyway, they seem to be limited to detection and description of meanings. Unless the characterization of these activities additionally admits a finalizing prescriptive element or at least a method of concept analysis reared toward subsequent stipulative definition, Kant did not employ a concept of explication akin to the one treated below (sect. 2).

b. The Analytic Classics

Current conscious acts of explication, as well as reflections and disputes about explication, usually stand in a methodic tradition that was started by Rudolf Carnap. He viewed Menger’s explication of ‘dimension’ (Menger 1943) as a prototypical explication. Carnap’s seminal text on the topic (1950, 1-18, ch. I) is a methodological introduction to an investigation into probability. The exposition is self-contained and does not rely on his views on probability. Earlier, explicit reference by Carnap to the method of explication can be found in (1947, 7-8, §2). In 1928, he already referenced the method of rational reconstruction, which is often likened to explication (2003, 158, §100 and v, preface to the second edition; see Beaney 2004, 125-128). Carnap named Husserl and Kant as sources of terminological inspiration, but Frege seems to have been a significant influence in this respect as well (Beaney 2004; Lavers 2013). Research on who influenced Carnap in what way is ongoing. At any rate, Frege did not propose any method of explication that, regarding conciseness, comes close to Carnap’s exposition. (But, see Frege 1969, 224, 261-262.  According to Blanchette 2012, 78: “Frege nowhere says what an adequate analysis is.”)

Carnap characterizes explication as “the transformation of an inexact, prescientific concept, the explicandum, into a new exact concept, the explicatum.” (1950, 3, §2) Though in a successful explication the explicatum is exact, Carnap does not fail to notice that the inexactness of the explicandum means that an explication cannot be said to be correct or incorrect in the sense that the explicatum exactly captures the explicandum. In order to work toward an exact explicatum, Carnap envisaged informal explicandum clarification as the first step in an explication. “[W]e must […] do all we can to make at least practically clear what is meant as the explicandum. […] Even though the terms in question are unsystematic, inexact terms, there are means for reaching a relatively good mutual understanding as to their intended meaning.” (1950, 4, §2) This involves distinguishing multiple meanings in the term associated with the explicandum (‘true’, not as used in carpentry, but as used in household language), giving examples and counterexamples (‘true’, not as in ‘a true democracy’), and naming synonyms (‘true’ in the sense of ‘accurate’). It should be noted that the clarification of the explicandum was very important to Carnap, but in (1950) a methodology of preparation of the explicative act is less explicit than in Lambert (1771). Carnap’s presentation of explication is primarily oriented toward that explicative act and its qualification with regard to certain desiderata.

Accordingly, in §3 (1950, 5-8) Carnap proceeds by naming four requirements an explicatum shall fulfill: (i) similarity to the explicandum, (ii) exactness, (iii) fruitfulness, and (iv) simplicity. All of the four requirements are partly, but not fully, explained by Carnap (see Boniolo 2003, 291-292, ch. II) and have therefore all spawned a varying amount of debate in specialized literature. – The similarity of the explicatum to the explicandum has been understood as either an overlap between the extensions of both concepts or an isomorphism between both extensions (cf. Brun 2017, sect. 4-5). Due to the different readings and the elusive nature of the explicandum, the similarity requirement is often seen as the most controversial one. (Beaney 2004, 139) In addition, occasional talk of the explicatum replacing the explicandum exerts some pressure on the similarity requirement, since things that replace one another usually have to be similar in certain respects in order to be replaceable. With regard to Carnap’s exposition, the question remains—in what respects an explicandum and an explicatum need to be similar for the latter to replace the former.

The other three requirements do not refer to the explicandum, but only to the explicatum. They are therefore not specific to a given explication. In an intuitive sense, exactness, fruitfulness, and simplicity are signs of quality for any introduction of a term. Carnap did not specify what he means by ‘exactness’, but he sympathized with the comparative concept of precision by Arne Naess. Naess said, roughly, the expression U is more precise than the expression T if and only if the set of admissible interpretations for T is a non-empty proper subset of the set of admissible interpretations for U. (Carnap 1950, 8, §3; Naess 1953, 60; on Naess’s notion of interpretation: 1953, 41-51) After Carnap, exactness of the explicatum is often viewed as either lack of vagueness or adherence to standards of formal concept formation. More specifically, fruitfulness of the explicatum is explained as figuring in a high number of universal statements. Carnap distinguished between empirical laws for non-logical and theorems for logical explicata. (1950, 6-7, §3) He did not expand on the problem of individuating and counting said statements in the face of the trivial or minor modifications that can replicate any such statement infinite times. Simplicity is presented as being subordinate to the other requirements (1950, 7, §3). It seems to refer to a low degree of syntactical complexity of the explicative definition (or other means of explicatum introduction) and, possibly, to the acceptance and handiness of the concepts and terms that are employed to introduce the explicatum.

In the two subsequent sections, Carnap gives a number of examples and distinguishes different kinds of concepts that are conceivable as explicata. This has primarily illustrative merit to the procedure of explication. However, in the last section of his exposition (1950, 15-18, §6), Carnap adds another aspect to explication that was important to him, namely interpretation. In his view, explication is a form of formalization which is the first part of the axiomatic method to establish a formal scientific theory. The second part is the interpretation of the axioms or postulates, which determine the interpretation of the definitions as well. This was a major programmatic issue for Carnap since the 1940s that was important to him independently from the method of explication. In the literature on explication, interpretation plays only a minor role.

Together with some of his other methodological publications, Carnap’s exposition influenced other scholars in one way or the other. Most notable are Nelson Goodman, Carl Gustav Hempel, Willard Van Orman Quine, and Peter Strawson. Hempel included the method into his Fundamentals of Concept Formation (1952, pt. I) where he distinguishes different ways to endow meaning. Among other things, analysis of meaning (cf. 2.b below) is described in detail. Explications considered as real definitions (in a very wide sense) he characterizes in continuity with Carnap: “Explication is concerned with expressions whose meaning in conversational language or even in scientific discourse is more or less vague […] and aims at giving those expressions a new and precisely determined meaning, so as to render them more suitable for clear and rigorous discourse on the subject matter at hand” (1952, 11). Hempel points out that the interplay between analyzed meaning on the one hand and systematic discursive interests on the other calls for a “judicious synthesis” (ibid.) of both.

Quine most notably discussed explication with reference to two set theoretic conceptions of the term ‘ordered pair’. (1960, 257-262, §53) He developed the method while discussing ontological issues related to that expression and then took the method to be representative for what happens in philosophical analysis and explication. In Quine’s own view (1960, 259, fn. 4), he followed Carnap, although he pragmatically converted the requirement of similarity: Inspired by the use of the explicandum, explicators associate the explicatum (Quine: “explicans”, see, for example, Reichenbach 1951, 49) with functional criteria of adequacy that are then supposed to be satisfied by it. In the case of the ordered pair there is just one such criterion: ∀xyzw [(x, y)=(z, w) → x=z ^ y=w]. It is satisfied by any adequate definition of the ordered pair, for example by the one presented by Kuratowski: ∀xy [(x, y)={{x}, {x, y}}]. The choice of criteria is open to those who perform explications and is guided by systematic interests that are associated with the actual use of the explicandum. For instance, the specific criterion for the ordered pair proves important when defining ‘relation’ by means of ‘ordered pair’. “We have, to begin with, an expression or form of expression that is somehow troublesome. […] But also it serves certain purposes that are not to be abandoned. Then we find a way of accomplishing those same purposes through other channels, using other and less troublesome forms of expression.” (Quine 1960, 260, §53)

Compared to Carnap (1950), these criteria of adequacy are new. Later, Carnap presupposed the Quinean understanding of adequacy in his reply to Strawson (Carnap 1963, 939). At any rate, the pragmatic outlook with its rather liberal selection of functional criteria of adequacy is more pronounced in Quine than in Carnap. (Within Quine’s own writing, the characterization of explication in (1960) is already a liberalization. According to (1951, 25, ch. II), explication relies to a higher degree on previous usage of the explicandum, which is meant to be “preserved” through the explication. This was part of Quine’s critique of synonymy relations, which according to him are presupposed by explicators, too.)

Strawson’s contribution to the method of explication (1963; see Pinder 2017b) is critical in nature. Starting with a rather strong reading of Carnap’s approach, in which the explicatum “replaces” the explicandum (Carnap 1950, 3), Strawson maintains that replacing a non-scientific concept with a scientific concept cannot be done without distorting the locutions that employ the original concept(s). “[T]ypical philosophical problems about the concepts used in non-scientific discourse cannot be solved by laying down rules of use of exact and fruitful concepts in science. To do this last is not to solve the typical philosophical problem, but to change the subject.” (1963, 506) In effect, explication prevents philosophers from dealing with their original problems that arise in a context of unexplicated concepts. The criticism that explication is “changing the subject” is still discussed among explication scholars (1.c). It is related to one version of the paradox of analysis (3.b). In a subsequent section of his paper, Strawson drops the strong reading of replacement, but maintains that in order to show the worth of an explication, one has to somehow relate the pre-explicative framework to the post-explicative one (1963, 510-514). This, to some extent, still requires an exact grasp of the pre-explicative conceptual situation. But if an exact grasp of the explicandum is possible, what does one need an explicatum (and an explication) for? (See Carnap’s reply 1963, 933-940.)

Goodman (see Cohnitz and Rossberg 2006, ch. 3) can be credited for two contributions: First, he elaborated on the similarity criterion of adequacy by proposing a criterion of “extensional isomorphism” in his 1951 Structure of Appearance (1966, 13-22, §I,3). This is a relation between, on the one hand, a set of explicanda and their semantical interrelations and, on the other hand, a set of explicata and their semantical interrelations. The criterion is different from both Carnap’s extensional overlap and Quine’s functional criteria of adequacy. The second contribution is derived from the first one, as the criterion of extensional isomorphism is usually directed at systems of expressions or concepts instead of individual ones (see Brun 2016, 1235-1236). Carnap recognized Goodman’s proposal of extensional isomorphism, but he deferred the choice of any similarity criterion to the specific explicative scenario at hand (1963, 945-946).

An outline of the development of explication in the analytic classics would be incomplete without a reference to the definition of truth by Alfred Tarski (1944; 1956; 1969; Hodges 2014). Although he does not employ the term ‘explication’, Tarski’s own depiction of his endeavors suggests classifying them as such. With regard to the explanation of meaning, for example, he distinguishes between (a) an account of the actual use of the term and (b) a normative suggestion that the term be used in some definite way. He attributes a mixed character to his own project: “What will be offered can be treated in principle as a suggestion for a definite way of using the term ‘true’, but the offering will be accompanied by the belief that it is in agreement with the prevailing use of this term in everyday language”. (Tarski 1969, 63) The so-called semantical explication of truth deserves the attention of explication theorists because it allows for all explication components to be identified with ease. Also, the core steps in the procedure are effortlessly traceable (Greimann 2007, 263). This especially applies to the criteria of explicative adequacy and the explicatum language. The criteria are condensed in convention T (Tarski 1956, 187-188). The remarks on—and the expressive power of—the explicatum language are associated with the distinction between object and metalanguage and, as a result, with the prevention of the semantical antinomies.

c. Recent Developments

Within recent discussions on explication, the mainly systematic contributions have to be distinguished from the mainly historical ones. In addition, there are numerous systematic publications that include significant exegetical sections. However, many exegetical investigations into the writings of authors referenced in the preceding section serve the purpose of correctly ascribing positions and changes of mind to pioneers of analytic philosophy and of identifying historical influences between them (such as Beaney 2004; Boniolo 2003; Carus 2007; Creath 2012; Floyd 2012). This section concerns only those contributions that are predominantly systematic.

Publications that contribute to the method of explication in systematic respects exist in a continuum with the works of Carnap, Quine, Goodman, Strawson and others. This is not true of the historical research on explication, which is a rather new phenomenon. Beginning in the 1960s, several scholars developed, criticized, and defended explication, sometimes enriching the discussion considerably. However, except for the four philosophers mentioned above, few scholars who worked on explication before 2000 are referenced in discourse after 2000. The renewed interest in explication is related to, and partly caused by, the revival of metaphysics toward the end of the 20th century and the subsequent rise of metametaphysics. Because of this historical situation, most of the recent investigations into explication consider its main field of application to be metaphysics and ontology. However, this is too narrow a view as evidenced in the settings within which some of the classic authors raise the issue of explication: such as Carnap (1950; philosophy of science), Quine (1960; set theory). In addition, many textbook examples lie outside metaphysics (Tarski’s explication of truth; 1956) or outside philosophy (IAU’s planet concept). Applications of explication in practical philosophy or in the special sciences are underrepresented within the research on explication (but see Hahn 2013, 34-53).

An example of the continuity between the pioneering efforts in explication methodology and later research is Hanna (1968). Hanna presumably was the first to develop an explicit procedural account of explication, which consists of five steps. (An incomplete procedural account is given by Naess (1953, 82-84).) Tillman (1965; 1967) transformed Strawson’s well known critique (see 1.b) into the method of “linguistic portrayal”, which can be seen as one step in a method of explication (2.b). Martin (1973) was the first to dedicate an entire paper to the explication of whole theories (systems of concepts, for example), as opposed to single concepts or expressions.

These examples can be seen as prototypes of some of roughly five very common kinds of systematic investigations into explication. (i) A number of contributions attempt, like Hanna, to establish a procedure of explication (Brun 2016; Greimann 2007; Siegwart 1997a) or to provide a formal theory of explication (Cordes 2017). It has to be noted that there is no canonical form of explication and that the various approaches are not directly compatible. A single procedure of explication that is both widely accepted and more detailed than the Carnapian exposition (1950) has yet to be devised.

(ii) Other scholars have focused, like Tillman, on single steps within a procedure of explication. Either widely accepted steps are spelled out, or new steps are proposed. Some mixed forms occur as well, like the explicandum clarification procedure by Shepherd and Justus (2015; Justus 2012; Pinder 2017a; Schupbach 2017), who in that context introduce experimental philosophy to explication. (For further examples see sect. 2 below.)

(iii) Conversely, others left the confines of isolated explications in favor of theorizing about the interrelations between multiple explications (see 3.a). This line of research on what might be called superexplicative structures continues Martin’s efforts. Brun (2017) refers to Martin’s emphasis on whole theories while trying to unify Carnap’s explication and Goodman’s reflective equilibrium. Brun’s final procedure can be seen as consisting of, among other components, multiple explications. Meanwhile, Siegwart considers chains of explications and types of disputes arising from rivalling explications (1997b, 263-265).

(iv) Another group of articles critically discusses various aspects of explication or the overall method without intending to enhance it. Strawson’s contributions were some of the first of this kind, and the various takes on the paradox of analysis and the changing-the-subject objection are often a centerpiece of both critical and sympathetic discussions. More recently, Maher’s express defense of explication (2007) has been widely noted. Reck (2012) exemplifies the kind of articles that are rather critical of explication. Both authors, like many others, base their assessments directly and only on Carnap’s conception of explication.

(v) Finally, there are those contributions that either relate explication to other methods or distinguish different kinds of explication than found in analytical classics and which had not been differentiated. Radnitzky (1989) wants to improve on Carnap’s notion of explication by transferring it from logical empiricism to the framework of Popper’s evolutionary methodology: Successful explications are exclusively considered to be a byproduct of processes of theory enhancement or theory replacement. Equally critical of Carnap, but inspired by Gaston Bachelard, Ibarra and Mormann (1992) regard the explication of a scientific concept (e.g. number or line) as its generalization in an explicative theory (number theory or geometry, respectively). Carus (2012) is concerned with two kinds of explication: local and global. Haslanger (2012, 376) distinguishes conceptual, descriptive, and ameliorative analysis, occasionally associating the latter with Carnap-Quinean explication (2012, 367). While all three kinds of analysis may—depending on their execution—have considerable overlap with explication, ameliorative analysis notably pushes the boundaries of explication because the “target concept” is picked for social or political reasons rather than, as was presupposed by the pioneers of explication, purely theoretical reasons.

The five categories are neither exhaustive nor mutually exclusive. For example, Brun (2017) was filed under (iii) and Floyd (2012) was seen as predominantly exegetical, but both could also reasonably be read as examples of category (v). A general theme throughout most recent systematic contributions to explication is that of the constantly evolving nature of languages and concepts (see Wilson 2006). Several scholars feel this nature needs to be considered more in explication—a method that is often seen as dealing with rather rigid notions of language, expressions, and concepts.

2. Procedure

This section provides a procedural account of explication. The first sub-section briefly situates explication within the field of methods of introduction (2.a). Then, three steps are distinguished within explication (Siegwart 1997a, 29): preparation (2.b), the act of explicative introduction (2.c), and postprocessing (2.d).

The act of explicative introduction is not the explication; it is only part of it. Thus, it is highly elliptical to talk about, say, an isolated definition as an explication. Preparation and postprocessing are integral parts as well. In preparation, the need for an explication is assessed, the explicandum is identified and situated within a context, and an explicatum within another context is chosen. Also, criteria of adequacy are established. Postprocessing primarily involves an assessment of adequacy.

Note that ‘explicandum’ and ‘explicatum’ are here taken to refer to expressions in order to avoid conflicting views on concepts. This is in accordance with Carnap, who admitted both concepts and terms as explicatum/explicandum (see the quote at the beginning of this article). Any theory and procedure of the term explication can be seen as concept explication once a suitable theory of concepts has been applied.

a. Framework

It is an everyday phenomenon in natural and artificial languages that we hear, read, or utter words and phrases without knowing their full meaning. We are unsure about their correct use or realize that we have used them in a way we would now consider to be questionable or even incorrect. In these situations, the meaning or use of the words may have to be fixed or entirely new words have to be fitted with novel patterns of use. Setting the meaning or use of a word in one of these ways is henceforth called introducing it. Introducing a word in this sense is to be distinguished from the didactic activity of teaching its correct use to someone. An introduction should entail the commitment and subsequent adherence to the meaning or use that was set. Thus, introduction is always stipulative (see Maher 2007, 336). What is usually called ‘reportive definition’ is not an introduction in this sense as no act that sets the meaning of an expression is being performed. Rather, the meaning that has been associated with the expression in question is being reported or its use is being described. The umbrella term ‘meaning clarification’ can be taken to refer to introduction and meaning description alike.

There are various methods and forms of introduction. Forms of introduction are, for example, definitions, axioms (or meaning postulates), and various types of metalinguistic rules. Acts of introduction are thus performed by, for example, setting definitions, setting axioms, or establishing metalinguistic rules that regulate the use of an expression. Each of these forms of introduction is compatible with explication as all of them are a means of providing an explicatum with a meaning or of determining how the explicatum is to be used correctly. In the literature on explication, the most frequent form of introduction, and sometimes the only one considered, are definitions.

When introducing an expression or a concept, there are many factors that may or may not be taken into account. Depending on what is to be taken into account, one follows a certain method of introduction. For example, previous usage of certain expressions may or may not be relevant to the act of introduction. More specifically, it is possible that there are expressions in use whose application is problematic (with respect to certain aims). If under such circumstances the introduction of expressions whose meaning or use is intended to more or less mirror the former expressions is attempted, then the method of introduction is called explicative or an explication. All other methods of introduction will be called novative. Novative introductions can be found, for example, in contexts of discovery. When, for example, a new plant species is discovered, it may receive any species predicate without the prior use of that expression being substantially relevant. For more on this topic, see Siegwart (1997a, 18-29, §§ 3-10).

b. Preparation of an Explicative Introduction

The preparation of an explicative introduction is intended to provide everything that is needed in order to perform introductions that can be evaluated according to standards of explication afterwards. Some preparatory steps are inevitable in order to yield an explication; at least the explicandum, the explicandum language, the explicatum, the explicatum language, and the criteria of explicative adequacy need to be identified in some way. Together with the explicative introduction (2.c), they constitute an explication in the sense that if one is missing, it can hardly be said that an explication has taken place. Therefore, they are constituents of an explication (3.a). But note, not all constituents need to be of a formal or explicit nature—this would be rather unusual at least for the side of the explicandum—and their identification may allow for some ambiguity. Thus, the minimal procedure of explication, which consists of the allocation of the five constituents (this sect.) and the introduction of the explicatum (the sixth constituent; 2.c), does not put any constraints on how to perform an explication.  It simply acknowledges what is needed in order to be able to speak of an explication at all (see Cordes 2017). More constructively: If, after some interpretive effort, one finds six items in a text that can be understood, respectively, as explicandum, explicandum language, explicatum, explicatum language, criteria of explicative adequacy, and explicative introduction, then, and only then, it seems reasonable to speak of an explication.

Other steps are not obligatory but help to give an explication that establishes an explicatum fit to the explicator’s locutionary purposes: posing an explicative question, assessing the need for an explication, treating ambiguity in the explicandum, considering synonyms of the explicandum, reviewing empirical research on the use of the explicandum, and reviewing the history of explication. In this sub-section, these steps are dealt with in a suitable order.

(i) Posing of an explicative question (optional): Quite frequently, explicative endeavors are prompted by a “what-is-x” question. What are qualia? What is a norm? What is truth? (see Carnap 1950, 4, §2; Audi 2015, 209) Questions of this kind help to invoke the field in which the explication takes place. By posing this question, the field is only roughly contoured so that indeterminacy still remains. When posing an explicative question, one should emphasize that it demands a characterization; examples or non-characterizing general statements (such as natural laws) are not the required answers. This emphasis can be achieved by using semantic vocabulary within the explicative question: What is the meaning of ‘beauty’? How is the term ‘rational’ to be used and understood?

(ii) Assessing the need for an explication (optional): Questions demanding characterizations may be satisfied with reportive answers. In that case, an explication may not be in order. If a questioner demands an explication, this demand may or may not be justified. Whether the need for an explication is real can be assessed by pointing to problems with the prior usage of the explicandum. Carnap names inexactness of the explicandum as the sole reason for performing an explication (1950, 4). The concept of (in)exactness in explication is frequently discussed (Reck 2012, 99-101) and seems to be inexact itself (in some intuitive sense). Hahn (2013, 36-42) investigates three specific reasons for performing an explication: ambiguity, vagueness, and semantic gaps. According to Hahn, each of these semantic defects may be innocuous (such as the ambiguity of ‘bank’) or risky (such as the vagueness of ‘medically necessary treatment’). However, a given semantic defect may be acknowledged but at the same time be desirable for non-cognitive communicative reasons. Constructive ambiguity in diplomacy is a case in point (see Pehar 2001). An expression in a context may have other defects that constitute causes for explication, such as emotive connotation, or it may have multiple defects. Naming them helps the explicative effort in establishing criteria of adequacy. Alternatively, there may be no inherent defect within prior usage, although the explicator envisions some specific discursive purpose which calls for adjustments in the use of certain expressions. Generally, any perceived relevant defect in an explicandum can be understood as a mismatch between its existing use patterns and the purposes to which one wants to use it.

(iii) Naming the explicandum (necessary): The proposal of an explication needs some expression that is supposed to be the explicandum. Posing an explicative question already involves using or mentioning that explicandum. But with further deliberation on the subject, one may want to choose a different, but related, expression as the explicandum. For example, starting with the question as to what beauty is, an explicator may determine ‘is beautiful’ as the explicandum. Although naming an explicandum is obligatory within an explication, it is possible to later retrace this step should a different expression turn out to be more suitable. In addition, there is the possibility of naming several expressions which are simultaneously explicated in one explicative endeavor. This may lead to one unifying explicatum or to several explicata, each of which are associated with one or more explicanda. Naming the explicandum is not the same as individuating the explicandum concept. If a pre-theoretic concept is considered to be the object of an explication, there are widely noted problems in individuating it, since its inexactness is presupposed by the usual understanding of ‘explication’ (Carnap 1950, 4, §2). The next four steps represent a limited remedy for that problem.

(iv) Naming the explicandum language (necessary): Since the explicandum is just an expression, it is not sufficient to serve as the basis of an explication which is supposed to build on prior usage. Thus, an explicandum language which includes the use patterns of the explicandum has to be identified. Associating some kind of use patterns with a language does not mean that this language has to be formal or even that the use patterns are known to the explicator. In fact, in this pre-explicative state, languages are often hard to individuate. For this reason, it suffices to identify a language by referring to the relevant context of use, such as a seminal text that employs the explicandum and is supposedly written in the explicandum language. Four examples of explicandum languages are contemporary philosopher’s English, the language of applied ethics, the language of the Vienna Circle debate about protocol statements, and the scientific jargon of some discipline, like physics. The most pertinent measure of what constitutes an acceptable identification of an explicandum language is the degree to which it encompasses those prior employments of the explicandum which the explicators deem to be the relevant reference points for their explicative project. For that purpose, it might be advisable to specify an explicandum theory if the distinction between theory and language is available. To give an example of this and the preceding step in one sentence: “Here I will explicate ‘velocity’ as it is used in Newtonian mechanics for engineering science.” (Carnap 1950, 3-5.)

(v) Treating ambiguity in the explicandum (optional): Risky ambiguity constitutes a reason for explication (see step (ii)). Different meanings or uses of expressions within the same context should be distinguished in an informal but systematic fashion (Siegwart 1997a, 31). This helps to clarify which of the meanings is the one that predominantly serves as the model for the later introduction of the explicatum. In some cases, multiple and systematically related but distinct meanings of the explicandum are supposed to serve as the model. In that case, it makes sense to develop an explication plan which involves multiple explicata that will be introduced in a systematic order. When, for example, explicating ‘supererogatory action’, the explicator will recognize a distinction between a type and a token understanding of actions in general. This may suggest that two explicata (‘supererogatory action type’ and ‘supererogatory action token’) are in order and two acts of explicative introduction and that one could be defined by the other. A classic Aristotelian example of disambiguation concerns ‘healthy’ in four interrelated senses: The term may be applied to things preserving health, things causing health, signs of health, and to things capable of health (Aristotle 1984, 1003a-b).

(vi) Considering synonyms of the explicandum (optional): In this step, synonyms of the explicandum are listed (Siegwart 1997a, 31). Since this is still a pre-explicative step, synonymy is taken in an intuitive sense. The list serves two purposes: (I) Synonyms may help to individuate the meaning that serves as the reference point for the explicative introduction. Thus, when explicating ‘argument’ in communication studies, one may list ‘dispute’ and ‘reason’ as two (partial) synonyms of ‘argument’ and then specify that this explication is not about ‘argument’ in the sense of ‘reason’. (II) The other purpose is to broaden the scope of explications. Some synonyms may be suitable as additional explicanda. This can lead to a refinement of the explication plan (see step (v)). Or, again, multiple explicanda can be merged into one explicatum; this then constitutes convergent explications (as in 3.a).

(vii) Reviewing empirical research on the use of the explicandum (optional): Shepherd and Justus (2015, sect. 3) argue that experimental philosophy can help with the preparation of an explication, especially with the individuation of the explicandum concept. According to them, surveys about the intuitive use of language help to uncover (I) vagueness, (II) ambiguity, (III) bias influencing intuition, (IV) non-biasing influences on intuition, and (V) central features of an explicandum concept. Data on these issues is relevant to what appears here as steps (ii) and (v). In addition, the experimental identification of central features of concepts can directly inform either the criteria of explicative adequacy or even the explicative introduction (Pinder 2017a, sect. 3). But explicators are not forced to follow the survey results in this respect. (Shepherd and Justus do not claim that either.) For example, when explicating conditional expressions of natural language, experimental philosophers may find that test subjects do not generally accept modus tollendo tollens. Explicators could still decide to put that inference pattern into the criteria of explicative adequacy in order to carve out a use of conditionals that does admit of modus tollendo tollens. Thus, experimental philosophy should primarily be seen as a heuristic in explication, an extension of the classical conceptual analysis method of contemplating and trying out acceptable and inacceptable uses of the explicandum. Similar to how explicators review results from experimental philosophy, they may also want to consider studies in corpus linguistics, but this field is not yet being investigated by scholars on explication.

(viii) Reviewing the explicative history of the explicandum (optional): Like the preceding steps, reviewing already existing explications of the same explicandum may serve heuristic purposes by contributing to the content of the current explication, as with regard to establishing criteria of adequacy (step (xi)). In addition, this step allows situating one’s own enterprise within the debate (Siegwart 1997a, 33-34). If there are no prior explications of the explicandum, it may help explicators to be aware of the potential paradigm function that their own explication may or may not serve (ius primae explicationis). On the other hand, if there are prior explications and relevant debates, explicators can point out similarities and dissimilarities and—if so disposed—may add critical remarks, thus partaking in explicative debates. Even without critical remarks, explicators should explain why they do not accede to existing explications (see 3.a).

(ix) Naming the explicatum language (necessary): Explicatum languages should be chosen carefully. Potential users of explicata and addressees of explications should be taken into account. It may help to itemize languages in a systematic fashion. If, for example, the explicatum language comprises reference to parts and wholes, one should consider both mereological and set theoretic languages. At any rate, the explicatum language should be one that is accessible, that is, one that is either intuitively useable in a correct fashion or one associated with available introductory literature. If necessary, explicators themselves have to give an introduction to the languages or even construct them ab initio. That does not imply that explicatum languages have to be formal, although this might be the case quite frequently. An explicatum language can also be “a more exact part of” the explicandum language (Carnap 1963, 935). If the explicatum language is informal, the individuation strategies are similar to those for explicandum languages (see step (iv)). Note that if the form of introduction later employed needs a background theory, then such a theory should be named in step (ix) as an explicatum theory within the explicatum language.

(x) Naming the explicatum (necessary): This step is as intricate as the preceding one. First, explicators must realize that there are several syntactical categories from which to choose in most explicatum languages. Thus, when explicating ‘beauty’, one may decide to choose a nominal expression or a predicative one as the explicatum. Both options entail several subordinate options. If the explicator decides for a predicate as the explicatum, the question of arity arises, which is connected to the question of what kind of operanda are acceptable for each place of the predicate. This is true of both formal and informal languages. Second, certain options may be excluded for various reasons. Semantical intuitions or plans about the post-explicative semantical relations between some expressions are relevant to determine which expressions are potential explicata, but since this itemization is only about expressions, none of the semantic relations is binding. It should be taken into account that prospective users of the explicata may desire explicata that can precisely describe complex relations and explicata that are simple and easy to use. Third, the explicatum or explicata must be explicitly named. The syntactical category and, if applicable, arity for each explicatum must be clear.

(xi) Naming the criteria of adequacy (necessary): The criteria of adequacy constitute the explications’ measure of success. Usually, it is a number of propositions involving the explicata that are supposed to be proven true after the explicative introduction has been performed (see Quine 1960, 257-266, §§53-54). Explicators may arrive at these propositions with the help of steps (ii) and (v) to (viii), in which various senses have been distinguished and conceptual interrelations have been scrutinized. It is advisable to assemble preliminary criteria of adequacy as a kind of wish list that is still formulated in the explicandum language. Eventually, however, conflicting criteria have to be eliminated and all criteria must be formulated in the chosen explicatum language so as to yield a demonstrably successful explication (see 2.d). Because the criteria of adequacy will often derive from pre-explicative intuitions that are associated with the explicandum, they can be seen as codifying Carnap’s requirement of similarity between explicandum concept and explicatum concept (Quine 1960, §53). Traditionally, the criteria of adequacy are thought of as explicatum language propositions that are true in the given explicatum language. However, sometimes it is important to explicators that certain propositions do not follow from the explicative introduction and that their negations do not follow either. For example, agnostic explicators of the concept of god may want to avoid a scenario which allows them to derive god’s omnipotence, or lack thereof, from the introduction of the explicatum. Formulating this kind of undecidability as a criterion of adequacy turns these criteria into partially metalinguistic propositions. Explication should allow for this kind of metalinguistic criteria because they allow us to formulate the other Carnapian requirements as criteria of adequacy. Thus, from an explication of ‘planet’ one may require as per two criteria of adequacy that (I) planets are celestial bodies and that (II) planethood implies that most planetary laws discovered before 1990 still hold. The second criterion is of a metalinguistic nature and codifies Carnap’s fruitfulness requirement (1.b).

c. Introducing an Explicatum

While preparing an explication, its explicandum, explicandum language, explicatum, explicatum language, and its criteria of explicative adequacy have to be named (2.b). The first two constituents represent the connection to a preceding language practice and affect the specification of the latter three constituents. However, only these are relevant to the introduction of the explicatum. The explicatum is introduced in the explicatum language in a way that supports the criteria of adequacy. The result of the act of introduction, such as a definitional proposition, is the sixth constituent of an explication. (see 3.a)

In order to introduce explicata in accordance with a certain form of introduction (definitions, meaning postulates/axioms, metalinguistic rules), that form of introduction needs to be available for the explicatum language. Depending on the form of introduction, further requirements must be met—for example, some definition rules are formulated with reference to theories which would have to be provided before performing the act of explicative introduction. All this applies to formal and to informal explicatum languages though informal languages often do not explicitly determine whether certain forms of introduction are available and what relevant background theories are. Explicators should keep this in mind while preparing the explication (2.b). There are at least three typical forms of introduction that are usually accepted in both formal and informal settings:

(i) Definition: Here, definitions are understood as constituting a form of introduction that is performed within (explicatum) languages. Thus, definitions are not metalinguistic. In formal languages, defining is like inferring in that it is governed by rules that can be stated in a suitable metalanguage. A considerable number of definition rules have been developed for formal languages (Suppes 1957, ch. 8). The definition of predicates by generalized biconditionals and the definition of individual constants and function constants by (generalized) identities are quite common. An example of the latter has been provided above for the ordered pair function constant ‘(.., ..)’ (1.b). It requires that in the explicandum language, or theory, the set theoretic symbols and some logical constants are already introduced. An example of a definition of predicates applied to an informal language is represented thus: any b is a piece of knowledge if and only if b is a belief and b is true and b is justified. Setting this proposition as a definition presupposes that the predicates have a definite meaning relative to the explicandum language, either by introduction or by implicit convention. Thus, when explicating the explicandum ‘knowledge’ by the explicatum ‘is a piece of knowledge’ through the definition just provided, explicators should settle on a language that (I) allows for this kind of definition and (II) provides the expressions occurring in the definiens, including their meaning.

(ii) Meaning postulate/axiom: Setting axioms or, by another terminology, meaning postulates is performed within languages as well. This excludes axiom schemes which are metalinguistic devices for describing classes of object language axioms (see (iii), below). Again, being an object language form of introduction, axioms can be seen as governed by metalinguistic rules. These rules usually call for consistency, or prima facie consistency, so that the resulting language/theory or parts of language/theory are not rendered trivializable or even inconsistent. Traditionally, axioms are thought to be true or evident, though it is not always clear according to what measures they could be judged thus. Alternatively, and instrumentally, setting axioms can itself be seen as an act of qualifying propositions as true. At any rate, while definitions have a specific syntactic form (see above) which guarantees non-creativity and eliminability, axioms are less regulated syntactically and usually allow for creativity in the sense of establishing new truths in a language (or theory). Axioms come into play when expressions cannot be defined but are seen as basic concepts which nonetheless need regulation. The various axiomatizations of set theory can be seen as explications of ‘belongs to’ or ‘is an element in’. In such an explication, the explicative introduction consists of several acts with each setting one axiom. An example of an informal explication employing axiomatic introduction is the juridical explication of ‘person’ with regard to corporations that states that in a number of respects, corporations have personhood. (‘.. has personhood’ can be seen as the explicatum.)

(iii) Metalinguistic rule: Axioms are used to introduce the basic expressions of a language. However, this is only true if logical and auxiliary expressions are disregarded. Frequently, these are not regulated by axioms but by metalinguistic rules, including rules for setting axioms and definitions as well as rules of inference and assumption. This illustrates that metalinguistic rules are yet another means of giving meaning to an expression and can be used for acts of explicative introduction. Therefore, establishing the rules of hypothetical derivation (conditional introduction) and modus ponendo ponens (conditional elimination) in a language may be seen as an act of explicative introduction which explicates the expressions ‘if … then …’ and ‘… provided …’ of ordinary language. Explication by metalinguistic rules also makes sense if the means of the object language are not sufficient to endow the explicatum with the full intended meaning. In mineralogy, hardness (explicatum) can be regulated by an operational rule that gives practical instructions to be followed, depending on the outcome of which propositions on comparative hardness may be constated. Metalinguistic rules, like axioms, are rather flexible. Usually, the only (meta-meta-)rule to be followed is that the metalinguistic rules shall not lead to any kind of prescriptive dilemma with respect to what moves they allow or forbid in their corresponding object language (but also see Prior 1960 for tonk style rules to be avoided).

The forms of introduction described here are not exhaustive. For example, forms of mixed introduction that combine metalinguistic rules and object language axioms are easily conceived. When explicating ‘heavier’ for a physical theory, the setting of axioms of irreflexivity and transitivity can be combined with an operational rule directing the use of a beam scale that leads to atomic propositions about one body being heavier than another (see Siegwart 2007b, 52-56).

d. Assessing Adequacy

After the explicatum has been introduced, the explication is commonly assessed. This can be done by verifying the criteria of explicative adequacy which have been established in the last step of the explicative preparation (2.b). A straightforward test of adequacy consists in verifying all these criteria in the explicatum language with the help of the explicative introductions performed in the preceding step (2.c) and, possibly, with the help of an explicatum theory provided beforehand (Cordes 2017). A positive test result qualifies the explication as adequate or successful. If any of the criteria is disproven in the test of adequacy, the explication is inadequate or has failed. A full test of adequacy is not always feasible. Sometimes, some criteria of adequacy (for example, metalinguistic criteria) cannot be decided by the resources available or are even undecidable. If at least all object language criteria of adequacy are proven in the test of adequacy, one may speak of a consequentially adequate explication. In any case, if there are criteria of adequacy that have neither been proven nor disproven and if all other criteria have been proven, the explication is adequate on probation. (Cordes 2016, 38)

As the criteria of adequacy constitute the salient benchmark for explications, a bad performance in tests of adequacy should motivate revisions. Accordingly, it is common for explicators to go back and forth and tweak any of the constituents of the explication. Rarely is the explicandum or the explicandum language changed in order to get the desired positive test results. Changing the explicatum, the explicatum language (or theory), the criteria of explicative adequacy, or the explicative introduction may all influence the test. Note that the to and fro of the revision process is usually omitted when an explication is presented within a publication in order to present only the final explication.

Besides the internal test of adequacy, external measures of explicative or introductive quality are conceivable. (As explicative adequacy is an internal quality measure, confusion is avoided by not attributing adequacy to explications in an external sense. Still, Stein (1992, 280) gives an externalistic view on explicative adequacy.) Carnap’s requirements of precision, fruitfulness and simplicity are generic, and thus external, criteria of introductive quality (which can be specified in various ways (cf. 1.b)). The choice of any of the following may cause external criticism despite a positive result of the internal test of adequacy: explicandum, explicandum language (and theory), explicatum, explicatum language (and theory), criteria of adequacy, explicative introduction.

One easy way of obtaining a successful explication is to employ the criteria of explicative adequacy as the explicative introduction, often as axioms/meaning postulates. This, however, trivializes the verification of the criteria. In some cases that is acceptable, but often it can be criticized on grounds of lacking fruitfulness because nothing that exceeds the criteria of adequacy will follow and a non-creative definition securing eliminability may be preferable over the ad hoc postulation of the criteria.

External assessments of explications include comparisons to other explications. Different explications may vie with one another for various desiderata, like simplicity, conciseness of the criteria of adequacy, or syntactical parsimony. Comparative assessments of explications substantiate explicative debates (3.a). A thorough exegesis of the history of explications for a given explicandum in the preparatory steps (2.b, (viii)) is an important resource to rely on in these debates.

3. Explication Theory

The second section provided an outline of an explicative procedure. It can be consulted when performing explications. When reflecting on explications, a theory of explication is needed. Such a theory can be provided with varying degrees of precision and distinguishing potential. In what follows (3.a), a theory that falls short of the formal rigor of a set theory in which languages, expressions, and explications are set theoretic entities (Cordes 2017) is developed. The aim is to provide some intuitive terminology in order to distinguish different kinds of explications and to relate explications to one another. Reflections on explications in this sense are often important when explicative disputes emerge and when the whole method of explication is discussed. A second subsection can be seen as an application of the terminology and also as a treatment of the analysis paradox that is frequently revisited when the method of explication is discussed critically.

a. Constituents of Explication

As described above, explications can be conceived as made up out of six constituents: explicandum, explicandum language, explicatum, explicatum language, criteria of explicative adequacy, and the explicative introduction. If so conceived, providing an explication means specifying these six constituents (2.b, 2.c). Reflecting on an explication means describing and theorizing on any of these constituents, their interrelations, or their relations to other entities—other explications, explicators, and the speakers of explicandum and explicatum language, background theories, etc. General discussions and justifications of the method of explication are supported by its theory, which can take the form of, for example, the six-constituent conception presupposed here. Other conceptions can be provided as well, but such activities have not received much attention in the past.

(i) On occasion, explicators or recipients of explications may reflect on one or several constituents of an explication separately. Explicators do so, for example, when following a procedure of explication akin to the one elaborated in section 2. In a strict sense, explicators cease to do so as soon as they relate the different constituents to one another, for example by considering the suitability of a specific explicatum language with regard to the potential explicata it accommodates. Nonetheless, a lot of explicators’ work is indeed directed at single constituents, like individuating the explicandum language. The same holds for recipients of explications who may take issue with one individual constituent for reasons purely internal to that constituent. For example, with regard to a certain criterion of adequacy, recipients may want to criticize it as self-contradictory or they may want to voice concerns about its intelligibility quite apart from how it relates to the other aspects of the explication.

(ii) The reflection on relations between different constituents within one explication has one prime example within the procedure of explication, namely the adequacy test. It relates the explicative introduction to the explicatum language and to the criteria of explicative adequacy. Thus, qualifying an explication as adequate, inadequate, consequentially adequate, or adequate on probation (2.d) is a matter of reflecting on some relations between the components of an explication. There are other kinds of relations that belong in this category. For example, explicators may want to compare the explicandum language and the explicatum language with respect to which is stronger or more precise or whether they are sublanguages of one another. This suggests the distinction between intra-language explications, where explicandum language and explicatum language are the same, and inter-language explications, where they are not.

(iii) Transcending isolated explications, a theory of explications provides means to compare different explications to one another. Of special interest are explications where the explicative introductions are equivalent (convergent explications) and which either start with the same explicandum from the same explicandum language (explication alternatives, as opposed to genetically different explications) or yield the same explicatum from the same explicatum language.  It is then possible to distinguish linguistic alternatives (different explicatum languages), lexical alternatives (different explicata), criterial alternatives (different criteria of adequacy) and introductory alternatives (different explicative introductions) among explication alternatives. This systematic approach to explication alternatives provides a starting point to carve explicative disputes at their joints. It allows explicators to localize their dissent and develop a productive explicative debate.

Apart from explication alternatives, other phenomena are noteworthy: If an explication starts from the results of a previous explication, then one may speak of a consecutive explication. As soon as more than two explications are related to one another, more interesting structures emerge. A history of explication is just any number of explications that are mutual explication alternatives. In contrast, chains of explications are those sequences of explications where the earlier explications provide explicata that are used in the later explications to introduce the later explicata (see sect. 4). Usually, the earlier explicata just figure in the later explicata’s definietia. For more detailed distinctions see Cordes (2017, sect. 4).

(iv) Lastly, one can reflect on relations between explications on one hand and entities that are independent from explications on the other hand. For example, when several explications provide explicata in a common explicatum language, the result can be considered a theory. This theory can be analyzed in many respects that are different from its explicative genesis. Also, as pointed out in the latter part of 2.d, explications can be externally assessed. Such an assessment accounts for the three Carnapian requirements of exactness, fruitfulness, and simplicity. But in general, explicators and the recipients of explications are free to relate explications to whatever entities they see fit.

This final group includes any investigations into the social dimension of putting forward an explication and into how explicators interact either competitively or cooperatively with one another or with other agents. Under the six-constituents conception of explication, explicators are not part of their explication. Consequentially and illustratively, when a written text is with substantial exegetic effort, interpreted as an explication, then the six-constituent conception of explication says nothing about whether the distinctness of writer and interpreter or their double effort justifies speaking of two explications. An analogous case occurs when authors develop a predecessor’s systematic-explicative ideas on some subject. These two scenarios suggest that in the vicinity of explications, the possible interactions between agents are versatile. Literature on this social-conceptual dimension of explication is rare.

b. The Analysis Paradox

The analysis paradox is a prima facie cognitive dilemma that repeatedly surfaces in, among other areas, explication theory (Stein 1992, 280; Dutilh Novaes and Reck 2017)—sometimes without explicit mention (Justus 2012). Some scholars even view the paradox of analysis as a main motivator for the development of a method of explication by Carnap (Lavers 2013, 226) or by Frege (Beaney 2004, 120; see sect. 4 below). This is a debatable view, since neither philosopher referred to the resolution of the paradox as a goal of their respective efforts. At any rate, setting up and employing any method of explication can be clearly separated from theorizing about how it relates to the paradox of analysis.

The term ‘paradox of analysis’ was first used by Langford in reference to G. E. Moore (Langford 1942, 323; see Beaney 2014, who also applies the term ‘paradox of analysis’ to some earlier problems). There are several versions of the paradox, but the one that is relevant to explication can be put as the following dilemma: Either (i) an explicator fails because the result of the explication diverges from the explicandum, thus not explicating the intended concept, or (ii) an explicator fails because the explicatum concept is identical with the explicandum, thus not improving on the pre-explicative cognitive situation.

The paradox is often associated with an underlying conflict about the role of natural and artificial (not only formal) languages in philosophy. For explications outside philosophy, this specific issue is less pertinent. Thus, Strawson (1963, 515) characterizes philosophical problems as problems about concepts, with those concepts being conveyed by natural language. Carnapian explication includes an explicatum concept (often conveyed by artificial language) that is allowed to diverge from the explicandum concept. This divergence prompts Strawson to his widely recognized charge that an explication “change[s] the subject” (1963, 506). Strawson’s critique could be framed so that it is directed against Carnap’s perceived acceptance of the first horn of the dilemma, while, at that point, Strawson takes the second horn. Consequently, this causes him to articulate his methodology in a way that diverts the appearance of failure connected to that horn.

The criticism of explication as changing the subject is thus, on the one hand, an application of the paradox of analysis to explication and, on the other hand, a manifestation of a deeper methodological dispute about the subject of philosophical problems and the language in which they are framed. Facing this situation within philosophy, explication theorists have two ways to deal with it: (i) Accept Carnapian explication and accept a central role of artificial languages, or (ii) put the emphasis in philosophy on natural language and reformulate the method of explication in divergence from Carnap.

(i) For the first way to deal with the paradox, explication theory provides a clear and intuitive understanding of explicative failure (2.d), which is different from the (negligible) failure of not exactly capturing the unaltered pre-explicative use of a term. Thus, in this approach, the first horn of the dilemma is not seen as constituting failure. In a sense, the “changing of the subject” is recognized and may in fact be encouraged. If, for example, there are inconsistencies in the explicandum language which are seen as being caused by the meaning of the explicandum, then the explicatum should diverge from the original meaning. Accordingly, when characterizing explication, Löffler (2008, 14) talks about remedy (“entstören”) instead of replacement (“ersetzen”). This is a departure from Carnap and many other scholars who portray the explicandum as being replaced by the explicatum. For another line of reasoning in partial support of conceptual change within explication, (ameliorative analysis) see Haslanger (2012, 392-394).

(ii) The second way to deal with the dilemma chooses natural over artificial language. However, this alone does not fend off the complaint of “changing the subject”. Even within natural language, the sameness or the difference of explicandum and explicatum may be seen as a dilemma. Thus, even intra-ordinary-language explications are prone to the paradox of analysis if unchanging replacement is expected.

4. An Exemplary Explication

In retrospect, the history of philosophy provides scholars of explication with numerous examples, although the exact count depends on the concept of explication. Presupposing a weak understanding, nearly any conceptual clarification in a philosophical context may count as an explication. Olsson (2015), for example, considers the JTB conception of knowledge to be an example of an explication, which is the basis for his defense against objections along Gettier’s lines. He does not cite a specific text which counts as the locus explicationis (a text documenting the execution of the explication). Instead, he trusts that the genesis of the JTB conception conforms to his 2-step conception of explication (2015, 59). In this last section, Frege’s Foundations of Arithmetic (1953) are described through the lens of an explication theorist.  That is, the book is taken as an explication exemplifying the methodology from sect. 2 and some of the terminology from sect. 3. (For a deeper investigation into Frege’s project from a different point of view, see Blanchette 2012, especially ch. 4.)

Although Frege’s book The Foundations of Arithmetic is usually seen as an explication of ‘number’ (Lavers 2013, 225), he notably starts his investigation by posing an explicative question with regard to ‘one’ or ‘1’. Frege defines ‘one’ in the latter part of the book (§77) after defining ‘number’. Carnap (1963, 935), however, refers to the work as an explication of the numerical words ‘one’, ‘two’, etc. The book can be considered to contain not only one explication, but a chain of explications (3.a) with the explication of ‘number’ at the center. Roughly, part II of the book (§§18-28) predominantly deals with ‘number’ and part III (§§29-54) with ‘one’. The very first sentence of the book (Introduction, p. I) contains two versions of the explicative question regarding ‘1’: ‘What is the number one? What does the symbol 1 mean?’ The second version emphasizes Frege’s conceptual or explicative intention when posing the question. Subsequently, he clarifies that he is asking for a definition of that numeral. The explicative question with respect to ‘number’ is posed a few paragraphs later: What is number? (Introduction, p. II)

Frege is aware that he carries the burden of proof regarding the need for an explication (p. III, and explicitly on p. V: “a desire for a stricter enquiry”) and so points out some problems with the talk of numbers in the course of the introduction. Ambiguity (p. I) and contradiction (p. IV) are two of his concerns. Frege is aware of other conceptual proposals (such as a history of explications) and that the proposals compete with one another (p. V).

The explicanda are named numerous times throughout the book, for example in the explicative questions: ‘1’ and ‘number’. After the introduction, Frege focuses on ‘number’ to illustrate his general aims, which is why this explicandum is more prominent (§§2, 4). The explicandum language can either be taken to be the everyday talk about whole-number quantitative issues or even the scientific mathematical jargon of Frege’s time. His reference to “arithmetic” seems to suggest the latter, but several of his examples stem from everyday life.

As mentioned above, ambiguity is a reason for Frege to start the explicative enterprise. When associating ‘one’ or the article ‘a’ with the partial synonym ‘unit’ or ‘unity’, he struggles with its ambiguity for several sections of the book (§§29-39). Right from the beginning of dealing with ‘unit’, Frege is skeptical of that term and rejects it as a possible explicandum. The syntactic ambiguity of ‘ein’ in the German language plays a role as well. (It can be a numeral or an indefinite article.) With regard to the ambiguity of ‘number’, Frege restricts his investigation to non-negative integers in §2, excluding other types of numbers.

Frege does not review systematic empirical research on the use of the expressions ‘one’ or ‘number’, but he gives several examples from ordinary language which he regards as representing widely shared use patterns (e.g. §52). (This is common in philosophical explications, since detailed linguistic studies for the relevant expressions are rarely available.) Frege also refrains from identifying a definite body of texts including relevant uses of the explicanda, although part II of the book is labeled “Views of certain writers on the concept of Number” which can be understood as an incidental definition of a canon of reference texts. In any case, it seems plausible to take Frege’s consideration of some proposals by other authors as an enquiry into preceding explications. Besides his repeated criticism, he positively incorporates a few aspects from earlier explications, for example Leibniz’s definition of the number two and beyond (§§6, 55). Strictly speaking, this does not concern the explicanda identified (‘number’, ‘one’) but related expressions (‘two’, ‘three’ …). When discussing the definability of ‘number’ in general, Frege recognizes the failures of some earlier attempts (§20). This can be seen as another recognition of the explicative history.

Writing before the dawn of metamathematics and thus metalanguage while also taking a leave from his own formal system (Begriffsschrift), it may be understandable that there is no section in Frege’s book where an explicatum language is explicitly named. It seems plausible to understand Frege’s explication of ‘number’ as not necessarily intended for everyday use, but for mathematical and philosophical contexts. Therefore, the explicatum language may be the scientific mathematical jargon. (In retrospect, one may understand Frege to be employing the language of a formal set theory as the explicatum language, but this defies chronology.) Thus, the scientific mathematical jargon is Frege’s explicandum language and explicatum language. This yields an intra-language explication (3.a). When considered this way, it is to be expected as per Carnap that the explicatum is situated within “a more exact part” of the explicandum language (Carnap 1963, 935). This, of course, depends on the further steps. For example, Frege’s investigation ensures that grammatical categorization is more transparent on the explicatum end. It becomes clear that ‘one’ is not supposed to be taken as a property name for objects (§§29-33). Some semantical issues (“Are units identical with one another?”, §34) are also addressed explicitly. In sum, instead of providing an explicit language and a definition of ‘number’ and ‘one’, Frege creates a “more exact part” within ordinary mathematical (or philosophical) language by answering several questions that pertain to the syntactical and semantical properties of the explicandum. Some properties of the environment in which the explicatum is to be situated is provided by Frege in passing: “I assume that it is known what the extension of a concept is.” (§68, fn.)

Since this process gradually transforms the original (explicandum) setting into the setting for the explicative introduction, the explicatum is not named explicitly, except in the definitions that are viewed as the explicative introductions (§§72, 77). The explicata are ‘… is a Number’ and ‘1’, a unary predicate constant and an individual constant in modern parlance. To avoid the ambiguous picture in which not only explicandum language and explicatum language are identical, but explicandum and explicatum as well, one may read the explicata with a suppressed index: ‘1F’.

Several definitions that are strictly part of the individuation of the explicatum language (or theory) lead up to and connect the two specific acts of explicative introduction. They are both acts of definition: “n is a Number” is to mean the same as the expression “there exists a concept such that n is the Number which belongs to it”” (§72). “1 is the Number which belongs to the concept ‘identical with 0’.” (§77) The metalinguistic character of these definitions can be discounted as they can be both easily rephrased without referring to expressions. Frege plausibly employed the quotation marks as a parsing device. The first explicative introduction introduces ‘… is a Number’ by employing the unary function constant ‘the number of …’. The second explicative introduction introducing ‘1’ also relies on the expressions ‘the number of …’ and on ‘0’. ‘0’ is introduced in §74. The definition of ‘the number of …’ relies on ‘the extension of …’—a base expression—and the predicate ‘… is equal to …’ (“gleichzahlig”), which is introduced in §72 as well. In the preceding illustrations, a segment of Frege’s definitional tree is sketched. If ‘the number of …’ and ‘0’ were included among the explicata, then at least two chains of explications (3.a) could be seen here: A1. ‘the number of …’ along with A2. ‘… is a number’ comprises one chain; and the other chain is comprised of B1. ‘the number of …’, B2. ‘0’, and B3. ‘1’. Following the explicative introductions, Frege defines further expressions, like ‘…follows in the series of natural number directly after …’ (§76).

Even before the number definition in §72, Frege launches efforts to assess the adequacy of his explication by preparing the verification of criteria of adequacy. He does this consciously, as can be seen from the headline, “Our definition completed and its worth proved”, and the consecutive section which mentions what was to become Carnap’s generic criterion of fruitfulness (§70). The specific criteria of explicative adequacy are not explicitly formulated before the explicative introduction. In fact, they are not presented until he actually verifies some of them, leaving the others without proof. This order of things is methodically not ideal because it gives the criteria an ad-hoc character. However, explicators proceed in this fashion frequently, possibly due to dramaturgical reasons. Among the specific criteria of adequacy are some theorems about equality which appear in the definiens for ‘1’. Frege comes closest to giving criteria of adequacy in §78 when he lists six theorems employing several of the expressions he defines and not only the two explicata. His motions toward full arguments supporting these theorems feature several more definitions and further theorems. Thus, Frege’s method of assessing the adequacy of his explications is not in full accord with the method of explication outlined in sect. 2. It would be better described as an argument for adequacy by use, as in the use of the explicata in a way they are usually used in arithmetic. With regard to certain theorems, Frege presupposes their truth by qualifying them as analytic in the concluding sections (§§87, 109).

5. References and Further Reading

  • Aristotle, 1984, The Complete Works of Aristotle, J. Barnes (ed.), Princeton: Princeton University Press.
  • Audi, P., 2015, “Explanation and Explication,” in The Palgrave Handbook of Philosophical Methods, C. Daly (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 208-230.
  • Beaney, M., 2004, “Carnap’s Conception of Explication: From Frege to Husserl?,” in Carnap Brought Home: The View from Jena, S. Awodey, and C. Klein (eds.), Chicago and LaSalle, IL: Open Court, pp. 117-150.
  • Beaney, M., 2014, “Analysis,” in Stanford Encyclopedia of Philosophy, E. N. Zalta (ed.), Stanford: Stanford University. URL: https://plato.stanford.edu/entries/analysis/
  • Blanchette, P. A., 2012, Frege’s Conception of Logic, Oxford: OUP.
  • Boniolo, G., 2003, “Kant’s Explication and Carnap’s Explication: The Redde Rationem,” International Philosophical Quarterly, 43 (1): 289-298.
  • Brun, G., 2016, “Explication as a Method of Conceptual Re-engineering,” Erkenntnis, 81 (6): 1211-1241.
  • Brun, G., 2017, “Conceptual re-engineering: from explication to reflective equilibrium,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1596-4
  • Carnap, R., 1947, Meaning and Necessity: A Study in Semantics and Modal Logic, Chicago and London: The University of Chicago Press.
  • Carnap, R., 1950, Logical Foundations of Probability, Chicago: The University of Chicago Press.
  • Carnap, R., 1963, “Replies and Systematic Expositions,” in The Philosophy of Rudolf Carnap, P. A. Schilpp (ed.), LaSalle, IL: Open Court, pp. 859-1013.
  • Carnap, R., 2003 [1928], The Logical Structure of the World and Pseudoproblems in Philosophy, R. A. George (transl.), Chicago, Ill.: Open Court.
  • Carus, A. W., 2007, Carnap and Twentieth-Century Thought: Explication as Enlightenment, Cambridge et al.: CUP.
  • Carus, A. W., 2012, “Engineers and Drifters: The Ideal of Explication and Its Critics,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 225-239.
  • Cohnitz, D., and M. Rossberg, 2006, Nelson Goodman, Chesham: Acumen.
  • Cordes, M., 2016, Scheinprobleme. Ein explikativer Versuch, dissertation, University of Greifswald. URL: https://nbn-resolving.org/urn:nbn:de:gbv:9-002497-0.
  • Cordes, M., 2017, “The constituents of an explication,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1615-5.
  • Creath, R., 2012, “Before Explication,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 161-174.
  • Dutilh Novaes, C., and E. Reck, 2017, “Carnapian explication, formalisms as cognitive tools and the paradox of adequate formalization,” Synthese 194 (1): 195-215.
  • Floyd, J., 2012, “Wittgenstein, Carnap, and Turing: Contrasting Notions of Analysis,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 34-46.
  • Frege, G., 1953 [1884], Die Grundlagen der Arithmetik – Eine logisch mathematische Untersuchung über den Begriff der Zahl/The Foundations of Arithmetic – A logico-mathematical enquiry into the concept of number, J. L. Austin (transl.), Oxford: Blackwell.
  • Frege, G., 1969 [1914], “Logik in der Mathematik,” in Gottlob Frege: Nachgelassene Schriften, H. Hermes, F. Kambartel, and F. Kaulbach (eds.), Hamburg: Meiner, pp. 218-270. (English translation: Frege, G., 1979, “Logic in Mathematics,” in Gottlob Frege: Posthumous Writings, H. Hermes, F. Kambartel, and F. Kaulbach (eds.), P. Long, and R. White (transl.), Oxford: Blackwell, pp. 203-250.)
  • Goodman, N., 1966 [1951], The Structure of Appearance, Indianapolis/New York/Kansas City: Bobbs-Merrill.
  • Greimann, D., 2007, “Regeln für das korrekte Explizieren von Begriffen,” Zeitschrift für philosophische Forschung, 61 (3): 261-282.
  • Hahn, S., 2013, Rationalität – eine Kartierung, Münster: Mentis. (English edition in preparation.)
  • Hanna, J. F., 1968, “An Explication of ‘Explication’,” Philosophy of Science, 35 (1): 28-44.
  • Haslanger, S., 2012, Resisting Reality: Social Construction and Social Critique, Oxford: OUP.
  • Hempel, C. G., 1952, Fundamentals of Concept Formation in Empirical Science, Vol. II, No. 7 of International Encyclopedia of Unified Science, O. Neurath, R. Carnap, and C. Morris (eds.), Chicago and London: The University of Chicago Press.
  • Hodges, W., 2014, “Tarki’s Truth Definitions,” in Stanford Encyclopedia of Philosophy, E. N. Zalta (ed.), Stanford: Stanford University. URL: https://plato.stanford.edu/entries/tarski-truth/
  • IAU (International Astronomical Union), 2006, “IAU 2006 General Assembly: Result of the IAU Resolution votes,” Prague. URL: https://www.iau.org/news/pressreleases/detail/iau0603/
  • Ibarra, A., and T. Mormann, 1992, “L’explication en tant que généralisation théorique,” Dialectica, 46 (2): 151-168.
  • Justus, J., 2012, “Carnap on concept determination: methodology for philosophy of science,” European Journal for Philosophy of Science, 2: 161-179.
  • Kant, I., 1998 [1781], Critique of Pure Reason, P. Guyer, and A. W. Wood (transl./eds.), Cambridge et al.: CUP.
  • Lambert, J. H., 1771, Anlage zur Architectonic oder Theorie des Einfachen und des Ersten in der philosophischen und mathematischen Erkenntniß, Riga: Hartknoch.
  • Langford, C. H., 1942, “The Notion of Analysis in Moore’s Philosophy,” in The Philosophy of G. E. Moore, P. A. Schilpp (ed.), Evanston/Chicago: Northwestern University, pp. 319-342.
  • Lavers, G., 2013, “Frege, Carnap, and Explication: ‘Our Concern Here Is to Arrive at a Concept of Number Usable for the Purpose of Science’,” History and Philosophy of Logic, 34 (3): 225-241.
  • Locke, J., 1997 [1689], An Essay Concerning Human Understanding, London et al.: Penguin.
  • Löffler, W., 2008, Einführung in die Logik, Stuttgart: W. Kohlhammer.
  • Maher, P., 2007, “Explication Defended,” Studia Logica, 86: 331-341.
  • Martin, M., 1973, “The Explication of a Theory,” Philosophia, 3 (2-3): 179-199.
  • Menger, K., 1943, “What is Dimension?” The American Mathematical Monthly, 50: 2-7.
  • Murzi, M., 2007, “Changes in a scientific concept: what is a planet?” PhilSci Archive. URL: http://philsci-archive.pitt.edu/id/eprint/3418
  • Naess, A., 1953, Interpretation and Preciseness: A Contribution to the Theory of Communication, Oslo: Dybwad.
  • Olsson, E. J., 2015, “Gettier and the method of explication: a 60 year old solution to a 50 year old problem,” Philosophical Studies, 172 (1): 57-72.
  • Pehar, D., 2001, “Use of Ambiguities in Peace Agreements,” in Language and Diplomacy, J. Kurbalija, and H. Slavik (eds.), Malta: DiploProjects, pp. 163-200.
  • Pinder, M., 2017a, “Does Experimental Philosophy Have a Role to Play in Carnapian Explication?” Ratio, 30 (4): 443-461.
  • Pinder, M., 2017b, “On Strawson’s critique of explication as a method in philosophy,” Synthese. DOI: https://doi.org/10.1007/s11229-017-1614-6
  • Prior, A. N., 1960, “The Runabout Inference-Ticket,” Analysis, 21 (2): 38-39.
  • Quine, W. V. O., 1951, “Two Dogmas of Empiricism,” The Philosophical Review, 60 (1): 20-43.
  • Quine, W. V. O., 1960, Word and Object, New York/London: The Technology Press (MIT)/John Wiley & Sons.
  • Radnitzky, G., 1989, “Explikation,“ in Handlexikon zur Wissenschaftstheorie, H. Seiffert, and G. Radnitzky (eds.), München: Ehrenwirth, pp. 73-80.
  • Reck, E., 2012, “Carnapian Explication: A Case Study and Critique,” in Carnap’s Ideal of Explication and Naturalism, P. Wagner (ed.), Houndsmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 96-116.
  • Reichenbach, H., 1951, “The Verifiability Theory of Meaning,” Proceedings of the American Academy of Arts and Sciences, 80 (1): 46-60.
  • Schupbach, J. N., 2017, “Experimental Explication,” Philosophy and Phenomenological Research, 94 (3): 672-710.
  • Shepherd, J., and J. Justus, 2015, “X-Phi and Carnapian Explication,” Erkenntnis, 80 (2): 381-402.
  • Siegwart, G., 1997a, “Explikation. Ein methodologischer Versuch,” in Dialog und System: Otto Muck zum 65. Geburtstag, W. Löffler, and E. Runggaldier (eds.), Sankt Augustin: Academia, pp. 15-45.
  • Siegwart, G., 1997b, Vorfragen zur Wahrheit: Ein Traktat über kognitive Sprachen, München: Oldenbourg.
  • Siegwart, G., 2007a, “Johann Heinrich Lambert und die präexplikativen Methoden,” Philosophisches Jahrbuch, 114 (1): 95-116.
  • Siegwart, G., 2007b, “Alethic acts and alethiological reflection. An outline of a constructive philosophy of truth,” in Truth and Speech Acts. Studies in the philosophy of language, D. Greimann, and G. Siegwart (eds.), New York: Routledge, pp. 41-58.
  • Stein, H., 1992, “Was Carnap Entirely Wrong, after All?” Synthese, 93 (1/2): 275-295.
  • Strawson, P. F., 1963, “Carnap’s Views on Constructed Systems versus Natural Languages in Analytic Philosophy,” in The Philosophy of Rudolf Carnap, P. A. Schilpp (ed.), LaSalle, IL: Open Court, pp. 503-518.
  • Strawson, P. F., 1992, Analysis and Metaphysics: An Introduction to Philosophy, Oxford: OUP.
  • Suppes, P., 1957, Introduction to Logic, Mineola, New York: Dover.
  • Tarski, A., 1944, “The semantic conception of Truth and the Foundations of Semantics,” Philosophy and Phenomenological Research, 4: 341-376.
  • Tarski, A., 1956, “The Concept of Truth in Formalized Languages,” in Logic, Semantics, Metamathematics. Papers from 1923 to 1938, A. Tarski, Oxford: Clarendon Press, pp. 152-278.
  • Tarski, A., 1969, “Truth and Proof,” Scientific American, 220 (6): 63-77.
  • Tillman, F. A., 1965, “Explication and Ordinary Language Analysis,” Philosophy and Phenomenological Research, 25 (3): 375-383.
  • Tillman, F. A., 1967, “Linguistic Portrayal and Theoretical Involvement,” Philosophy and Phenomenological Research, 27 (4): 597-605.
  • Wilson, M., 2006, Wandering Significance: An Essay on Conceptual Behavior, Oxford: Clarendon Press.

 

Author Information

Moritz Cordes
Email: cordesm@uni-greifswald.de
University of Greifswald
Germany

and

Geo Siegwart
Email: siegwart@uni-greifswald.de
University of Greifswald
Germany

Philosophy of Architecture

photo of SFMOMA by David Ohmer

The relation between philosophy and architecture is interrogative and propositional. It is about asking questions concerning the meaning of human habitation—what it means to live in built environs—and about evaluating plans and design projects where human flourishing and social progress can best occur—in what kinds of buildings, interior spaces and urban precincts. The following sets of questions address issues—aesthetic, ethical, and political issues, as well as metaphysical and epistemological concerns—that relate philosophy to architecture. Although philosophers and architectural theorists (and often design practitioners) can each be expected to have an interest in any or all of these questions, as scholars or public intellectuals of a kind, architectural theorists have played as much, if not more, of a role in shaping the field than philosophers have. There are historical reasons for this, having much to do with the origins and evolution of different academic disciplines and critical perspectives: the questions likely to be posed by one or the other, for a given period (or perennially in some cases) and the people most concerned to ask them. Here are the questions:

  • What is the philosophy of architecture about? How is, how can, and how should philosophy be connected to architecture?
  • How and in what ways is architecture concerned with aesthetics? How and in what ways is architecture concerned with ethics? Is there a connection?
  • What are architecture’s relations to social and political concerns and what does this tell us about the knowledge and discipline of architecture?

The focus of the article is on aesthetic and ethical issues which are, on virtually all accounts, the mainstay of philosophy of architecture. A consideration of ethical issues in architecture in relation to the aesthetic ones quickly segues into architecture’s relation to social theory and political philosophy.

Table of Contents

  1. Architecture: Discourse and Practice
  2. Key philosophical Issues
    1. Architecture: What is it?
      1. Form and Function
      2. Affective Form (Function Follows Form)
      3. Architecture as Means to Social Engineering
    2. Architecture and Aesthetics
    3. Architecture and Ethics
  3. Philosophical Movements and Ideas in Architecture
    1. Idealism and Architectural History
    2. Phenomenology and Architectural Experience
    3. Structuralism and Meaning
      1. Postmodernism
  4. Post-structuralism and Power
  5. Selected Lines of Inquiry into Philosophy of Architecture
    1. Architecture and Representation
    2. Architectural Value and Heritage
    3. Environmental Issues
    4. Design Pedagogy
  6. References and Further Reading

1. Architecture: Discourse and Practice

The mixed character of architecture comes from it being a subject of overlapping philosophical and theoretical discourses as well as a category of creative practices. Philosophy of architecture has long been associated first and foremost with aesthetics. While architecture may be an art form, it is not a branch of aesthetics. In fact (or instead), a case can be made for relocating architecture, as philosophically considered, primarily, though by no means exclusively, within ethics and social and political philosophy. From a philosophical perspective, Winter’s (2001) survey article “Architecture” appears to be located where it belongs, in The Routledge Companion to Aesthetics rather than a volume on social and political thought or ethics. But from an architectural theorist’s point of view, or the viewpoint of a geographer or city planner, classifying architecture as a topic within aesthetics may seem too narrow or conforming. Of course, treating architecture within aesthetics does not preclude consideration of it from other points of view and it would be equally legitimate to have entries on architecture in companions to social philosophy or evolutionary psychology. Nevertheless, from these other points of view, placing the philosophy of architecture primarily in aesthetics is misleading. A survey of theory journals like Assemblage, Grey Room, or Architectural Theory Review shows that philosophy remains an important source of ideas and validation, but references to philosophical aesthetics are limited. The philosophical study of architecture raises questions in several philosophical sub-disciplines, and many design practitioners show a greater interest in ethical and political issues than aesthetic ones.

The question of where to place the philosophy of architecture is not simply a matter of preference or a topographical question of little importance. Instead, the issue can be linked to architectural practice in ways that enlarge our conceptions of architecture, architectural practice, and the architect. In any case, the question of where to locate the philosophy of architecture highlights significant differences between architectural theorists and philosophers, as to how to conceptualize, analyze, and address a range of topics of common concern.

While philosophers of art and aesthetics are still more likely to consider architecture than are social philosophers and ethicists, architectural theorists see the connection with ethics and social and political issues as more relevant and important. However, by considering the concerns of both philosophy and architectural theory, philosophy of architecture enlarges its conceptual and critical domain in ways that impact both philosophical and architectural theoretical approaches to architecture.

2. Key philosophical Issues

 Architecture can and has been conceived as an intrinsically philosophical enterprise—grounded in aesthetics and ethics (incluing theories of human nature)—and also in elements of social and political philosophy. Architects, landscape architects, and designers are responsible for creating spaces and fashioning the world (materially and ideationally) in which people live and interact. In so doing they promote as well as undermine certain values, understandings, and ways of living.

One need not cite utopian characterizations of “the City” to make the point that architecture is concerned with material realizations of visions of the good and what it means to live well. Urban culture manifests itself in its architecture. Debates over the future and planning of the City, including schemes that either rehabilitate or disavow utopian traditions, reinforce this important role. Although Ballantyne and Winters both discuss the aesthetic evaluation of architecture at length, they both believe that “we should evaluate buildings according to how well they make possible desired forms of life” (Goldblatt and Paden 2011, 4). Thus, the final standard of architectural value for them is the ethical (Ballantyne 2011; Winters 2011).

Adopting a historical perspective for these concerns, in a 1949 editorial for the Architectural Review announcing a new feature of the review called the “Canon,” the editors (including Nikolaus Pevsner) bemoaned the fact that architectural theorists had long lacked the common source material “necessary for the construction of the framework within which any who are so disposed may begin to discuss the theory and philosophy of architecture.” They stressed the need for a theoretically informed architectural criticism, regarding it as essential to understanding and improving the then current state of architecture and the architectural profession in Britain.

The perceived absence of any such canon or framework for a theory and philosophy of architecture, may have undermined or hindered informed architectural criticism in Pevsner’s mind, but it has not stopped speculation on the nature of the relation between philosophy and architecture—including accounts that are quite general and self-aggrandizing. Mueller (1960, 39-43) described the relations as follows.

Both philosophy and architecture are edifying … [They] make possible all other values of life or all other arts … Architecture is their spatial, philosophy their spiritual home. In one and the same act, philosophy and architecture enclose man in their shell and structure, and disclose open vistas, new horizons, spiritual possibilities of expansion and self-realization … architecture [expresses] an underlying world-view, a cultural whole, the spirit of an epoch or a people, a dominant value of life—all transcriptions of philosophy… Philosophy and architecture have the coming task of healing the split of knowledge and feeling, of individual and community…

One source of Sigfried Giedion’s often quoted but obscure claim that the main task of architecture is “the interpretation of a way of life” valid for the times (1974, xxxiii) can be found in Immanuel Kant. Although Kant (“Analytic of the Beautiful” in The Critique of Judgment (1790)) claimed that function limits the potential beauty of buildings, he also claimed that beautiful art has “‘spirit’ by means of which ‘aesthetic ideas’ can be expressed” (Goldblatt and Paden 2011, 3). Presumably, it is this “expressive” dimension of architecture that Giedion has in mind. It is however one thing to say that architecture has an expressive dimension and another to suggest that a building is capable of expressing the spirit of an age.

Modern philosophical discourse on architectural practice can be traced to Kant, John Ruskin (1849), more directly to Martin Heidegger (1951), Pevsner (1936), Giedion (1974), and more recently to Roger Scruton (1979) and Karsten Harries (1997). (See Haldane 1999.) In the 20th century, the discourse has focused on architecture’s seeking to articulate its identity and special relevance by embodying or expressing the social, political, economic and personal character of the times. Other figures, from the philosophical side, include Warwick Fox (2000), Malcolm Budd (1995), Christine Smith (1992), and Tom Spector (2001). Among many widely discussed prominent architectural theorists and practitioners who have contributed to a philosophy of architecture are Vitruvius (First century Roman authority), Le Corbusier, Adolf Loos, Robert Venturi, Bernard Tschumi, Frank Gehry, and Daniel Libeskind.

a. Architecture: What is it?

The key underlying question for all of the preceding philosophers and architectural theorists and practitioners is “What is the philosophy of architecture about?” At the same time, we can also turn to ask “How is, how can, and how should philosophy be connected to architecture?” and both of these beg the question “What is architecture?”

The question “What is architecture?” has commonly focused first on how architecture is to be distinguished from “mere” building; and second, on its relation to art. Answers have often depended on how one has sought to reconcile or prioritize Vitruvius’s three elements of architecture: firmitas (durability, firmness), utilitas (convenience, commodity, practicality, function), and venustas (beauty, delight). Given the Vitruvian perspective, it is questionable whether these three elements could ever allow one to surmise, let alone deduce, fundamental architectural principles governing every significant work. It is doubtful that such principles could ever be extrapolated from observations of the material, functional or aesthetic attributes of a building—its “durability”, “convenience” or “beauty.” In any case, Vitruvius’s The Ten Books of Architecture (c. 15 B.C.E.), has been the most common source employed by architectural theorists and philosophers concerned with articulating the nature of architecture. (Spector (2001) structures his book around Vitruvius’s three elements.)

Whether, or to what extent, architecture is to be regarded as an art has long been disputed. One could perhaps just conceive of architecture as a craft pertaining to the fashioning of useful buildings. Nevertheless, Vitruvius, like most architectural theorists, sees the aesthetic (venustas) as essential to architecture. However, if architecture is an art, it is unusual and perhaps unique in that, unlike other art (music, sculpture, visual arts), utilitas (function) is also regarded as an essential part of architecture. While music, drama and the other arts, may serve many functions, including those that are purely aesthetic, and are “practical” in various ways, none of these functions are regarded as intrinsic to their status as an art or to their ontological status—that is, to what they are.

Functionality (utility, purpose, practicality, and so forth) is however, necessary to architecture. Graham (1989, 249) says “[A]esthetic functions in music and painting [and so forth] can be abandoned without loss to their essential character as worthwhile objects of aesthetic attention … But the same cannot be said for architecture. A building which fails in the purpose for which it is built [no matter how aesthetically pleasing] is an architectural failure, whatever other merits it may have.” Even those modernists who see form as preeminent and imply that the inverse of the adage holds true (that function must follow form) may agree with Graham on this. An aesthetically designed material arrangement that has no function or extra-aesthetic purpose may be regarded as a sculpture, and possibly even a “building.” However, on the accounts stemming in theory from Vitruvius, apart from it having some (not wholly aesthetic) function (utilitas), no matter how changeable (the functions of buildings may of course change) or nondescript, it cannot be regarded as architecture.

Given that the elements in the Vitruvian triad are all ingredient in architecture to some degree, the central issue concerning the nature of architecture often rests on determining which of these indispensable elements does or should take precedence and why. Given that the aesthetic (or aesthetic concern) is always necessary, though to what extent, and whether dictated primarily by form or function is disputed, the history of architectural theory can be seen as a debate between those who place emphasis on form rather than function or vice-versa. Graham (1989, 252-3) says “The philosophical dispute between the two lines of thought suggested here—function should determine form and form should determine function—is arguably the basis of the history of architecture in the last 120 years. It is around these themes that the differences between architects of the late 19th century and the Modernists are best understood.”

Whereas the arts typically are regarded as non-functionally valuable for their own sakes, Winter is saying that not only is architecture functional, but also that its aesthetic values are integral rather than incidental to its functionality. Lagueux (2004) claims that the link between ethics and aesthetics is intrinsic. Ethical problems are at one and the same time aesthetic problems for the architect. This is one, if not “the” defining feature of architecture.

i. Form and Function

Theory and practice may diverge in addressing this question of precedence. In practice, it is clear that some architecture is concerned primarily with function and the performance of buildings however this is conceived (occupationally, economically or in terms of durability, and so forth), and some largely with form. The symbolic and mnemonic demands placed upon commemorative architecture, for instance, typically result in designs with a strong emphasis on form. Of course, the expectation that a monument or memorial convey memories and ideas about the past also entails a kind of functionalist—that is, performative—reasoning. The two elements are not so easily distinguished.

Theoretically speaking, however, the need for a far closer connection between function and form has frequently been envisioned. Some claim that buildings in which form and function are closely related make for better architecture. Those who argue that they do (Pevsner 1937, 11) see a mismatch between form and function (for example, if a commercial establishment were designed to look like a church) as a deception, a fraud, or perhaps morally uncertain. Graham (1989, 252) claims that while “copying of styles and the extensive use of facades” may in a sense be deceptive, it is not immoral, and yet “other things being equal, [ideally] such deception is better avoided, if it can be.” But why its avoidance would be preferable is unclear, and views on the subject are mixed and have changed over time. According to the 19th century horticulturalist and architectural critic John C. Loudon (1822, 1013), “A barn disguised as a church would afford satisfaction to none but those who considered it as a trick. The beauty of truth is so essential to every other kind of beauty that it can neither be dispensed with in art nor in morals.”

Contrast this sentiment with Venturi’s celebration of the “decorated shed” 150 years later in Learning from Las Vegas (Venturi, Brown & Izenour 1972), or the notoriety granted Gehry with his “Binoculars Building” in Los Angeles and his collaboration with artists Claes Oldenburg and Coosie van Bruggen on the building’s entrance façade. Graham’s (and Loudon’s) view is likely rooted in a normative presupposition about the relation between form and function; that it is better to have them “unified” in some sense. Graham (1989, 252) says “a building which declares its functions openly and yet at the same time succeeds in conveying all those attributes which the use of a façade aimed to do, would be preferable.” But why and in what way it would be preferable remains unclear. Watkin (1984) thinks that this talk of deception is misguided and that in any case, architecturally speaking, there is nothing wrong with such practices.

Louis Sullivan, the architect responsible for developing the architecture of the late 19th century skyscraper in Chicago, is known for the principle—“form follows function”—around which possibly most debate in modern architecture and design has focused. Augustus Pugin, known for his work on the Gothic revival, made much the same point. He argued against architectural features unrelated to the purpose of a building. For Sullivan, the principle was metaphysically grounded—a kind of law of nature that was normative.

“Form follows function” was seen by some as an inviolable principle offering unique design solutions. It is closely associated with modernist architects early in the 20th century and later. Adolf Loos famously denounced building ornament as a crime for it was superfluous to function and consequently immoral. Frank Lloyd Wright, Sullivan’s assistant at one time, also adopted the principle. The debate on just how the principle is to be interpreted and applied, as well as its validity on any interpretation, is ongoing.

Granted that the proposed use of a building will naturally influence its form (its design), the idea that “use” determines the form seems far too prescriptive. After all, many quite disparate forms might equally well serve a building’s function. The idea that any particular architectural design—no matter how fitting—is more or less uniquely dictated by the precise function can be little more than a retrospective and wishful justification for one’s own design choices. The dictum “form should follow function” is more likely to be the case, or be the case to a far greater extent, with engineering or industrial design (for example, a fuel injector or a heart valve) than with architecture. Questions remain. Just how is one to justify a dictum “form should follow function?” Is it a metaphysical or a normative ethical principle and/or principle of aesthetics? Is it some other kind of irreducible architectural principle? Is the dictum’s justification meant to be logical, rational and/or affective, or in some way grounded in experience or phenomenology?

ii. Affective Form (Function Follows Form)

Le Corbusier’s modernist architecture seeks to create, influence, redefine or even determine the functions that architectural shapes and spaces (not simply “buildings”) are used for. For Le Corbusier, architecture is art. However, the idea that form can influence or even determine function and thereby shape human behavior and communities makes the architect more than, and other than, an artist. It makes the architect a social engineer and planner of sorts as well as likely a moralist and visionary (albeit not necessarily a good visionary). As Corbusier conceived it, the architect is, to various degrees, able to control the uses that designed space is put to—how occupants move in such spaces, how they live in them—and perhaps even the kinds of thoughts and inclinations they have as a result of experiencing designed space in certain ways. Needless to say, questions remain regarding the extent to which Corbusier was successful. Not everyone liked the results.

The idea that building form affects its occupants, physically and/or mentally, is not new. Different metaphysical systems have postulated how this works long before the terms of “form” and “function” became part of the modernist vocabulary. For Renaissance humanist architects, for example, relations of resemblance or similitude embedded in neo-Platonic doctrine suggested that church domes be designed to emulate the vault of heaven, while belief in a force of sympathetic attraction accounted for why the eye is drawn upwards upon entering the sanctuary. By comparison, in the early 19th century Loudon and his contemporaries turned to then current theories of “associationism” to explain why buildings should look certain ways. (A barn should look like a barn, a church like an ecclesiastical building—particularly a Gothic one according to the aesthetic predilections of the day.) Arguably, the “form follows function” equation is the more empirical and deterministic, largely behavioral and socially normalizing, successor to associationism.

Le Corbusier was principally concerned with domestic habitation (housing). His architecture was not intended to service preconceived ideas about what such habitation should be, but to create new and as yet undetermined possibilities for living. The modernist realizations of these undetermined possibilities for habitation were often considered failures. Judged “ugly” or not, iconic modernist building types were found to be unpleasant to live or to work in. High-rise blocks of flats and urban housing estates like Pruitt-Igoe in St. Louis (famously dynamited in 1972, only 16 years after its completion) were routinely derided as “eyesores” for their monumental scale and visual monotony and condemned for their crowdedness, exposure to uninvited surveillance and characteristic disrepair. Architectural historian and critic Charles Jencks (1977) famously described the demolition as the day modern architecture officially died.

However, refusing to unthinkingly yield to preconceived and possibly worn-out notions of habitation can hardly be said to lead to the conclusion that a building’s function follows (or should follow) its form in some highly determinate manner. Likewise, a suburban estate clothed in neo-traditional garb, with encircling gardens, front porches and pathways for pedestrian interaction, is no more guaranteed to produce community than a block of council flats, however well-designed either may be. One can read Le Corbusier’s dictum that “the house is a machine for living in” as a useful prompt for thinking about possibilities for and limitations on architecture’s capacity to provide for human wellbeing and social vitality.

The affective capacity of architecture is the driving idea behind modernism—one that still exercises considerable influence. This can be seen particularly in the formally inventive work by Gehry, Zaha Hadid and other of contemporary architecture’s best known designers. However, the idea has had a far more lasting influence in other design disciplines such as city planning. Thus, it is claimed that certain kinds of public spaces enhance democratic (or totalitarian or socialist, and so forth) values while others are constructed to promote alternative sets of values and ways of thinking and behaving. The broad claim is that the function as well as ethos (moods, motivations, ethics) of buildings, parks, neighborhoods or entire cities follow from their design (form).

It is easy to see how design, choice of material and planning (form), may enhance or curtail certain values and functions. But just as in the case of those who claim “form follows function,” only an ideologue would claim that function is determined by, or should be determined by, form and form alone. Innovation of form may influence, expand and otherwise alter our understanding of function in some cases, but this still falls short of strict adherence to the direct causal connection implied by Le Corbusier’s mechanical analogy. A parking lot must have a place to park cars and a laundromat a place where clothes can be cleaned. Whether one is building a home, factory or an airport, the building’s intended function needs to be taken into account. But if such function does not follow from its form alone, then any dogmatic (absolute) interpretation of Le Corbusier’s version of modernism will fail.

Furthermore, there are various kinds of built environments, communities, dwellings, public spaces, and kinds of cities that people enjoy or dislike (often at the same time). None may be intrinsically better than any other when judged by a single criterion, whether that criterion is “form follows function” or vice versa. Some may suit the “well-being” of particular individuals better than others. “Which is better Paris or London…the city or the county?” are questions that have no determinate answer unless one regards them, as one should, as questions about preferences. Certain goals of development and the realization of one kind of community or public space will often preclude others- although they need not.

Graham (1989, 255) says “a style of architecture which satisfies both functional and aesthetic considerations and has a greater unity is intelligible as an ideal, and one to which many generations of architects have aspired.” It is not difficult to see why architecture that satisfies both kinds of considerations in some unified way, and there may be various equally fine ways of doing so, is desirable. To see this as an ideal is to see aesthetics as intrinsic to architecture (and design). But this ideal is incompatible with the ideology on either side of the form versus function debate. Architecture that achieves such unity will have succeeded in embedding its function in its form and expressing it by means of its form. The ability to achieve this in various ways and to various degrees is an essential part of the architect’s (designer’s) capability qua architect.

However, just how and to what extent buildings can convey meaning or ideas regarding function, as opposed to having meaning and ideas attributed to them, is controversial (Whyte 2006; Graham 1989, 256). For instance, Parsons and Carlson (2008) claim that aesthetic judgments of buildings depend directly upon satisfying functional requirements. Nonetheless, even when buildings appear perfectly suited to their purposes (for example, churches, sport stadiums, schools, prisons) they do not entail meaning and value independent of context and associations accrued through time and place. Meaning and value are virtually always contextualized.

iii. Architecture as Means to Social Engineering

Social Engineering (the “possibility of making society”) and physical determinism (influencing or determining human behavior through space) are ideas that preceded Le Corbusier. They have been deeply imbedded in modern design and urban planning from the start (see Lawhon 2009) and they continue to be influential. An important supposition underlying such ideas is captured by David Brain (2005, 233) who says, “In the context of the urban landscape, every design and planning decision is a value proposition, and a proposition that has to do with social and political relationships.” More recently these issues have re-appeared in the debate on “New Urbanism” as well as with self-reflective questions concerning the nature and aspirations of contemporary architecture and planning.

New Urbanism is a movement codified in the Congress for the New Urbanism’s (CNU) charter (Leccese and McCormick 2000) and identified by a set of 27 principles and evaluative ideas about how cities, particularly suburban cities, should be organized. The CNU sees architecture as the means to social engineering, making for genuine community. The appeal to “community” is ubiquitous in contemporary architectural discourse, partly because notions of “community” are often invoked as a justification by practitioners on behalf of favored design practices. New Urbanism is hard to pin down because just which projects meet or fail to satisfy the CNU’s principles is disputed. Many of the movement’s aims are clearly aligned with late 20th century showcase communities like “Seaside” and the Disney Corporation’s “Celebration” in Florida. The term has been applied retrospectively to the post World War II planned suburban community “Levittown” and additional developments in Pennsylvania and New York built in the late 1940’s and 50’s. New Urbanism aims to provide an alternative, a remedy to suburban sprawl and urban decay and bring about much needed social and political change through design and planning. In this regard it shares features with the anti-sprawl “Smart Growth” planning movement.

The CNU charter proposes that cities and towns should “bring into proximity a broad spectrum of public and private uses to support a regional economy that benefits people of all incomes” (principle 7). Proximity requires that “many of the activities of daily living should occur within walking distance, allowing independence to those who do not drive, especially the elderly and the young” (12). The movement’s followers promise to design civic buildings and public gathering places that “reinforce community identity and the culture of democracy” (25). Such principles demonstrate New Urbanism’s self-conscious concern to bring urban planning into line with certain ethical (including social and political) standards and values. These are the values that its charter delineates as consonant with what democracy, social justice, and more generally “human flourishing” require in a contemporary urban environment. This objective calls to mind Giedion’s and Harries’s view of the architect as, in equal parts: social visionary, political provocateur, and savior. It sees the principle task of the architect or planner as one of interpreting and helping to build, in Giedion’s terms, “a way of life valid for our time.” More pointedly, New Urbanism illustrates Lagueux’s (2004) contention that architecture and ethics are indissolubly joined.

This raises the question as to whether these aims can be realized unless democratic institutions and a foundation of social and economic justice are, to a degree, already operative in the spheres where decisions about development take place. Without such a foundation, development remains in the hands of the kind of “conventional development regime” that Brain (2006, 18-19) cites as (partly) responsible for urban sprawl and inner-city decay in the first place. This regime is constituted by “an interlocking system of financing formulas, measures of market feasibility, product types, zoning categories, environmental impact assessments, and routinized planning practices” that make it virtually impossible to undertake projects “that don’t fit standardized categories.”

Those theorizing the nature of urban development often insist that planning be used in ways that enhance and shape the democratic character of the city. But this is not an easy thing to define, as if democratic values were simply decided by consensus—doubtful, given the possibility the “tyranny of the majority” is always a possibility. Moreover, how can the manipulation of the physical environment through architecture contribute to the inculcation of democratic character and values? This has been the most significant question for urban planning since its inception. It is a question only partly about technique.

Physical determinism addresses an important aspect of social engineering in relation to planning in asking how, and to what extent, certain values and ways of life (human behavior) can be inculcated though the physical (planned) environment. In city planning and urban design, physical determinism is underscored by belief that human behavior is determined by environment. It “implies that the design influences residents’ behavior according to some pattern desired by the designer” (Lawhon 2009, 14).

Sociologist Herbert Gans describes the “fallacy of physical determinism,” which addresses a central aspect of “rational goals-means determination” in questioning the extent to which goals made explicit in the design process can be realized by the physical instantiation of a design. He explains (1968, vii) that “Planning is a method of public decision-making which emphasizes explicit goal-choice and rational goals-means determination, so that decisions can be based on the goals people are seeking and on the most effective programs to achieve them.” However, problems arise when there is the lack of explicit goal-choice, or when some superficially well-defined goal-choices turn out to be nebulous. Insofar as goal-choices involve evaluative and interpretive concepts (for example, “democratic values” or “security”)—concepts whose meaning vary widely relative to ethnic, economic, political, religious and other social groups—what may seem like clear choices may gloss over deep divisions.

The fallacy of physical determinism is meant to question the link between physical design concepts and social outcomes. Gans argues (1968), for example, “that the social homogeneity [race and income] of residential areas based on the neighborhood unit was the chief reason for the success of these neighborhoods and that physical determinism was not a chief determining factor in how successful neighborhoods actually were in forming cohesive, stable units” (Lawhon 2009, 13). But Gans’ criticism, narrowly interpreted, is a broadside that misses the significant claim (that geo-spatial environment does affect behavior) by attacking its haplessly over-generalized cousin (geo-spatial environment determines behavior in a quasi-metaphysical sense of determinism). A severe economic recession can bring down even the most successful and socially homogenous residential area, and no amount of urban planning is going to bring about happily integrated communities of different socio-economic and racial backgrounds, with the kinds of race and class divisions that existed in many cities in the near past and that still largely exist in most cities.

Given that very few design professionals hold the kind of strong determinism that the fallacy of physical determinism applies to, it is not this aspect of planning that needs to be queried. The challenges are instead twofold. First, it is the notion of “desired social outcomes” that requires and has received attention. But to reiterate, this part of the problem, that of articulating an adequate and justifiable goal-choice, is an issue that is not wholly, not even primarily, architectural. It is ethical. Jane Jacobs in her classic book and criticism of 1950’s style rationalist planning, The Life and Death of Great American Cities (1961), understood this, and it was this that enabled her to redefine the relation between physical design concepts and desired social outcomes.

The second challenge is fundamentally design based. Given that behavior, and thought, is affected by environment, the question for planning professionals is how to construct an environment in ways that help effect (not determine) behavior: that help to inculcate desirable values (for example, democratic and other social values), and that are also responsive to the values, at least some of the values, of the inhabitants. The question, at the center of modern architectural planning, remains “what is the nature and character of a range of likely effects (plural) and how do these engage with ethics?” The notion of community has had a contested part to play in New Urbanism, but so too have other central social and political ideas that relocate the principal focus of architecture in relation to philosophy from aesthetics to social and political philosophy and ethics. The Aristotelian notion of what it is to “live well” (human-flourishing) has become closely connected to questions about how the built environment can either enhance or detract from a virtuous and otherwise “good” life (compare Ballantyne 2011 and Winters 2011). From this perspective, the goal and “art” of architects and other design professionals is to enhance the “good” life by adhering to established design principles, while also inventively suggesting ever “better” ways of living.

This heightened connection between philosophy and architecture—practical as well as theoretical—involves both an enlargement and reconfiguration of what we take architects, design professionals, and even engineers, to be and to do. Architects in particular, most noticeably the icons of 20th century architecture, have embraced and promoted themselves not only as arbiters and promulgators of taste (an aesthetic function), but also of value: as visionaries capable of addressing fundamental social and political issues, even spiritual ones (for example, national identity and aspirations) through innovative design in ways that others are simply unable to do—an ethical function bordering at times on the salvific.

Seen from the perspective of the design professional, living up to such a new understanding of their role may seem daunting. Design professionals are first and foremost just that. They are not, on the face of it, ethicists, nor does it seem that they need to be politically active or concerned in their own right, in order to conduct their professional lives. The issue then is whether it only their expertise as “technicians of space” that is required, or whether “architecture” now also implies that practitioners engage with places as designers and citizens—both with a broad understanding of ethics, social philosophy, and so forth.

b. Architecture and Aesthetics

The principal, though not sole question concerning architecture in relation to aesthetics is whether architecture, or at least some architecture, is art. Granted that at least some architecture is art, then issues relating to the connection between architecture as an art form and ethics can and have been raised. Likewise, architecture’s concern with ethics is highlighted when asking “Is architecture an art form?”

The question seems to be of more concern to those interested in philosophical aesthetics than to either architects or architectural theorists. Nevertheless, it is central to the philosophy of architecture. Just how the aesthetician or architectural theorist responds to the question is determined by their particular accounts of what a work of art is, or their ontology of art—if they have one. If an artwork is characterized as necessarily non-functional, then there would be reason to exclude virtually all works of architecture as objects of art.

Still, one can deny that architectural objects are objects of art while maintaining that there is or should be an aesthetic dimension to architectural objects. Architecture can be judged on aesthetic grounds in accordance with aesthetic standards of one kind or another—though arguably they either cannot be or should not be judged on aesthetic grounds alone—without thereby being regarded as art objects proper. It is pointless however to deny that some buildings are “beautiful” or that they may engender an aesthetic experience, leaving aside how such an experience is to be understood.

Some architects may regard architecture as an art form. But for those that do, the reason has less to do with a preconceived idea of the nature or ontology of art, than with understanding such an assignation as honorific in some sense. If architectural objects can be art objects, then architects must be artists, along with demonstrating whatever else—technical skill for example—that may be involved in being an architect.

From the perspective of architecture as an applied practice rather than the philosophy of architecture and aesthetics as scholarly disciplines, the question of whether architecture is an art form, and buildings objects of art, could be seen as resting on an ambivalence between art or being an artist on the one hand, and being “artful” and showing due concern for enhancing the aesthetic aspect of the built environment on the other. The O.E.D. defines “artful” as “Displaying or characterized by technical skill,” or “That [which] has practical, operative, or constructive skill; dexterous, clever.” Thus, not only can surgeons and architects be artful, but so too can cooks, car mechanics and thieves.

At times, when a level of “artfulness” displayed is of a very high or remarkable standard, it might be and sometimes is said, that the product is a work of art. Julia Child was an artist in the kitchen, much in the same way that skillful and inventive surgeons might be “artists” in the operating room, teachers in the classroom, and certainly hairdressers in the salon, and so forth. And, although there is undoubtedly an aesthetic dimension in cooking (as well as an architectural dimension if a kind of structure (form) or performative value (a function) is manifest by a certain dish), the aesthetic appears to play a particularly essential role, at least as a desideratum, in architecture.

In his essay “Is Architecture Art?” Davies (1994, 37) never questions whether buildings can be artworks. He says, “it seems obvious that many works are uncontroversial both in being buildings and works of art.” The issue, however, is not whether they are so acclaimed but whether they should be. Those who proclaim such buildings as artworks do not necessarily rely on some articulated and defended notion of art. Their acclaims appear to be largely honorific; another way of saying that such buildings are beautiful and remarkable. It doesn’t necessarily follow that everything that is beautiful is a work of art. Davies’ concern is rather with what kind of artworks they might be, with their resemblance to some kinds of artworks but not others, with the role of the designer’s intentions in determining artfulness, and with technical virtuosity, site, and culturally specific contexts underpinning their status as art objects. He asks: are buildings that are artworks “singular as are hewn sculptures, or instead admit of multiple instances (as do cast bronzes, novels, symphonies, and the like” (1994, 43)?

The claims Davies sees as uncontroversial—that architecture (buildings) may be art; that some specific buildings are artworks, and that some architects (and then only sometimes) are artists—others maintain are confused or mistaken. The view is that such claims carelessly, albeit at times on theoretical grounds, conflate the aesthetic dimension of architecture for art. The last claim is mistaken in particular for seeing architects not as “artful” practitioners, but as artists. Architects may artfully design buildings and houses that enhance the lifestyles and values of their occupants or even suggest new and alternative ones. They may design spaces that promote democratic values, sociability and neighborliness, and workplaces that are particularly well-suited to the specific needs of workers. But in so doing they are practicing architecture—applying their skills—rather than functioning as artists. But, even where aesthetic concerns are predominant, it may just be a way of talking to call their products works of art.

c. Architecture and Ethics

Architecture’s concern with ethics is perhaps more clearly highlighted when asking about its relations to social and political concerns. What does this tell us about the knowledge and discipline of architecture?

Works of architecture—not just great or iconic works, but those where design is manifest in practical concerns—are also aesthetic achievements. A well-built house, for example, is not a bunker but potentially a home—where the notion of it being a home has an aesthetic and moral valence that ideally contributes to the well-being of its inhabitants. Architecture is often judged in terms of aesthetic and technical, rather than moral, criteria. Yet, the view that judgments based on aesthetic criteria are independent of those based on moral criteria has a history of being challenged. The idea that the aesthetic value of an art work, including architecture, is independent of moral considerations, and so should properly be judged apart from such considerations—the view in philosophical aesthetics known as “aestheticism” or “autonomism”—is perhaps more easily disputed in architecture than in any other aesthetic endeavor. Even on the face of it, architecture impacts our daily lives in ways that are morally significant. Architecture’s concern with aesthetics is mediated in ways that, according to some, make it an essentially ethical discipline.

As we have seen, various understandings of the relation between form and function already contain ethically normative precepts. Adolf Loos’s functionalism, as implied in his claim about ornament and crime, is ethically as well as architecturally grounded. While some architectural theory remains focused on Vitruvius’s elements and the relation between form and function, contemporary discussion about the relationship between architecture (including landscape architecture, and other planning and design professions) and ethics (including social and political philosophy), has refocused the discussion in different terms.

Thus, Lagueux (2004) argues for an intrinsic connection between architecture and ethics, distinguishing this connection from art forms and professions in which, he argues, any connection with ethics is extrinsic. He claims that architectural problems are, at one and the same time, ethical problems and that the two, being intrinsically related though not identical, must be solved at the same time and in the same way. This alleged connection between architecture and ethics may be seen to be a reformulation or evolution of the Vitruvian problem, where the notion of function or utility (or essential function) is interpreted as irreducibly ethical in part, and the “ethical” is understood to include judgments about value—about what is “good” as well as about what is right.

Even if it is true that interventions in the urban landscape have ethical implications as Brain believes, this would not necessarily substantiate Lagueux’s claim that architecture should recognize its inherently ethico-political character. The two sets of problems might best be kept separate and, to a degree, resolved separately. Nevertheless, in practice there may be reason to believe he is right, even if no sharp distinction can always be made between architecture and other disciplines (for example, medicine and biology) as to whether ethical considerations are intrinsic.

Lagueux’s claim regarding architecture and ethics as opposed to other disciplines may seem implausible. Medicine, for example, inevitably confronts its practitioners with practical moral problems and dilemmas that must be considered in relation to the concrete details of the situation. Lagueux does not deny this, but claims that such moral problems remain moral problems and that there is no fundamental or intrinsic connection between the medical and ethical aspects of the problem. One might, for instance, bring in an ethical specialist for advice—as indeed is often the case. Lagueux needs to explain why he sees architecture as having resources for dealing with the moral issues it raises that medicine lacks. If doctors not trained in ethics cannot deal with the issues, what qualifies similarly untrained architects to do so?

Lagueux would say that insofar as architects are not ethically competent, they are also not architecturally competent. In other words, unlike the case of medicine, ethical and aesthetic problems are linked in such a way that ideally they must be resolved at one and the same time—even in the absence of any unique solution. Insofar as architecture (or the architect) does not have resources for dealing with the moral issues it fails as architecture. Lagueux’s claim is that, unlike the case of medicine, the architect qua architect requires ethical training because they cannot practice architecture without it. He does not also claim that architecture is unique in this respect as against all the other arts (for example cinematography), in that it alone has ethical and aesthetic issues intrinsically linked. In any case, even if one denies that Lagueux’s claim is universally true, one might accept it as characteristic of architecture.

Since Lagueux sees aesthetics and ethics as intrinsically connected in architecture in a way they are not in other disciplines, architecture for Lagueux is characterized by the way it presents the practitioner with ethical problems linked to aesthetic ones. For example, the placement of windows and doors in a building should be done in such a way that it satisfies both aesthetic considerations like pleasing views, as well as ethical ones such as due concern for neighbors’ privacy. A more complex example would be designing a public atrium as part for a corporate complex where due consideration is given on the one hand to its utility as public accessible space—responding to the needs, desires and values of those who inhabit and traverse the space—and on the other to perhaps the conflicting concerns, or incommensurate values, of those inhabiting neighboring work environments. A more abstract case yet might be the construction of a public space—a park or a square—designed to be aesthetically pleasing but also, by means of its design, to promote certain civic and democratic values.

3. Philosophical Movements and Ideas in Architecture

As a field that engages multiple disciplines, philosophy of architecture can be aligned with currents of inquiry seen across the humanities. Before Pevsner penned his 1949 editorial in Architectural Review and to a greater extent since the 1960’s, architectural theory has provided a philosophical gloss for architectural criticism, design practices and education. Much of what counts as scholarship on architecture has come to resemble a history of philosophical ideas. The changeable terrain and contingencies of practice have resulted in a continuing critical reappraisal of the discipline’s terms, and intellectual and aesthetic traditions (including the Vitruvian triad and its legacy). Theory has been informed to a large extent by continental European philosophy. Movements such as German idealism, phenomenology, structuralism and post-structuralism, the Frankfurt School, neo-Marxism, psychoanalytic theory, and feminist and deconstruction (literary) theory have found an audience among architectural historians, theorists and practitioners at the “cutting edge” of design.

Arguably, the autonomy of architecture, like other creative arts (for example, film), is made questionable by what some have described as the indeterminate, mixed or “hybrid” character of the discipline and by the critical writing architecture attracts. This includes theory that treats creative disciplines as primarily demonstrative of philosophical truths, rather than productive of ethical insights into the human condition. In his effort to provide a more comprehensive account of the field, Andrew Benjamin, who has written extensively on architecture and the continental tradition, proposes to “think the particularity of the architectural” and devise a uniquely “architectural philosophy” (2000, vii).

Whether Benjamin’s undertaking, or any other, has provided the kind of framework for the philosophy of architecture that Pevsner desired, is open to question. Much depends on how “philosophy” is itself understood and where one stands in relation to history, theory, or practice. Adopting an “honorific” conception of philosophy, for instance, privileges its modes of interrogation as the means to clarify and adjudicate claims of truth arising in these areas. Another view, common in schools of architecture and shared by practitioners seeking intellectual rigor for their work, requires that a “philosophy”—guiding principles or a theoretical exegesis—accompany each design project.

a. Idealism and Architectural History

Idealism, specifically the movement with origins in late 18th and early 19th century German philosophy and bearing the imprimatur of Kant and especially Hegel, is significant for treating works of architecture as objects of our consciousness, their meaning and value being variable, though ultimately determined by the mind’s responsiveness to the material world. As McQuillan points out in his article on German Idealism, the movement is remarkable for its systematic treatment of several philosophical disciplines, including aesthetics, to which one can add art history which followed as a recognizable discipline later.

Architectural history is largely an offshoot of art history. German idealist historians writing in the mid- to late 19th and early 20th centuries (Schnaase, Semper, Wölfflin and Warburg and others; see Podro 1982) contributed much to the formation of art and architectural canons. Critical historiography on architecture developed alongside Hegelian notions of Zeitgeist (the spirit of the age manifest in art forms) and Weltanschauung (the notion that art represents a people’s worldview). Philosophical debate on the nature of architecture was given impetus by comparative analyses fostered by this tradition and the view that saw art forms categorised according to their purported capacities to manifest universal truths.

The influence of Hegel and idealism can be seen in Pevsner’s writing on the origins of the modern movement, notably in Pioneers of the Modern Movement (1936). In this seminal text, developments in architectural form manifest an emerging functionalist aesthetic and spirit indicative of the modern age. There is a strong sense of historical determinism behind this movement. Hence, in Pioneers there is dramatic language of “stages being set,” of heroic architects “appearing on the scene” and of designs “ahead of their time” (122, 132, 136). Historical determinism imposes a particular challenge to expectations for an architect’s autonomous control of a work and the capacity of a cohort of avant garde architects to initiate a new direction for contemporary design. Idealism’s legacy is perhaps best seen in its contribution to subsequent philosophical movements (like Husserl’s phenomenological idealism) and in the broad expectation that art and architecture contribute to understanding the historical moment.

b. Phenomenology and Architectural Experience

The particular nature and significance of architecture has often been discussed in terms of ways that buildings (or some of them anyway) can be experienced. Among philosophers and architectural theorists and designers, there is the broad expectation that different types of buildings, and public and private spaces, engage human perceptions and feelings in ways that both shape and are shaped by patterns of human behaviour and self-consciousness. There is a corresponding and overlapping set of interests, expressed within and outside the academy, questioning how cities allow for distinctive forms of “urban experience” or how certain kinds of public or monumental architecture or “heritage” precincts make for an experience of history that is distinctive, stimulating, and productive of good citizenship. Social, political and ethical contexts for architecture and urban design are raised by such studies as well as others.

From the perspective of moral philosophy, a subset of aesthetic concerns focuses specifically on what an “aesthetic experience” of buildings might be as a means of grounding claims of value. For instance, Michael Mitias (1999) proposes that an “adequate analysis” of the experience of architecture is possible and this is the “safest road” to a reasoned understanding of what architecture is about, for evaluating it and for finding principles of education in architectural aesthetics. He questions:

Under what theoretical and perceptual conditions it is possible to experience, appreciate and evaluate a building as an architectural integrity, in its own terms, without appealing to, or relying on, an external or implied philosophical, ideological, political, or social agenda? (61)

Thoughts on what an experience of architecture may be, acquire greater conceptual rigour in the context of phenomenology. This has been understood as either a disciplinary field in philosophy alongside other studies like ontology and epistemology, logic and ethics, or as a more specific movement in the history of philosophical ideas informed by, among others, Edmund Husserl and Martin Heidegger, Maurice Merleau-Ponty and Gaston Bachelard. Phenomenology studies the “appearances of things, or things as they appear in our experience or the ways we experience things, [and] thus the meaning things have in our experience” (Smith 2003). When studying buildings, particularly for their existential and transcendental value, the phenomenologist emphasises the subject, subjective or first person view of architecture as a condition of conscious awareness. In the work of Christian Norberg-Schulz (1980 [1979]), phenomenology is concerned with the concept of the “genius-loci” whereby the distinctive character or spirit of a place is reinforced by patterns of human settlement and acts of building and dwelling. Urban form, architecture and contrived landscapes that aim at “place-making” elicit a similar concept.

Phenomenology is an influential movement in architectural theory, though its interpretation and application is far from univocal. Its proponents vary in their commitment to its key terms and thinkers, and take its applications and implications (tendencies like transcendentalism or existentialism) in different directions. For Alberto Perez-Gomez (1983), for instance, transcendental phenomenology informs a particular perspective on modernism, supporting the contrast of creative “poiesis” and meaning, on the one hand, and architecture’s representation as plans and drawings—along with its rationalised construction and role as consumerist object—on the other. Theorists who contribute to this and parallel lines of thinking include Juhani Pallasmaa, Dalibor Vesely and Karsten Harries. Architectural practitioners like Steven Holl and Peter Zumthor cite the influence of phenomenology upon their designs.

Phenomenology is influential on architecture, though it provides no clear and categorical definition of architecture. This is partly because there are social grounds for experiencing buildings and semantic considerations that characterise architectural aesthetics according to cultural differences, including discriminations between “high” and “low” art. Whether the function of buildings makes for a different kind of experience from the pleasure derived from their beauty or perception of any likely “architectural integrity” they may have is also at issue. So too is the possibility there are conditions that make for an experience of “bad” architecture. Consider whether places like detention centres can be improved by designing with the genius-loci in mind.

c. Structuralism and Meaning

 Widely attributed to the pioneering work of Ferdinand de Saussure, structuralism was a movement introduced into a number of academic disciplines in the 1950s and 60s. It was an outgrowth of interests in linguistics, semiotics, and allied studies of language. It was influential in anthropology, with work by Claude Lévi-Strauss. Structuralism’s subsequent appeal for architectural theorists was largely due to its promise of a more philosophical, systematic or “scientific” framework for what had long been presupposed (some believe since the Renaissance; others since Vitruvius) that architecture was akin to language and that, like written text, architectural form exhibited a grammar-like structure for conveying meaning. According to this reasoning, material details (classical orders, ornament, and so forth) of buildings or series of building facades, are conceived as metonymic wholes, possessing semantic content and conceivably ethical worth (valence) for communicating meanings and values within social formations and from one generation to the next. Victor Hugo more or less espoused the idea in Notre-Dame de Paris where he bemoaned the arrival of the printing press and cheaply reproduced books. He counterpoised the fluidity and unreliability of the written word with the heyday of architecture in the form of the gothic cathedral on which he believed meanings were artistically manifest in stone—thus acquiring greater permanency and social relevance.

Linguistic structuralism promised not so much a philosophy of architecture; rather, it required that study of architectural aesthetics conform to the model and ideal of language and adhere to what amounts to an empiricist conception of knowledge. Structuralism’s methods worked to establish a fundamental opposition between (i) architectural form and function—privileging the communicative capacity of architectural aesthetics over a building’s other performative roles (as structure, shelter, or its function as a commodity, and so forth)—and (ii) between architectural form as a category of signifiers, and a largely pre-existing context of potentially meaningful artifacts, signified entities or referents.

Accordingly, Umberto Eco (1968) effectively recast the Vitruvian terms of form and function as elements in a culturally-grounded system of architectural signification, thereby denying the precedence and determining influence the modernists gave to one term over the other:

In other words, the principle that form follows function might be restated: the form of the object must, besides making the function possible denote that function clearly enough to make it practicable as well as desirable [emphasis in original], clearly enough to dispose one to the actions through which it would be fulfilled. (186)

Eco moves to distinguish between primary (denotative) and secondary (connotative) functions, neither more important than the other, but each dependent upon the other to form a “semiotic mechanism” (188). Hence, the form of either a barn or a church allows them to function as habitable spaces of a kind (their primary function) and these forms denote this purpose. Their doors “tell” us there is space inside; their windows “tell” us there is light with which to see and so forth. The combination and arrangement of building details work alongside cultural codes to connote (their secondary function) that the first building type, the barn, is just that, merely a building, while the second possesses architectural significance. Roland Barthes complicates the idea that architectural signs are composed by the one-to-one correspondence between signifiers and signifieds. In “Semiology and the Urban” (1971) he emphasizes the transience of urban life so that meanings are not fixed by such a correlation, but temporary and mobile.

Among architectural theorists and practitioners, renewed emphasis in the 1970s and early 80s on the meaningful interpretation of architectural and urban typologies (the classification and comparison of the formal and visual characteristics of building types and urban forms) reinforced the linguistic model. Reyner Banham (in Baird and Jencks, 1969, 101) rejected the move, believing that arguments in support of architectural semantics were merely promoting a new ideology of monumentality in the service of social elites rather than a more rational formalism and egalitarian (that is, functionalist) approach to design. Contributions to the debate over meaning versus functionalism in architecture were published in the first book in English on the subject, Meaning in Architecture (Baird and Jencks, 1969). Additional titles promoting the language of architecture appeared in quick succession, including Venturi, Brown & Izenour (1972) and Jencks (1977). Arguably, Banham’s functionalism and egalitarianism were pushed aside in preference for the stylistic eclecticism and populism allowed for in these books.

Borrowing from Noam Chomsky’s linguistics, Peter Eisenman began a series of experimental projects in the 1970s. These were primarily small houses designed with highly complex forms and models resembling abstract geometric compositions. Though the projects were often accompanied by equally complex theoretical exegeses, Eisenman nonetheless believed that his viewers were able to understand their meaning as they were purportedly derived from the same linguistic and syntactical structures used to express everyday thoughts. The architect-theoretician tried to relate formalism and linguistics logically, distinguishing between meanings that were semantic and those that were syntactical or integral to architecture’s coherence as an object. For Eisenman, formalism was the displacement of the semantic content of a design with the syntactic. The promise of freedom attributed to this displacement underscored Eisenman’s desire to create architecture that was autonomous and free from external constraints arising from pre-established meaning and practical necessity. His view of the “paradoxical nature” of architecture prefigured his subsequent interests in deconstruction and theories of conceptual and “cardboard” (unbuilt) architecture. This includes architectural drawings and plans for projects that may never be built or could not be built.

Structuralism is largely appraised today for the movements that followed and perhaps were reactions to it, variously assembled under the banners of “postmodernism” or “post-structuralism.” Its demise was perhaps due in part to the cumbersome vocabulary developed to describe systems of signification (de Saussure’s terms of and distinction between langue and parole, the division of “signs” into “signifiers” and “signifieds,” Eco’s denotative and connotative functions, and so forth). Questions also arise about the reality behind these terms and equally obscure concepts like Eisenman’s “wellness.” While the vocabulary and concepts might provide the theorist with a framework for describing architectural meanings, they are also largely a-historical and overly formulaic. Structuralism leaves us with the question of whether the so-called “paradoxical nature” of architecture as a system of signification can be reconciled with its determination by, and determining influence on, power and politics.

i. Postmodernism

Drawing on heterogeneous writing, principally by Jean-François Lyotard and Jean Baudrillard, and popularized by architect-critics Charles Jencks and Charles Moore, the underlying aims, scope, and methods of postmodernism are subject to considerable debate and contestation (Habermas 1982; Jameson 1991). Defying easy description, Hal Foster, in The Anti-Aesthetic (1983), nonetheless identifies two distinct and opposing strains of thought behind postmodernism’s claims. Together, they account for the equally imprecise and ambivalent position of the movement in the history of ideas about architecture.

One the one hand, postmodernism was a reactionary movement; it encouraged opposition to certainties that grounded modernism and modern architecture; it challenged the idea that social progress was adjunct to rational design, for instance, or that building form was relatable to function in a pre-determined way or that any epistemology like semiotics could fully encompass the fluidity, ambiguity and impermanence of meaning. This variant of postmodernism accepted the status quo and rejected, notably in work by Jencks and Moore, the “high art” status of International modernism. It embraced populism based on architectural aesthetics characterized by historicist motifs and bricolage. On the other hand, postmodernism can be seen as a critical stance towards modernism that sought to reappraise its claims to truth, as well as reinforcing, perhaps indirectly, the semiotician’s undertaking to provide a more thorough account of architectural meaning.

Along with joining the chorus of scholars asking “What was postmodernism?” it is worth standing back from the particular claims of its leading figures and examining how philosophical movements such as this have engaged “the question of history” (Attridge and others 1987) and utilize (or eschew) forms of historical investigation to produce insightful architectural criticism or novel design styles. One can investigate how philosophical concepts are appropriated and possibly misinterpreted by practitioners when the practical demands of clients and corporate patrons intercede or architectural media weighs in with a market for design novelty and appealing visual imagery.

d. Post-structuralism and Power

Post-structuralism is another interdisciplinary movement that emerged in the 1970s and 80s as an extension and critique of structuralism. Its multiple strains are no more easily characterized than postmodernism. Post-structuralism is associated with writing by Michel Foucault, Jacques Derrida, Julia Kristeva, Gilles Deleuze and other continental philosophers. Derrida’s work on deconstruction further popularized textual analyses for studies in the arts and humanities. Inspiring much architectural criticism and coinciding with highly publicized projects like Bernard Tschumi’s competition winning scheme for Parc de la Villette in Paris (1982) and Zaha Hadid’s unrealized design for Hong Kong’s Peak Club (1983), deconstruction encouraged further unpacking of modernism’s traditions, particularly functionalism. It promised designers a new generative grammar based on the ambiguity, fragmentation, and collision of architectural elements in which systems of representation and habitation were recognized as fluid and contingent. However, the popular reception of deconstruction as an exciting new architectural style may have overshadowed the movement’s critical impetus to firmly position language and meaning within a social matrix. This was enlivened by the dialectic of presence and absence whereby humankind retained a measure of freedom to shape its own identity.

Foucault’s work on knowledge and power develops a key theme of post-structuralism, though he does this in a distinctive (and, for some, idiosyncratic) way using methods that challenge conventional boundaries between modes of philosophical, historical and material analyses. The uneven reception of his oeuvre among architectural historians and theorists is perhaps due to the relatively few works containing explicit references to architecture or architects. Foucault’s analysis of the Panopticon prison in Discipline and Punish (1975) is well known and inspired many studies of space, knowledge and power in the context of disciplinary society.

In one frequently cited interview, Foucault (1982) left his readers with no doubt about the limited agency architectural greats like Le Corbusier or everyday practitioners have in shaping this milieu. It is one where social engineering results not from forms that follow functions (or vice versa) but from techniques of power that engage multiple levels of human experience (material, conceptual and authoritative, and possibly others):

After all, the architect has no power over me. If I want to tear down or change a house he built for me, put up new partitions, add a chimney, the architect has no control. So the architect should be placed in another category—which is not to say that he is not totally foreign to the organization, the implementation, and all the techniques of power that are exercised in a society. I would say that one must take him—his mentality, his attitude—into account as well as his projects, in order to understand a certain number of the techniques of power that are invested in architecture, but he is not comparable to a doctor, a priest, a psychiatrist, or a prison warden. (247-48)

Other works by Foucault have a bearing on philosophy of architecture. In an early work, The Order of Things (1966), for instance, Foucault adopted a quasi-structuralist approach to write an “archaeology of human reason” (also the subtitle of the book). Amongst other tasks he locates movements like phenomenology in a historical framework punctuated by ruptures in the representational structures or “epistemes” of Western discourse. Normally, the disciplines of ontology and epistemology would provide methods for this or similar analyses, but Foucault adopted a more radical approach; after all ontology and epistemology were themselves forms of philosophical inquiry with histories of their own and were already complicit in shoring up the phenomenologist’s claims to truth. His ambition, to stand apart from philosophy in order to see its deepest workings, made for a story of changing relations between signs and the things they came to signify. It was a history where new objects of knowledge appear, and old ones were lost. This allowed for concepts like “life,” “labor,” and “language” to emerge; to provide new foundations for sciences (biology, economics, and linguistics) to be formed, and to describe the human condition. On Foucault’s account, these terms helped shape a distinctly modern framework for an understanding of humanity and, arguably, the built environment and architecture as well.

4. Selected Lines of Inquiry into Philosophy of Architecture

Topics and lines of inquiry into philosophy of architecture have engaged one or more of the preceding movements and illustrate the breadth, intellectual richness, and relevance of the field. Consider a few of these.

a. Architecture and Representation

The question raised early in this article, one that invites further inquiry and positions philosophy of architecture as propositional, “What is architecture?” begs consideration of the relations between the material substances and physical properties of buildings and their representation through various media. The key issue is how the materials, tools and techniques of architecture partly or wholly determine what architecture is or can be. Is designing and visualising buildings the same as thinking about their ethical or other value? Does visualising an ideal building or urban form contribute to its meaning or form part of an architectural experience?

The historical development, prevalence and the popular appeal of wide-ranging media, including conventional design and construction drawings and models, new digital media and even film, have shaped architectural discourse and underscored forms of professional expertise. Ways of representing or producing images of buildings, like plan-metric (two-dimensional), orthogonal or perspective drawings and, more recently, computer renderings of complex building forms, have supported various design and construction practices. These have also encouraged speculation on the capacity of architecture to embody ideas and entail a distinctive way of conceptualising the world. (For instance, Winters 2011 writes on the critical attitude cultivated by “paper”—delineated, but unbuilt or unbuildable—architecture.) This is a tendency that borrows reasoning from art history, studies of iconology and meaning and influential books like Erwin Panofsky’s Perspective as Symbolic Form (1927). The Euclidean character of visual space has been questioned and by some accounts, superseded by new visual regimes that challenge conventional understanding of relations between the architectural object and the viewing/inhabiting subject. Even building diagrams, popularly caricatured as the architect’s calling card when scribbled on dinner table napkins and taken to indicate a unique kind of self-reflection, can raise questions about architecture’s intertwined practical and philosophical aspects.

Studies of architectural media have also prompted philosophical reflection on architecture as affording understanding of transcendental values and existential meaning. For some architectural theorists, for instance, plans and drawings are representations of a second order, seemingly distant from all sense of time and place. For others, new digital media allow not only for the easy visualisation, rapid prototyping and construction of novel architectural forms, but also provide insight into the human condition in an era of globalisation and rapid technological change. On the one hand, issues raised by Walter Benjamin’s much cited essay in “The Work of Art in the Age of Mechanical Reproduction” (1935) run parallel to concerns for architecture given the representation and mass-reproduction of building forms and their contribution to dominant global culture. On the other, the dynamism of fluid building forms, so-called architectural “blobs” and forms inspired by Deleuze’s interest in “the fold” or “folded” spaces (1988), promises unheralded opportunities for self-invention and social renewal.

The utility of “representation” as a trans-historical category of critical analysis (in architectural theory and cultural studies, generally) is accompanied and in some cases countered by philosophical reflection on time, temporality, and transience whereby two- and three-dimensional images of buildings possess only limited value in themselves. In architecture, these themes are evident in arguments for the essential timelessness and fundamental intelligibility of Classicism (Porphyrios 1982) that renders it more than a style, or studies describing the physical characteristics of building materials and emphasizing the meaningfulness of weathering (Mostafavi and Leatherbarrow 1993). Building age and the register of weather, organic and human factors on timber, stone and other materials can be valorised as providing the necessary conditions for Heidegger’s concept and state of “being-in-the-world” whereby the alienation of human subjects from the material world of objects is overcome. These relatively recent studies are worth comparing to 18th and early 19th century aesthetic treatises on ruination, the sublime, and picturesque, though their provenance is not wholly attributable to them.

Conversely, on some accounts, architecture comes into its own when distanced from strict demands for functionality, conventional delineation, and commonly-held meanings (Benedikt 1991; Harbison 1991). Visionary schemes set in “cyberspace,” so-called “virtual” and “unbuilt” (also “paper”) architecture are described and valued for their intellectual content, provocative appeal, and their potential to liberate communities from what is construed as the deadweight of the past, historical building styles, and the conservatism of much architectural heritage. From this follow another set of issues discussed in the literature. One is whether or not an architectural work or proposition is complete when drawings are finished, independent of the design’s construction and prior to the project’s occupation and evaluation. If wholly propositional or paper architecture possesses a kind of creative integrity, this raises questions about the necessary contribution (or otherwise) of the project’s sites and settings (physical and performative) to the design process. Is the architect first and foremost a visionary, rather than merely a technician? If so, how can communities come to understand, share, and assess the architect’s largely utopian mission?

b. Architectural Value and Heritage

Questions concerning the integrity of a work of architecture and the autonomy of the architect as a particular kind of expert or visionary correspond to those asked about other kinds of artworks. Can a work of virtual architecture, a painter’s cartoon or unfinished symphony make a lasting contribution to an artistic canon, or must they invariably be “read” as secondary in importance, interpreted according to existing representational or technical (that is, social) norms? Is unbuilt architecture best left as it is, unrealised, or an unfinished masterpiece best left incomplete, thereby allowing audiences—and posterity—the freedom to fill in the missing pieces? If the latter proposition is correct, then is the meaning of an artwork invariably a social construct? Does the architect, artist, or composer ever have a lasting claim (in term of its meaning) over their work, as they intended it to be?

These questions highlight philosophical issues concerning a creative work’s contribution to culture and heritage, and they draw further attention to differences between architecture and other forms of art. Debates surrounding the preservation, restoration or adaptive re-use of iconic buildings, for instance, show up technical, social, and political contexts governing architectural value that may not be applicable to other artistic genres. The “Salk Controversy” is a case in point, where disagreement arose over plans for a building addition to Louis Kahn’s Salk Institute at La Jolla, thought by some to contradict the architect’s original design intention (Spector 2001, 166-84). The debate shows up differences in views regarding a building’s past and present integrity as an artistic object and how and to what extent the creative vision of an architect should be privileged over the needs of clients and users. Is the heritage value of either the Salk Institute building, or the stature of the architect Louis Kahn diminished by such additions?

Discourse on architectural heritage was shaped by historical figures like Viollet-le-Duc, Ruskin, and Pugin. Working to refurbish France’s medieval cathedrals, Viollet-le-Duc drew and then followed a fine line between restoring building fabric to its “original” (though invariably hypothetical) condition, on the one hand, and adapting the buildings to evolve according to modern standards of function, taste, and aesthetics, on the other. Drawing together both of these divergent positions was an emerging imperative that architecture, both old and new, should be relevant for the times. This perspective is taken up in architectural theory by Giedion and Harries, among others. This and additional views on architectural ethics and heritage have been enacted by the establishment of institutions such as the National Trust (UK) and its offspring of national and regional heritage councils, the International Council on Monuments and Sites (ICOMOS), and Docomomo, charged with the protection and preservation of modern architecture and urbanism. Awareness of the fluidity of heritage as grounds for questioning the relativity of architectural values has been sharpened by debates generated by controversial demolition and rebuilding projects. Famous episodes include the protracted commercial redevelopment of Paternoster Square at St. Paul’s Cathedral, London (1980s-90s); Venturi and Brown’s postmodernist addition (1991) to the National Gallery on Trafalgar Square (replacing the design famously condemned by Prince Charles as a “monstrous carbuncle”), and the rebuilding (completed 2005) of the Frauenkirche, Dresden, to include evidence of damage from Allied carpet bombing during the Second World War.

c. Environmental Issues

While perhaps always present in some measure, the ethical dimensions of architecture have never been as public and as apropos to the civic and political climate as in the early 21st century. Warwick Fox sees this situation as mainly the result of increasing environmental problems and a concern with the built environment as a heretofore neglected aspect of environmental ethics (2000, 1–12). Fox is partly right, but to see the relation between architecture and ethics exclusively in terms of environmental ethics, as commonly understood, is too narrow. For one thing, drawing on forms of historical, theoretical and practical (also professional) knowledge, architecture, more than most other humanities disciplines, is concerned with multiple conceptions of and concerns for the environment. Viewed as subjects of philosophical inquiry, distinctions between “architecture” and the “built environment,” and between either of these terms and “nature” or “the natural environment,” beg for ontological and epistemological elucidation.

Many of the philosophical concerns about architecture may be seen as a subset or variant of concerns for the built environment. They tend to arise in a cultural sphere, bound by interpretative traditions, entailing the formative concepts, historicity and rhetorical conventions, of the discipline. The primary function of the built environment seems to be to provide for habitation and the requisites of life. The question thus arises as to whether this primary function takes precedence over the aesthetic functions of architecture, specifically expectations for its artistry or meaning. Should what seems to be the primary function of the built environment to provide for habitation and the requisites of life take precedence over the aesthetic functions of architecture, specifically expectations for its artistry or meaning?

Moreover, the challenges architects and allied design professionals (particularly planners and urban designers) face in responding to demands for environmentally sustainable buildings with reduced energy consumption and lower carbon emissions, and for cities with greater resilience to global climate change, raise additional philosophical and ethical issues that Vitruvius and his annotators could hardly have imagined. Many of these raise questions about the meaning and scope of sustainability. Is it a matter of science and building technology or behavior—or both? Can buildings be designed sustainably in societies geared for endless growth and consumption? Can a city be made resilient to environmental disaster if this requires the pre-emptive destruction of neighborhoods in vulnerable areas—and possibly worsened levels of social injustice and inequality that may result?

While it may be assumed these concerns and issues have only appeared at the beginning of the 21st century, there are broader, longstanding and overlapping conceptual and practical contexts for locating them historically. In histories of ideas bearing on philosophy and environment (also nature), for instance, (Pratt et al 1999), one learns of the importance of arguments for the uniqueness of living species based on the geographic regions and climates they inhabit. In this regard, today’s environmentalists can be seen as developing thoughts expressed by natural theologians or geographers like Alexander Humboldt (1769-1859) or systemic botanists like John Hutton Balfour (1808-1884) who described life as a process emerging from interactions between living beings and their surroundings.

Humboldt, Balfour, Darwin, and others contributed to the scientific formulation of ecology as well as spatio-temporal frameworks whereby newly established facts of biological existence could also be used to describe urban societies and environments. Arguably, these frameworks contributed to interests in vernacular architecture and the model of “the primitive hut” (Vidler 1987) as these were interpreted as manifesting links between building forms, patterns of human settlement, and distinctive eras. Advancements in building technology over the course of the 19th and 20th centuries, particularly in the areas of sanitation, illumination, heating, and ventilation, reinforced a largely functionalist view of the interrelationship of building interiors, urban spaces and human wellbeing.

According to one line of thinking, our scientific and technological orientation towards control of the natural world is one contributing factor, not the solution, to environmental crises. The logical conflict of different criteria available to measure a building’s ecological sustainability, for instance (entailing its consumption of energy for lighting and heating versus the energy embodied in its materials and construction), demonstrates the limitations of conventional instrumental or practical reasoning. However, it seems fanciful to anticipate that another philosophy of nature and the built environment will appear—one that is more than merely functionalist and non-individualistic or post-humanistic—to underscore effective environmental activism and remediation.

The developments affecting architectural practices in the 21st century arise from the awareness of the link between the environment and human flourishing, though these developments are reducible to no one single concept about the environment. These include growing unease over hitherto unforeseen consequences of building technology and concomitant processes of industrialism and urbanization. Issues range from local ones such as “sick building syndrome,” pollution, and revelations of the toxicity of building sites, to broader concerns arising from the global warming and the depletion of natural resources, including energy resources. These developments have prompted new movements among design practitioners. They include calls for “green architecture” with its emphasis on sustainability and purportedly sustainable practices such as “cradle to cradle” design where building materials are chosen with their life cycles and future recyclability in mind. On a larger scale there is the move towards the “ecological restoration” of natural and urban landscapes aimed at reversing the consequences of environmental degradation or limiting the impacts of future flooding, bushfires and other disasters.

These and other developments directed towards more complete awareness, preservation or restoration of the environment have important subjective and ethical dimensions. These are evident not only in obvious political or design movements, but in ascetic—self-disciplining, restraining, and possibly abstaining—practices involving the design, furnishing, and maintenance of the home, the water-wise planting, and rigorous inspection of the suburban garden for invasive species and noxious weeds. What emerges from such practices is a relationship between thought and experience mediated by an understanding of environs, surrounds, spaces, and choice regarding possible ways of living in them.

d. Design Pedagogy

Given the lines of inquiry outlined in this article, it should become clear there is no one single relation between philosophy and architecture. Rather, there are likely multiple connections that make this an important multi-, inter- and trans-disciplinary field, and these connections can be brought to bear to consider the formation of architectural historians, theorists, and designers as particular kinds of intellectuals and “philosophical” agents with responsibilities for the built environment.

Consequently, the opening question “What is architecture?” leads to another. “What is an architect?” There are a variety of answers. One set of responses is to describe what an architect does, namely design, as a distinctive activity that is not only creative and imaginative, comparable to other forms of “art,” but also both critically and practically oriented. Donald Schön in his book The Design Studio (1985) coined the phrase “reflection-in-action” to describe design and, like many design educators, valued the design studio as a unique arena for cultivating creativity and innovation, and for devising novel solutions to social, technological and pragmatic problems. Indeed, it is a common view that design and studio practice are means of articulating just what the pressing problems of the day are or will soon be. There are many “philosophies” and metaphors of design, including descriptors such as “problem-setting” versus “problem solving” and “lateral thinking”—that old shibboleth of many devotees of design and “the creative industries.” There are also different ways of describing what the ideal design process should be (rational, but not linear; reiterative, cyclical, and so forth).

For Winters (2011), the value of the “paper” architecture (sketch designs, drawings, and other media representations of unbuilt and perhaps unbuildable work) routinely produced in schools is that it throws into sharp relief the capacity of architecture as a visual art to be infused with a “critical attitude” combining the Apollonian and the Dionysian conceptions of aesthetics. The first entails the disinterested contemplation of the creative object as form, and the second active participation in and self-formation through an aesthetic experience, so that:

The designed environment unfolds before us requiring our occupational presence to make it whole. It is in this sense that a work of architecture displays itself as a canvas upon which to project the systematic undertakings that are constitutive of a life, but unlike the blank canvas, this canvas has marked out across its surface patterns that present themselves as suitable accommodation for our endeavors. (67)

Like the Vitruvian triad or the phenomenologist’s favored concept of “poiesis,” many of these descriptions impose—rather than merely recognize—a particular ontological and epistemological order on design acts and, more or less, stress their reasonableness, reliability, and universal applicability. Conversely, it could be argued that the epithet “design” encompasses a number of different cognitive, imaginative, and creative acts; these have histories and institutional settings that cannot be reduced to one common denominator.

Aesthetics is not only grounds for connecting philosophy to architecture in a multi-disciplinary field. It is also commonly the chief vehicle for composing and teaching histories of architecture, for teaching design and assessing design outcomes, and often, for positioning a student’s ambitions at “the cutting edge” of design. Alertness to historical, social, and political contexts impacting our understanding of “design” begs greater openness towards the domain of “aesthetico-ethical” exercises that the activity and related metaphors and methodologies routinely entail. These include perceptions and discriminations of various kinds which make the built environment something to be considered, reflected, and acted upon—in the design studio but also more broadly and everyday, across society. Discriminations, such as between the form and function of a building, or between the “utilitas” or “venustas” of architecture are means whereby a wholeness of character, psychological closure or renewal of community is sought among other aspirations or, conversely, whereby our passions and desires for a wholeness of the self, closure, and community are subverted. Such discriminations are exercised by individuals occupying a number of subject positions, degrees of knowledge, and authority. They are acted upon in multiple and overlapping social and political arenas.

It is clear that the field of philosophy of architecture has much work cut out for it. However, given the admittedly only partial account of its concerns as outlined here, it is likely that the significance of what is in some ways merely a nascent subfield within both philosophy and architecture, will grow.

5. References and Further Reading

  • Attridge, D., G. Bennington, and R. Young, 1987, Post-structuralism and the question of history, Cambridge: Cambridge Universty Press.
  • Baird, B., and C. Jencks, 1969, Meaning in Architecture, London: Barrie and Rockliffe.
  • Ballantyne, A., 2011, “Architecture, Life, and Habitat,” The Journal of Aesthetics and Art Criticism, 69 (1): 43-49.
  • Barthes, R., 1971, “Semiology and the Urban,” reprinted in The City and the Sign: An Introduction to Urban Semiotics, M. Gottdiener and A. Lagopoulos (eds), New York: Columbia University Press, 1986, pp. 88-98.
  • Benedikt, M., 1991, Cyberspace: first steps, Cambridge, MA: MIT Press.
  • Brain, D., 2005, “From Good Neighborhoods to Sustainable Cities: Social Science and the Social Agenda of the New Urbanism,” International Regional Science Review, 28 (2): 217-238.
  • Brain, D., 2006, “Democracy and Urban Design: The Transect as Civic Renewal,” Places, 18 (1): 18-23.
  • Budd, M., 1995, Values of Art: Pictures, Poetry and Music, London and New York: Penguin.
  • Davies, S., 1994, “Is Architecture Art?,” in Philosophy and Architecture, M. Mitas (ed.), Amsterdam and Atlanta: Rodopi, pp. 31-47.
  • Deleuze, G., 1988, The fold: Leibniz and the Baroque, London: Continuum, 2006.
  • Eco, U., 1968, “Function and Sign: Semiotics of Architecture,” reprinted in Rethinking Architecture: A Reader in Cultural Theory, N. Leach (ed.), New York: Routledge, 1977, pp. 166-204.
  • Foucault, M. 1982, “Space, Knowledge and Power,” in The Foucault Reader, P. Rabinow (ed.), New York: Pantheon, 1984, pp. 239-256.
  • Fox, W., (ed.), 2000, Ethics and the Built Environment, London and New York: Routledge.
  • Gans, H., 1968, People and Plans: Essays on Urban Problems and Solutions, New York: Basic Books.
  • Giedion, S., 1947, Space, Time and Architecture, Cambridge, MA: Harvard University Press, 5th ed., 1974.
  • Goldblatt, D. and R. Padden 2011, “Introduction” to “The Aesthetics of Architecture: Philosophical Investigations into the Nature of Building,” special issue of The Journal of Aesthetics and Art Criticism, 69 (1): 1-6.
  • Graham, G., 1989, “Art and Architecture,” British Journal of Aesthetics, 29 (3): 248-257.
  • Habermas, J., 1982, “Modern and Post-Modern Architecture,” trans. H. Tsoskounglou, 9H, 4: 9-14.
  • Haldane, J., 1999, “Form, meaning and value: a history of the philosophy of architecture,” The Journal of Architecture, 4: 9-20.
  • Harbison, R., 1991, The built, the unbuilt, and the unbuildable: in pursuit of architectural meaning, London: Thames and Hudson.
  • Harries, K., 1997, The Ethical Function of Architecture, Cambridge, MA: The MIT Press.
  • Heidegger, M., 1951, “Building, dwelling, thinking,” in Poetry, Language, Thought, trans. A. Hofstadter, New York: Harper and Row, 1975, pp. 145-161.
  • Jameson, F., 1991, Postmodernism, or the Cultural Logic of Late Capitalism, London : Verso.
  • Jencks, C., 1977, The Language of Post-Modern Architecture, New York : Rizzoli, 1984.
  • Lagueux, M., 2004, “Ethics Versus Aesthetics in Architecture,” Philosophical Forum, 35 (2): 117-133.
  • Lawhon, L. L., 2009, “The Neighborhood Unit: Physical Design or Physical Determinism?,” Journal of Planning History, 20: 1-22.
  • Leccese, M. and K. McCormick, (eds.), 2000, The Charter of the New Urbanism, New York: McGraw Hill.
  • Loudon, J. C., 1822, An Encyclopaedia of Gardening, London: Longman, 1835.
  • Mitias, M., 1999, “The aesthetic experience of the architectural work,” Journal of Aesthetic Education, 33 (3): 61–77.
  • Mostafavi, M., and D. Leatherbarrow, 1993, On weathering: the life of buildings in time, Cambridge, MA: MIT Press.
  • Mueller, G., 1960, “Philosophy and Architecture,” AIA Journal, 34: 38-43.
  • Norberg-Schulz, C., 1980, c1979, Genius loci: towards a phenomenology of architecture, London: Academy Editions.
  • Parsons, G., and A. Carlson, 2008, Functional Beauty, Oxford: Clarendon Press.
  • Perez- Gomez, A., 1983, Architecture and the Crisis of Modern Science, Cambridge, MA: MIT Press.
  • Pevsner, N., 1936, Pioneers of the Modern Movement; renamed Pioneers of Modern Design, partly revised and rewritten edition, Harmondsworth: Penguin, 1975.
  • Podro, M., 1982, The Critical Historians of Art, New Haven, CT and London: Yale University Press.
  • Porphyrios, D., 1982, “Classicism is Not a Style,” reprinted in Classical architecture, London: Academy Editions, 1991.
  • Pratt, V., J. Howarth, and E. Brady, 1999, Environment and Philosophy, London and New York: Routledge.
  • Ruskin, J., 1849, The Seven Lamps of Architecture, New York: Noonday, 1974.
  • Scruton, R., 1979, The Aesthetics of Architecture, Princeton, NJ: Princeton University Press.
  • Smith, C., 1992, Architecture in the Culture of Early Humanism: Ethics, Aesthetics, and Eloquence 1400-1470, New York: Oxford University Press.
  • Smith, D. W., 2003, “Phenomenology,” in The Stanford Encyclopedia of Philosophy, E.N. Zalta (ed.), (Winter edition). Online. Available HTTP: <http://plato.stanford.edu/archives/win2003/entries/davidson/> (last accessed 25 June 2010
  • Spector, T., 2001, The Ethical Architect: the dilemma of contemporary practice, New York: Princeton Architectural Press.
  • Venturi, R., D. S. Brown, & S. Izenour, 1972, Learning from Las Vegas: The Forgotten Symbolism of Architectural Form, revised edition, Cambridge, MA: MIT Press, 1977.
  • Vidler, A., 1987, “Rebuilding the Primitive Hut,” in The Writing of the Walls, Princeton, NJ: Princeton University Press, pp. 7-21.
  • Watkin, D., 1984, Morality and Architecture, Chicago: University of Chicago Press.
  • Whyte, W., 2006, “How do Buildings Mean? Some Issues of Interpretation in the History of Architecture,” History and Theory, 45: 153-177.
  • Winters, E., 2001, “Architecture,” in The Routledge Companion to Aesthetics, B. Gaut, and D. Lopes (eds.), London and New York: Routledge, 2nd edition, pp. 655-667.
  • Winters, E., 2011, “A Dance to the Music of Architecture,” The Journal of Aesthetics and Art Criticism, 69 (1): 61-67.

 

Author Information

William M. Taylor
Email: bill.taylor@uwa.edu.au
University of Western Australia
Australia

and

Michael P. Levine
Email: michael.levine@uwa.edu.au
University of Western Australia
Australia

Art and Interpretation

picture of man looking at art objectsInterpretation in art refers to the attribution of meaning to a work. A point on which people often disagree is whether the artist’s or author’s intention is relevant to the interpretation of the work. In the Anglo-American analytic philosophy of art, views about interpretation branch into two major camps: intentionalism and anti-intentionalism, with an initial focus on one art, namely literature.

The anti-intentionalist maintains that a work’s meaning is entirely determined by linguistic and literary conventions, thereby rejecting the relevance of the author’s intention. The underlying assumption of this position is that a work enjoys autonomy with respect to meaning and other aesthetically relevant properties. Extra-textual factors, such as the author’s intention, are neither necessary nor sufficient for meaning determination. This early position in the analytic tradition is often called conventionalism because of its strong emphasis on convention. Anti-intentionalism gradually went out of favor at the end of the 20th century, but it has seen a revival in the so-called value-maximizing theory, which recommends that the interpreter seek value-maximizing interpretations constrained by convention and, according to a different version of the theory, by the relevant contextual factors at the time of the work’s production.

By contrast, the initial brand of intentionalism—actual intentionalism—holds that interpreters should concern themselves with the author’s intention, for a work’s meaning is affected by such intention. There are at least three versions of actual intentionalism. The absolute version identifies a work’s meaning fully with the author’s intention, therefore allowing that an author can intend her work to mean whatever she wants it to mean. The extreme version acknowledges that the possible meanings a work can sustain have to be constrained by convention. According to this version, the author’s intention picks the correct meaning of the work as long as it fits one of the possible meanings; otherwise, the work ends up being meaningless. The moderate version claims that when the author’s intention does not match any of the possible meanings, meaning is fixed instead by convention and perhaps also context.

A second brand of intentionalism, which finds a middle course between actual intentionalism and anti-intentionalism, is hypothetical intentionalism. According to this position, a work’s meaning is the appropriate audience’s best hypothesis about the author’s intention based on publicly available information about the author and her work at the time of the piece’s production. A variation on this position attributes the intention to a hypothetical author who is postulated by the interpreter and who is constituted by work features. Such authors are sometimes said to be fictional because they, being purely conceptual, differ decisively from flesh-and-blood authors.

This article elaborates on these theories of interpretation and considers their notable objections. The debate about interpretation covers other art forms in addition to literature. The theories of interpretation are also extended across many of the arts. This broad outlook is assumed throughout the article, although nothing said is affected even if a narrow focus on literature is adopted.

Table of Contents

  1. Key Concepts: Intention, Meaning, and Interpretation
  2. Anti-Intentionalism
    1. The Intentional Fallacy
    2. Beardsley’s Speech Act Theory of Literature
    3. Notable Objections and Replies
  3. Value-Maximizing Theory
    1. Overview
    2. Notable Objections and Replies
  4. Actual Intentionalism
    1. Absolute Version
    2. Extreme Version
    3. Moderate Version
    4. Objections to Actual Intentionalism
  5. Hypothetical Intentionalism
    1. Overview
    2. Notable Objections and Replies
  6. Hypothetical Intentionalism and the Hypothetical Artist
    1. Overview
    2. Notable Objections and Replies
  7. Conclusion
  8. References and Further Reading

1. Key Concepts: Intention, Meaning, and Interpretation

It is common for us to ask questions about works of art due to puzzlement or curiosity. Sometimes we do not understand the point of the work. What is the point of, for example, Metamorphosis by Kafka or Duchamp’s Fountain? Sometimes there is ambiguity in a work and we want it resolved. For example, is the final sequence of Christopher Nolan’s film Inception reality or another dream? Or do ghosts really exist in Henry James’s The Turn of the Screw? Sometimes we make hypotheses about details in a work. For instance, does the woman in white in Raphael’s The School of Athens represent Hypatia? Is the conch in William Golding’s Lord of the Flies a symbol for civilization and democracy?

What these questions have in common is that all of them seek after things that go beyond what the work literally presents or says. They are all concerned with the implicit contents of the work or, for simplicity, with the meanings of a work. A distinction can be drawn between two kinds of meaning in terms of scope. Meaning can be global in the sense that it concerns the work’s theme, thesis, or point. For example, an audience first encountering Duchamp’s Fountain would want to know Duchamp’s point in producing this readymade or, put otherwise, what the work as a whole is made to convey. The same goes for Kafka’s Metamorphosis, which contains so bizarre a plot as to make the reader wonder what the story is all about. Meaning can also be local insofar as it is about what a part of a work conveys. Inquiries into the meaning of a particular sequence in Christopher Nolan’s film, the woman in Raphael’s fresco, or the conch in William Golding’s Lord of the Flies are directed at only part of the work.

We are said to be interpreting when trying to find out answers to questions about the meaning of a work. In other words, interpretation is the attempt to attribute work-meaning. Here “attribute” can mean “recover,” which is retrieving something already existing in a work; or it can more weakly mean “impose,” which entails ascribing a meaning to a work without ontologically creating anything. Many of the major positions in the debate endorse either the impositional view or the retrieval view.

When an interpretative question arises, a frequent way to deal with it is to resort to the creator’s intention. We may ask the artist to reveal her intention if such an opportunity is available; we may also check what she says about her work in an interview or autobiography. If we have access to her personal documents such as diaries or letters, they too will become our interpretative resources. These are all evidence of the artist’s intention. When the evidence is compelling, we have good reason to believe it reveals the artist’s intention.

Certainly, there are cases in which external evidence of the artist’s intention is absent, including when the work is anonymous. This poses no difficulty for philosophers who view appeal to artistic intention as crucial, for they accept that internal evidence—the work itself—is the best evidence of the artist’s intention. Most of the time, close attention to details of the work will lead us to what the artist intended the work to mean.

But what is intention exactly? Intention is a kind of mental state usually characterized as a design or plan in the artist’s mind to be realized in her artistic creation. This crude view of intention is sometimes refined into the reductive analysis one will find in a contemporaneous textbook of philosophy of mind: intention is constituted by belief and desire. Some actual intentionalists explain the nature of intention from a Wittgensteinian perspective: authorial intention is viewed as the purposive structure of the work that can be discerned by close inspection. This view challenges the supposition that intentions are always private and logically independent of the work they cause, which is often interpreted as a position held by anti-intentionalists.

A 2005 proposal holds that intentions are executive attitudes toward plans (Livingston). These attitudes are firm but defeasible commitments to acting on them. Contra the reductive analysis of intention, this view holds that intentions are distinct and real mental states that serve a range of functions irreducible to other mental states.

Clarifying each of these basic terms (meaning, interpretation, and intention) requires an essay-length treatment that cannot be done here. For current purposes, it suffices to introduce the aforesaid views and proposals commonly assumed. Bear in mind that for the most part the debate over art interpretation proceeds without consensus on how to define these terms, and clarifications appear only when necessary.

2. Anti-Intentionalism

Anti-intentionalism is considered the first theory of interpretation to emerge in the analytic tradition. It is normally seen as affiliated with the New Criticism movement that was prevalent in the middle of the twentieth century. The position was initially a reaction against biographical criticism, the main idea of which is that the interpreter, to grasp the meaning of a work, needs to study the life of the author because the work is seen as reflecting the author’s mental world. This approach led to people considering the author’s biographical data rather than her work. Literary criticism became criticism of biography, not criticism of literary works. Against this trend, literary critic William K. Wimsatt and philosopher Monroe C. Beardsley coauthored a seminal paper “The Intentional Fallacy” in 1946, marking the starting point of the intention debate. Beardsley subsequently extended his anti-intentionalist stance across the arts in his monumental book Aesthetics: Problems in the Philosophy of Criticism ([1958] 1981a).

a. The Intentional Fallacy

The main idea of the intentional fallacy is that appeal to the artist’s intention outside the work is fallacious, because the work itself is the verdict of what meaning it bears. This contention is based on the anti-intentionalist’s ontological assumption about works of art.

This underlying assumption is that a work of art enjoys autonomy with respect to meaning and other aesthetically relevant properties. As Beardsley’s Principle of Autonomy shows, critical statements will in the end need to be tested against the work itself, not against factors outside it. To give Beardsley’s example, whether a statue symbolizes human destiny depends not on what its maker says but on our being able to make out that theme from the statue on the basis of our knowledge of artistic conventions: if the statue shows a man confined to a cage, we may well conclude that the statue indeed symbolizes human destiny, for by convention the image of confinement fits that alleged theme. The anti-intentionalist principle hence follows: the interpreter should focus on what she can find in the work itself—the internal evidence—rather than on external evidence, such as the artist’s biography, to reveal her intentions.

Anti-intentionalism is sometimes called conventionalism because it sees convention as necessary and sufficient in determining work-meaning. On this view, the artist’s intention at best underdetermines meaning even when operating successfully. This can be seen from the famous argument offered by Wimsatt and Beardsley: either the artist’s intention is successfully realized in the work, or it fails; if the intention is successfully realized in the work, appeal to external evidence of the artist’s intention is not necessary (we can detect the intention from the work); if it fails, such appeal becomes insufficient (the intention turns out to be extraneous to the work). The conclusion is that an appeal to external evidence of the artist’s intention is either unnecessary or insufficient. As the second premise of the argument shows, the artist’s intention is insufficient in determining meaning for the reason that convention alone can do the trick. As a result, the overall argument entails the irrelevance of external evidence of the artist’s intention. To think of such evidence as relevant commits the intentional fallacy.

There is a second way to formulate the intentional fallacy. Since the artist does not always successfully realize her intention, the inference is invalid from the premise that the artist intended her work to mean p to the conclusion that the work in question does mean p. Therefore, the term “intentional fallacy” has two layers of meaning: normatively, it refers to the questionable principle of interpretation that external evidence of intent should be appealed to; ontologically, it refers to the fallacious inference from probable intention to work-meaning.

b. Beardsley’s Speech Act Theory of Literature

Beardsley at a later point develops an ontology of literature in favor of anti-intentionalism (1981b, 1982). Reviving Plato’s imitation theory of art, Beardsley claims that fictional works are essentially imitations of illocutionary acts. Briefly put, illocutionary acts are performed by utterances in particular contexts. For example, when a detective, convinced that someone is the killer, points his finger at that person and utters the sentence “you did it,” the detective is performing the illocutionary act of accusing someone. What illocutionary act is being performed is traditionally construed as jointly determined by the speaker’s intention to perform that act, the words uttered, and the relevant conditions in that particular context. Other examples of illocutionary acts include asserting, warning, castigating, asking, and the like.

Literary works can be seen as utterances; that is, texts used in a particular context to perform different illocutionary acts by authors. However, Beardsley claims that in the case of fictional works in particular, the purported illocutionary force will always be removed so as to make the utterance an imitation of that illocutionary act. When an attempted act is insufficiently performed, it ends up being represented or imitated. For example, if I say “please pass me the salt” in my dining room when no one except me is there, I end up representing (imitating) the illocutionary act of requesting because there is no uptake from the intended audience. Since the illocutionary act in this case is only imitated, it qualifies as a fictional act. This is why Beardsley sees fiction as representation.

Consider the uptake condition in the case of fictional works. Such works are not addressed to the audience as a talk is: there is no concrete context in which the audience can be readily identified. The uttered text hence loses its illocutionary force and ends up being a representation. Aside from this “address without access,” another obtaining condition for a fictional illocutionary act is the existence of non-referring names and descriptions in a fictional work. If an author writes a poem in which she greets the great detective Sherlock Holmes, this greeting will never obtain, because the name Sherlock Holmes does not refer to any existing person in the world. The greeting will only end up being a representation or a fictional illocution. By parity of reasoning, fictional works end up being representations of illocutionary acts in that they always contain names or descriptions involving events that never take place.

Now we must ask: by what criterion do we determine what illocutionary act is represented? It cannot be the speaker or author’s intention, because even if a speaker intends to represent a particular illocutionary act, she might end up representing another. Since the possibility of failed intention always exists, intention would not be an appropriate criterion. Convention is again invoked to determine the correct illocutionary act being represented. It is true that any practice of representing is intentional at the start in the sense that what is represented is determined by the representer’s intention. Nevertheless, once the connection between a symbol and what it is used to represent is established, intention is said to be detached from that connection, and deciding the content of a representation becomes a sheer matter of convention.

Since a fictional work is essentially a representation of an insufficiently performed illocutionary act, determining what it represents does not require us to go beyond that incomplete performance, just as determining what a mime is imitating does not require the audience to consider anything outside her performance, such as her intention. What the mime is imitating is completely determined by how we conventionally construe the act being performed. In a similar fashion, when considering what illocutionary act is represented by a fictional work, the interpreter should rely on internal evidence rather than on external evidence of authorial intent to construct the illocutionary act being represented. If, based on internal data, a story reads like a castigation of war, it is suitably seen as a representation of that illocutionary act. The conclusion is that the author’s intention plays no role in fixing the content of a fictional work.

Lastly, it is worth mentioning that Beardsley’s attitude toward nonfictional works is ambivalent. Obviously, his speech act argument applies to fictional works only, and he accepts that nonfictional works can be genuine illocutions. This category of works tends to have a more identifiable audience, who is hence not addressed without access. With illocutions, Beardsley continues to argue for an anti-intentionalist view of meaning according to which the utterer’s intention does not determine meaning. But his accepting nonfictional works as illocutions opens the door to considerations of external or contextual factors that go against his earlier stance, which is globally anti-intentionalist.

c. Notable Objections and Replies

One immediate concern with anti-intentionalism is whether convention alone can point to a single meaning (Hirsch, 1967). The common reason why people debate about interpretation is precisely that the work itself does not offer sufficient evidence to disambiguate meaning. Very often a work can sustain multiple meanings and the problem of choice prompts some people to appeal to the artist’s intention. It does not seem plausible to say that one can assign only a single meaning to works like Ulysses or Picasso’s abstract paintings if one concentrates solely on internal evidence. To this objection, Beardsley (1970) insists that, in most cases, appeal to the coherence of the work can eventually leave us with a single correct interpretation.

A second serious objection to anti-intentionalism is the case of irony (Hirsch, 1976, pp. 24–5). It seems reasonable to say that whether a work is ironic depends on if its creator intended it to be so. For instance, based on internal evidence, many people took Daniel Defoe’s pamphlet The Shortest Way with the Dissenters to be genuinely against the Dissenters upon its publication. However, the only ground for saying that the pamphlet is ironic seems to be Defoe’s intention. If irony is a crucial component of the work, ignoring it would fail to respect the work’s identity. It follows that irony cannot be grounded in internal evidence alone. Beardsley’s reply (1982, pp. 203–7) is that irony must offer the possibility of understanding. If the artist cannot imagine anyone taking it ironically, there would be no reason to believe the work to be ironic.

However, the problem of irony is only part of a bigger concern that challenges the irrelevance of external factors to interpretation. Many factors present at the time of the work’s creation seem to play a key role in shaping a work’s identity and content. Missing out on these factors would lead us to misidentifying the work (and hence to misinterpreting it).

For instance, a work will not be seen as revolutionary unless the interpreter knows something about the contemporaneous artistic tradition: ignoring the work’s innovation amounts to accepting that the work can lose its revolutionary character while remaining self-identical. If we see this character as identity-relevant, we should then take it into consideration in our interpretation. The same line of thinking goes for other identity-conferring contextual factors, such as the social-historical conditions and the relations the work bears to contemporaneous or prior works. The present view is thus called ontological contextualism to foreground the ontological claim that the identity and content of a work of art are in part determined by the relations it bears to its context of production.

Contextualism leads to an important distinction between work and text in the case of literature. In a nutshell: a text is not context-dependent but a work is. The anti-intentionalist stance thus leads the interpreter to consider texts rather than works because it rejects considerations of external or contextual factors. The same distinction goes for other art forms when we draw a comparison between an artistic production considered in its brute form and in its context of creation. For convenience, the word “work” is used throughout with notes on whether contextualism is taken or not.

As a reply to the contextualist objection, it has been argued (Davies, 2005) that Beardsley’s position allows for contextualism. If this is convincing, the contextualist criticism of anti-intentionalism would not be conclusive.

3. Value-Maximizing Theory

a. Overview

The value-maximizing theory can be viewed as being derived from anti-intentionalism. Its core claim is that the primary aim of art interpretation is to offer interpretations that maximize the value of a work. There are at least two versions of the maximizing position distinguished by the commitment to contextualism. When the maximizing position is committed to contextualism, the constraint on interpretation will be convention plus context (Davies, 2007); otherwise, the constraint will be convention only, as endorsed by anti-intentionalism (Goldman, 2013).

As indicated, the word “maximize” does not imply monism. That is, the present position does not claim that there can be only a single way to maximize the value of a work of art. On the contrary, it seems reasonable to assume that in most cases the interpreter can envisage several readings to bring out the value of the work. For example, Kafka’s Metamorphosis has generated a number of rewarding interpretations, and it is difficult to argue for a single best among them. As long as an interpretation is revealing or insightful under the relevant interpretative constraints, we may count it as value-maximizing. Such being the case, the value-maximizing theory may be relabelled the “value-enhancing” or “value-satisfying” theory.

Given this pluralist picture, the maximizer, unlike the anti-intentionalist, will need to accept the indeterminacy thesis that convention (and context, if she endorses contextualism) alone does not guarantee the unambiguity of the work. This allows the maximizing position to bypass the challenge posed by said thesis, rendering it a more flexible position than anti-intentionalism in regard to the number of legitimate interpretations.

Encapsulating the maximizing position in a few words: it holds that the primary aim of art interpretation is to enhance appreciative satisfaction by identifying interpretations that bring out the value of a work within reasonable limits set by convention (and context).

b. Notable Objections and Replies

The actual intentionalist will maintain that figurative features such as irony and allusion must be analysed intentionalistically. The maximizer with contextualist commitment can counter this objection by dealing with intentions more sophisticatedly. If the relevant features are identity conferring, they will be respected and accepted in interpretation. In this case, any interpretation that ignores the intended feature ends up misidentifying the work. But if the relevant features are not identity conferring, more room will be left for the interpreter to consider them. The intended feature can be ignored if it does not add to the value of the work. By contrast, where such a feature is not intended but can be put in the work, the interpreter can still build it into the interpretation if it is value enhancing.

The most important objection to the maximizing view has it that the present position is in danger of turning a mediocre work into a masterpiece. Ed Wood’s film Plan 9 from Outer Space is the most discussed example. Many people consider this work to be the worst film ever made. However, interpreted from a postmodern perspective as satire—which is presumably a value-enhancing interpretation—would turn it into a classic.

The maximizer with contextualist leanings can reply that the postmodern reading fails to identify the film as authored by Wood (Davies, 2007, p, 187). Postmodern views were not available in Wood’s time, so it was impossible for the film to be created as such. Identifying the film as postmodernist amounts to anachronism that disrespects the work’s identity. The moral of this example is that the maximizer does not blindly enhance the value of a work. Rather, the work to be interpreted needs to be contextualized first to ensure that subsequent attributions of aesthetic value are done in light of the true and fair presentation of the work.

4. Actual Intentionalism

Contra anti-intentionalism, actual intentionalism maintains that the artist’s intention is relevant to interpretation. The position comes in at least three forms, giving different weights to intention. The absolute version claims that work-meaning is fully determined by the artist’s intention; the extreme version claims that the work ends up being meaningless when the artist’s intention is incompatible with it; and the moderate version claims that either the artist’s intention determines meaning or—if this fails—meaning is determined instead by convention (and context, if contextualism is endorsed).

a. Absolute Version

Absolute actual intentionalism claims that a work means whatever its creator intends it to mean. Put otherwise, it sees the artist’s intention as the necessary and sufficient condition for a work’s meaning. This position is often dubbed Humpty-Dumptyism with reference to the character Humpty-Dumpty in Through the Looking-Glass. This character tries to convince Alice that he can make a word mean what he chooses it to mean. This unsettling conclusion is supported by the argument about intentionless meaning: a mark (or a sequence of marks) cannot have meaning unless it is produced by an agent capable of intentional activities; therefore, meaning is identical to intention.

It seems plausible to abandon the thought that marks on the sand are a poem once we know they were caused by accident. But this at best proves that intention is the necessary condition for something’s being meaningful; it does not prove further that what something means is what the agent intended it to mean. In other words, the argument about intentionless meaning does a better job in showing that intention is an indispensable ingredient for meaningfulness than in showing that intention infallibly determines the meaning conveyed.

b. Extreme Version

To avoid Humpty-Dumptyism, the extreme actual intentionalist rejects the view that the artist’s intention infallibly determines work-meaning and accepts the indeterminacy thesis that convention alone does not guarantee a single evident meaning to be found in a work. The extreme intentionalist claims further that the meaning of the work is fixed by the artist’s intention if her intention identifies one of the possible meanings sustained by the work; otherwise, the work ends up being meaningless (Hirsch, 1967). Better put, the extreme intentionalist sees intention as the necessary rather than sufficient condition for work-meaning.

Aside from the unsatisfactory result that a work becomes meaningless when the artist’s intention fails, the present position faces a dilemma when dealing with the case of figurative language (Nathan, in Iseminger (1992)). Take irony for example. The first horn of the dilemma is as follows: Constrained by linguistic conventions, the range of possible meanings has to include the negation of the literal meaning in order for the intended irony to be effective. But this results in absolute intentionalism: every expression would be ironic as long as the author intends it to be. But—this is the second horn—if the range of possible meanings does not include the negation of literal meaning, the expression simply becomes meaningless in that there is no appropriate meaning possible for the author to actualize. It seems that a broader notion of convention is needed to explain figurative language. But if the extreme intentionalist makes that move, her intentionalist position will be undermined, for the author’s intention would be given a less important role than convention in such cases. However, this problem does not arise when the actual intentionalist is committed to contextualism, for in that case the contextual factors that make the intended irony possible will be taken into account.

c. Moderate Version

Though there are several different versions of moderate actual intentionalism, they share the common ground that when the artist’s intention fails, meaning is fixed instead by convention and context. (Whether all moderate actual intentionalists take context into account is controversial and this article will not dig into this controversy for reasons of space.) That is, when the artist’s intention is successful, it determines meaning; otherwise, meaning is determined by convention plus context (Carroll, 2001; Stecker, 2003; Livingston, 2005).

As seen, an intention is successful so long as it identifies one of the possible meanings sustained by the work even if the meaning identified is less plausible than other candidates. But what exactly is the interpreter doing when she identifies that meaning? It is reasonable to say that the interpreter does not need to ascertain all the possible meanings and see if there is a fit. Rather, all she needs to do is to see whether the intended meaning can be read in accordance with the work. This is why the moderate intentionalist puts the success condition in terms of compatibility: an intention is successful so long as the intended meaning is compatible with the work. The fact that a certain meaning is compatible with the work means that the work can sustain it as one of its possible meanings.

Unfortunately, the notion of compatibility seems to allow strange cases in which an insignificant intention can determine work-meaning as long as it is not explicitly rejected by the relevant interpretative constraint. For example, if Agatha Christie reveals that Hercule Poirot is actually a smart Martian in disguise, the moderate intentionalist would need to accept it because this proclamation of intention can still be said to be compatible with the text in the sense that it is not rejected by textual evidence. To avoid this bad result, compatibility needs to be qualified.

The moderate intentionalist then analyses compatibility in terms of the meshing condition, which refers to a sufficient degree of coherence between the content of the intention and the work’s rhetorical patterns. An intention is compatible with the work in the sense that it meshes well with the work. The Martian case will hence be ruled out by the meshing condition because it does not engage sufficiently with the narrative even if it is not explicitly rejected by textual evidence. The meshing condition is a minimal or weak success condition in that it does not require the intention to mesh with every textual feature. A sufficient amount will do, though the moderate intentionalist admits that the line is not always easy to draw. With this weak standard for success, it can happen that the interpreter is not able to discern the intended meaning in the work before she learns of the artist’s intention.

There is a second kind of success condition which adopts a stronger standard (Stecker, 2003; Davies, 2007, pp. 170–1). This standard for success states that an intention is successful just in case the intended meaning, among the possible meanings sustained by the work, is the one most likely to secure uptake from a well-backgrounded audience (with contextual knowledge and all). For example, if a work of art, within the limits set by convention and context, affords interpretations x, y, and z, and x is more readily discerned than the other two by the appropriate audience, then x is the meaning of the work.

These accounts of the success condition answer a notable objection to moderate intentionalism. This objection claims that moderate intentionalism faces an epistemic dilemma (Trivedi, 2001). Consider an epistemic question: how do we know whether an intention is successfully realized? Presumably, we figure out work-meaning and the artist’s intention respectively and independently of each other. And then we compare the two to see if there is a fit. Nevertheless, this move is redundant: if we can figure out work-meaning independently of actual intention, why do we need the latter? And if work-meaning cannot be independently obtained, how can we know it is a case where intentions are successfully realized and not a case where intentions failed? It follows that appeal to successful intention results in redundancy or indeterminacy.

The first horn of the dilemma assumes that work-meaning can be obtained independently of knowledge of successful intention, but this is false for moderate intentionalists, for they acknowledge that in many cases the work presents ambiguity that cannot be resolved solely in virtue of internal evidence. The moderate intentionalist rejects the second horn by claiming that they do not determine the success of an intention by comparing independently obtained work-meaning with the artist’s intention (Stecker, 2010, pp. 154–5). As already discussed, moderate intentionalists propose different success conditions that do not appeal to the identity between the artist’s intention and work-meaning. Moderate intentionalists adopting the weak standard hold that success is defined by the degree of meshing; those who adopt the strong standard maintain that success is defined by the audience’s ability to grasp the intention. Neither requires the interpreter to identify a work’s meaning independently of the artist’s intention.

d. Objections to Actual Intentionalism

The most commonly raised objection is the epistemic worry, which asks: is intention knowable? It seems impossible for one to really know others’ mental states, and the epistemic gap in this respect is thus unbridgeable. Actual intentionalists tend to dismiss this worry as insignificant and maintain that in many contexts (daily conversation or historical investigations) we have no difficulty in discerning another person’s intention (Carroll, 2009, pp. 71–5). In that case, why would things suddenly stand differently when it comes to art interpretation? This is not to say that we succeed on every occasion of interpretation, but that we do so in an amazingly large number of cases. That being said, we should not reject the appeal to intention solely because of the occasional failure.

Another objection is the publicity paradox (Nathan, 2006). The main idea is this: when someone S conveys something p by a production of an object O for public consumption, there is a second-order intention that the audience need not go beyond O to reach p; that is, there is no need to consult S’s first-order intentions to understand O. Therefore, when an artist creates a work for public consumption, there is a second-order intention that her first-order intentions not be consulted, otherwise it would indicate the failure of the artist. Actual intentionalism hence leads to the paradoxical claim that we should and should not consult the artist’s intentions.

The actual intentionalist’s response (Stecker, 2010, pp. 153–4) is this: not all artists have the second-order intention in question. If this premise is false, then the publicity argument becomes unsound. Even if it were true, the argument would still be invalid, because it confuses the intention that the artist intends to create something standing alone with the intention that her first-order intention need not be consulted. The paradox will not hold if this distinction is made.

Lastly, many criticisms are directed at a popular argument among actual intentionalists: the conversation argument (Carroll, 2001; Jannotta, 2014). An analogy between conversation and art interpretation is drawn, and actual intentionalists claim that if we accept that art interpretation is a form of conversation, we need to accept actual intentionalism as the right prescriptive account of interpretation, because the standard goal of an interlocutor in a conversation is to grasp what the speaker intends to say. (This is a premise even anti-intentionalists accept, but they apparently reject the further claim that art interpretation is conversational. See Beardsley, 1970, ch.1.) This analogy has been severely criticized (Dickie, 2006; Nathan, 2006; Huddleston, 2012). The greatest disanalogy between conversation and art is that the latter is more like a monologue delivered by the artist rather than an interchange of ideas.

One way to meet the monologue objection is to specify more clearly the role of the conversational interest. In fact, the actual intentionalist claims that the conversational interest should constrain other interests such as the aesthetic interest. In other words, other interests can be reconciled or work with the conversational interest. Take the case of the hermeneutics of suspicion for example. Hermeneutics of suspicion is a skeptical attitude—often heavily politicized—adopted toward the explicit stance of a work. Interpretations based on the hermeneutics of suspicion have to be constrained by the artist’s non-ironic intention in order for them to count as legitimate interpretations. For instance, in attributing racist tendencies to Jules Verne’s Mysterious Island, in which the black slave Neb is portrayed as docile and superstitious, we need to suppose that the tendencies are not ironic; otherwise, the suspicious reading becomes inappropriate. In this example, the artistic conversation does not end up being a monologue, for the suspicious hermeneut listens and understands Verne before responding with the suspicious reading, which is constrained by the conversational interest. A conversational interchange is hence completed.

5. Hypothetical Intentionalism

a. Overview

A compromise between actual intentionalism and anti-intentionalism is hypothetical intentionalism, the core claim of which is that the correct meaning of a work is determined by the best hypothesis about the artist’s intention made by a selected audience. The aim of interpretation is then to hypothesize what the artist intended when creating the work from the perspective of the qualified audience (Tolhurst, 1979; Levinson, 1996).

Two points call for attention. First, it is hypothesis—not truth—that matters. This means that a hypothesis of the actual intention will never be trumped by knowledge of that very intention. Second, the membership of the audience is crucial because it determines the kind of evidence legitimate for the interpreter to use.

A 1979 proposal (Tolhurst) suggests that the relevant audience be singled out by the artist’s intention, that is, the audience intended to be addressed by the artist. Work-meaning is thus determined by the intended audience’s best hypothesis about the artist’s intention. This means that the interpreter will need to equip herself with the relevant beliefs and background knowledge of the intended audience in order to make the best hypothesis. Put another way, hypothetical intentionalism focuses on the audience’s uptake of an utterance addressed to them. This being so, what the audience relies on in comprehending the utterance will be based on what she knows about the utterer on that particular occasion. Following this contextualist line of thinking, the meaning of Jonathan Swift’s A Modest Proposal will not be the suggestion that the poor in Ireland might ease their economic pressure by selling their children as food to the rich; rather, given the background knowledge of Swift’s intended audience, the best hypothesis about the author’s intention is that he intended the work to be a satire that criticizes the heartless attitude toward the poor and Irish policy in general.

However, there is a serious problem with the notion of an intended audience. If the intended audience is an extremely small group possessing esoteric knowledge of the artist, meaning becomes a private matter, for the work can only be properly understood in terms of private information shared between artist and audience, and this results in something close to Humpty-Dumptyism, which is characteristic of absolute intentionalism.

To cope with this problem, the hypothetical intentionalist replaces the concept of an intended audience with that of an ideal or appropriate audience. Such an audience is not necessarily targeted by the artist’s intention and is ideal in the sense that its members are familiar with the public facts about the artist and her work. In other words, the ideal audience seeks to anchor the work in its context of creation based on public evidence. This avoids the danger of interpreting the work on the basis of private evidence.

The hypothetical intentionalist is aware that in some cases there will be competing interpretations which are equally good. An aesthetic criterion is then introduced to adjudicate between these hypotheses. The aesthetic consideration comes as a tie breaker: when we reach two or more epistemically best hypotheses, the one that makes the work artistically better should win.

Another notable distinction introduced by hypothetical intentionalism is that between semantic and categorial intention (Levinson, 1996, pp. 188–9). The kind of intention we have been discussing is semantic: it is the intention by which an artist conveys her message in the work. By contrast, categorial intention is the artist’s intention to categorize her production, either as a work of art, a certain artform (such as Romantic literature), or a particular genre (such as lyric poetry). Categorial intention indirectly affects a work’s semantic content because it determines how the interpreter conceptualizes the work at the fundamental level. For instance, if a text is taken as a grocery list rather than an experimental story, we will interpret it as saying nothing beyond the named grocery items. For this reason, the artist’s categorial intention should be treated as among the contextual factors relevant to her work’s identity. This move is often adopted by theorists endorsing contextualism, such as maximizers or moderate intentionalists.

b. Notable Objections and Replies

Hypothetical intentionalism has received many criticisms and challenges that merit mention. A frequently expressed worry is that it seems odd to stick to a hypothesis when newly found evidence proves it to be false (Carroll, 2001, pp. 208–9). If an artist’s private diary is located and reveals that our best hypothesis about her intention regarding her work is false, why should we cling to that hypothesis if the newly revealed intention meshes well with the work? Hypothetical intentionalism implausibly implies that warranted assertibility constitutes truth.

The hypothetical intentionalist clarifies her position (Levinson, 2006, p. 308) by saying that warranted assertibility does not constitute the truth for the utterer’s meaning, but it does constitute the truth for utterance meaning. The ideal audience’s best hypothesis constitutes utterance meaning even if it is designed to infer the utterer’s meaning.

Another troublesome objection states that hypothetical intentionalism collapses into the value-maximizing theory, for, when making the best hypothesis of what the artist intended, the interpreter inevitably attributes to the artist the intention to produce a piece with the highest degree of aesthetic value that the work can sustain (Davies, 2007, pp. 183–84). That is, the epistemic criterion for determining the best hypothesis is inseparable from the aesthetic criterion.

In reply, it is claimed that this objection may stem from the impression that an artist normally aims for the best; however, this does not imply that she would anticipate and intend the artistically best reading of the work. It follows that it is not necessary that the best reading be what the artist most likely intended even if she could have intended it. The objector replies that, still, the situation in which we have two epistemically plausible readings while one is inferior cannot arise, because we would adopt the inferior reading only when the superior reading is falsified by evidence.

The third objection is that the distinction between public and private evidence is blurry (Carroll, 2001, p. 212). Is public evidence published evidence? Does published information from private sources count as public? The reply from the hypothetical intentionalist emphasizes that this is not a distinction between published and unpublished information (Levinson, 2006, p. 310). The relevant public context should be reconstrued as what the artist appears to have wanted the audience to know about the circumstances of the work’s creation. This means that if it appears that the artist did not want to make certain proclamations of intent known to the audience, then this evidence, even if published at a later point, does not constitute the public context to be considered for interpretation.

Finally, two notable counterexamples to hypothetical intentionalism have been proposed (Stecker, 2010, pp. 159–60). The first counterexample is that W means p but p is not intended by the artist and the audience is justified in believing that p is not intended. In this case hypothetical intentionalism falsely implies that W does not mean p. For example, it is famously known among readers of Sherlock Holmes adventures that Dr. Watson’s war wound appears in two different locations. On one occasion the wound is said to be on his arm, while on another it is on his thigh. In other words the Holmes story fictionally asserts impossibility regarding Watson’s wound. But given the realistic style of the Holmes adventures, the best hypothesis of authorial intent in this case would deny that the impossibility is part of the meaning of the story, which is apparently false.

However, the hypothetical intentionalist would not maintain that W means p, because p is not the best hypothesis. She would not claim that the Holmes story fictionally asserts impossibility regarding Watson’s wound, for the best hypothesis made by the ideal reader would be that Watson has the wound somewhere on his body—his arm or thigh, but exactly where we do not know. It is a mistake to presuppose that W means p without following the strictures imposed by hypothetical intentionalism to properly reach p.

The second counterexample to hypothetical intentionalism is the case where the audience is justified in believing that p is intended by the artist but in fact W means q; the audience would then falsely conclude that W means p. Again, what W means is determined by the ideal audience’s best hypothesis based on convention and context, not by what the work literally asserts. The meaning of the work is the product of a prudent assessment of the total evidence available.

6. Hypothetical Intentionalism and the Hypothetical Artist

a. Overview

There is a second variety of hypothetical intentionalism that is based on the concept of a hypothetical artist. Generally speaking, it maintains that interpretation is grounded on the intention suitably attributed by the interpreter to a hypothetical or imagined artist. This version of hypothetical intentionalism is sometimes called fictionalist intentionalism or postulated authorism. The theoretical apparatus of a hypothetical artist can be traced back to Wayne Booth’s account of the “implied author,” in which he suggests that the critic should focus on the author we can make out from the work instead of on the historical author, because there is often a gap between the two.

Though proponents of the present brand of intentionalism disagree on the number of acceptable interpretations and on what kind of evidence is legitimate, they agree that the interpreter ought to concentrate on the appearance of the work. If it appears, based on internal evidence (and perhaps contextual information if contextualism is endorsed), that the artist intends the work to mean p, then p is the right interpretation of the work. The artist in question is not the historical artist; rather, it is an artist postulated by the audience to be responsible for the intention made out from, or implied by, the work. For example, if there is an anti-war attitude detected in the work, the intention to castigate war should be attributed to the postulated artist, not to the historical artist. The motivation behind this move is to maintain work-centered interpretation but avoid the fallacious reasoning that whatever we find in the work is intended by the real artist.

Inheriting the spirit of hypothetical intentionalism, fictionalist intentionalism aims to make interpretation work-based but author-related at the same time. The biggest difference between the two stances is that, as said, fictionalist intentionalism does not appeal to the actual or real artist, thereby avoiding any criticisms arising from hypothesizing about the real artist such as that the best hypothesis about the real artist’s intention should be abandoned when compelling evidence against it is obtained.

b. Notable Objections and Replies

The first concern with fictionalist intentionalism is that constructing a historical variant of the actual artist sounds suspiciously like hypothesizing about her (Stecker, 1987). But there is still a difference. “Hypothesizing about the actual artist,” or more accurately, “hypothesizing the actual artist’s intention,” would be a characterization of hypothetical intentionalism rather than fictionalist intentionalism. The latter does not track the actual artist’s intention but constructs a virtual one. As shown, fictionalist intentionalism, unlike hypothetical intentionalism, is immune to any criticisms resulting from ignoring the actual artist’s proclamation of her intention.

A second objection criticizes fictionalist intentionalism for not being able to distinguish between different histories of creative processes for the same textual appearance (Livingston, 2005, pp. 165–69). For example, suppose a work that appears to be produced with a well-conceived scheme did result from that kind of scheme; suppose further that a second work that appears the same actually emerged from an uncontrolled process. Then, if we follow the strictures of fictionalist intentionalism, the interpretations we produce for these two works would turn out to be the same, for based on the same appearance the hypothetical artists we construct in both cases would be identical. But these two works have different creative histories and the difference in question seems too crucial to be ignored.

The objection here fails to consider the subtlety of reality-dependent appearances (Walton, 2008, ch. 12). For example, suppose the exhibit note beside a painting tells us it was created when the painter got heavily drunk. Any well-organized feature in the work that appears to result from careful manipulation by the painter might now either look disordered or structured in an eerie way depending on the feature’s actual presentation. Compare this scenario to another where a (almost) visually indistinguishable counterpart is exhibited in the museum with the exhibit note revealing that the painter spent a long period crafting the work. In this second case the audience’s perception of the work is not very likely to be the same as that in the first case. This shows how the apparent artist account can still discriminate between (appearances of) different creative histories of the same artistic presentation.

Finally, there is often the qualm that fictionalist intentionalism ends up postulating phantom entities (hypothetical creators) and phantom actions (their intendings). The fictional intentionalist can reply that she is giving descriptions only of appearances instead of quantifying over hypothetical artists or their actions.

7. Conclusion

From the above discussion we can notice two major trends in the debate. First, most late 20th century and 21st century participants are committed to the contextualist ontology of art. The relevance of art’s historical context, since its first philosophical appearance in Arthur Danto’s 1964 essay “The Artworld,” continues to influence analytic theories of art interpretation. There is no sign of this trend diminishing. In Noël Carroll’s 2016 survey article on interpretation, the contextualist basis is still assumed.

Second, actual intentionalism remains the most popular position among all. Many substantial monographs have been written in this century to defend the position (Stecker, 2003; Livingston, 2005; Carroll, 2009; Stock 2017). This intentionalist prevalence probably results from the influence of H. P. Grice’s work on the philosophy of language. And again, this trend, like the contextualist vogue, is still ongoing. And if we see intentionalism as an umbrella term that encompasses not only actual intentionalism but also hypothetical intentionalism and probably fictionalist intentionalism, the influence of intentionalism and its related emphasis on the concept of an artist or author will be even stronger. This presents an interesting contrast with the trend in post-structuralism that tends to downplay authorial presence in theories of interpretation, as embodied in the author-is-dead thesis championed by Barthes and Foucoult (Lamarque, 2009, pp. 104–15).

8. References and Further Reading

  • Beardsley, M. C. (1970). The possibility of criticism. Detroit, MI: Wayne State University Press.
  • Contains four philosophical essays on literary criticism. The first two are among Beardsley’s most important contributions to the philsoophy of interpretation.

  • Beardsley, M. C. (1981a). Aesthetics: Problems in the philosophy of criticism (2nd ed.). Indianapolis, IN: Hackett.
  • A comprehensive volume on philosophical issues across the arts and also a powerful statement of anti-intentionalism.

  • Beardsley, M. C. (1981b). Fiction as representation. Synthese, 46, 291–313.
  • Presents the speech act theory of literature.

  • Beardsley, M. C. (1982). The aesthetic point of view: Selected essays. Ithaca, NY: Cornell University Press.
  • Contains the essay “Intentions and Interpretations: A Fallacy Revived,” in which Beardsley applies his speech act theory to the interpretation of fictional works.

  • Booth, W. C. (1983). The rhetoric of fiction (2nd ed.). Chicago, IL: University of Chicago Press.
  • Contains the original account of the implied author.

  • Carroll, N. (2001). Beyond aesthetics: Philosophical essays. New York, NY: Cambridge University Press.
  • Contains in particular Carroll’s conversation argument, discussion on the hermenutics of suspicion, defense of moderate intentionalism, and criticism of hypothetical intentionalism.

  • Carroll, N. (2009). On criticism. New York, NY: Routledge.
  • An engaging book on artistic evaluation and interpretation.

  • Carroll, N., & Gibson, J. (Eds.). (2016). The Routledge companion to philosophy of literature. New York, NY: Routledge.
  • Anthologizes Carroll’s survey article on the intention debate.

  • Currie, G. (1990). The nature of fiction. Cambridge, England: Cambridge University Press.
  • Contains a defense of fictionalist intentionalism.

  • Currie, G. (1991). Work and text. Mind, 100, 325–40.
  • Presents how a commitment to contextualism leads to an important distinction between work and text in the case of literature.

  • Danto, A. C. (1964). The artworld. Journal of Philosophy, 61, 571–84.
  • First paper to draw attention to the relevance of a work’s context of production.

  • Davies, S. (2005). Beardsley and the autonomy of the work of art. Journal of Aesthetics and Art Criticism, 63, 179–83.
  • Argues that Beardsley is actually a contextualist.

  • Davies, S. (2007). Philosophical perspectives on art. Oxford, England: Oxford University Press.
  • Part II contains Davies’ defense of the maximizing position and criticisms of other positions.

  • Dickie, G. (2006). Intentions: Conversations and art. British Journal of Aesthetics, 46, 71–81.
  • Criticizes Carroll’s conversation argument and actual intentionalism.

  • Goldman, A. H. (2013). Philosophy and the novel. Oxford, England: Oxford University Press.
  • Contains a defense of the value-maximizing theory without a contextualist commitment.

  • Hirsch, E. D. (1967). Validity in interpretation. New Haven, CT: Yale University Press.
  • The most representative presentation of extreme intentionalism.

  • Hirsch, E. D. (1976). The aims of interpretation. Chicago, IL: University of Chicago Press.
  • Contains a collection of essays expanding Hirsh’s views on interpretation.

  • Huddleston, A. (2012). The conversation argument for actual intentionalism. British Journal of Aesthetics, 52, 241–56.
  • A brilliant criticism of Carroll’s conversation argument.

  • Iseminger, G. (Ed.). (1992). Intention & interpretation. Philadelphia, PA: Temple University Press.
  • A valuable collection of essays featuring Beardsley’s account of the work’s autonomy, Knapp and Michaels’ absolute intentionalism, Iseminger’s extreme intentionalism, Nathan’s account of the postulated artist, Levinson’s hypothetical intentionalism, and eight other contributions.

  • Jannotta, A. (2014). Interpretation and conversation: A response to Huddleston. British Journal of Aesthetics, 54, 371–80.
  • A defense of the conversation argument.

  • Krausz, M. (Ed.). (2002). Is there a single right interpretation? University Park: Pennsylvania State University Press.
  • Another valuable anthology on the intention debate, containing in particular Carroll’s defense of moderate intentionalism, Lamarque’s criticism of viewing work-meaning as utterance meaning.

  • Lamarque, P. (2009). The philosophy of literature. Malden, MA: Blackwell.
  • The third and the fourth chapters discuss analytic theories of interpretation along with a critical assessment of the author-is-dead claim.

  • Levinson, J. (1996). The pleasure of aesthetics: Philosophical essays. Ithaca, NY: Cornell University Press.
  • The tenth chapter is Levinson’s revised presentation of hypothetical intentionalism and the distinction between semantic and categorial intention.

  • Levinson, J. (2006). Contemplating art: Essays in aesthetics. Oxford, England: Oxford University Press.
  • Contains Levinson’s replies to major objections to hypothetical intentionalism.

  • Levinson, J. (2016). Aesthetic pursuits: Essays in philosophy of art. Oxford, England: Oxford University Press.
  • Contains Levinson’s updated defense of hypothetical intentionalism and criticism of Livingston’s moderate intentionalism.

  • Livingston, P. (2005). Art and intention: A philosophical study. Oxford, England: Oxford University Press.
  • A thorough discussion on intention, literary ontology, and the problem of interpretation, with emphases on defending the meshing condition and on the criticisms of the two versions of hypothetical intentionalism.

  • Nathan, D. O. (1982). Irony and the artist’s intentions. Journal of Aesthetics and Art Criticism, 22, 245–56.
  • Criticizes the notion of an intended audience.

  • Nathan, D. O. (2006). Art, meaning, and artist’s meaning. In M. Kieran (Ed.), Contemporary debates in aesthetics and the philosophy of art (pp. 282–93). Oxford, England: Blackwell.
  • Presents an account of fictionalist intentionalism, a critique of the conversation argument, and a brief recapitulation of the publicity paradox.

  • Nehamas, A. (1981). The postulated author: Critical monism as a regulative ideal. Critical Inquiry, 8, 133–49.
  • Presents another version of fictionalist intentionalism.

  • Stecker, R. (1987). ‘Apparent, Implied, and Postulated Authors’, Philosophy and Literature 11, pp 258-71.
  • Criticizes different versions of fictionalist intentionalism

  • Stecker, R. (2003). Interpretation and construction: Art, speech, and the law. Oxford, England: Blackwell.
  • A valuable monograph devoted to the intention debate and its related problems such as the ontology of art, incompatible interpretations and the application of theories of art interpretation to law. The book defends moderate intentionalism in particular.

  • Stecker, R. (2010). Aesthetics and the philosophy of art: An introduction. Lanham, MD: Rowman & Littlefield.
  • Contains a chapter that presents the disjunctive formulation of moderate intentionalism and the two counterexamples to hypothetical intentionalism.

  • Stecker, R., & Davies, S. (2010). The hypothetical intentionalist’s dilemma: A reply to Levinson. British Journal of Aesthetics, 50, 307–12.
  • Counterreplies to Levinson’s replies to criticisms of hypothetical intentionalism.

  • Stock, K. (2017). Only imagine: Fiction, interpretation, and imagination. Oxford, England: Oxford University Press.
  • Contains a defense of absolute (the author uses the term “extreme”) intentionalism.

  • Tolhurst, W. E. (1979). On what a text is and how it means. British Journal of Aesthetics, 19, 3–14.
  • The founding document of hypothetical intentionalism.

  • Trivedi, S. (2001). An epistemic dilemma for actual intentionalism. British Journal of Aesthetics, 41, pp. 192–206.
  • Presents an epistemic dilemma for actual intentionalism and defense of hypothetical intentionalism.

  • Walton, K. L. (2008). Marvelous images: On values and the arts. Oxford, England: Oxford University Press.
  • A collection of essays, including “Categories of Art,” which might have inspired Levinson’s conception of categorial intention; and “Style and the Products and Processes of Art,” which is a defense of fictionalist intentionalism in terms of the notion “apparent artist.”

  • Wimsatt, W. K., & Beardsley, M. C. (1946). The intentional fallacy. The Sewanee Review, 54, 468–88.
  • The first thorough presentation of anti-intentionalism, commonly regarded as starting point of the intention debate.

 

Author Information

Szu-Yen Lin
Email: lsy17@ulive.pccu.edu.tw
Chinese Culture University
Taiwan

Plotinus: Virtue Ethics

This article focuses on the virtue ethics of Plotinus (204—270 C.E.) and its implications for later accounts of virtue ethics, particularly in Porphyry and Iamblichus. Plotinus’ ethical theory is discussed in relation to the aim of the virtuous person to become godlike, the role of disposition in the soul’s intellectualization, the four cardinal virtues, well-being, human freedom, and self-determination. Plotinus’ virtue ethics is also presented in regards to his theory of transmigration and his criticism of the Gnostics.

Plotinus was a neo-Platonist, and Plato’s ethical teaching underlines Plotinus’ conception of virtue as an intrinsic quality of human character and also underlies Plotinus’ conception of excellence that derives from the soul’s purity in the contemplation of the Forms. Aristotle’s ethical theory influences Plotinus, particularly Aristotle’s recognition of the gods as purely intelligible beings, which are not possessing virtues. Even more importantly, Aristotle’s distinction between intellectual and ethical virtues was a great influence upon Plotinus.

Plotinus’ virtue ethics has been used by later Neoplatonists such as Porphyry, Iamblichus, Macrobius, and Olympiodorus. Plotinus’ treatment of virtues is also found in the ethical theories of Arabic Neoplatonists and in Neoplatonic commentaries on the Aristotelian ethics. Plotinus’ analysis of the four Platonic cardinal virtues has been systematically treated by Porphyry.

Table of Contents

  1. Philosophical Background and Reception
  2. Becoming like God
  3. Disposition and Intellectual Qualities
  4. The Cardinal Virtues
    1. Courage
    2. Self-control
    3. Justice
    4. Wisdom
  5. Well-being
  6. Human Freedom and Self-Determination
  7. Soul and Transmigration
  8. Criticism of the Gnostics
  9. Virtue Ethics after Plotinus
    1. Porphyry
    2. Iamblichus
  10. References and Further Reading
    1. Texts and Translations
    2. Commentaries
    3. Introductory Sources
    4. Ethical Theory
    5. Human Freedom and Selfhood
    6. Post-Plotinian Virtue Ethics

1. Philosophical Background and Reception

Plato‘s and Aristotle‘s virtue ethics are found in the background of Plotinus’ ethical theory. Plato’s ethical teaching—particularly in the Symposium, the Phaedo, the Pheadrus, and the Republic—underlines Plotinus‘ conception of virtue as an intrinsic quality of human character and his conception of excellence that derives from the soul’s purity in the contemplation of the Forms. Aristotle’s ethical theory (Nicomachean Ethics) influences Plotinus, particularly Aristotle’s recognition of the gods as purely intelligible beings, which are not possessing virtues (NE 1178), but even more importantly Aristotle’s distinction between intellectual and ethical virtues (NE 1139).

Plotinus’ virtue ethics, mainly exposed in Ennead I 2, has been used by later Neoplatonists such as Porphyry, Iamblichus, Macrobius, and Olympiodorus (O’ Meara 2003). Plotinus’ treatment of virtues is also found in the ethical theories of Arabic Neoplatonists and in Neoplatonic commentaries on the Aristotelian ethics (Smith 2004). Plotinus’ analysis of the four Platonic cardinal virtues in Ennead I 2 has been systematically treated by Porphyry in his Sententiae ad intelligibilia ducentes (section 32).

Porphyry discussed a fourfold scale of virtues in correspondence to the area where the virtues apply: (1) political virtues correspond to the practical and civic sphere, (2) purificatory virtues correspond to soul’s initial purification and ascent from the body, (3) theoretical virtues correspond to soul’s contemplation of the Forms, and (4) the paradigmatic, or exemplary virtues, correspond directly to the Forms and the divine Nous.

Porphyry, in his biographical work of Plotinus, On the Life of Plotinus and the Order of his Books, classifies the nine treatises of the first Ennead as his master’s work that is “mainly concerned with morals” (Life 24.17-18). Plotinus’ ethical theory is mainly discussed in Enneads I 2 [19] On Virtues, I 4 [46] On Well-Being, and Ennead I 3 [20] On Dialectic. Whereas Ennead I 2 offers an analysis of Plato’s four cardinal virtues:  (1) courage (andreia), (2) self-control (sophrosyne), (3) justice (dikaiosyne) and (4) wisdom (phronesis / sophia), Ennead I 4 focuses on the excellence of the wise man (spoudaios) and the nature of well-being (eudaimonia; see also Ennead I 5 [36] On Whether Well-Being Increases with Time). Furthermore, Ennead I 3 follows chronologically Ennead I 2 and actually supplements Plotinus’ ethical analysis on virtues with special reference to the advantages of the Platonic dialectic in contrast to the Stoic and the Aristotelian logic. Plotinus highlights the significance of Plato’s dialectic in respect to soul’s intellectual purification and its aim for noetic ascent.

In addition, Plotinus’ discussions on the nature of evil in Ennead I 8; on the metaphysics of beauty in Ennead I 6; on the philosophical comparison between the nature of human being and that of other living beings in Ennead I 1; and the very short treatise Ennead I 7 on the Platonic Good qua the primal good of all aspiration in life, include significant elements of his ethical theory. Finally, important implications of Plotinus’ virtue ethics are highlighted in his theory of transmigration (see particularly Enneads I 1.11; III 2.15), his criticism of Gnosticism in Ennead II 9 [33] Against the Gnostics, as well as his conception of human freedom and self-determination, particularly maintained in Ennead VI 8 [39] On Free Will and the Will of the One. Plotinus’ virtue ethics is further developed and systematized by later Neoplatonists such as Poprhyry (Sententiae ad intelligibilia ducentes 32) and Iamblichus (On Virtues).

Porphyry’s pupil Iamblichus, in his work On Virtues (the work is not preserved today), developed further the scale of virtues of Sententiae 32 (Finamore 2012; O’ Meara 2003). Iamblichus maintained a sevenfold scale of virtues. By tracing back to the Middle Platonists (Baltzly 2004), he added two more groups of virtues below the political and one group of virtues at the highest level above the paradigmatic virtues. Iamblichus’ classification developed from low to high with the following virtues: (1) natural, (2) ethical, (3) civic, (4) purifying, (5) contemplative, (6) paradigmatic, and (7) hieratic. Iamblichus’ scale of virtues testifies to the importance of theurgy for later Neoplatonists and influences St. Augustine’s early thought (Kalligas 2014).

2. Becoming like God

Plotinus’ treatise Ennead I 2 On Virtues opens with a question on the soul’s escape from evils in the earthly world: “Since it is here that evils are, and they must necessarily haunt this region, and the soul wants to escape from evils, we must escape from here. What, then, is this escape?” (I 2.1.1-3) For Plotinus, the answer should be found in Plato’s Theaetetus 176a-b, “to become as like God as possible”, and soul’s likeness to God should be related to the virtue of wisdom qua the highest ruling principle of the universe and the world soul (I 2.1.3-10). The passage from Plato’s Theaetetus marks Plotinus’ exposition of virtues in Ennead I 2 (2.1-10; 3.1 ff.; 5.1-2; 6.1-11; 7.27-30; see also Ennead I 8.6-7 on the necessity of evils) and Armstrong, in his introductory note on the treatise, emphatically regards Ennead I 2 as a commentary on the passage from the Theaetetus. In addition to Plato’s reference, Aristotelian and Stoic elements have been identified in Plotinus’ theory of virtue as well as some Neo-Pythagorean influences. Plotinus’ approach to “becoming like god” is discussed by later Neoplatonists such as Porphyry, Iamblichus, and Proclus (Baltzly 2004).

A careful reading of the first lines of Ennead I 2 shows a divergence from Plato’s assertion in the Theaetetus 176a-b that likeness to god is achievable up to a certain point. This difference seems to be not without purpose for the Neoplatonist and explains Plotinus’ interpretation of likeness to god in the same way as the Middle-Platonist Eudorus of Alexandria interpreted godlikeness “in virtue of that element in us which is capable of this”, and signifies the purpose of human life in Pythagoras, Socrates, and Plato (Dillon 1983). However, despite the omission of Plato’s qualification, Plotinus appropriately conceives the meaning of the Theaetetus passage as it is related to soul’s purification and the divine excellence of the virtuous life (Kalligas 1984; see particularly I 2.6.9-10, 7.24 and II 9.9.50-1). Plotinus’ metaphysics of power justifies the possibility of the virtuous soul to ascent to the higher intelligible realm without inherent limitations or qualifications. Plotinus actually diminishes Plato’s qualification of the Theaetetus since the soul’s noetic and complete likeness to the god is possible. Plotinus puts an emphasis on the intelligible purity of the soul and the power of virtue to lead the human mind to noetic ascent and the higher intelligible principles; our virtues are intelligible powers in the soul and derive from the divine Intellect, so the soul is able to return to the intelligible realm of the Forms and become like the divine Nous. The goal of the virtuous and wise person is to become godlike (II 9.15.40). The wise person is likened through virtue to the self-sufficient, perfect, and pure life of the intelligible world.

3. Disposition and Intellectual Qualities

Aristotle in his Nicomachean Ethics (1106b36-1107a1) defines virtue as a “disposition” (hexis) of the soul that is concerned with deliberate choice. The disposition of the soul underlies moral action in terms of moderation (mesotes), that is, the appropriate mean between the two extremes of deficiency and excess (1107a2-6). Aristotle emphasized the habitual aspect of disposition both in terms of ethical exercise (praxis) and the desired excellence of the moral agent.

Plotinus, in the fifth chapter of Ennead VI 8 On Free Will and the Will of the One, defines virtue as a hexis of the soul, but not in habitual terms. The Neoplatonist stresses the intellectual qualities of virtue not in terms of ethical practice but mainly in terms of contemplation. Virtue is a hexis not in the dispositional sense of ethical praxis but as an active state of the soul, a contemplative disposition that “intellectualises the soul” beyond ethical practice: “being in our power does not belong to the realm of action but in intellect at rest from activity” (VI 8.5.35-36). Plotinus underlines a self-directed perspective of the moral soul’s power of virtue. Virtue intellectualizes the soul in its internal contemplation of Nous and not in external considerations. The Plotinian hexis is not found in the moderation of praxis but in the soul’s conscious apprehension of being, and particularly in the middle region of the soul, in between the higher intelligible and the lower perceptible regions of the psyche. The Plotinian virtue is an active hexis that consciously directs the soul in the contemplation of the intelligible world of the Forms.

4. The Cardinal Virtues

In Ennead II 9, Plotinus acknowledges the inherent value of virtue: “if we talk about God without true virtue, God is only a name” (15.40). For Plotinus, every virtue is purification, and the purified soul becomes both form and forming principle. The virtuous soul noetically ascends without body to the divine realm of Nous, the world of true goodness, intelligence, and beauty (I 6.6). In Ennead I 6 On Beauty, Plotinus particularly refers to the four cardinal virtues found in Plato’s teaching­—wisdom, justice, self-control, and courage: “For, as was said in old times, self-control, and courage and every virtue, is a purification, and so is even wisdom itself.… For what can true self-control be except not keeping company with bodily pleasures, but avoiding them as impure and belonging to something impure? Courage, too, is not being afraid of death. And death is the separation of body and soul; and a man does not fear this if he welcomes the prospect of being alone. Again, greatness of soul is despising the things here: and wisdom is an intellectual activity which turns away from the things below and leads the soul to those above” (I 6.6.1-13).

In Ennead I 2, Plotinus focuses on the four cardinal virtues, emphasizing their intellectual and contemplative nature. However, as Smith claims, Plotinus’ aim is not to suggest a fixed scale of virtues but an ascending schema of levels of the cardinal virtues in relation to the different levels and aspects of humanity’s ethical and intellectual life (Smith 2004). Plotinus approaches the cardinal virtues from the following aspects: (P1) civic life (I 2.1.17-21), (P2) the purification of the soul in relation to the body (3.15-9), (P3) soul’s contemplation of the higher intelligible world (6.12-27), and (P4) the intelligible purity and goodness of the Forms (7.3-6). Despite the fact that Plotinus’ treatment of the four cardinal virtues in Ennead I 2 is not necessarily systematic, the following accounts are identified in relation to the four levels above.

a. Courage

In civic life courage deals with soul’s emotions (P1); in the process of purification courage is characterized by soul’s fearlessness to depart from the body (P2); in a contemplative state, courage is the virtue that frees the soul from lower affections in likeness of soul’s higher intelligible part (P3); and the highest intelligible level of Nous courage is identified with “immateriality and abiding pure by itself” (P4).

b. Self-control

The agreement and harmony of the soul’s passion and reason underlies self-control as civic virtue (P1). Whereas at a purificatory level self-control means that the soul is not sharing bodily experience (P2), at the level of contemplation it means the soul’s inward turning to Intellect (P3), and at the highest intelligible level of the Forms, self-control is identified with self-concentration (P4).

c. Justice

In civic life, justice is defined in Plato’s terms as the virtue that facilitates the agreement between the different parts of the soul in “minding their own business where ruling and being ruled are concerned” (P1), while in purificatory terms, justice is purely ruled by reason and intellect “without opposition” (P2). However, at a contemplative level, justice is not found in the plurality of the soul’s parts but in the disposition of the unity to itself and so “the higher justice in the soul is its activity towards Intellect (P3). At this pure intelligible level, justice entails the soul’s proper and paradigmatic activity in minding its own business beyond any plurality (P4).

d. Wisdom

As a civic virtue, phronesis, practical wisdom, is related to the discursive reason of the soul (P1), while as a purificatory virtue, it refers to the soul “acting alone” outside the experience of the body and mere opinion (P2). In a contemplative person, practical wisdom and theoretical wisdom (sophia) involve the contemplation of the intelligibles, that is, what the divine Intellect contains and possesses in immediate contact. Plotinus discriminates between the wisdom of Intellect and that of the soul; wisdom, as with all virtues, is not a virtue in Nous but manifests only in the soul. Wisdom in Intellect is its pure actuality (=intelligence) and what it really is (=being), in the soul, wisdom derives from Nous but is directed to other things (P3), and so the paradigm of wisdom is related to pure intelligence and knowledge manifested in soul’s direct sight towards the hypostasis of Intellect (P4).

In Ennead I 3, Plotinus further distinguishes between higher virtues and lower virtues. Plotinus maintains that the higher virtues are interrelated and correspond to intelligible Forms, which are not virtues themselves, but contribute to the noetic ascent as well as the practical and theoretical excellence of the soul. Moral philosophy is not only about intellectual virtues but also deals with the production of the appropriate dispositions and exercises (I 3.6). However, the higher virtues contribute to the purification of the soul and “moral philosophy derives from dialectic in its contemplative side.” Dialectic is the purest part of intelligence and wisdom that guides the soul to knowledge and apprehension in correct order and reason. As Plotinus maintains in Ennead I 2, if all virtues are purifications, the process of purification produces and perfects all virtues, and so the one who possesses the greater intellectual virtues must necessarily have the lesser civic virtues. However, this is not admitted to the one who possesses the lesser virtues. The intellectual virtues complete the lower virtues and not vice versa (I 3.6.5-7). For Plotinus, well-being (eudaimonia) is achieved only with the excellence of the higher virtues that lead to the intelligible world.

5. Well-being

In Ennead I 4 On Well-Being, Plotinus criticizes Aristotle’s primal relation of well-being (eudaimonia) with practical accomplishment, proper functions, and the achievement of natural ends (chapter 1).  Plotinus is also skeptical with the Stoics and the Epicureans (chapter 2); eudaimonia should not be related either to the Stoic ‘extirpation of passion’ (apatheia) and the “study of primary natural needs which perfects reason”, nor to the ataraxic pleasure of an unworried state of mind (ataraxia) supported by the Epicureans.  As a devoted Platonist, Plotinus returns for an answer on the question of eudaimonia to the original teaching of Plato. Plotinus follows Plato’s metaphysical perspective on eudaimonia in relation to the contemplation of the Forms. For Plotinus, well-being (eudaimonia) is not achieved primarily in ethical practice (praxis), as Aristotle suggested, but mainly through the noetic ascent of the soul and in contemplation (theoria) of true being in the intelligible realm of Nous. The wise person (spoudaios) has to become godlike (see Ennead I 2.1) to be eudaimon, that is, to live the perfect life of Intellect, the life of the higher soul purely contemplating the eternal reality of Nous. The real virtue of the wise is to be aware of the perfection, self-sufficiency, and completeness of Intellect, the intelligible reality where the soul is truly purified beyond discursive reason and consciousness (I 4.3.34-41).

The excellence of virtue is achieved not by having intellect but by being intellect; the perfectly virtuous soul of the wise is self-sufficient, ascends purified to the intelligible world and so likens itself to Intellect’s divine and eternal eudaimonia (I 4.4). However, Plotinus clarifies that the meaning of likeness (homoiosis) in the wise and good person is not the likeness of two pictures in perceptible terms but the intelligible likeness of the soul to the divine model of Nous different from our perceptible self  (I 2.7.28-31). Hence, the soul of the wise man, purely concentrated on the divine realm, is not affected either by the sufferings or the misfortunes of the animated body (I 4.5-8), nor in any way influenced by the lower life of the material world (I 4.9). The spoudaios experiences a life in noetic purity guided only by the higher intelligible part, and any disturbances from the lower perceptible part hardly trouble the wise person (I 2.5.22 ff.). Any kinds of affections from the perceptible part of the soul are dim echoes for the mind of the wise man just because of the affinity between the two parts within the soul. The lower part is always benefited within the soul of the spoudaios “just as a man living next door to a sage would profit by the wise man’s neighborhood, either by becoming like him, or by regarding him with such respect as not to dare to do anything of which the good man would not approve” (I 2.5.25-28). However, the wise person is not careless about the perceptible body despite the fact that bodily goods will not contribute to eudaimonia; the wise person has to give to the body what the body really needs (I 4.11-16). The concern of the wise person is “not to be out of sin, but to be God” (I 2.6.2-3), and so the virtue of the wise person that leads to true well-being is to exercise the higher activity of the soul’s intelligible self; true arete frees the spoudaios and leads the soul to the ultimate goal to become like the higher intelligible and eternal life of the divine Nous (I 2.6-7).

The wise man is also not inconsiderate to others (I 4.15.21-25) but does not belong to the mass of people (II 9.9.6-11). He chooses to be acquainted with virtuous friends and he is the paradigm of excellence and contemplative life. As Plotinus notes, the spoudaios is not “unfriendly” (aphilos) or careless about others, but he cares about his own soul as he cares about his own affairs and the excellence of his companions. The wise man manifests intelligible unity and purity by being an earthly paradigm of the divine Nous, and so “renders to his friends all that he renders to himself, and so will be the best of friends due to his union with Nous” (I 4.15.21-25). The wise man shares his eudaimonia by being present at the same time to his own self and the others (See Porphyry Life 8.19-23), and lives a friendship (philia) in the sensible world that imitates the friendship of the universal order and the higher divine realm of Nous and its unity with its Forms (VI 7.14). The power of philia traverses all the hypostases of being as it is identified with and derives from the supreme unity of the One (V 1.9.1-5).

6. Human Freedom and Self-Determination

Plotinus’ theory of virtue ethics is closely related to human freedom and self-determination. In the beginning of Ennead VI 8 On Free Will and the Will of the One Plotinus wonders, “Is there anything in our power?” In his analysis a distinction is offered between internal determinations (that is, what depends on us) and external determinations (that is, what is not dependent on us) (Eliasson 2008; Remes 2006). An action is voluntary and depends on us not only if we are free and we are not obliged to act, but also if we are not following the path of reason without critical evaluation. For Plotinus, an action depends purely on us only if the soul defines its own self as a self-determined principle (VI 8.3.20-26).

Plotinus’ notion of self-determination is related to the concept of “what depends on us” (eph’ hemin) as having the connotation of a faculty describing either the quality of action or the agent himself (Leroux 1990). Furthermore, a distinction has been suggested between an inclusive notion of “what depends on us” (that is, the moral action has its origins in the agent) and an exclusive notion (that is, the moral action has its origins in rational decisions and judgments not necessarily determined by the agent) (Eliasson 2008). For Plotinus, voluntariness and awareness of an action are not sufficient for an action to be depended on us, but from our wish coming through the contemplation of virtue.

Furthermore, for Plotinus, moral actions that are determined by external factors are related to passive dispositions, but true virtue should be based on the internal state of the soul in relation to intellect (II 5.2.34-35). Moral agency reveals itself not primarily in ethical practice but in the excellence of the inner self in active contemplation of the Forms (II 3.9-10). The virtuous soul is purely dependent on its own self without considering external conditions or determinations; the free soul is self-determined only by internal conditions (III 1; see also VI 8.3.20-26) and acts autonomously in self-determination inconsiderate of external parameters or situational conditions (VI 8.6.19-23). The virtuous action is underlined by three conditions: firstly, an action is voluntary (that is, we should not be forced to act); secondly, an action must be conscious (that is, we should have knowledge of what we are doing); and, thirdly it must be self-determined (that is, we should be masters of ourselves) (Eliasson 2008). Considering a self-directed aspect of moral agency, Plotinus moves the emphasis from the outward activity of ethical practice (that is,  Aristotle’s primary concern in relation to virtue ethics) to the inner activity of the contemplating soul. A free and noble action is not justified or based mainly on practice (praxis), but on the intellectual virtues of the soul as qualities of its intelligible self prior to moral action that is found in the perceptible realm (VI 8.6.20-22). Virtue is an active disposition of the soul in terms of contemplation (theoria) that ends in an established state of mind internally tuned and moderated in accordance to the perfection of the intelligible world. In light of this approach, well-being is not found in actions but in the inner contemplation of the soul. As Plotinus puts it, “To place eudaimonia in actions is to locate it in something outside virtue and the soul; the activity of the soul lies in thought, and action of this kind within itself; and this is the state of eudaimonia” (I 5.10.20-23). True happiness of a free and moral soul is not established in external situations and activities but in internal determinations and intellectual virtues (I 5.1).

Moreover, whereas Aristotle conceives of human freedom as related to the problem of choice and contingency, Plotinus of conceives human freedom in relation to the freedom of the self (Leroux 1996) and the virtuous life of the wise person, without necessarily being defined by or dependent on voluntary choice (Ennead VI 8.1-7). Plotinus emphatically argues that no outward actions are purely dependent on us: “in practical actions self-determination and being in our power is not referred to practice and outward activity but to the inner activity of virtue itself, that is, its thought and contemplation” (VI 8.6.20-22). What depends on us can be found in the realm of intellect “at rest from actions” (VI 8.5.35-37). Only virtue as an intellectual quality purifies and frees the soul, and as Plotinus states by following Plato’s expression in the Myth of Er in the Republic (617e3), virtue has “no master” as far as it intellectualizes the soul beyond any external determination (VI 8.5.30-37).

7. Soul and Transmigration

Plotinus’ virtue ethics is a self-directed ethical theory that is related to his psychology and metaphysics. His ethical theory follows his theory of the psyche and its dual-aspect nature. The higher and lower virtues correspond to the higher intelligible and sense-perceptive parts of the soul (Ennead I 3). Whereas the lower virtues are related to passions and the lower sense-perceptive part of the soul, the higher virtues are related to wisdom and dialectic and refer to the higher intelligible part of the soul (I 3.6). Plotinus aims to stress the superiority of the soul’s higher intelligible part, which is its inner self and is contrasted with the soul’s sufferings and passions of the lower sense-perceptive part, which is related to the outer self. He maintains that tragic and cruel moments in life should not be taken seriously but should be regarded as incidents in the plot of a play: “we should be spectators of murders, and all deaths, and takings and sacking of cities, as if they were on the stages of theaters, all changes of scenery and costume and acted wailings and weepings” (III 2.15.43-47). It is not the soul’s inner self that participates in the “game of life” but “the outside shadow of man” (47-50). The higher soul remains unaffected by bodily conditions and so “the outer man has to take off the play-costume in which he is dressed” (55-57). The inner man is clear from affections, and this is our true self that possesses the virtues that belong to the realm of intellect and “have their seat actually in the separate soul, separate and separable even while it is still here below” (I 1.10.7-10).

Plotinus’ dual-aspect theory of the soul is related to his account of transmigration and its ethical implications. It is noteworthy that Plotinus never uses the term metempsychosis (reincarnation) but only metensomatosis (transmigration). Plotinus adopts a monistic view of transmigration. A monistic approach to transmigration agrees with the ontological unity and homogeneity of the soul and the non-eschatological aspect of human destiny. The transmigration of the soul should be conceived of as illumination of the living bodies. The soul is not literally transmigrated, since the bodies are just shadows and images of the higher soul. The bodies are projections of the soul and so transmigration is the illumination of the light of the soul transmitted into different bodily forms and without affecting the unity of the soul.

Plotinus stresses the ethical implications of transmigration originally found in the Platonic dialogues (Phaedo 81-82; Republic X. 620; Timaeus 91-92). However, in light of the soul’s ontological unity, homogeneity, and monism, Plotinus aims to reconcile some dualistic accounts of transmigration found in Plato, the early Pythagoreans, and some Presocratics such as Heraclitus and Empedocles (VI 4.16.4-7). His intention is to abolish the barriers between different psychic classes and hierarchies. Since the soul is one, homogenous, intelligible substance of life, all transmigrations into various life forms are possible (humans, animals, plants) and by extension, all animated bodies are rational and immortal (IV 7.14.1-8; see also VI 4.16; IV 8.1; III 4.2.16-30).

Plotinus’ reconsideration of Plato’s accounts of transmigration also has an ethical side. The logos of the soul manifests at different facets of life and being: the man who exercised political virtue becomes a man again, while the one who is not active in community becomes a bee; the man who loved music a song-bird; kings who ruled stupidly into eagles; those who lived with the senses animals; even plants for those who lived with the desire of flesh coupled with dullness of perception (III 4.2). Nevertheless, for Plotinus, whereas the transmigration of human souls into animal bodies is possible, the soul’s destiny has nothing to do with transmigration (I 1.11.8-15). It is not physical condition that affects the soul but the moral quality of the soul that affects the physical order, both of individual bodies and the cosmos. Plotinus denies an eschatological approach to transmigration for the soul’s higher intelligible part. As an intelligible entity, the soul is pure and immortal logos and thus sinless in its very nature (I 1.12.1-4). Since the soul is sinless it cannot be judged or punished in after-life nor transmigrated by passing from body to body. The higher part of the soul never descends completely to the lower realm of the sensible world (IV 8.8), while the lower part is a shadow of the higher part, and the descent of the soul is an inclination of the intelligible part in the realm of becoming (I 1.12). The dual-aspect nature of the soul is vividly described in Ennead I 1, where Plotinus uses the dual image of the noble and virtuous hero Heracles who “had this active virtue and in view of his noble character was deemed worthy to be called a god—because he was an active and not a contemplative person (in which case he would be altogether in that intelligible world), he is above, but there is also still a part of him below” (35-39).

Whereas Plotinus accepted transmigration of the soul in different forms and in terms of the soul’s purity and immortality while denying the soul’s bodily affection and sin, later Neoplatonists interpreted transmigration in different ethical terms: the evil man becomes a beast-like character and the sinful soul is temporarily associated with an animal body or form. This is actually the central point of controversy between Plotinian and post-Plotinian accounts of transmigration. Whereas, for Plotinus, the ethics of transmigration is based on the non-hierarchical monism and homogenous, intelligible nature of the soul, for later Neoplatonists, transmigration is denied in terms of a hierarchical ontology in which the human soul possesses a higher ranking of existence in comparison to the other animals. On the one hand, Poprhyry seems to follow Plotinus’ transmigration of human soul into animal bodies as far as both human soul and animal souls are rational, deriving from the same intelligible source of the soul as second hypostasis of being (Smith, 1987; Wallis, 1995). On the other hand, Iamblichus and Proclus rejected human transmigration to animals as far as human and animal souls are essentially different and even denied that animals have souls at all in the strict sense of the term (Wallis, 1995).

8. Criticism of the Gnostics

In Ennead II 9 Against the Gnostics, Plotinus aims to defend Platonism against the immoral, pessimistic, and irrational doctrines of those who misinterpret Plato’s teaching and attribute evilness and darkness to the material universe (Puech 1960). Plotinus’ criticism is directed to a group of Gnostics who argued that knowledge should be not considered as a product of philosophical reasoning but of divine revelation (Wallis 1995). The Gnostics generally maintained that salvation is possible only through ‘knowledge’; gnosis is the only presupposition for the soul to find the pleroma, the spirit of the supreme God beyond this lesser and evil material universe. In order to emphasize the Gnostics’ irrational doctrines and perhaps their hypocritical and hyperbolic attitude, Plotinus describes them as speaking about Plato’s theories with “raving words” (II 9.18.20), like Sibyl’s delirious speech, as Heraclitus vividly expresses in fr. 92. In contrast to the Gnostics and other misinterpretations of Plato, Plotinus maintains that the material universe is the most perfect possible image of the intelligible world; the material world reflects in the best possible way the beauty and goodness of the divine realm.

Plotinus evaluates the Gnostic conceptions of the world, history, and ethics in three corresponding forms of alienation: firstly, alienation from the world, secondly, alienation from history, and thirdly, alienation from society (Kalligas 1997). Moreover, Plotinus’ objections are directed to the Gnostic doctrines of the denial of the divinity of the Word-Soul and the heavenly bodies, the rejection of salvation through true virtue and wisdom, the non-philosophical and irrational support of their arguments, and the arrogant view of themselves as saved by nature, that is, as privileged beings in whom alone God is interested (See Armstrong’ introductory note on Ennead II 9; cf. also Wallis 1995). For Plotinus, the Gnostics are deceived when they believe that the universe is created by a fallen soul (II 9.4-5) and when they speak of the divine creator as an ignorant or evil Demiurge who produced an imperfect material world (II 9.6). They are mistaken when they regard the creative activities of the Demiurge as the result of a spiritual fall within the intelligible hierarchy (II 9.10-12); they are melodramatic when they speak about the influence of the cosmic spheres (II 9.13); they are in the wrong direction when they lay claim to the higher powers of magic (II 9.14); and they are completely misled when they believe that immortality achievable through the complete rejection of and abstention from the material world.

Ennead II 9.15-18 includes an important account of Plotinus’ ethical criticism of the Gnostic movement. However, it is not only concerned with a polemic against Gnosticism but also with a defence of Platonism against the immoral, irrational, and pessimistic doctrines of negative otherworldliness. Plotinus draws a line between virtue, beauty, and truth, emphasizing Plato’s teaching of ethics, aesthetics, and metaphysics. Plotinus’ criticism of Gnosticism is an abridgment of his virtue ethics where the meaning of arete is justified for its importance for the soul’s purification, unity, and self-improvement. Plotinus shows his ethical standpoint on the value of human life. The life of the wise and virtuous soul is not to abandon the material world in a disinterested way of life, but to understand through virtue the divine origins of the soul and recognize the beauty and goodness of the intelligible world in the soul’s self-perfection.

Particularly, in chapter 15, Plotinus states that “we must be particularly careful and not to let escape us” what the immoral arguments of the Gnostics do to the souls (15.1-3). He distinguishes between two theoretical directions about the “end” (telos) of life (15.4-8): whereas, for the first, the end is the pleasure (hedone) of the body, for the second, the end is nobility (kalon) and virtue (arete).  Plotinus further divides the first theoretical direction into two schools of thought: (1) Epicurus and the Epicureans, who abolish divine providence and extol pleasure and enjoyment (8-22); (2) the Gnostics, who are pessimistic about the material world and promote an ascetic life without virtue and goodness (22-40). Prima facie the classification of the Epicureans and the Gnostics into the same category is puzzling: whereas the Epicureans were known for their hedonistic views, the Gnostics were known for their ascetic and detached views. Probably, Plotinus’ aim is to offer a philosophical comparison in a dialectic form in order to answer two dissimilar schools of thought, both of which, however, omit virtue ethics and divine goodness. According to another perspective, Plotinus perhaps considers a common alienated attitude both in the Epicurean life of pleasure and in the Gnostic life of asceticism.

For Plotinus, the Gnostics are immoral for neglecting the role of virtue in human life and noetic ascent. The Gnostics omit to define virtue, and they fail to explain how to attain the higher world without virtue. No treatise is devoted to virtue, and their treatment of virtue is completely absent from their doctrines: “they do not tell us what kind of thing virtue is, nor how many parts it has, nor about all the many noble studies of the subject to be found in the treatises of the ancients, nor from what virtue results and how it is to be attained, nor how the soul is taking care of itself, nor how it is purified.” (II 9.15.30-33) Plotinus argues that “looking to god” without knowing how to look is insufficient because only virtue leads the soul to the goal of divine aspiration (15.33-40).

Plotinus further relates virtue to beauty and the divine (II 9.16-18). Perceptible beauty is a reflection of the intelligible beauty, and the wise soul is able to recognize the beauty and goodness of the intelligible world through an inner sight to the perceptible world (II 9.1639-48).  Plotinus justifies the difference between Platonic and Gnostic otherworldliness. Whereas Plato’s otherworldliness accepts the beauty and goodness of the material world (in Plato’s Timaeus), Gnostics’ otherworldliness denies the beauty of the universe and the divine goodness of the Demiurge (II 9.17). Plotinus defends Plato and the beauty of the earthly world by using the metaphor of two people living in the same fine house, “one of whom reviles the structure and the builder, but stays there none the less, while the other does not revile, but says the builder has built it with the utmost skill, and waits for the time to come in which he will go away, when he will not need a house any longer” (II 9.18.3-9; trans. Armstrong).

Virtue forces the soul to recognize both itself and its divine origins and to guard itself against the strokes of fortune (18.26-30). The higher soul of the universe is not troubled; “it has nothing that it can be troubled by. We, while we are here, can already repel the strokes of fortune by virtue and make some of them become less by greatness of mind and others not even troubles because of our strength”; when our soul contemplates the completely untroubled state of the world soul, the universe and the stars, we become our true selves, well prepared for any possible misfortune (30-35) (see also Ennead I 4.8).

9. Virtue Ethics after Plotinus

a. Porphyry

In Sententiae ad intelligibilia ducentes (section 32) Porphyry systematized Plotinus’ treatment of the four cardinal virtues exposed in Ennead I 2 (O’ Meara 2003; Kalligas 2014). Porphyry stressed the importance of purification in virtue ethics and particularly the significance of purification in self-knowledge and the care of soul. He underlined the necessity of detachment from the soul’s bodily pleasure and irrational passions, its inconsideration of pains produced by sense-objects, and any kind of inclination on the part of the soul to the corporeal world. For Porphyry, the virtuous soul achieves impassibility by completely removing bodily dispositions (Sententiae 32.89-140).

Porphyry suggested a specific scale of the cardinal virtues following an ascending exposition of the soul’s need for purification: from the lower civil and practical life of the earthly realm to the higher paradigmatic life of the intelligible Forms. The scale of virtues begins (P1) from the level of political virtues and civic life, continues (P2) to the level of purificatory virtues and soul’s primal noetic ascent, (P3) to the theoretical virtues of the contemplative mind, and (P4) to the exemplary or paradigmatic virtues of the intelligible world.  The cardinal virtues of courage, self-control, justice, and wisdom apply throughout the four levels or states of being. Whereas the object of the “civic virtues” (P1) is to moderate passions and to conform conduct to the laws of human nature, the “purificatory virtues” (P2) detach the soul completely from passions. The object of the “contemplative virtues” (P3) is to apply to the soul pure intellectual activities, without any concern about passions, while the paradigmatic virtues (P4) are the exemplars and archetypes of all other virtues (Sententiae 32.83-89)

Porphyry begins section 32 of the Sententiae with the application of virtues to different states of human experience and focuses on different expressions of virtues with respect to different levels of purification: between the virtues of the citizen, the virtues of the soul that attempts to rise to contemplation, the virtues of the soul that purely contemplates intelligence, and finally the mind that possesses pure intelligence and that is completely separated from the bodily level of the soul (1-5). As Porphyry summarizes, “the practical virtues make man virtuous (spoudaios); the purificatory virtues make man divine (daimonios), or make the good man a benign spirit (daimon agathos); the one who acts only in accordance to contemplative virtues becomes a god (theos); while the one who acts in accordance with the paradigmatic virtues is the father of gods (theon pater)” (89-94 translation Guthrie, 1988, modified).

Furthermore, Porphyry places an emphasis on the political virtues and their civic importance as the first stage of excellence in terms of moderation of the passions (metriopatheia) and appropriate moral duty underlined by pure reason. For Porphyry, political virtues contribute to the harmonious civic life with fellow human beings and “mutually unite all citizens”. The political virtues are human virtues and a necessary precondition to the noetic ascent of the soul to the higher realms. There is a necessity to exercise humanity in the self before its application to fellow-humans or the purification at higher levels of being (Sententiae 32, 6-14).

For Porphyry, the contemplative man is detached from the political sphere and the virtues possessed are called “purifications” since they aim at higher realities and genuine existences. The soul of the contemplative man is raised above the passions of the earthly life to the intelligible realm and in likeness to the divine. (15-33). However, as Porphyry clarifies, there is “a difference between purifying oneself, and being pure” (33-35). The role of purificatory virtues is twofold: they both purify the soul and coexist as qualities in the purified soul. The importance of the purificatory virtues lies in their power to release the soul completely from any form of evil, either the one related to lower things or the one related to passions. The political virtues release the soul only from passions (35-50).

At a level higher than the purficatory virtues, Porphyry places the contemplative virtues, “the virtues of the soul that contemplates intelligence”. The purified soul directs its activities to the higher intelligible realm, and the four cardinal virtues manifest different kinds of qualities of the soul in constant contemplation of the intelligible beings (51-63). Finally, Porphyry suggests a fourth kind of virtues, the paradigmatic virtues, which belong to the realm of the Forms and reside within the higher Nous.

At the intelligible level of the Forms, the virtues are identified with specific intelligibles (noeta). Porphyry’s claim has been considered a departure from Plotinus’ position in Ennead I 2 (by following Aristotle) that virtues should not be seen as archetypes in the intelligible world of the Forms. However, Porphyry follows Plotinus in claiming that one who possesses the superior virtues also possesses the lower virtues, but not vice versa. In fact, one who possesses the higher virtues is not interested in practicing the lower virtues. Furthermore, Porphyry underlines the intrinsic value of virtues by upgrading their ontological status, while Plotinus highlights their psychological value in the soul’s noetic purification.

For Porphyry, the superiority of the paradigmatic virtues, compared with the virtues of the soul, lies in the fact that the virtues of the soul are images of the “archetypal” paradigmatic virtues, and so they subsist in the divine Nous simultaneously (63-70). As Porphyry synopsizes: “1, the paradigmatic virtues, characteristic of intelligence, and of the being or nature to which they belong; 2, the virtues of the soul turned towards intelligence, and filled with her contemplation; 3, the virtues of the soul that purifies herself, or which has purified herself from the brutal passions characteristic of the body; 4, the virtues that adorn the man by restraining within narrow limits the action of the irrational part, and by moderating the passions” (70-78; translation Guthrie 1988, modified).

b. Iamblichus

Iamblichus, in his work On Virtues (not preserved today), develops the scale of virtues of Porphyry’s Sententiae 32 (O’Meara 2003; Kalligas 2014; Finamore 2012). Iamblichus added two more virtues below the political: the natural virtues (at the lowest level) and the ethical virtues (below the political virtues), as well as the hieratic virtues at the highest level of the scale. Iamblichus’ scale of virtues, following an ascending order is: (1) natural; (2) ethical; (3) civic; (4) purifying; (5) contemplative; (6) paradigmatic and (7) hieratic (apud Damascius, In Phaedo I.138-144).

Iamblichus suggested a level below the civic or political virtues in order to underline the importance of virtues and their cultivation both in children and certain animals (O’Meara 2003). He emphasized the importance of “habituation” (ethismos) at the level of ethical virtues and highlights the classical association between virtue and habituation (hexis) found in Plato and Aristotle. However, Iamblichus, following closely Plato’s teaching, reconsiders the educational importance of the ethical virtues in molding and bringing up children. A virtue ethics education is presupposed for political virtues, which entails maturity and rationality (O’Meara 2003).

In Iamblichus’ canon— a selection (or curriculum) of twelve Platonic dialogues used to initiate the student to Plato’s original teaching—the scale of virtues is important for the soul’s purification and its progressive noetic ascent from the nature of the self (Alcibiades I) to a complete treatment of the divine nature (Parmenides). The virtues follow the purpose of the Platonic dialogues and the order of being to guide the soul’s likeness to god. For instance, the Gorgias, the second dialogue in the list, involves civic virtues, while the Phaedo, the third dialogue in the list, involves purificatory virtues. Moreover, whereas the Cratylus and the Theaetetus, fourth and fifth in the list, refer to contemplative virtues and emphasize logic, the Sophist and the Statesman, sixth and seventh in the list, refer again to contemplative virtues but with an emphasis on nature and the perceptible world. The dialogues Phaedrus, the Symposium, and the Philebus (eighth, ninth, and tenth respectively in the list of the curriculum) are related to contemplative virtues with theological purposes and the nature of the Good. Finally, the Timaeus, eleventh in the list, entails physical education with reference to the nature of the cosmos.  It is noteworthy that the Timaeus and the Parmenides are considered “perfect” dialogues, which sum up the previous ten.

Iamblichus’ detailed development of the scale of virtues offers a comprehensive and insightful analysis on human morality, from the natural level of being to the highest form of divination. As in Plotinus and Porphyry, the ascending scale of virtues follows the noetic ascent and the progressive purification of the virtuous soul that achieves likeness with the divine. As O’Meara (2003) maintains, “the Iamblichean scale of virtues remains a method of progressive divinization, a process of complexity worthy of the metaphysical world-view of the later Neoplatonists” (145). Iamblichus’ fine elaboration of virtues was influential on the work of Marinus and Damascius and shows the importance of human excellence both in the practical and the theoretical sphere.

10. References and Further Reading

a. Texts and Translations

  • Armstrong, A. H. (1966-1988) Plotinus. 7 vols. Loeb Classical Library. Cambridge: Harvard University Press.
  • Henry, P., and Schwyzer, H. R. (1964, 1976, 1982) Plotini Opera. 3 vols. (editio minor) Oxford: Clarendon Press.

b. Commentaries

  • Kalligas, P. (2014) The Enneads of Plotinus: A CommentaryVolume 1, Elizabeth Key Fowden and Nicolas Pilavachi (trs.), Princeton; Oxford: Princeton University Press.
  • Kalligas, P. (1994-2009) Plotinus’ Enneads I-V: Ancient Greek text, translation and commentary. Athens: Academy of Athens.
  • Leroux, G. (1990) Traité sur la liberté et la volonté de l’Un. [VI.8 (39)] Paris: Vrin.
  • McGroarty, K. (2007) Plotinus on Eudaimonia. A Commentary on Ennead I.4. Oxford: Oxford University Press.

c. Introductory Sources

  • Gerson, L. P. (1994) Plotinus. London/New York: Routledge.
  • O’Meara, D. (2003) Platonopolis. Platonic Political Philosophy in Late Antiquity. Oxford:  Clarendon Press.
  • Smith, A. (2004) Philosophy in Late Antiquity.  London/New York:  Routledge.
  • Wallis, R. T. (1995) Neoplatonism. London: Duckworth.
  • Wright, M. R. (2009) Introducing Greek Philosophy. Acumen.

d. Ethical Theory

  • Baltzly, D. (2004) ‘The Virtues and ‘Becoming Like God’: Alcinous to Proclus’, Oxford Studies in Ancient Philosophy 26: 297-321.
  • Dillon, J. M. (1996) ‘An ethic for the late antique sage’ in The Cambridge companion to Plotinus. Gerson, L. P. (ed.), Cambridge: Cambridge University Press, 315-335.
  • Dillon, J. M. (1983) ‘Plotinus, Philo and Origen on the Grades of Virtue’, in H.-D. Blume and F. Mann (eds), Platonismus und Christentum. Festschrift für Heinrich Dörrie, Münster Westfalen: 92-105.
  • Plass, P. (1982) ‘Plotinus’ ethical theory’, Illinois Classical Studies 7, 2: 241-259.
  • Remes, P. (2006) ‘Plotinus’ Ethics of Disinterested Interest’, Journal of the History of Philosophy 44: 1-23.
  • Rist, J. M. (1976) ‘Plotinus and Moral Obligation’ in The Significance of Neoplatonism. Harris, R. B. (ed.), Virginia: ISNS, 217-233.
  • Schniewind, A. (2003) L’Éthique du Sage chez Plotin. Le paradigme du spoudaios. Paris: Librairie Philosophique J. Vrin.
  • Smith, A. (1999) ‘The Significance of Practical Ethics for Plotinus’ in Traditions of Platonism: Essays in Honor of John Dillon. Cleary, J. J. (ed.), Aldershot, 227-236.
  • Stamatellos, G. (2015) ‘Virtue and Hexis in Plotinus’, International Journal of the Platonic Tradition 9 (2): 129-145.

e. Human Freedom and Selfhood

  • Eliasson, E. (2008) The Notion of That Which Depends on Us in Plotinus and Its Background. Leiden/Boston:  Brill.
  • Leroux, G. (1996) ‘Human Freedom in the thought of Plotinus’, in Gerson, L. P., The Cambridge companion to Plotinus. Cambridge: CUP, 292-314.
  • Remes, P. (2007) Plotinus on self: the philosophy of the ‘We’. Cambridge, New York: Cambridge University Press.
  • Stern-Gillet, S. (2009) ‘Dual Selfhood and Self-Perfection in the Enneads’, Epoché 13, 2: 331-345.

f. Post-Plotinian Virtue Ethics

  • Finamore, J. (2012) ‘Iamblichus on the Grades of Virtue’ in Iamblichus and the Foundations of Late Platonism. Eugene Afonasin, John Dillon, John F. Finamore (ed.), Leiden; Boston:  Brill, 113–132.
  • Guthrie, K. (1988) (tr.) PorphyryLaunching-Points to the Realm of Mind. Phanes Press.

 

Author Information

Giannis Stamatellos
Email: istamatellos@acg.edu
The American College of Greece
Greece

Aesop’s Fables

With the possible exception of the New Testament, no works written in Greek are more widespread and better known than Aesop’s Fables. For at least 2500 years they have been teaching people of all ages and every social status lessons how to choose correct actions and the likely consequences of choosing incorrect actions. However, because the fables do not fit the model of philosophy that would be developed later by thinkers like Plato and Aristotle and their successors, they are often disregarded by philosophers; and because they are regarded as having been written for children and slaves, they are often not taken seriously as a source of information about practical ethics in ancient Greece.

In order to provide some context for the fables themselves, after a brief introduction the first part of this article discusses the Life of Aesop, a pseudo-biographical text about the fables’ legendary author. Next, the article considers the form and content of fables, and how these limit what the fables can do while also providing opportunities that other forms of communication do not. Finally, the article looks at some specific fables and the messages that can be taken away from them, in order to demonstrate the kinds of ethical principles that the ancient Greeks conveyed using this kind of philosophizing—and which are still present in the fables that are read and recited around the world today.

Table of Contents

  1. Introduction
  2. The Life of Aesop
  3. Aesopic Fable as a Kind of Philosophy
  4. Philosophical Values in Aesopic Fable
    1. The Strong and the Weak
    2. Friends and Enemies
    3. Intelligence/Foolishness
    4. Overambition/Failure
    5. Truth/Honesty/Lies/Deceit
    6. Gods
    7. Reciprocity
    8. Women, Family, Love
  5. Conclusion
  6. References and Further Reading

1. Introduction

This article talks about the fables under consideration as “Aesopic” fables to show that they are attributed to Aesop while also being clear that Aesop is not necessarily their actual author. The ancient Greeks believed that there had once been a man named Aesop who was the originator of the fable and author of its earliest examples, and it became traditional to attribute all fables to him, just as Americans currently tend to attribute any clever remark to Mark Twain. However, there are at least two problems with this view of Aesop as the creator and author of fables. First, there is very little evidence to suggest that Aesop ever existed. This is not surprising, given that he allegedly lived during the sixth century B.C.E., centuries before the Greeks who were writing down his fables were born; and there is very little surviving evidence from that era about anything. In addition, the ancient Greeks were not scrupulous about historical detail—if something should have been written or said or done by a particular person, then they attributed it to that person. (For example, the Athenians attributed many laws to Solon, which are documented as being enacted well after his death.) There is a surviving pseudo-biography of Aesop that is discussed below, not for its historical accuracy or value, but in order to bring out some of the beliefs that the Greeks had about the kind of person who should have written the fables, because, as was noted above, these beliefs tell us something important about the fables themselves. Second, we know that Aesop could not have been the originator of the fable form because fables predate the Greek civilization of which he was supposed to have been a part by many centuries. Their origins are lost, in part, because they were orally transmitted for an unknown period of time before being written down, but (as has been said) stories that are clearly recognizable as fables have been found in tablets written in ancient Sumeria.

2. The Life of Aesop

Even though Aesop probably never existed, it is helpful in understanding how the ancient Greeks thought about the fables to understand who Aesop was thought to have been, and how he was thought to have lived his life. We can reasonably assume that the “life story” of the inventor of the fables developed along the lines that would have been found most compatible with what the Greeks thought the fables were. Therefore, by learning what the Greeks thought about the author of the fables, we can expect to learn something about what they thought about the fables themselves.

So, who was Aesop to the ancient Greeks? We know that Aesop was widely known in the ancient Greek world. We find references to him and his life in Herodotus, Plato, Aristotle, and Aristophanes, and while those references may not be historically accurate, they do show that the audiences for the works of these four men (a historian, two philosophers, and a comic playwright), which would have included citizens from a wide range of social classes, knew who Aesop was and could be expected to respond to references to him in predictable ways. It also shows that he was well known and important enough for these authors to decide that he was worth including in their writings in the first place, and this can only be because his life and fables were believed to be useful cultural material and worthy of attention.

Setting aside the references mentioned above, an extended account of Aesop’s life can be found in the pseudo-biographical Life of Aesop, which is believed to have been written in roughly the 2nd cn. C.E., although much of it is a compilation of older stories that were part of oral tradition (for example, the Life of Ahiqar). The details of his life, although they may be entirely fictional, are important because while today we tend to draw sharp distinctions between how a philosopher does their job and how they live their life, in ancient Greece and Rome this was much less the case. The philosopher was expected to live their life according to their principles, and accordingly what one did (or was believed to have done) had a real impact on how their philosophy was received. Therefore, Aesop’s life can be seen as an embodiment of the principles he lives by, and vice versa: we can learn about fables through the ”biography” of the person who wrote them, whether or not Aesop ever actually existed. Rather than analyzing the entire text in detail, this article will offer a short summary, and then look in more detail at four especially salient aspects of his life. First, he was said to have begun his life as a slave; second, he is said to have been extremely ugly—as though he were not entirely human; third, he begins his life unable to speak; and, finally, his rise from slavery to greatness also leads to his destruction. As we will see, each of these qualities mark him as being on the boundary between human beings and the other animals that feature so prominently in Aesop’s fables.

Several versions of the Life of Aesop have survived the centuries, and while they have differences, they are the same in broad outline. Aesop, we are told by the unnamed author, was a slave from Samos, a Greek island in the Northern Aegean. He had a number of distinctive traits. He was remarkably ugly, and is frequently compared to animals in terms of his appearance. He was born mute, entirely unable to speak, which is another trait usually associated with animals, who can make sounds but cannot make words or speeches. However, he was also remarkably intelligent and resourceful. This is illustrated by an incident early in the Life in which he is successfully able to defend himself from a false accusation of eating stolen figs by getting the slaves who were the actual culprits to unwillingly reveal their guilt even though he is unable to tell the master what has happened. Aesop does this by drinking warm water and vomiting, which reveals that he had not recently eaten figs. He then gets their master to make the other slaves drink warm water and vomit, which leads to them vomiting up the evidence. He is spared, and they are beaten. He is also pious: One day he helps a priestess of the goddess Isis who has strayed from the road and become lost, and Isis and the Muses repay him for his help by “conferring on him the power to devise stories and the ability to conceive and elaborate tales in Greek.” (The version of the Life used in this article is the one found in Daly’s book referenced below. It is probably the most widely available source of the Life). Shortly after this the slave overseer realizes that if Aesop can speak, he is in a position to convincingly relate the overseer’s abuse of slaves and other wrongdoing to the master. (Since the other slaves, who can speak, have not already reported the overseer, we are already being made aware that there is something exceptional about Aesop’s insistence on being well treated – as though he were a human and not an animal).

The overseer is able to get a slave dealer to pay him a pittance and take Aesop away, but when the dealer takes Aesop to the slave market to sell him, he is at first unable to find a buyer because Aesop is so ugly. The slave dealer is eventually able to sell him, for almost nothing, to the philosopher Xanthus. (There is a connection here, which may be intentional, between Aesop and speaking animals. In Homer’s Iliad at XIX.400, it is a horse named Xanthus who is briefly given the power of speech by Hera in order to reply to Achilles’ demand that his horses do a better job of keeping him from harm than they had done with Patroclus). The next section of the Life describes Aesop’s activities while a slave of Xanthus, and in a number of different episodes Aesop demonstrates that he is in fact wiser than his master—that although he is legally a slave and has no formal education, when it comes to wisdom, cleverness, and proper use of language, the qualities that philosophers like Xanthus claim make them superior to other human beings (and to the animals), Aesop is in fact the master. In all of these episodes, Aesop is not merely showing off his superiority. All of his efforts are turned toward gaining his freedom, but largely due to Xanthus’ arrogance and dishonesty they always fall short. It is not until Xanthus’ fellow citizens call on him to free Aesop so that Aesop may interpret a portent of the future (which Xanthus has promised to interpret before realizing he is not able to do so and being driven to the brink of suicide) before a meeting of the Assembly that Aesop is finally freed (Aesop having helpfully (and ironically) advised them that it is not proper for a slave to address free men in the Assembly). After Aesop correctly interprets the portent, he gains fame and fortune, skillfully solves problems and riddles for famous and powerful figures, and occasionally tells fables along the way. However, in the end it is his very success that leads to his ruin. Although he is successful in his service to the king of Babylon, so much so that the king raises a golden statue in his honor, Aesop decides to travel to Delphi. On the way, he visits many cities and demonstrates his wisdom, receiving payment from cities whose citizens have been impressed by these demonstrations. But when he does the same at Delphi, the people there do not give him any reward for his performance. In return, Aesop mocks the Delphians as being like driftwood, which seems like something worthwhile at a distance but is revealed to be worthless when seen up close. He goes further and tells them that it is not surprising that they are worthless, because their ancestors were slaves (apparently forgetting that he himself was once a slave). The Delphians are outraged by his abuse, hide a golden cup from the temple of Apollo in his luggage, arrest him as he leaves town for allegedly trying to steal it, and sentence him to death. He is unable to persuade them not to kill him, and in the end he is either thrown off of a cliff by the Delphians or, in another tradition, jumps from the cliff himself instead of dying at their hands. The Life ends by noting that the Delphians were afflicted by a famine for killing Aesop and were subsequently punished by the Greeks, Babylonians, and Samians.

What can we take away from this story about what fables are and how they were regarded in ancient Greece? First, it is widely accepted that attributing authorship of the fables to a slave means that the messages of the fables were primarily intended for slaves, or that they were created by slaves, or both. Why would slaves be thought to be particularly appropriate as the creators and audience for animal fables? Two arguments, which are not mutually exclusive, have been put forward. First, many authors have noted that fables allow for the possibility of hidden messages. They allow slaves to tell stories to one another about the cruelty of slavery and how its effects can be mitigated or evaded, without communicating in a way that will get them caught and punished by their masters. The fables can also provide messages about how to successfully survive in a world in which the odds are stacked against you. (Another example of this would be the Uncle Remus stories, which allowed African-Americans to criticize and make fun of whites, as well as share advice about how to survive, without suffering unwanted consequences). Second, it is important to recall that as an ugly slave, unable to speak, Aesop himself is on the boundary between human and animal at the beginning of his life. His slave status would by itself mark him as being on this boundary. Athenians commonly referred to slaves as “boy”—they had no individual identities, like the animals in the fables (and, in fact, slaves were also sometimes called “andropodon,” man-footed animal, related to the word “tetrapodon,” four-footed animal, used to describe cattle). And slaves, like animals, were considered unable to speak in that they had no legal identities—they could not represent themselves in public because speaking in public is a characteristic of human beings (hence Aesop’s insistence that if he is to speak freely to the Samian Assembly in interpreting the portent he must have his freedom). So, fables, which so often feature animals in order to teach lessons to humans, are believed to have been invented by an author who is himself on the border of the animal and the human. It is only once he reaches the pinnacle of fame, wealth, and influence—when he has left his beginnings as almost more animal than human behind and moved from the low end of the human hierarchy to the high end—that he makes the errors in judgment that lead to his death in Delphi. His life story reinforces a significant theme in the fables: that of being unable to change one’s nature and status—although he succeeds for a time, his destruction ultimately comes as a result of these changes. For an example of a fable with a similar message, see Gibbs 327 (Perry 123).

In addition, Aesop’s biography shows us that the fables are related to the animal side of human beings. It is all well and good for Aristotle to suggest that the happiest life is one spent in pure intellectual contemplation or for Plato to tell us that the best life is one spent pursuing knowledge about the Forms of the good and the just and the beautiful, but for most people this kind of philosophy is unavailable, because they do not have the resources to pursue academic philosophy. For some few, linking the human to the divine is an enticing intellectual activity; most of us are closer to the animal than the divine and will benefit more from advice that is framed accordingly. For such people, fables which bring the animal and the human together will be much more valuable than Platonic or Aristotelian philosophy, because fables are focused on practical and embodied philosophy rather than the theoretical and abstract.

3. Aesopic Fable as a Kind of Philosophy

The word “fable” comes from Latin. It ultimately means “story” and is derived from the word fari which simply means “to speak.” Theon famously called it “a false discourse depicting the truth.” Although not all fables are about animals—humans, plants, inanimate objects, and the gods all make appearances—animals certainly predominate, and understanding what fable is requires understanding something about why animals have such a prominent role in them. (Indeed, if we remember that fables were, for a long time, written down on animal skins, it would be fair to say that the ancient fables would not exist if not for animals, either intellectually or physically.)

It’s important to keep in mind that animals were much more important as a part of the life of ancient Greeks than they are for most people in the Western world in the twenty-first century. As they are for many of us today, animals were sources of food and clothing and companionship for the Greeks. However, for the Greeks, they were in addition forms of transportation and conveyance, entertainment, and prestige; they were valued as hunting animals, were used in war, were sources of personal protection, and were an important part of sacrificial rituals linking the human, animal, and divine. Since animals were so deeply involved with their day-to-day physical life, it makes sense that the Greeks would incorporate them into their intellectual life as well. Animals live in a variety of different locations, sometimes in herds and sometimes alone; they engage in a wide range of behaviors and act differently in different settings. Often it would seem to be a simple matter of selecting the right animal in order to evoke a particular understanding of the setting and motivations for the participants in the fable. This allows the author to suggest or imply a lot of backstory in a format which is partially defined by its brevity. So, whereas establishing that a human character is clever might take considerable effort, if the author chooses a fox as one of the characters in the fable, then cleverness is already established as a trait for that character. Similarly, it takes less time to say “this fable is about a mouse” than to establish the timidity of a particular human being.

Of course, stories about animals are only useful lessons for human beings if human beings have traits in common with other animals. For the analogy between human beings and other animals to hold up, human beings must be understood as being a kind of animal themselves. There is a fable that makes this point:

Following Zeus’s orders, Prometheus fashioned humans and animals. When Zeus saw that the animals far outnumbered the humans, he ordered Prometheus to reduce the number of the animals by turning them into people. Prometheus did as he was told, and as a result those people who were originally animals have a human body but the soul of an animal. (Perry 240)

Animals in fable do have one significant difference from animals in the real world as the Greeks saw them: they have the ability to speak, which in the real world is restricted to human beings. (There is disagreement today about whether or not animals can speak, as well as what it means to be able to speak in the first place, but those debates need not concern us here.) Aristotle is perhaps the best-known exponent of this view, as he says in Book 1 of the Politics. Connected to their inability to speak is the inability to reason (the word logos captures both meanings); Aristotle says at Metaphysics 1.1: “The animals other than man live by appearances and memories, and have but little of connected experience; but the human race lives also by art and reasonings.” And at Nicomachean Ethics X.8, he explains that animals do not partake in contemplation and so cannot be said to be happy. Only if someone can make a conscious choice can their actions be in accordance with happiness and virtue (thus Aristotle also indicates that children (and, presumably, slaves) cannot be happy, because they lack the adult ability to make choices). By giving other animals the ability to speak, the fables blur the lines between humans and those other animals, making it easier for humans to learn from the stories fables tell.

With regard to form, fables have a number of distinguishing characteristics: they are usually very short, typically only a few sentences long; they lack any specific setting in time or place; they typically (although not always) involve animals, who are not named or described; the main character acts so as to bring about some outcome, usually through conflict with another character, but often fails to achieve what they intend to do; finally, the character typically makes some kind of a statement acknowledging where they went wrong and accepting the consequences of their error (which can be anything up to and including death). On the one hand, these characteristics limit what the fable can convey. There is no plot, there is no character development, there is typically only one action, and there does not even need to be any dialogue. On the other hand, the characteristics of the form of fable are perfectly suited for widespread oral transmission, which was for centuries the only way in which they were or could be transmitted, and they continued to be transmitted in that way even after the development of widespread literacy, as indeed they still are today. Their simplicity makes them memorable and helps give them their power. Although the fables lack abstraction, they provide a rich stock of philosophical resources for people who are in need of practical philosophical principles to be used in their day-to-day life. The simplicity of the fable is not a sign of the ignorance or limited abilities of the author or the audience; indeed, the opposite is true because creating an effective fable requires stripping the action and language of the story down to the bare minimum needed to convey the truth it seeks to convey.

Lester Hunt says that “though this sort of speech [fable] is not characteristic of philosophy as we know it, that may be because it represents a form of argument that does not seem to be well suited to serve certain purposes that philosophers characteristically pursue, and not because it fails to be an argument” (Hunt 371). In part, it does not serve those purposes because it pre-dates Socrates, who is seen as the first philosopher in the Western tradition, and Plato, who did more than anyone to fix the boundaries of Western philosophy and to define what it was. As presented by Plato, Socrates was deeply interested in the definitions of words. He wanted to know the answers to questions like “What is justice?” and “What is piety?” and these kinds of questions are what many people associate with philosophy to this day. These questions and others like them are indeed not well suited to the form and content of the fables. As has been said, the fables serve to illustrate the consequences of certain kinds of behavior. Their message is practical rather than theoretical, and simple rather than complex. In the Platonic dialogues, Socrates rejects examples of behavior as suitable definitions for words: a list of actions that are just or pious is not the same as a definition of justice or piety, and Plato’s Socrates insists that we cannot reliably come up with examples of a virtue unless we are able to give an accurate definition of what that virtue is. This would seem to exclude fables from the category “philosophy” because they are specific individual examples of behavior and consequences and not concerned with creating systems or defining terms. (It is worth noting here that Socrates himself often uses myths and other stories, such as the Ring of Gyges in Republic, to advance his philosophical arguments.)

But Socrates is only the founder of philosophy if one accepts that philosophy is the thing that Socrates was the first person to do. If one believes, as Socrates apparently did, that one reason to examine one’s life is to be able to be more self-aware, or to live more happily or successfully, then the earlier traditions of wisdom literature such as Aesopic fables which aim at these goals should certainly count. Fables may not be able to tell you about the Form of Justice, but they can suggest some likely consequences of unjust behavior; they may not be able to define Virtue and Vice, but they can give you some examples of what these things look like and suggest for which of the two should be chosen in particular situations and what the outcome of that choice is likely to be. It is true that they are not suitable for complex forms of reasoning or logic, or extended argument—but why should these set boundaries on what we believe philosophy is or does? Hunt adds that: “Because of the limitations of [fable]—that is, that it must be a short, simple narrative making a clear and memorable point that can reach a wide audience—its interest tends to be overwhelmingly practical” (Hunt 379). This does not, however, make fables less philosophical, especially for the Greek audience that they were originally addressed to. Aristotle tells us that the purpose of practical knowledge (by which he means knowledge about ethics and politics) is to enable people to act properly. Leading people to act properly may sometimes require complicated arguments, but it does not mean that only complicated arguments are philosophy.

In addition, fables deliver their messages through analogy, which is a recognized form of philosophical argument. Not every fable does this, but then not every dialogue is a Platonic dialogue—the form allows, but does not compel, philosophical meanings. Perhaps the best starting point for a consideration of how fables worked as analogies can be found in Book II, Chapter 20 of Aristotle’s Rhetoric, where he discusses how they can be used effectively in persuading people to take political action:

Fables are suitable for addresses to popular assemblies; and they have one advantage—they are comparatively easy to invent, whereas it is hard to find parallels among actual past events. You will in fact frame [fables] just as you frame illustrative parallels: all you require is the power of thinking out your analogy, a power developed by intellectual training.

That is, the speaker shows that the situation the assembly currently faces is similar to a situation described by fable, and shows what happens to the characters in the fable, leaving it to the audience to conclude that if they want a different outcome they must act differently than the characters in the story they have just heard (or, if they want the same outcome, they must act in the same way). This requires the audience to actively take part in constructing the argument: they have to analyze the fable, analyze the current situation, determine whether and how they are similar, and come up with a conclusion regarding how they ought to act. The speaker does not tell the listeners how to act; instead, they leave it to the listeners to reach their own conclusions about the right thing to do—which, again, fits with the methods of practical philosophy.

The listeners can then carry the fable with them in their minds—since fables are written to be short and memorable—so that it can be used in other situations. Someone who knows a lot of fables can probably find one to fit any situation—but in order to use the fable effectively, they must be able to choose the appropriate one for the particular situation they are in. For example, is this a situation which calls for determination and persistence, such as that exhibited by the tortoise in the race with the hare? Or is it a situation which calls for someone to recognize that the goal is unattainable and to walk away, as when the fox realizes that the grapes are not within his reach and decides that they must be sour anyway? Again, the fable’s value as an analogy is dependent on the ability of the person using it to properly determine what the appropriate analogy is and what the fable tells that person about the situation they find themselves in. This practice of reflection seems worthy of being described as philosophical activity in this person.

That analogy can be used within other kinds of philosophy and not just fables can be shown with reference to Plato. Socrates also frequently uses analogy as a form of argument, perhaps most famously in the Apology. For example, after he gets his accuser Meletus to say that out of all the Athenians, only Socrates makes the young men worse, and he responds thusly:

I am very unfortunate if that is true. But suppose I ask you a question: Would you say that this also holds true in the case of horses? Does one man do them harm and all the world good? Is not the exact opposite of this true? One man is able to do them good, or at least not many; the trainer of horses, that is to say, does them good, and others who have to do with them rather injure them? Is not that true, Meletus, of horses, or any other animals?

And Meletus agrees. From this Socrates concludes that Meletus is wrong in his accusation of Socrates and is not even taking the trial seriously—anyone who thought things through would easily see that, just as only a few know how to improve horses, only a few would know how to make human beings better. Of course, as many people have noted, this may not be a good analogy. Knowing when an analogy applies and when it does not is an important part of taking it seriously and using it properly. Whether this is a valid analogy or not is not important for our point here, which is that it is a form of argument requiring the listener’s active participation in reaching the correct ethical and political judgment about Socrates’ guilt or innocence. And, of course, in the Republic, Socrates offers his famous cave analogy as a way of explaining the nature of human existence. So Plato is willing to use analogy within the realm of higher philosophy when it seems to be the most effective way to communicate what he is trying to explain.

Perhaps the best statement regarding the content of fables is that of Zafiropoulos, who says that fable offers an “exemplary and popular message on practical ethics and which comments, usually in a cautionary way, on the course of action to be followed or avoided in a particular situation” (Zafiropoulos 1). Practical ethics for the Greeks, as exemplified in the writings of Aristotle, was considered an aspect of politics and political education, so that we can see the fables as not only philosophy but political philosophy, telling people not only how they should live but how they should live together, what to expect from other people if they behave in certain ways, how to have successful social interactions, and so on. In this way the fables can be regarded as similar to Greek plays and epic poetry. Both the plays and the epic poems offer examples of fictional characters conducting themselves in particular ways and the consequences of their conduct so that the audience can learn from their choices (and, most significantly, their mistakes). Raaflaub says with regard to the oral tradition of the epic poems of Homer that “It was important not only to the community but also to the elite to propagate positive patterns of behavior and to illustrate the disastrous consequences of negative ones” (Raaflaub 565). Fables had the same function, while being more accessible to everyone in the community.

4. Philosophical Values in Aesopic Fable

The message (or messages) of a particular fable depend on where it is found. If it is located within a particular story, it will derive its message from the story in which it is found, although even then it may have more than one meaning. If it stands on its own, or is found in a collection of fables, its meaning becomes even more fluid. Nevertheless, if we look at the early fable collections, there do seem to be particular themes that emerge.

Many authors have discussed the themes to be found in the fables; what follows draws on the list found in Morgan, Chapter 3, but similar themes can be found in, for example, Zafiropoulos. Gibbs’ collection of the fables organizes them along thematic lines as well, although her categories differ from the ones given below. Included with each category is an example fable, which will be used to show the way in which the fables generally deal with the topic. Also included is the fable number from Laura Gibbs’ edition of the fables, as well as the Perry number, which is the standard reference number for each fable. The text of each fable is copied from Gibbs’ edition, as found on her website. Taken together, the fables provide a useful set of principles for conducting oneself appropriately according to ancient Greek moral beliefs.

a. The Strong and the Weak

Gibbs 131. The Hawk and the Nightingale
Perry 4 (Hesiod, Works and Days 202 ff.)

This is how the hawk addressed the dapple-throated nightingale as he carried her high into the clouds, holding her tightly in his talons. As the nightingale sobbed pitifully, pierced by the hawk’s crooked talons, the hawk pronounced these words of power, “Wretched creature, what are you prattling about? You are in the grip of one who is far stronger than you, and you will go wherever I may lead you, even if you are a singer. You will be my dinner, if that’s what I want, or I might decide to let you go.”

Perhaps the predominant theme in fable is also the oldest. It can be found in the first recorded fable in the Aesopic fable tradition, from Hesiod’s Works and Days (which significantly pre-dates the supposed dates for Aesop’s life). There is some disagreement about the lesson to be taken from this fable, but it seems clear that the opposition is between the strength of the hawk and the words of the nightingale, who has nothing but words to counter that strength. It is the classic statement of “might makes right,” and those who have little power of their own must necessarily learn this lesson quickly and well. In the poem, Hesiod goes on to claim that the exercise of unjust power is wrong and that Zeus will punish it. Whether or not this is true, it is clear that the thought of future divine punishment will not necessarily deter the strong or protect the weak.

b. Friends and Enemies

Gibbs 70. The Lion and the Mouse
Perry 150 (Ademar 18)

Some field-mice were playing in the woods where a lion was sleeping when one of the mice accidentally ran over the lion. The lion woke up and immediately grabbed the wretched little mouse with his paw. The mouse begged for mercy, since he had not meant to do the lion any harm. The lion decided that to kill such a tiny creature would be a cause for reproach rather than glory, so he forgave the mouse and let him go. A few days later, the lion fell into a pit and was trapped. He started to roar, and when the mouse heard him, he came running. Recognizing the lion in the trap, the mouse said to him, “I have not forgotten the kindness that you showed me!” The mouse then began to gnaw at the cords binding the lion, cutting through the strands and undoing the clever ingenuity of the hunter’s art. The mouse was thus able to restore the lion to the woods, setting him free from his captivity.

The theme here in some ways qualifies the previous example, as sometimes those who seem to be powerless turn out to have more power than one might expect. Although the mouse is weak, the lion’s decision to free the mouse ends up working in his favor in the end, as the mouse repays one kindness with another. There is no way to know in advance who might be able to help you in the future, and so it pays to show kindness and benefit others in the hope of future reciprocity.

c. Intelligence/Foolishness

Gibbs 434. The Man and the Golden Eggs
Perry 87 (Syntipas 27)

A man had a hen that laid a golden egg for him each and every day. The man was not satisfied with this daily profit, and instead he foolishly grasped for more. Expecting to find a treasure inside, the man slaughtered the hen. When he found that the hen did not have a treasure inside her after all, he remarked to himself, “While chasing after hopes of a treasure, I lost the profit I held in my hands!”

Here we have the stereotypical example of foolishness: someone who has a good situation but does not properly appreciate it and, in trying to get still more, loses what they have. Throughout the fables, foolish decisions are punished, often by death. Intelligence, on the contrary, gets a good reputation in the fables. Those who are smart, or at least clever, can turn situations to their advantage—as, for example, in Gibbs 104/Perry 124, “The Fox and the Raven,” in which the fox is able to steal dinner from the raven by the crafty use of flattery. They can also sometimes use their intelligence to find ways to protect themselves from those who have superior power and strength, as in Gibbs 18/Perry 142.

d. Overambition/Failure

Gibbs 342. The Jackdaw and the Eagle
Perry 2 (Syntipas 9)

There was a jackdaw who saw an eagle carry away a lamb from the flock. The jackdaw then wanted to do the very same thing himself. He spied a ram amidst the flock and tried to carry it off, but his talons got tangled in the wool. The shepherd then came and struck him on the head and killed him.

This fable and others like it illustrate the importance of not overreaching. In a society such as the majority of ancient Greek cities, which were extremely hierarchical and which did not allow for social mobility, trying to become more than what one is by nature or birth is a strategy not for climbing to the top but for being destroyed. It is this that arguably destroys Aesop in the Life of Aesop: though a slave by birth, he ends up aspiring to be the adviser of kings, and in the end, his change of status leads him to Delphi and thereby to his death.

e. Truth/Honesty/Lies/Deceit

Gibbs 117. The Wolf and the Sleeping Dog
Perry 134 (Chambry 184)

A dog was sleeping in front of the barn when a wolf noticed him lying there. The wolf was ready to devour the dog, but the dog begged the wolf to let him go for the time being. “At the moment I am thin and scrawny,” said the dog, “but my owners are about to celebrate a wedding, so if you let me go now, I’ll get fattened up and you can make a meal of me later on.” The wolf trusted the dog and let him go. When he came back a few days later, he saw the dog sleeping on the roof. The wolf shouted to the dog, reminding him of their agreement, but the dog simply said, “Wolf, if you ever catch me sleeping in front of the barn again, don’t wait for a wedding!”

This fable provides a nicely Machiavellian lesson about promising one’s enemies whatever is necessary while they have you at a disadvantage and then abandoning those promises when the conditions that made you promise no longer exist. Conversely, the lesson may be that when you are in a position of advantage over an enemy, you should not be too quick to accept their promises about their future behavior.

f. Gods

Gibbs 481. Heracles and the Driver
Perry 291 (Babrius 20)

An ox-driver was bringing his wagon from town and it fell into a steep ditch. The man should have pitched in and helped, but instead he stood there and did nothing, praying to Heracles, who was the only one of the gods whom he really honoured and revered. The god appeared to the man and said, “Grab hold of the wheels and goad the oxen: pray to the gods only when you’re making some effort on your own behalf; otherwise, your prayers are wasted!”

The gods do not appear especially frequently in the extant fables, but when they do appear they are usually there to either reward appropriate conduct (or punish inappropriate conduct), or else to serve to remind people that prayers without effort generally do no good. As the Christian proverb has it, “God helps those who help themselves.” Greek religion provides a wider selection of deities, but reaches a similar conclusion.

g. Reciprocity

Gibbs 167. The Murderer and the Mulberry Tree
Perry 152 (Chambry 214)

A robber had murdered someone along the road. When the bystanders began to chase him, he dropped the bloody corpse and ran away. Some travellers coming from the opposite direction asked the man how he had stained his hands. The man said that he had just climbed down from a mulberry tree, but as he was speaking, his pursuers caught up with him. They seized the murderer and crucified him on a mulberry tree. The tree said to him, “It does not trouble me at all to assist in your execution, since you tried to smear me with the murder that you yourself committed!”

This is an unusual fable in that it features not a talking animal but a talking plant. However, the lesson is not an uncommon one: if you attempt to harm others, they will undoubtedly respond in kind. The fable of the lion and the mouse quoted above would also fit here, as the lion’s kindness is repaid by reciprocity on behalf of the mouse.

h. Women, Family, Love

Gibbs 496. The Thief and His Mother
Perry 200 (Chambry 296)

A boy who was carrying his teacher’s writing tablet stole it and brought it triumphantly home to his mother who received the stolen goods with much delight. Next, the boy stole a piece of clothing, and by degrees he became a habitual criminal. As the boy grew older and became an adult, he stole items of greater and greater value. Time passed and the man was finally caught in the act and taken off to court where he was condemned to death: woe betide the trade of the thief! His mother stood behind him, weeping as she shouted, “My son, what has become of you?” He said to his mother, “Come closer, mother, and I will give you a final kiss.” She went up to him, and all of a sudden he bit her nose, tugging at it with his teeth until he cut it clean off. Then he said to her, “Mother, if only you had beaten me at the very beginning when I brought you the writing tablet, then I would not have been condemned to death!”

Violence and death are commonplace in the fables, but this one is unusual for the graphic depiction of the violence. Nevertheless, it provides a clear example of how mothers ought to behave: they need to provide clear moral guidance to their children (perhaps through the use of instructive fables?), lest the wayward child turn into a criminal as an adult.

5. Conclusion

This article has described what fable is and the characteristics of the man who was allegedly its inventor in order to make the case that the form and content of Aesopic fable as it existed in ancient Greece were philosophical in nature and taught those who learned the fables valuable moral and intellectual lessons for survival. Although fable is not well suited to complicated or abstract arguments, its brevity and use of argument by analogy provides useful food for thought for those who are looking for simple, effective, and memorable moral principles by which to guide their behavior. Fable is therefore well suited to deliver practical life-lessons that can be applied by anyone who is able to think through their situation and draw on the appropriate fable and the lesson that it teaches. In the Greek world, those lessons were oriented toward the day-to-day lives of people who were often in positions of powerlessness and low status, but even for those who were higher on the socioeconomic ladder, fables could provide valuable instruction. In the modern world, as communications become shorter and more immediate (such as Twitter, Facebook, and other social media), we may see a renaissance of the fable form, although of course the lessons it will communicate in today’s world may be very different from those of ancient Greece.

6. References and Further Reading

  • Adrados, Francisco Rodriguez. History of the Graeco-Latin Fable. Vols. 1 and 3. Leiden, NL: Brill, 2003.
  • Arnheim, M. T. W. “The World of the Fable.” Studies in Antiquity 1979–1980, 1–11.
  • Blackham, H. J. The Fable as Literature. London: Athlone Press, 1985.
  • Carnes, Pack. Fable Scholarship: An Annotated Bibliography. New York: Garland Publishing, Inc., 1985.
  • Compton, Todd. Victim of the Muses. Cambridge, MA: Center for Hellenic Studies, 2006.
  • Daly, Lloyd. Aesop Without Morals. New York: Thomas Yoseloff, 1961.
  • Hägg, Tomas. “A Professor and His Slave: Conventions and Values in The Life of Aesop.” In Conventional Values of the Hellenistic Greeks, edited by Per Bilde, Troels Engberg-Pedersen, Lise Hannestad, and Jan Zahle. Aarhus, DKs: Aarhus University Press, 1997.
  • Holzberg, Niklas. The Ancient Fable: An Introduction. Translated by Christine Jackson-Holzberg. Bloomington: Indiana University Press, 2002.
  • Hunt, Lester. “Literature as Fable, Fable as Argument.” Philosophy and Literature 33:2 (2009): 369–385.
  • Katsadoros, George C. “Aesopic Fables in the European and the Modern Greek Enlightenment,” Review of European Studies 3:2, 2011.
  • Kurke, Leslie. Aesopic Conversations. Princeton: Princeton University Press, 2011.
  • Lignell, David. Aesop in a Monkey Suit: Fifty Fables of the Corporate Jungle. New York: iUniverse, 2006.
  • Lissarrague, François. “Aesop, Between Man and Beast: Ancient Portraits and Illustrations.” In Not The Classical Ideal, edited by Beth Cohen, 132–149. Leiden, NL: Brill, 2000.
  • Morgan, Teresa. Popular Morality in the Early Roman Empire. New York: Cambridge University Press, 2007.
  • Nagy, Gregory. The Best of the Achaeans. Baltimore: Johns Hopkins University Press, 1979.
  • Noonan, David C. Aesop & the CEO: Powerful Business Insights from Aesop’s Ancient Fables. Nashville, TN: Thomas Nelson, 2005.
  • Papademetriou, I. -Th. A. Aesop as an Archetypal Hero. Athens: Hellenistic Society for Humanistic Study, 1997.
  • Patterson, Annabel. Fables of Power: Aesopian Writing and Political History. Durham, NC: Duke University Press, 1991.
  • Perry, B. E. Aesopica. Vol. 1. Urbana: University of Illinois Press, 1952.
  • Perry, B. E. Babrius and Phaedrus. Cambridge, MA: Harvard University Press, 1965.
  • Perry, B. E. Studies in the Text History of the Life and Fables of Aesop. Chico, CA: Scholars Press, 1981.
  • Pervo, Richard. “A Nihilist Fabula: Introducing the Life of Aesop.” In Ancient Fiction and Early Christian Narrative, edited by Ronald F. Hock, J. Bradley Chance, and Judith Perkins. Atlanta: Scholars Press, 1998.
  • Plato. Phaedo. Translated by C. J. Rowe. Cambridge: Cambridge University Press, 1993.
  • Plato. Apology. http://classics.mit.edu/Plato/apology.html
  • Raaflaub, Kurt A. “Intellectual Achievements.” In Raaflaub, Kurt A., and Hans van Wees, A Companion to Archaic Greece. New York: Blackwell Publishing, 2009.
  • Short, Jeremy C., and David J. Ketchen Jr. “Teaching Timeless Truths through Classic Literature: Aesop’s Fables and Strategic Management.” Journal of Management Education 29 (2005): 816–832.
  • Van Dijk, Gert-Jan. Ainoi, Logoi, Mythoi: Fables in Archaic, Classical, and Hellenistic Greek Literature. Leiden, NL: Brill, 1997.
  • Winkler, John J. Auctor and Actor. Berkeley: University of California Press, 1985.
  • Zafiropoulos, Christos A. Ethics in Aesop’s Fables: The Augustana Collection. Leiden, NL: Brill, 2001.

 

Author Information

Edward W. Clayton
Email: edward.clayton@cmich.edu
Central Michigan University
U. S. A

Boredom: A History of Western Philosophical Perspectives

The essayist Joseph Epstein has remarked, “Boredom is after all part of consciousness, and about consciousness the neurologists still have much less to tell us than do the poets and the philosophers.”

Although not a major topic for Western philosophers, some important Western philosophers have spoken of it, and regarded it as a major philosophical theme of human life. They have addressed the following issues: (1) what boredom is, which can be taken as the problem of producing an analysis of the concept of boredom, or as the problem of giving a typology of boredom, or as a phenomenology of the experience of it; (2) what to do about boredom, how to overcome it, lessen it, or learn to live with it; (3) what, if anything, the phenomenon of boredom reveals about matters metaphysical or otherwise deep—for instance, God, being, the meaning of life, human nature, the nature of the self, or the nature of some culture or other; (4) what boredom produces, and what it explains; (5) whether and how boredom represents a fundamental mood or “attunement” to the world for a reflective human being; (6) what the conditions are produce boredom, and what sorts of beings tend to feel it; and (7) ethical issues that relate to the phenomena of boring others and being bored oneself.

Why is boredom a philosophical issue? The preceding sketch should indicate how boredom may be regarded not only as a legitimate philosophical issue but as a major one. Moreover, there are several aspects of the problem of boredom which prevent its exhaustive treatment in a straightforward biological, psychological, sociological, or statistical way. There is a problem of identifying what boredom essentially is—a part of which is the problem of determining whether it is one thing or something that comes in a variety of importantly different forms or modes. Whatever scientific studies may be able to contribute to this problem, progress toward its solution will inevitably require contributions from conceptual and phenomenological investigations. There is also the fact, emphasized by Lars Svendsen, that most people have difficulty saying whether they are bored or not, both at the moment and in general throughout their lives—a fact that points to obvious limitations of statistical studies that begin from survey questions about, say, whether people tend to be bored and what bores them, and issue in claims about boredom’s prevalence, objects, typical conditions, cures, and so forth. Finally, it seems clear that if any academic discipline has much to say concerning the metaphysical or ethical implications of boredom, it is more likely to be philosophy than any of the empirical sciences.

The main philosophical texts on boredom are A Philosophy of Boredom by L. Svendsen, Boredom: A Lively History by the classicist P. Toohey, and Fundamental Concepts of Metaphysics by M. Heidegger.

Table of Contents

  1. Ancient and Medieval Times
    1. Solomon
    2. Seneca
    3. Acedia
  2. Early Modern Period
    1. Pascal
    2. Kant
  3. Nineteenth Century
    1. Schopenhauer
    2. Thoreau
    3. Kierkegaard
    4. Nietzsche
    5. James
  4. Twentieth Century
    1. Russell
    2. Heidegger
    3. Bernard Williams
  5. Philosophical Work since the 1990s
  6. References and Further Reading

1. Ancient and Medieval Times

There is a debate among scholars, including philosophers, about how far back in history boredom goes. Several philosophers claim that boredom has always plagued human beings, while others hold that it is peculiarly a malady of the modern world. Those holding the latter view do generally admit, however, that there were pre-modern precursors of boredom. It is with discussion of three of these precursors that this study begins.

a. Solomon

Qoheleth (c.200 B.C.E.), traditionally “Solomon”, author of Ecclesiastes, certainly sounds like he is speaking in large part of something like boredom and that he suffers from the condition himself. What we actually get in Ecclesiastes is nothing like a philosophical analysis of boredom or reflections on any deep implications it might have. Rather, we get expressions of the condition itself, partial identification of its causes or reasons, as well as advice concerning how to reduce it, or anyway how to live a halfway decent life in spite of it.

Expressions of boredom or tedium vitae run throughout the book. “Vanity of vanities, all is vanity.” “The eye is not satisfied with seeing, nor the ear with hearing.” “All I labored to do was vanity and vexation of spirit.” “I hated life because all my work was grievous to me.” These are the sounds a seriously bored (and rather depressed) man makes.

The reasons for boredom in Ecclesiastes seem to be primarily that nothing satisfies and the same old things keep getting repeated, within an individual life, and over countless generations. That which has been is now; and that which is to be has already been. There is nothing new under the sun.

And yet there does seem to be a moral Qoheleth draws.

Go thy way, eat thy bread with joy, and drink thy wine with a merry heart . . . . Let thy garments be always white; and let thy head lack no ointment. Live joyfully with the wife whom thou lovest all the days of the life of thy vanity, which he hath given thee under the sun, all the days of thy vanity: for that is thy portion in this life, and in thy labour which thou takest under the sun. Whatsoever thy hand findeth to do, do it with thy might; for there is no work, nor device, nor knowledge, nor wisdom, in the grave, whither thou goest. (King James Version, 9: 7-10)

That is, live life with gusto, and get enjoyment from what you do, if you can. One might well wonder how effective this advice could be to one truly suffering from a bad case of severe boredom.

b. Seneca

Lucius Annaeus Seneca (4 BCE – 65 CE), the Roman Stoic philosopher, talks about boredom or tedium in his essay, “On Tranquillity,” addressed to his friend “Serenus”, who always seems to need a lot of advice. Based on some things Serenus says, Seneca apparently thinks his friend is on the verge of lapsing into boredom, or at least has gotten himself into a mode of living that leads straight to it. To the modern reader, Serenus’s confessions do not sound like confessions of anything like boredom.

In any event, Seneca takes them as such and proceeds straightway to the following pronouncement.

All are in the same case, both those, on the one hand, who are plagued with fickleness and boredom and a continual shifting of purpose, and those, on the other, who loll and yawn.

Notice here that Seneca includes two central elements in the phenomenon of boredom. On the one hand there is fickleness and restlessness, and on the other a lack of motivation and interest, a weariness that expresses itself in lolling and yawning. His subsequent remarks provide a pretty apt description of the phenomenology of boredom—what the bored person feels, and how he or she is inclined to act (or fail to act). Seneca shall be quoted at length here because of the delightfulness of his prose and the aptness of his portrait of bored people.

[T]hen creeps in the agitation of a mind which can find no issue, because . . . of the hesitancy of a life which fails to find its way clear, and then the dullness of a soul that lies torpid amid abandoned hopes. And all these tendencies are aggravated when men have taken refuge in solitary studies, which are unendurable to a mind that is intent upon public affairs, desirous of action, and naturally restless, because assuredly it has too few resources within itself. When, therefore, the pleasures have been withdrawn which business itself affords to those who are busily engaged, the mind cannot endure home, solitude, and the walls of a room, and sees with dislike that it has been left to itself.From this comes that boredom, dissatisfaction, the vacillation of a mind that nowhere finds rest, and the sad and languid endurance of one’s leisure.

Thence comes mourning and melancholy and the thousand waverings of an unsettled mind, which its aspirations hold in suspense, and then disappointment renders melancholy. Thence comes that feeling which makes men loathe their own leisure and complain that they themselves have nothing to be busy with.

[T]heir mind becomes incensed against Fortune, and complains of the times, and retreats into corners and broods over its trouble until it becomes weary and sick of itself. For it is the nature of the human mind to be active and prone to movement. Welcome to it is every opportunity for excitement and distraction.

Hence men undertake wide-ranging travel, and wander over remote shores, and their fickleness, always discontented with the present, gives proof of itself now on land and now on sea. They undertake one journey after another and change spectacle for spectacle. They began to be sick of life and the world itself, and [think]: “How long shall I endure the same things?”

You ask what help, in my opinion, should be employed to overcome this tedium. The best course would be . . . to occupy oneself with practical matters, the management of public affairs, and the duties of a citizen.

So what do we get from Seneca that will help us in our attempts to understand boredom? We get three things: first, a rather compelling phenomenological account of what the state is like; second, an indication that it can lead to states worse than itself (for example, melancholy, jealousy, and envy); and, third, some advice about how to eliminate or at least ameliorate the condition, namely, through work and immersion in practical affairs.

c. Acedia

Acedia, the “disease that wasteth at noonday” or the demon responsible for the infliction of the disease, was a form of pre-boredom or boredom with sloth that afflicted innumerable practitioners—priests, monks, hermits, and the like—of the religious life in the Christian middle ages. Since our concern here is with philosophical thought on boredom, this fascinating chapter in the book of boredom must be largely passed over. But before it is passed over altogether, it should be noted that there was an ethical overtone to acedia/boredom. The overtone was negative. If God, God’s world, and the life God has ordained for you seem boring to you, there is almost certainly something wrong with your soul, something you had better hasten to fix.

Readers who wish to understand more about acedia should consult the excellent treatment of it in Toohey 2011.

2. Early Modern Period

a. Pascal

But let us move on to the seventeenth century French philosopher Blaise Pascal (1623-1662). With Pascal, we are in the era of history termed “modernity” or “early modernity” or the “beginning of the modern world.” Pascal treats boredom with more depth, passion, and insight than any of his predecessors. He does this in his book, Pensees, especially in section II “The misery of man without God.”

Most of what we get from Pascal are observations of human nature, or of people in general. His primary and oft-repeated point concerning them is that, without diversions and distractions, human beings are naturally bored. Boredom is the natural state of the human being left to his or her own devices.

Diversion is a dominant theme in Pascal’s thought. People cannot live in quiet, peace, and rest with themselves, and so they seek distractions and diversions to draw away their attention from their own empty selves and lives. The diversions do not really work, and so people find themselves returning again and again to perception of the emptiness and nothingness of their own lives, and to a pervasive sense of ennui or boredom, the fit response to their own emptiness and nothingness.

Pascal’s description of the bored and weary person is apt and insightful. He is noteworthy for the claim that boredom and ennui are the natural state of the human being. But his message is not entirely negative. Boredom is the natural state of a human being without God. A life in relation to an infinite God fills the emptiness of the soul and obliterates the restlessness, weariness, and boredom which naturally afflicts people.

b. Kant

Immanuel Kant (1724-1804) speaks of boredom in passing. His remarks about it occur primarily in his Lectures on Ethics. Kant believes that boredom plagues the person who is inactive and has nothing to do. His cure for it is activity, either work or participation in activities of recreation and diversion. The person who just loafs and does not engage in activity can find no rest at the end of the day, while the one who has been active can.

It is interesting to note that Kant’s solution to boredom does not carry the theological overtones that Pascal’s does. Pascal advises one to overcome boredom by establishing a relationship with God; Kant just recommends activity, whether of work or play.

3. Nineteenth Century

a. Schopenhauer

We now come to a philosopher who makes boredom a centerpiece of his philosophy. He is the great German pessimist Arthur Schopenhauer (1788-1870). Several things stand out in Schopenhauer’s treatment of boredom.

First, there is his claim that boredom is one of the twin poles of human life. The other pole is need, want, lack, or desire. Here is the way it works. We feel that we lack something, something we need. We pursue it and, if we are fortunate, capture it. But the capture does not bring the satisfaction we had expected. What we get instead is a strong dose of boredom, and we find ourselves casting about to identify another object of pursuit, somehow convincing ourselves that if we can get it, we will experience satisfaction. Neither want nor boredom is a particularly pleasant state to be in; in fact, both are forms of misery. And so life may be viewed as a pendulum that passes back and forth between one bad state and another.

Second, Schopenhauer offers something like a definition of boredom, a brief analysis of the concept, which may be the first offered in Western thought. Boredom, he says, is a “tame longing without any particular object.”

Third, Schopenhauer offers in addition not just a definition but a substantive account of what boredom is. Boredom, he says, is the sensation of the worthlessness of existence. Boredom may even be regarded as evidence or proof that existence is worthless. If life itself had any real, positive value, there would be no such thing as boredom. Simply being alive would delight us. But, as things are, we can find no modicum of relief from our misery, except when we are diverted or distracted from our lives.

Fourth, Schopenhauer reflects on what boredom or its absence reveals about the intelligence and complexity of the one who suffers from it. His general claim here is that a propensity to be bored is a sign of intelligence. Animals, he speculates, feel very little boredom. Humans are prone to it in proportion to how smart they are. It takes a rich and varied world to hold the interest of a genius, and the real world often doesn’t measure up. As for those who are content with something like mere everyday existence, they are the stupidest of people, not much, if any, above the level of the brutes.

It should be added that there are exceptions in Schopenhauer to his intelligent bored person. One of these is the human being who is lost in the contemplation and enjoyment of art, especially music. The other is the sage, saint, or mystic who has thoroughly denied the will to live and exists in nirvana, or something like it. But very few can even conceive of such a state, let alone achieve it. The vast majority of intelligent people simply have to put up with long stretches of boredom throughout their lives.

Finally, Schopenhauer stresses the seriousness of boredom more than any of his predecessors. It is a form of misery, and a real scourge on the human race. It can lead to the death of the bored one; it can make him or her hang himself or herself. Or, to overcome it, he or she may find himself or herself the instigator of wars, massacres, and murders.

b. Thoreau

The American Transcendentalist philosopher Henry David Thoreau (1817-1865) does not write about boredom as such at any length. But he does make some remarks about it—he calls it “ennui” or “tedium”—which are striking because they are very nearly the opposite of what Schopenhauer claims about the link between boredom and intelligence, and because they provide one answer to a question that would become prominent in the debates about boredom of later philosophers and other scholars.

Thoreau writes, “Undoubtedly the very tedium and ennui which presume to have exhausted the variety and the joys of life are as old as Adam.” Thoreau here anticipates an issue discussed by philosophers more than a century later. The issue is whether boredom is a natural state that has been around ever since there were humans, or whether it developed, or was invented, in the early modern period and is uniquely an affliction of modernity. Thoreau’s answer is that it is as old as Adam.

Moreover, in contrast to Schopenhauer, Thoreau seems to think that it is those who are less intelligent, less mentally active, and more “asleep” who tend to suffer most from boredom. In his Walden essay “Where I Lived and What I Lived For” Thoreau says:

Moral reform is the effort to throw off sleep. Why is it that men give so poor an account of their day if they have not been slumbering? . . . . The millions are awake enough for physical labor; but only one in a million is awake enough for effective intellectual exertion, only one in a hundred millions to a poetic or divine life. To be awake is to be alive. . . . If we respected only what is inevitable and has a right to be, music and poetry would resound along the streets. When we are unhurried and wise, we perceive that only great and worthy things have any permanent and absolute existence, that petty fears and petty pleasures are but the shadow of the reality. This [reality] is always exhilarating and sublime.

Thoreau’s point about boredom is that it is the state of one of limited mental capacities, or of one “asleep”. An intelligent and alert mind is never bored. In its surroundings there are always a thousand things that are fascinating and sublime. The hum of a mosquito, if alertly attended to, is as fascinating and enthralling as an Iliad or an Odyssey.

c. Kierkegaard

The Danish philosopher Soren Kierkegaard (1813-1855) has several things to say about boredom. Four of them will be mentioned here.

First, Kierkegaard shares with Schopenhauer the idea that boredom is quite a serious matter. He claims at one point that boredom is the root of all evil. The consensus among philosophers is that, while there may be something to this, it is a bit of an exaggeration.

Second, Kierkegaard’s conception of boredom is that it is a kind of nothingness, a nothingness that permeates all reality. He calls it “demonic pantheism”—demonic, because it is that which is empty, pantheism because it is all-pervasive.

Third, in spite of (or because of?) its nature of nothingness, boredom functions as a highly effective impetus to action. This strikes Kierkegaard as something of a paradox. He finds it strange that something as staid and solid—staidness and solidity somehow apparently being compatible with nothingness—as boredom could serve as a motivator and stimulus of action. Desire certainly stimulates to action, which is not so puzzling. But boredom is the opposite of desire, not attraction but repulsion. It is some kind of negative stimulus to action. Hence Kierkegaard speaks of boredom’s action-instigating character as “magical”.

Fourth, for Kierkegaard, boredom is a sort of status symbol. It belongs to persons of rank. He writes, “Those who bore others are the plebeians; . . . those who bore themselves are the chosen ones, the nobility.”

d. Nietzsche

The German philosopher Friedrich Nietzsche (1844-1900) nowhere gives an extended treatment of boredom as such, but he does speak of it here and there throughout his writings, and much of what he says about it is thought-provoking. Here are some of his especially interesting points.

First, boredom is part of the explanation of Christian, saintly, or ascetic ideals and practices. These ideals are created, and these practices are followed, largely in an attempt to combat boredom. They are ways to fight it, ways to find a remedy for it. In On the Genealogy of Morals Nietzsche writes:

What do ascetic ideals mean? . . . . [A]mong physiologically impaired and peevish people (that is, among the majority of mortals) they are an attempt to imagine themselves as “too good” for this world, a holy form of orgiastic excess, their chief tool in the fight with their enduring pain and boredom.

He makes the same kind of point in The Antichrist:

In Christianity the instincts of the subjugated and oppressed come to the fore: here the lowest classes seek their salvation. The casuistry of sin, self-criticism, the inquisition of the conscience, are pursued as a pastime, as a remedy for boredom.

Second, boredom explains not only saints and ascetics. It explains virtually everything. In The Antichrist, Nietzsche says that, according to the story in Genesis at the beginning of the Bible, the old God is bored. So he invents man. Man is entertaining to God, but man himself is bored. God then creates animals for him to play with, but they do not entertain him. So, God creates woman.

If he is serious here, Nietzsche implies that, according to the Biblical story anyway, boredom is powerful indeed. It gave rise to the entire human and animal world!

Third, Nietzsche sometimes speaks positively of boredom. He agrees with Schopenhauer that boredom is a sign of vitality and intelligence in the one who has it. Anticipating Heidegger, Nietzsche says that a person who blocks all boredom from his or her life also blocks access to his or her deepest self and the water that flows from its fountain. In another place, Nietzsche makes a claim about boredom that would make it sound like a good thing to many of us, if not to Nietzsche himself. The claim is that, although normally we look away from those who are suffering, we sometimes attend to them and help them in order to rid ourselves of our own boredom.

But, fourth, Nietzsche also speaks of boredom as something we do not want. In Beyond Good and Evil he writes:

[L]et us be careful lest out of pure honesty we eventually become saints and bores! Is not life a hundred times too short for us—to bore ourselves?

Finally, an amusing remark of Nietzsche’s in his late Twilight of the Idols—which, unfortunately may have some truth in it—is worth quoting:

 “What is the task of all higher education?” To turn men into machines. “What are the means?” Man must learn to be bored. “How is that accomplished?” By means of the concept of duty. “Who serves as the model?” The philologist: he teaches grinding.

e. James

The American Pragmatist philosopher and psychologist William James (1842-1910) has a couple of interesting things to say about boredom in “The Perception of Time,” Chapter XV of his massive Principles of Psychology (1890).

First, James tells us what boredom is and the conditions under which it arises. Boredom, he says, is an experience or sensation that “comes about whenever, from the relative emptiness of content of a tract of time, we grow attentive to the passage of the time itself.” When bored, you attend closely to the mere feeling of time per se.

Second, James notes that the experience of boredom is unpleasant, even odious, and he offers an explanation of why that is so. The odiousness of the experience of boredom arises from its insipidity. The feeling of bare time is the least stimulating experience we can have, and stimulation is a necessary component for any pleasure we might find in, or get from, an experience. James quotes with apparent approval the statement of another psychologist who says that “the sensation of tedium is a protest against the entire present.”

4. Twentieth Century

a. Russell

The great British analytic philosopher Bertrand Russell (1872-1970) devotes an entire chapter of his popular book The Conquest of Happiness (1930) to boredom. The chapter is rich in ideas, despite some apparent confusion.

Russell offers a view of what boredom essentially is. “Boredom,” he says, “is essentially a thwarted desire for events.” And besides this thwarted desire, there are two additional essentials of boredom. One is a contrast between present circumstances and some other more agreeable circumstances which force themselves irresistibly upon the imagination. The other is that one’s faculties must not be fully occupied. The opposite of boredom is excitement.

Russell moreover begins the tradition of distinguishing different kinds of boredom (unless we can regard Seneca as the initiator of the tradition). Russell’s distinction is rather odd. The two kinds of boredom are the kind that arises from the absence of drugs and the kind that arises from the absence of vital activity.

There are at least two things in this account that many philosophers would take issue with. Those who think of boredom as a kind of empty longing, or a tame longing without an object, or a desire for a desire, would not accept Russell’s suggestion that in boredom circumstances other than the bored person’s present ones force themselves irresistibly upon his or her imagination. And some philosophers would think it obvious that the opposite of boredom is not excitement but interest. An excited person is no doubt interested in something, but a person whose interest is captured by something need not be excited by anything. It seems wrong to consider a person who is interested in something (for example, the crossword puzzle she is quietly working) as bored just because at the moment there is nothing in her that is exactly excitement.

Russell surmises that boredom (or fear of it and a desire to get rid of it) has been a great motivator throughout human history. It has produced wars, persecutions, quarrels with neighbors, and witch-hunts. Russell even speculates that “more than half” of the sins of humankind have been caused by fear of boredom.

Russell observes that boredom is unpleasant and thus it is natural to want to get rid of it. There is a deep-seated desire for excitement in human beings.

But boredom is not all bad. Some boredom may be a necessary ingredient in life. In fact, Russell says that a certain power of enduring boredom is essential to a happy life. He even claims that:

All great books contain boring portions, and all great lives have contained uninteresting stretches. . . .[A] quiet life is characteristic of great men, and . . . their pleasures have not been of the sort that would look exciting to the outward eye. No great achievement is possible without persistent work, so absorbing and so difficult that little energy is left over for the more strenuous kinds of amusement.

The idea is apparently that great lives require those who live them to endure a lot of quietness and boredom; and the same must be endured by those who make great achievements.

One issue spoken much of in the twentieth century interdisciplinary literature on boredom is that of whether there is more or less boredom in the modern era than in previous times. The usual claim is that there is more boredom in modernity, that is, now. Russell weighs in on this issue and offers the suggestion that there is actually less boredom now than in prior eras. Here is his view. Long ago, in the hunting stage of humanity, there was much excitement and little boredom. Early man was constantly involved in exciting activities that kept him entertained—hunting, fighting, courting, and so forth. But then several centuries ago the agricultural era began, an era that lasted right up to the modern period. Life in the agricultural era was incredibly boring. Work in the fields was generally solitary and repetitive. Life at home in the evenings was as dull as could be. There was no electricity; there were no books, music, or movies; there wasn’t much of anything to do except hunt occasionally for witches. The farmer and his family lived lives of perpetual boredom. But things changed drastically with the coming of the machine age and advances in technology. True, factory workers’ jobs are repetitive and sometimes tedious. But at least the workers usually have company. And during their time off work, there are many things for them to do. Modern life is much less boring than it was for centuries in the agricultural past.

Russell wrote this in 1930. It’s pretty clear that he would say that life is even less boring in the current computer and iPhone age. Most people simply aren’t bored very often these days. They have much excitement and little boredom.

But Russell makes this observation. Although we experience less boredom than our ancestors did, we are more afraid of boredom than they were. They just accepted it; we think that an ideal human life should be completely free of boredom. We tend to think that boredom is not a part of a natural human life.

And so many of us seek one exciting stimulation after another. Russell thinks that this approach to life won’t work. Too much excitement leads to the need for more and more excitement, which results in the end in the inability to be excited at all—and also in the inability to experience many or most of the joys of life. A too zealous quest for excitement leads, paradoxically, to boredom.

Russell isn’t opposed to the pursuit of excitement altogether. If it is engaged in only rarely and then in moderation, it can contribute greatly to the happiness of a life. But in the end Russell recommends a quiet life, one in sync with “the rhythm of the earth”. “A happy life,” he says, “must be to a great extent a quiet life, for it is only in an atmosphere of quiet that true joy can live.”

Russell sometimes speaks as though, in recommending a quiet life, what he is recommending is a life that contains a large amount of boredom. But there may be some confusion on his part here. It seems clear that what he is really recommending is a life that looks boring from the outside, and would be boring to one who needed a lot of stimulation and excitement to be happy, not a life that is boring to the one living it. For the quiet life is one of true joy.

b. Heidegger

The German philosopher Martin Heidegger (1889-1976) discusses boredom at length in his 1929-30 lecture series The Fundamental Concepts of Metaphysics. What Heidegger says there will be the focus of the present summary of his conception of boredom and its significance.

This summary can be little more than a sketch—for several reasons: Heidegger’s treatment of boredom is complex and subject to various interpretations; it cannot be fully understood apart from the vast body of his philosophical work as a whole; and much of it is couched in technical terminology, or in ordinary terms to which Heidegger gives special meanings, which makes it difficult to render into readily intelligible English prose.

But we can say the following things with some confidence. They will necessarily be disjointed. Their relationship is unclear even to many of Heidegger’s close readers.

(1) Heidegger writes about boredom more than any other major philosopher, and perhaps he sees it as having a greater significance than any other major thinker has seen it as having.

(2) The central concept of Heidegger’s philosophy is Dasein, literally, “being there”. Dasein is the kind of being we are. We are beings who are there, in the world. Throughout his long academic career, Heidegger was preoccupied with the question of the meaning of being. In everyday German language the word “Dasein” means “life” or “existence.” Dasein, that being which we ourselves are, is distinguished from all other beings by the fact that it makes an issue of its own being. As Da-sein, it is the location, “Da”, for the disclosure of being, “Sein.”

(3) Fundamental moods or “attunements” figure prominently in Heidegger’s thought. They reveal Being to us. But moods are not to be thought of as mere subjective feelings, inner happenings, or responses to objective facts. A mood is neither internal or external; a mood goes beyond such a distinction and is a basic characteristics of being-in-the-world. It is by way of a mood that we relate to our surroundings. Moods have epistemic as well as merely subjective significance. They reveal the world to us as much or more than our senses do.

(4) Boredom, Langeweile, is a fundamental attunement, a mood. Along with anxiety, it is one of the most important and profound ones. Heidegger makes a distinction between being bored with something and boring oneself with something. The latter is a more profound and useful form of boredom. There may be an even more profound form of boredom. Normally, it is there, in us, but asleep. Heidegger wants to wake it up.

(5) Heidegger wants to awaken boredom rather than let it slumber through various forms of everyday pastime. Boredom, and we ourselves, are asleep in our everyday pastimes in our actual life. We like being asleep. We like lives of slumbering distractions. We seek to be occupied because it liberates us from the emptiness of boredom. But why on earth would we want to wake up, and especially to awaken in a mood as dreary and empty as boredom? Boredom removes an illusion of meaning from things and allows them to appear as what they are: emptiness and nothingness. Who in her right mind would want to remove such an illusion?

(6) Heidegger’s main answer to these questions may be: Boredom prepares the mind for profound vision. Svendsen writes:

By awakening the mood of boredom Heidegger believes we will be in a position to gain access to time and the meaning of being. For Heidegger, boredom is a privileged fundamental mood because it leads us directly into the very problem complex of being and time.

Profound boredom can set us on the road to authenticity. When boredom works its magic, what is left is nothing less than Being itself and its meaning—if it has any. But Dasein is still there, and Being can reveal itself to Dasein.

(7) Heidegger has other answers to the questions raised above (about why one would want boredom and its insights). Here are two of them: (a) accompanying sober boredom is a strange kind of calm joy; and (b) “[p]hilosophy is born in the nothingness of boredom.”

(8) Finally, let us mention three of Heidegger’s points about boredom that have some interest in their own right. They are: (a) normally time is transparent, but in profound boredom we experience time as time; (b) boredom is a mood that in many respects is like an absence of mood; it is indeed a mood, a fundamental attunement, but it is also, paradoxically, a kind of a non-mood; and (c) Heidegger’s answer to the question of what exactly in the world it is that bores us is that it is the Boring. Wrestling with these three claims, wondering if they make sense and, if they do, what sense it is, would make a fit pastime for whiling away a slow slumbering evening.

c. Bernard Williams

The prominent English moral philosopher Williams Bernard ‘s (1929- 2003) famous essay, “The Makropulos Case: Reflections on the Tedium of Immortality,” is not about the nature of boredom as such. Its thesis is, roughly, that it would not be a good thing to live forever, for eventually immortal life would become boring. But in the essay, Williams makes several important points about boredom.

Williams’s conception of boredom is apparent from his language. He thinks of boredom as indifference, detachment, coldness, and inner death.

Long-lasting sameness is what brings it on. Williams explores the issue by reference to his test case, a woman called “EM” in a play Williams uses as a springboard for his reflections. EM had taken an elixir at age 42 which kept her at 42 and continued to do so every year she took it. At the time of the action in the play, EM has been 42 for 300 years, and she is bored to death. She refuses this time to take the elixir, and she dies.  EM’s boredom is connected with the fact that everything that could happen and make sense to one particular human being had already happened to her.

One implication of this is that, in Williams’s view, boredom is a rather serious thing that can motivate, not suicide exactly, but the choice of death over life.

Williams seems to think that EM’s boredom (and consequent choice of death over life) is entirely understandable, proper, and fitting.

In all this there is the idea, stated more or less explicitly at certain points, that there are circumstances in which one ought to be seriously bored. Not being bored suggests an impoverishment in one’s consciousness of her circumstances. Williams writes that, “not being bored can be a sign of not noticing or not reflecting enough.” So, Williams may think that sometimes we have a moral, or more broadly ethical, or still more broadly human, reason to be bored. There are times when we ought to be bored, and not being so represents some kind of ethical, intellectual, or human failure.

5. Philosophical Work since the 1990s

It is impossible to do justice here to current writing on boredom. Work on it by philosophers (and others) is thriving. The major philosophical work, as has been mentioned, is Svendsen’s Philosophy of Boredom. Other important works are: Toohey’s Boredom: A Lively History; the Routledge Boredom Studies Reader: Frameworks and Perspectives, edited by M. E. Gardiner and J. J. Haladyn, a rich anthology by a diverse group of contributors exploring multiple issues, many of them philosophical, that the phenomenon of boredom raises; W. O’Brien’s “Boredom” in the 2014 volume of the journal Analysis; and several works on boredom by Andreas Elpidorou, probably the most prolific and certainly one of the most interesting of the writers on the subject at present. Only two of the issues in the current debate will be mentioned here.

First, there is discussion about the question of whether there really is such a thing as “existential” or “profound” boredom (as distinct from everyday situative boredom, whose reality nobody questions). Svendsen, following Heidegger, argues that existential boredom is indeed real and important. Toohey, in contrast, denies that there is any such state, arguing that what has been misidentified as such is actually a particular form of depression. Although there are exceptions, “analytical” philosophers tend to side with Toohey on this matter while “continental” philosophers tend to side with Svendsen. (Perhaps this is due to the great influence of Heidegger on continental thought, an influence largely absent in analytical circles.)

Second, an analysis of the concept of boredom has been proposed by Wendell O’Brien in the journal Analysis. O’Brien suggests that boredom is an unpleasant or undesirable mental state of weariness, restlessness, and lack of interest in something to which one is subjected, a state in which the weariness and restlessness are causally connected in some way to the lack of interest. The extent to which O’Brien’s analysis is satisfactory remains to be seen. Those on the Toohey side of the disagreement just mentioned are likely to accept something like it. Those on the Svendsen side are likely to find it deeply problematic, though they may concede that some variation of the sort of analysis O’Brien proposes might capture the notion of everyday situational boredom.

If the reader wishes to pursue study of this current work, it is suggested that he or she begin by consulting post-1990 sources listed in the “References and Further Reading” below.

6. References and Further Reading

  • Ecclesiastes. KJV.
  • Elpidorou, A. 2014. “The Bright Side of Boredom,” Frontiers in Psychology 5: 1245.
  • Elpidorou, A. 2016. “The Significance of Boredom: A Sartrean Reading,” in Philosophy of Mind and Phenomenology: Conceptual and Empirical Approaches, ed. D. O. Dahlstrom, A. Elpidorou, and W. Hopp, pp. 268-285. London: Routledge
  • Frankfurt, H. 1999. “On the Usefulness of Final Ends,” in Necessity, Volition, and Love, 82-94. Cambridge and New York: Cambridge University Press.
  • Gardiner, M. E. and J. J. Haladyn, eds. 2016. Boredom Studies Reader: Frameworks and Perspectives. London: Routledge.
  • Healy, S. D. 1984. Boredom, Self, and Culture. Madison, N. J.: Fairleigh Dickinson University Press.
  • Heidegger, M. 1995. The Fundamental Concepts of Metaphysics, W. McNeill & N. Walker trans.  Bloomington: Indiana University Press.
  • James, W. 1890. The Principles of Psychology. (Now in the public domain and readily accessible online.)
  • Kant, I. 1963. Lectures on Ethics, Louis Infield trans. London: Harper & Row.
  • Kierkegaard, S. 1992. Either/Or, Alastair Hannay trans. & abridged. London: Penguin Books.
  • Lombardo, N. E. 2017. “Boredom and Modern Culture,” Logos: A Journal of Catholic Thought and Culture, 20: 2, 36-59.
  • Millgram, E. 2004. “On Being Bored Out of Your Mind,” Proceedings of the Aristotelian Society, New Series, Vol. 104, pp. 165-186.
  • Nietzsche, F. 1954. The Portable Nietzsche, W. Kaufmann trans. & ed. New York: Viking Penguin.
  • Nietzsche, F. 1968. Basic Writings of Nietzsche, W. Kaufmann trans. & ed. New York: The Modern Library, Random House.
  • O’Brien, W. 2014. “Boredom,” Analysis 74:2, 236-243.
  • Pascal, B. 1958. Pens‎ees, W. F. Trotter trans. New York: E. P. Dutton.
  • Raposa, M. L. 1999. Boredom and the Religious Imagination. Charlottesville: University of Virginia Press.
  • Russell, B. 1930. The Conquest of Happiness. London: Liveright.
  • Schopenhauer, A. 1970. Essays and Aphorisms, R. J. Hollingdale trans. London: Penguin Books.
  • Seneca, L. 1917. Epistles: Vols. IV-VI. Loeb Classical Library, R. M. Gummere, trans. Cambridge, MA.: Harvard University Press.
  • Spacks, P. M. 1995. Boredom: The Literary History of a State of Mind. Chicago: The University of Chicago Press.
  • Svendsen, L. 2005. A Philosophy of Boredom, John Irons trans. London: Reaktion Books.
  • Thoreau, H. 1983. Walden and Civil Disobedience. New York: Penguin Books. (Originally published in 1854.)
  • Toohey, P. 2011. Boredom: A Lively History. New Haven: Yale University Press.
  • Williams, B. 1973. “The Makropulos Case: Reflections on the Tedium of Immortality,” in Problems of the Self, pp. 81-100. Cambridge: Cambridge University Press.
  • Yao, V. 2015. “Boredom and the Divided Mind,” Res Philosophica, 92:4, 937-957.

 

Author Information

Wendell O’Brien
Email: w.obrien@moreheadstate.edu
Morehead State University
U. S. A.

Mind and the Causal Exclusion Problem

The causal exclusion problem is an objection to nonreductive physicalist models of mental causation. Mental causation occurs when behavioural effects have mental causes: Jennie eats a peach because she wants one; Marvin goes to Harvard because he chose to, etc. Nonreductive physicalists typically supplement adherence to mental causation with the view that behavioural effects have distinct sufficient physical causes as well: Jennie eats a peach because the muscles in her arms contracted as a result of the innervations of muscle fibres, which were in turn caused by the release of neurotransmitters from the motor neurons at the neuromuscular junction, and so on and so forth. Nonreductive physicalists, therefore, argue that behavioural effects have sufficient physical causes and distinct mental causes. The causal exclusion problem is the leading objection to this view, and it is based on the causal exclusion principle, which stipulates that events cannot have more than a single sufficient cause. The causal exclusion principle conflicts with the nonreductive physicalist view that behavioural effects have a sufficient physical cause and a distinct mental cause. Critics typically add that the sufficient physical cause of the behavioural effect excludes the mental cause of the same effect, so nonreductive physicalism also fails to secure mental causation.

Various responses to the causal exclusion problem have been suggested. Some overcome the causal exclusion problem by undermining certain metaphysical foundations supporting the problem. For example, some adopt differing models of events and properties, thereby avoiding the thrust of the causal exclusion problem. Others turn to differing models of causation to dissipate exclusion pressures. There are those who resolve the causal exclusion problem by providing robust nonreductive physicalist models of mental causation. These models include supervenience based nonreductive physicalism, emergentism, functionalism, and the realization strategy. Each of these models attempts to provide an account of how the causal exclusion problem does not defeat their model. Still others respond to the causal exclusion problem by rejecting one of the principles undergirding the causal exclusion problem. For example, the epiphenomenalist rejects the principle of mental causation, the interactionist dualist rejects the principle of physical causal completeness, the reductionist abandons the principle of irreducibility, and the compatibilist rejects the principle of causal exclusion. These views must demonstrate the viability of rejecting one of these widely accepted principles.

This article introduces and motivates the causal exclusion problem, and considers the merits and demerits of these avenues of response to the causal exclusion problem. The stakes are high. Not only does the causal exclusion problem pose difficulties with reconciling autonomous agency with neuroscientific advances that increasingly establishes neural causes of behavioural effects, but it also threatens to leak out into other domains. This article closes with a discussion of the possibility that the causal exclusion problem applies to the realm of explanation as well, and the viability of the view that the causal exclusion problem may generalize to other special science disciplines such as sociology, economics, geology, and biology.

Table of Contents

  1. Introduction
  2. The Causal Exclusion Problem
    1. The Principle of Mental Causation
    2. The Principle of Physical Causal Completeness
    3. The Principle of Irreducibility
    4. The Principle of Causal Exclusion
  3. Solutions to the Causal Exclusion Problem
    1. The Metaphysics of Events and Properties
    2. The Metaphysics of Causation
    3. Supervenience
    4. Emergentism
    5. Functionalism
    6. Realization
    7. Epiphenomenalism and Autonomy
    8. Interactionist Dualism
    9. Reductionism
    10. Compatibilism
  4. Explanatory Exclusion
  5. The Generalization Problem
  6. Conclusion
  7. References and Further Reading

1. Introduction

While disputes about mental causation arise in ancient philosophy, the locus classicus of the problem of mental causation is René Descartes. Descartes argued that the mind is a thinking substance that is distinct from the body, which is an extended substance. Descartes supplemented this substance dualism with a principle of interactionism, according to which the mind causally interacts with the body. For example, Jennie’s wanting a peach, which is distinct from the physical processes in her brain, causes her to eat a peach. Descartes’ interactionist dualism faced numerous difficulties. Princess Elizabeth of the Palatinate and others argued that thinking substance, which is not extended in space, cannot come into causal contact with the extended body. Henry More and others argued that a distinct thinking substance that causally interacts with a body would violate conservation principles by increasing the motion of the universe.

Contemporary discussion on the problem of mental causation typically begins with the Type Identity Theorists of the mid-twentieth century. They argued that mental states are type identical with causally efficacious physical states, thereby securing mental causation. For example, Jennie’s wanting a peach is a causally efficacious physical process in her brain, so Jennie’s peach eating has a mental cause, which is the sufficient physical cause. The type identity theory faced numerous difficulties as well. Hilary Putnam and others argued that mental properties are multiply realizable, so they cannot be identical with specific physical properties. David Chalmers and others argued that mental states have qualitative or intentional properties that are irreducible to physical processes.

The failure of type reductionism led to the currently dominant, nonreductive, and physicalist solution to the problem of mental causation. They argue that behavioural effects have sufficient physical causes and distinct mental causes. For example, Jennie’s peach eating has a mental cause that supervenes upon a distinct sufficient physical cause of the behaviour. In recent years, this nonreductive hegemony has been likewise threatened. The causal exclusion problem is the principal weapon fashioned against nonreductive physicalist solutions to the mental causation problem. The causal exclusion problem is most thoroughly exposited in a series of articles and books by Jaegwon Kim (Kim, 1998; Kim, 2005). He argues in favour of the causal exclusion principle, which states that: “No single event can have more than one sufficient cause occurring at any give time…” (Kim, 2005, 42). Accordingly, Jennie’s peach eating cannot have a mental cause and a distinct sufficient physical cause, as nonreductive physicalism posits. As a result of this causal exclusion problem, nonreductive physicalist solutions to the mental causation problem are currently threatened.

2. The Causal Exclusion Problem

In brief, the causal exclusion problem amounts to the difficulty of establishing the nonreductive physicalist view that behavioural effects have sufficient physical causes and distinct mental causes, over and against the plausibility of the view that the sufficient physical cause of the behaviour excludes the mental event from causally influencing the behaviour.  If mental events are excluded from causally influencing behavioural effects, nonreductive physicalism fails to secure mental causation—a shortcoming which is probably fatal. More formally, according to a common though not universal presentation, the causal exclusion problem is the conjunction of the following four individually plausible, but (seemingly) jointly inconsistent principles.

a. The Principle of Mental Causation

The first of the four principles constituting the causal exclusion problem is the principle of mental causation. Broadly construed, the principle of mental causation stipulates that some events have mental causes:

The Principle of Mental Causation: some events have mental causes.

This initial definition is subject to considerable nuance, including the following three distinctions. First, there are questions about whether mental causes should be construed as substances, events, or properties of events. Traditional models of substance dualism, most famously espoused by René Descartes, suppose that mental causes are substances, such as a soul or a disembodied mind. Most contemporary philosophers reject this view in favour of the the view that mental phenomena are events. However, there are deep, yet relevant, disagreements about the nature of events, and whether events, in virtue of certain properties, are causally efficacious. These issues will be dealt with in detail in Section 3.a.

Second, there are questions about whether the events that have mental causes are mental effects, physical effects, or both. Some philosophers endorse the autonomist view, according to which mental events cause mental effects but do not cause physical effects (Gibbons, 2006). Devin’s sadness is caused by his belief that his gecko died, but his crying has physical causes. This view will be dealt with in Section 3.g.

Third, many philosophers distinguish between autonomous mental causation and reduced mental causation. Nonreductive physicalists endorse autonomous mental causation, which is the conjunction of the principle of mental causation and the principle of irreducibility. Autonomous mental causation is the view that the mental-as-mental causes effects. Reductive physicalists endorse reduced mental causation, which is the conjunction of mental causation with a rejection of irreducibility. Reduced mental causation is the view that the mental-as-physical causes effects. Most, even some reductionists (Kim, 2005, 159), agree that autonomous mental causation would be preferable, though the result of the causal exclusion problem may be that autonomous mental causation is not possible within a physicalistic metaphysic.

There are numerous arguments in support of the principle of mental causation. First, Donald Davidson overthrew the consensus against mental causation by highlighting the plausible distinction between having a reason for acting, and acting for a reason (Davidson, 1963). A student may have a desire to impress the teacher as a reason for asking a question, but the student may actually ask the question because he wants to know the answer. This plausible distinction presumes that reasons are causes, which supports the principle of mental causation. Second, the moral responsibility argument: an ought implies a can, and a can implies mental causation. Someone locked up in chains, literally unable to move, is not morally responsible for not helping someone who has fallen. Similarly, if humans are unable to act, since they lack mental causation, they lack moral responsibility (Kim, 2005, 9). Third, the epistemic argument: knowing implies the justification relation between premise and conclusion, or sensation and belief, is a causal relation. Imagine a random syllogism generator that spits out a million invalid syllogisms in a row: monkeys like bananas, tomatoes are red, therefore the sky is blue. Then, once it finally gets lucky: humans are animals, animals are mortal, therefore humans are mortal. There is justified true belief here, but not knowledge. What is missing? Among other things, the conclusion does not occur because of the reasonableness of the premises, where because is taken literally—it must be the cause (Brewer, 1995, 242). Finally, an evolutionary argument (Jackson, 1982, 133): organisms typically inherit traits that enhance fitness, so, probably, mental events enhance fitness. If mental events lack causal efficacy, they do not enhance fitness. So, probably, mental events are causally efficacious. For these reasons, many think the principle of mental causation must be taken as a “truism” (Ney, 2007, 486), whose rejection would amount to “the end of the world” (Fodor, 1989, 77).

b. The Principle of Physical Causal Completeness

The second of the four principles constituting the causal exclusion problem is the principle of physical causal completeness:

The Principle of Physical Causal Completeness: every physical event that has a cause has a sufficient physical cause.

This principle is subject to several possible modifications. First, the principle of physical causal completeness is stated deterministically. If it turns out that the completed microphysics is indeterministic, it would be easy to reframe this principle in terms of a probabilistic model of microphysics. For example, every physical event has its probability fixed by entirely physical antecedents (Papineau, 1993, 22; Bennett, 2008, 281). Nothing of substance rides on adopting the deterministic or probabilistic version, but the deterministic reading is often used for the sake of simplicity. Second, this principle, as presently defined, merely stipulates that physical events have sufficient physical causes, but does not require that mental events have sufficient physical causes. Thus, it is possible that physical events have sufficient physical causes, but mental events do not. This possibility is ruled out by the addition of a strong supervenience principle, which, for the purposes of the mental causation debate, can be defined as follows:

The Principle of Supervenience: physical events determine every mental event, and every mental event depends upon physical events.

As a general example of supervenience, imagine a picture of Mona Lisa printed out by a dot-matrix printer. The Mona Lisa depends upon the dots on the page, and the dots on the page determine that the Mona Lisa arises. Likewise in the case of mental causation: Jennie’s desire for a peach is determined by, and dependent upon, some series of neural events in Jennie’s brain. The supervenience principle, combined with the principle of physical causal completeness, says that not only does every physical event have a sufficient physical cause, but every event, including mental events, is determined by physical events. This can serve as an adequate definition of physicalism:

Physicalism: every physical event has a sufficient physical cause, and every event is determined by, and depends upon, physical events.

Not only do many physicalists supplement the principle of physical causal completeness with a supervenience principle, but some also strengthen the principle of physical causal completeness into a principle of physical causal closure. The principle of physical causal closure indicates that physical events only have sufficient physical causes. On physical causal completness, distinct mental causes are not definitively barred from causally interacting with physical events (Kim, 2009, 38; Montero, 2003, 174). That is, one can admit that every physical event has a sufficient physical cause, while continuing to add a mental cause for the event as well (Marcus, 2005, 19ff; Crane and Mellor, 1990, 206). This possibility is closed off by stipulating that physical events only have physical causes (Vicente, 2006, 150; Kim, 2005, 50; Montero, 2003, 175). Most, however, do not endorse this stronger principle of physical causal closure, since it appears to exclude nonreductive physicalist models of mental causation (Kim, 2005, 52; Lowe, 2000, 572).

The principle of physical causal completness is supported by two arguments. First, the appeal to conservation laws. There are a number of sources that extensively discuss the historical ascension of conservation laws in modern physics (Harbecke, 2008, 19ff; Papineau, 2001, 13ff). In brief, Descartes introduced the law of the conservation of motion, according to which the total mass times speed of any set of bodies remains constant. Descartes maintained, however, that the mind could alter the direction of bodies without altering their speed. Leibniz however, established the law of the conservation of linear momentum, according to which the total mass times speed and direction of any set of bodies remains constant, regardless of how they interact. Leibniz argued that this conservation law closed the physical world off from mental causes. Several centuries later, Hermann von Helmholtz added the law of conservation of energy: the total energy, or force, of any system of interacting bodies is conserved, or, remains the same across time. The result of these conservation laws is that every physical event has a sufficient physical cause. Those endorsing physical causal closure also use this argument to yield the stronger conclusion that every physical event has only a sufficient physical cause, as distinct mental causes cannot add energy to a closed physical system. Physical causal completeness is also supported by the success of neuroscience. In the past one hundred years, neuroscientists have successfully mapped neuronal processes responsible for a wide range of behavioural effects. Brain regions associated with mental states such as emotions, cognitive capacities, and perceptual capacities have been discovered. While neuroscience is not yet complete, these findings provide increasingly compelling evidence that every behavioural effect has a sufficient physical cause. For these reasons, many think that the principle of physical causal completeness is “fully established” (Papineau, 2001, 33).

c. The Principle of Irreducibility

The third principle constituting the causal exclusion problem is the principle of irreducibility, according to which mental causes of behavioural effects are distinct from physical causes of behavioural effects.

The Principle of Irreducibility: mental causes of behaviour are distinct from physical causes of behaviour.

The principle of irreducibility, like the previous two principles, has different readings. Some take the principle of irreducibility to mean that mental properties are distinct from physical properties, though mental events are identical with physical events (Davidson, 1993, 3; Fodor, 1974, 100). That is, a brain event in Jennie’s brain has neural properties, such as its cascading neural activity, and distinct mental properties, such as a felt desire for a peach. On this view, mental causation typically occurs when the brain event, in virtue of its mental properties, causes behavioural effects. Others take the principle of irreducibility to mean that mental properties are distinct from physical properties, and mental events are distinct from physical events (Kim, 2005, 42). In this case, Jennie’s neural activity is a distinct event from her felt desire for a peach. On this view, mental causation typically occurs when the mental event causes behavioural effects. This distinction is central to a strategy for overcoming the causal exclusion problem, as discussed in Section 3.a.

There are two leading arguments in support of the principle of irreducibility. First, Leibniz’ doctrine of the indiscernibility of identicals stipulates that if two entities are identical they share all the same properties. Thus, a moose is not identical with a bear if the moose has antlers but the bear does not, and mental causes are not identical with physical causes if mental causes have distinct properties from physical causes. Some argue that mental events are subjective experiences such as itches or pains while physical events are objective chemical interactions (Chalmers, 1996). Others argue that mental events are purposive, intentional, and rational, while chemical activity in the brain is not (Silberstein, 2001, 85). Still others contrast free agency with physical determinism. The principle of irreducibility is also supported by the multiple realizability argument (Putnam, 1967), according to which mental properties can be realized by a number of different physical property instances. For instance, the same hunger for fish is realized by different neural activity in humans and sharks. Since a self-identical property must always be present wherever it is, hunger cannot be identical with a specific neural activity in humans, since hunger is also present where this specific neural property is absent. Some philosophers add that the same mental event can be multiply realized over time as well (Pereboom, 2002, 503). This way, Jennie’s belief that cheerios are yummy has persisted since she was three, despite slight alterations to the neural correlates constituting her belief over time. For these reasons, many take the failure of the principle of irreducibility to be “simply inconceivable” (Slors and Walter, 2002, 1), as evidenced by reductionists who acknowledge the “grip” of the “compelling intuition” (Papineau, 2002, 3).

d. The Principle of Causal Exclusion

The final principle constituting the causal exclusion problem is the principle of causal exclusion. In its contemporary formulation, the causal exclusion principle is introduced by Norman Malcolm (1968), but is most thoroughly exposited by Jaegwon Kim. Here is Kim’s articulation of the principle:

The Principle of Causal Exclusion: “No single event can have more than one sufficient cause occurring at any given time—unless it is a genuine case of causal overdetermination” (Kim, 2005, 42).

It is worth highlighting several important features of this definition. First, the causal exclusion principle comes with the caveat that an event can have more than one sufficient cause if the event is genuinely overdetermined. Genuine overdetermination occurs when two independent causal processes converge on the same effect—the house burns down because the lit match drops in the garbage at the same time as the lightning strikes the house. The nonreductive physicalist posits that mental events supervene on physical events, so there are not two independent causal processes, so behavioural effects are not genuine cases of overdetermination. As a result, the causal exclusion principle directly opposes the nonreductive physicalist view that behavioural effects can have sufficient physical causes and dependent mental causes.

The causal exclusion principle specifies that events cannot have more than one sufficient cause occurring at a given time. This caveat is added in order to set aside instances involving causal chains where a is a sufficient cause of b, and b is a sufficient cause of c, thereby indicating that a is also a sufficient cause of c. It is acceptable for c to have both a and b as sufficient causes in this way. But, since the nonreductive physicalist argues that mental events supervene on physical events, these events occur simultaneously, so the causal exclusion principle applies straightforwardly.

The causal exclusion principle can be interpreted as stating that behavioural events cannot have two sufficient causes (Arnadottir and Crane, 2013, 254). This interpretation allows for the following trivial solution to the causal exclusion problem: behavioural effects have one sufficient physical cause and a distinct but insufficient mental cause, which does not violate the causal exclusion principle. Indeed, the principle of mental causation does not stipulate that mental events must be sufficient causes, so this solution would be available. The alternative reading of the causal exclusion principle rules out this scenario by stating that behavioural events cannot have a single sufficient cause and any other cause, partial or sufficient (Kim, 2005, 17).

While the causal exclusion principle rules out more than a single sufficient cause of an effect, it does not rule out the possibility that the effect has a sufficient physical cause and a non-causal determinant. This is an especially poignant note, since physical events non-causally determine supervening mental events. Thus, the following trivial solutions to the causal exclusion problem are available: mental effects have sufficient mental causes, and are determined by subvening physical events (Thomasson, 1998, 183-186); physical effects have sufficient physical causes, and are determined by distinct mental events. These moves are repelled in two ways. First, by appealing to a broader principle of determinative exclusion, according to which effects can have no more than a single determinant, causal or otherwise (Kim, 2005, 17). Second, by appealing to Edward’s Dictum, which says sufficient synchronic determination relations exclude causal relations (Kim, 2005, 36ff). For example, the existence of a university at a time is what ultimately determines that Dr. Smith is a professor at the university at that time, not the fact that she got tenure two years ago. Similarly, subvening physical bases determine mental events, thereby excluding prior mental events as causes of those mental events.

There are four arguments in support of the causal exclusion principle. First, the massive coincidence argument: multiple sufficient causes of the same effect is a rare coincidence—barns infrequently burn down by the simultaneous occcurrence of a lightning strike and a dropped match. Yet, mental causation is ubiquitous—agents perform acts based on reasons hundreds of times a day and there are billions of agents in the world. Thus, the view that behavioural effects have more than a single sufficient cause is a view that stipulates massive amounts of coincidence, which is implausible (Kim, 1998, 53). Second, the parsimony argument: according to the venerable principle of parsimony, one ought not multiply causes beyond necessity, where necessity is eclipsed once sufficient causation is established (Kim, 1989, 98). Third, the necessity argument: overdetermining causes are individually sufficient, so not individually necessary. Billy and Suzy both throw stones, simultaneously breaking the window. Billy’s throw is individually sufficient—his throw alone, without Suzy’s, would have broken the window—so Suzy’s throw is individually unncessary. If behaviour is likewise overdetermined, neither cause is individually necessary. But, physical causal completeness insists that some physical cause is necessary, and mental causation requires a mental cause (Moore, 2017). Fourth, the additivity argument: if causation involves production, and one sufficient cause packs all the punch required to produce the effect, then a second cause would push the effect too far, or be incapable of producing the effect at all, on account of the fact that the effect has already been fully produced (Carey, 2011, 253; Kim, 1998, 53).

To briefly summarize, each of the four principles constituting the causal exclusion problem are substantially motivated. But, it is difficult to imagine how one can consistently endorse all four principles. How can one agree that behavioural effects can have no more causes than the single sufficient physical cause, while simultaneously arguing that behavioural effects nevertheless have distinct mental causes as well? The nonreductive physicalist endorses the first three principles, leaving the fourth principle as an objection to the nonreductive physicalist view. The next section canvasses a variety of resolutions to this causal exclusion problem.

3. Solutions to the Causal Exclusion Problem

Numerous responses to the causal exclusion problem have arisen. Some solutions modify the metaphysical foundations underlying the causal exclusion problem (3.a-3.b). Others propose models of nonreductive physicalism that secures mental causation (3.c-3.f). Still others reject one of the four principles constituting the causal exclusion problem (3.g-3.j).

a. The Metaphysics of Events and Properties

The causal exclusion problem is grounded in a set of metaphysical assumption. Some critics undermine the causal exclusion problem by refuting or simply rejecting these metaphysical assumptions. One such strategy focuses on the nature of events and properties presumed by the causal excluion problem. Kim frames the causal exclusion principle in terms of events: no single event can have more than a single sufficient cause. Likewise, the principle of physical causal completness indicates that physical events have sufficient physical causes, while the principle of mental causation states that events sometimes have mental causes. Clearly, the nature of events is central to the causal exclusion problem.

Kim endorses the property exemplification model of events, according to which an event is the instantiation of a property by an object at a time (Kim, 1976). Sebastian’s stroll at noon, or Brutus’ stabbing at sunset, are prototypical events as are the brain’s neural process at dawn and Joe’s pain at dawn. While each event has one constitutive property, events can have other properties as well. Sebastian’s stroll at noon has the constitutive property of ‘being a stroll’, while it also has the properties of being long and winding. Events are identical if they have the same object, constitutive property, and time. Thus, Sebastian’s stroll at noon is identical with the man’s stroll at 12:00 PM, but is not identical with Grace’s stroll at noon, or with Sebastian’s sleep at noon, or with Sebastian’s stroll at sunset. These identity conditions on events entail that a mental event is only identical with a physical event if, among other things, the mental property of the event is identical with the physical property of the event (Kim, 2005, 42). This is called the single-instantiation thesis, and a number of authors agree with it (Whittle, 2007, 64; Gibb, 2004, 469). The implication is that one cannot yoke event identity with property dualism.

Cynthia MacDonald and Graham MacDonald modify the property exemplification model in a manner that opens up a solution to the causal exclusion problem (MacDonald and MacDonald, 2006). They note that the same event can be the instantiation of both a constitutive property and numerous other properties as well. Sebastian’s stroll is the same event as Sebastian’s walk at noon, and Sebastian’s moving at noon, and Sebastian’s exercising at noon. As they say, “there can be just one instance of distinct properties” (MacDonald and MacDonald, 2006, 562). This co-instantiation thesis, applied to the mental causation debate, suggests that the same event can be an instance of a physical property and a distinct mental property. Or, one can yoke event identity with property dualism. This affords the following solution to the causal exclusion problem: mental causes are identical with sufficient physical causes. This means there are not two sufficient causes, and mental properties cannot be reduced to physical properties.

This solution not only rests upon the successful modification of Kim’s framework, but also faces a version of the quausal problem (Honderich, 1982, 63-64; Sosa, 1984, 277; Kim, 1984, 267)). The quausal problem stipulates that events are caused by virtue of causally relevant properties. For example, while the heavy, green pear causes the scale to tip to one pound, it is in virtue of the pear’s heaviness that the scale tips to one pound, not in virtue of the pear’s greenness. Likewise, while the event causes behavioural effects, it is plausible that the event, by virtue of its constitutive physical properties, rather than its mental properties, causes behavioural events. MacDonald and MacDonald avoid the quausal problem by suggesting that events cause as ontological simples. That is, it assumes that the event that is a physical instance is the event that is a mental instance, and this event does not cause in virtue of it being a physical instance or mental instance, but it causes in virtue of being an event (MacDonald, 2007, 243). Some worry that this response would allow every property instanced as the event to be causally efficacious (Wyss, 2010, 174).

Donald Davidson goes a step further than MacDonald and MacDonald, rejecting the Kimian framework of events entirely (Davidson, 1980, 163ff). For Davidson, events are ontologically simple. Events are not instantiations of constitutive properties, nor do they have other properties as ontological constituents. Thus, the building’s falling at noon is not, essentially, a falling, so it can truly be re-described using non-equivalent language, such as ‘the event reported about on pg. 5 of the Times’, or ‘that fateful event’. Or, the neural process can be truly described using physical vocabulary or non-equivalent mental vocabulary, such as ‘Jennie’s desire for a peach’. This amounts to event identity: the mental event is the physical event, yoked together with conceptual dualism, and the physical description is irreducible to the mental description (Davidson, 1980, 207ff). This model solves the causal exclusion problem as follows: mental causes are identical with sufficient physical causes, so there are no more than a single sufficient cause, while mental predicates are irreducible to physical predicates.

Critics typically level the quausal problem against this Davidsonian solution (Honderich, 1982). Again, the quasal problem suggests that events cause in virtue of causally relevant properties: while the fleecy, pink slippers provide warmth, it is in virtue of their fleecyness, not their pinkness, that they provide warmth. Likewise, while mental causation may be secured by the fact that the mental event is the efficacious physical event, mental quausation fails by virtue of the fact that events cause in virtue of their lawlike physical properties, not in virtue of their mental properties. Davidson responds by stating that events cause as events, no matter whether the events are described in physical vocabulary or mental vocabulary (Davidson, 1993). Most philosophers are unsatisfied with Davidson’s response, as they find it plausible that events cause effects through causally relevant properties (Kim, 1993b).

b. The Metaphysics of Causation

The causal exclusion problem is, prima facie, a problem pertaining to causation. The principle of mental causation implies that there are mental causes, while the principle of physical causal completeness implies that there are physical causes that are sufficient causes. At the same time, the principle of causal exclusion stipulates that no effect can have more than a single sufficient cause. Questions about the nature of causation are paramount, and numerous solutions to the causal exclusion problem advert to modifying the metaphysics of causation upon which the problem rests.

Jaegwon Kim crafts the causal exclusion problem from within a productive model of causation, according to which “a cause is something that produces, or generates, or brings about its effects, something from which the effects derive their existence or occurrence” (Kim, 2007, 235). This means that causes push, pull, strike, transfer momentum or energy, or in some other way produce their effects. The additivity argument for the causal exclusion principle is largely motivated by the productive model of causation. If the physical cause packs all the punch required to produce the effect, then a distinct mental cause would push the effect further or harder or be incapable of producing the effect at all on account of the fact that the effect has already been fully produced.

Numerous philosophers undermine the causal exclusion problem by attacking this productive model of causation. They argue, for example, that productive notions of causation do not appear in contemporary physics (Loewer, 2007). Emboldened by these considerations, numerous critics resolve the causal exclusion problem by operating within different models of causation. For example, the nomological model of causation stipulates that causes nomologically necessitate their effects. Fire is a cause of smoke since there exists a law such that ‘if fire occurs, then smoke occurs’. On this view, mental events are causes of behavioural effects if there exists a law such that ‘if the mental event occurs, then the behavioural event occurs’. This law can be established, as the physical event, which necessitates the behavioural effect, also necessitates the occurrence of the mental event. So, all things being equal, the presence of the mental event necessitates the occurrence of the behavioural effect (Fodor, 1989, 66). Thus, behavioural effects can have nomologically sufficient physical causes and nomologically sufficient mental causes.

Critics level several objections at this nomological causation solution. First, the nomological model may fail as an account of causation. Numerous events stand in nomological relations with other events without being causes of those events (Kim, 2007, 231): the gun’s sound nomologically necessitates the hole in the wall, the knife’s shadow nomologically necessitates the gash in the screen. Similarly, mental events may be like shadows that nomologically necessitate without causing behavioural effects. One can object to this shadow analogy on the grounds that mental causation stipulates that mental events are efficacious whereas shadows are not. A more appropriate analogy may be where fire causes Joe’s death, but fire necessitates the appearance of  smoke, which is clearly also a nomologically sufficient cause of Joe’s death. However, given that the causal exclusion principle stipulates that there cannot be two sufficient causes of behavioural effects, those committed to the causal exclusion principle will not allow two nomologically sufficient causes of behavioural effects.

Some philosophers resolve the causal exclusion problem by appealing to a counterfactual model of causation. According to the counterfactual model, effects are counterfactually dependent on their causes. Fire is a cause of smoke since there is a counterfactual dependency such that ‘Had the fire not occurred, the smoke would not have occurred’ (Lewis, 1986, 166-167). Counterfactual dependency is established if the nearest possible world where fire does not occur is also a world where smoke does not occur. On this view, mental events are causes of behavioural effects if the nearest possible world where the mental event is absent is a world where the behavioural effect is absent as well. This counterfactual dependency is established by virtue of the fact that the nearest possible world where the mental event does not occur is also, given supervenience, a world where the subvening physical event does not occur, which indicates the behavioural effect does not occur either (Loewer, 2007; Kroedel, 2015).

This accounts for mental causation, but establishing the truth of physical causal completness is more complex on the counterfactual account. Some demonstrate that the behavioural event has a physical cause by appealing to the truth of the counterfactual ‘Had the physical event not occurred, the behavioural effect would not have occurred’ (Loewer, 2002). While this establishes that behavioural effects have physical causes, physical causal completeness requires that behavioural effects have sufficient physical causes. Unfortunately, the counterfactual model lacks a clear criterion for sufficient causation. Here is one possibility: the physical event is a sufficient cause of the behavioural effect if the nearest possible world where the physical event occurs without the mental event is a world where the behavioural effect occurs. The difficulty with this possibility is that nonreductive physicalists typically say that the physical event metaphysically necessitates the mental event, so there are no worlds where only the physical event occurs (Loewer 2002, 658; Bennett 2003, 479; Kallestrup, 2006, 472), so it is not established that the physical event is a sufficient cause. Perhaps this difficulty is avoided by saying that the physical event is a sufficient physical cause of the behavioural effect if the nearest possible worlds where the physical event occurs are also worlds where the behavioural effect occurs (Menzies, 2013, 63; Kroedel, 2015, 366).

The counterfactualist solution to the causal exclusion problem faces additional difficulties as well. Like the difficulty facing nomological accounts, some complain that counterfactual dependency is established among non-causal events: had the knife’s shadow not occurred, the wound would not have occurred, but the shadow is not a cause of the wound; had the gun’s sound not occurred, the hole in the wall would not have occurred, though the sound is not a cause of the hole (Kim, 2007, 234). Similarly, mental events may be like shadows that behaviour counterfactually depends upon, but nevertheless makes no causal contribution.

Some philosophers have recently deployed a difference-making model of causation to solve the causal exclusion problem (List and Menzies, 2009, 482). The difference-making model, which bares similarities with a blend of the nomological account and the counterfactual account, says that causes must ‘make a difference’ for their effects. The fire makes a difference for the smoke if ‘Had the fire occurred, the smoke would have occurred’ is true, and if ‘Had the fire not occurred, the smoke would not have occurred’ is true. Mental events cause behavioural effects, since ‘Had the mental event occurred, the behaviour would have occurred’ and ‘Had the mental event not occurred, the behaviour would not have occurred’ are both true. Physical events are sufficient causes of behavioural effects, because ‘Had the physical event occurred, the behaviour would have occurred’ is true. But ‘Had the physical event not occurred, the behaviour would not have occurred’ is false, because the mental event could have a different physical realizer, in which case the behaviour would still have occurred. So, the physical event is a sufficient cause, securing physical causal completeness; meanwhile, the behaviour has no more causes than the distinct mental cause, which satisfies the causal exclusion principle while sustaining distinct mental causes of behaviour.

Critics level several objections at this difference-making solution. First, since the physical event is not a cause of the behaviour, physical causal completeness may fail (Bermudez and Cahen, 2015, 53). While some take this issue to defeat the model, others take it to be a virtue of the model, since it turns the causal exclusion problem on its head by establishing that mental causes exclude physical causes of behavioural effects (Menzies, 2015, 39-40). It is also possible to avoid the failure of physical causal completness by arguing that behavioural effects are realization-sensitive. That is, if the occurrence of a different physical realizer yields a different behavioural effect, then the nearest worlds where the physical event does not occur are worlds where the behavioural effect does not occur. This establishes that the physical event is a cause of the behavioural effect. But, this solution now violates the causal exclusion principle, as it postulates more than a single sufficient cause of the behaviour.

Some philosophers appeal to the related interventionist model of causation to solve the causal exclusion problem (Woodward, 2003). On interventionism, fire is a cause of smoke if we intervene on fire, or, bring it about that fire does or does not happen, while holding all other variables constant, and the result is that the smoke does or does not happen. On this view, mental events are causes, since, when one intervenes on the mental event, the behavioural event shifts. For example, changing a mental event from a belief that carrots are healthy to a belief that carrots are poisonous would eliminate carrot eating behaviour. Likewise, the physical event is a cause of the behaviour, since taking the physical event away would take the behavioural effect away. The result: the interventionist model articulates a manner in which behavioural effects legitimately have more than a single sufficient cause, thereby falsifying the principle of causal exclusion.

Some resist this interventionist solution to the causal exclusion problem on the grounds that it is impossible to establish mental causation on interventionism. Interventionist mental causation requires intervening on the mental event while holding all other events, including the subvening physical event, fixed. But, by virtue of supervenience, it is impossible to intervene on the mental event without also altering the subvening physical event, so interventionist mental causation is not established (Baumgartner, 2010). Some respond by stipulating that the special nature of the supervenience relation entails that one need not – since one can not – hold the physical event fixed while intervening on the mental event. As such, there is no difficulty establishing mental causation within an interventionist framework (Woodward, 2015).

c. Supervenience

 

The most longstanding model of nonreductive physicalist mental causation is mind-body supervenience. According to this view, mental events supervene upon, or are determined by, physical events. This notion of supervenience implies a tight enough dependency relation for mental events to inherit the efficacy of their subvening bases. For example, Jennie’s desire for a peach causes her to eat a peach, while the subvening physical cause also causes her to eat the peach (Zangwill, 1996). It is worth noting that Kim once appealed to the supervenience relation to secure mental causation (Kim, 1993, 106). It is common to depict the situation with the aid of the following diagram, where m stands for a mental cause, p is a physical cause, m* is a mental effect, p* is a physical effect, horizontal and diagonal lines indicate causation, while vertical lines indicate supervenience:figure 1

Diagram 1: Supervenient Mental Causation

According to this diagram, p is a sufficient cause of p*, but p necessitates the presence of m, ensuring that m is present to cause p* and m* as well.

The causal exclusion principle is fashioned to directly confront this supervenience model of mental causation. Indeed, sometimes the causal exclusion problem is called the supervenience argument, indicating that supervenience based solutions are clear targets of the causal exclusion principle. The causal exclusion principle stipulates that events cannot have more than a single sufficient cause, where p* has two causal arrows converging on it, so one of the causes must be excluded. The principle of physical causal completeness stipulates that p must be a sufficient physical cause of p*, so m is excluded as a cause. Furthermore, m* is fully determined by its supervenience base p*, so m is excluded as a cause of m* (Kim, 2005, 39ff). There is no work for the mental event to do, which undermines the principle of mental causation, thereby calling supervenience-based nonreductive physicalism into question.

d. Emergentism

The difficulties associated with the supervenience solution led many philosophers to explore fresh methods of accommodating a sufficient physical cause and a distinct mental cause of the same effect. Chief among these new methods is renewed interest in the doctrine of emergentism (O’Connor and Wong, 2005; Humphreys, 1997). Emergentists argue that novel properties emerge, or arise, out of a base level. At the base level, microphysical particles arrange in such a way as to compose, and give rise to, higher level mereological wholes such as molecules or molecular compounds. The molecule has novel properties, or, properties that its particles, in isolation, lack. Molecules then arrange in such a way as to compose higher level biological wholes such as cells, hearts, reproductive mechanism, and organisms. These biological wholes have novel properties as well, such as the ability to pump an organism’s blood, to reproduce or to chew. Similarly, some biological wholes, specifically, brains, give rise to persons with novel mental properties such as beliefs and desires. And, persons, when appropriately grouped together, compose social structures and institutions with novel properties that the persons, in isolation, lack—for example, being a university professor, or being the baseball team’s shortstop.

Higher level emergent properties are capable of downward causation, which means that emergent properties influence lower level domains. Thus, the puppy-wise organization of bones and muscles influences these bones and muscles—if the bone was not arranged as a puppy jaw, the bone would not be able to bite down or move. Emergent properties secure mental causation by exercising downward efficacy on lower-level behaviours. The emergentist conceives of mental properties as emergent properties that arise out of, and are not reducible to, its neural parts. Hence, the principle of irreducibility is secured. At the same time, emergent mental properties arise out of, and are dependent upon, lower level physical properties, which may establish the physicalist view that all events are dependent upon physical events.

It is common to object to the emergentist solution to the mental causation problem on the grounds that emergent properties supervene upon their bases. This is problematic because, as discussed above, supervenient properties are susceptible to exclusion pressures. In other words, since the lower level bases are sufficient causes of behavioural effects, supervening emergent properties are excluded from causing behavioural effects (Kim, 1999; McLaughlin, 1997, 16).

Some emergentists respond by rejecting the view that emergent properties are supervenient properties, hence rejecting the view that the same exclusion pressures facing supervenient properties face emergent properties (Silberstein and McGeever, 1999). Other emergentists concede that emergent properties lack downward causation. They endorse what is sometimes called weak emergentism, or epistemic emergentism, according to which higher level descriptions cannot be explained by, and are not predictable from, lower level descriptions of the same phenomenon (Bedau, 1997). This view bares certain affinities with conceptual dualism yoked with ontological monism. Other emergentists respond by abandoning, or significantly nuancing, the principle of physical causal completeness (Hendry, 2010). Because emergent properties have novel, downward causal influence on behaviour, the lower level parts must not be sufficient causes of behaviour. However, higher level wholes, such as persons, rabbits, and mountains, are still broadly physical (Kim, 1997, 293), so higher level causes are still broadly physical causes. Thus, behavioural effects have sufficient physical causes, where sufficient physical causes includes both the lower level microphysical processes and the higher level broadly physical processes. This position, while it may secure a principle of broad physical causal completeness, seems to abandon microphysical causal completeness.

e. Functionalism

Functionalism is a dominant model of nonreductive physicalist mental causation that construes mental properties as functional properties of the mind (Witmer, 2003; Block, 1990). Jennie’s belief that it will rain is defined as her being in whatever state is caused by rain-indicating perceptual inputs and, given a background psychology that is familiar with rain, causes her rain preparation behaviour. These functional properties in turn have various physical realizers which carry out the specific task defined by the causal role. Jennie’s belief that it will rain is realized by some neural structure in her brain. Functionalism secures irreducibility and mental causation by defining mental events by their causal profiles, which are distinct from their physical realizers which implement the causal profile and cause behaviour.

The causal exclusion problem poses difficulties for this functionalist model of mental causation. Since, according to physical causal completeness, the physical realizer does all the work in bringing about behavioural effects, the mental state as defined by an abstract causal role is excluded from efficacy (Block, 1990, 155; Kim, 1998, 51). For example, the pill’s functional property of ‘being dormitive’ is realized by some chemical property of the pill, where the chemical does all the work in producing sleep in patients, leaving the dormitivity of the pill with no work left to do.

Functionalists respond to the causal exclusion problem in several ways. Some functionalists, including Kim, endorse realizer functionalism, which is the view that functional states are identical with their efficacious realizers, thereby inheriting the causal efficacy of their physical realizers. Thus, Jennie’s belief that it will rain is identical to the specific neural structure in her brain that realizes the functional role specified as the belief. This view secures mental causation via ontological reductionism, while it typically also endorses property dualism. For example, while the dormitive properties of one pill is secobarbital, the dormitive properties of another pill is phenobarbital, though not all dormitivity is realized by secobarbital. This functional reductionist position will be discussed in Section 3.i. Other functionalists endorse role functionalism, which is the view that functional states are distinct from their efficacious realizers. These functionalists must explain how functional properties play a role over and above the role played by their realizers. One currently viable possibility is the realization approach, as detailed in Section 3.f.

f. Realization

The realization strategy agrees with the functionalist view that mental properties are realized by physical properties. They add, however, that the causal powers associated with the realized mental property are distinct from the causal powers associated with the realizing physical property. Typically, the causal powers associated with the mental property are taken to be a proper subset of the causal powers associated with the physical property (Shoemaker, 2011; Wilson, 2011). For example, the causal powers associated with pain include the disposition to produce winces and groans, while the causal powers associated with pain’s realizer, C-Fibre firing, includes the disposition to produces winces and groans as well as other capabilities, such as the disposition to slightly tip sensitive scales and the disposition to nourish hungry lions. This model preserves irreducibility, since the causal powers associated with the mental property are more limited than, and hence distinct from, the causal powers of the physical property. This model secures mental causation by noting that it is the causal powers associated with the mental property that is causally efficacious in bringing about the behavioural effect. This model secures physical causal completeness since the mental property is realized by, hence is nothing over and above, the physical property instance that causes the effect. And, this solution does not violate the causal exclusion principle, as parts do not compete with their wholes for causal efficacy—a salvo of shots fired at Smith does not exclude the single arrow in that salvo that strikes Smith as a cause of Smith’s death (Shoemaker, 2007, 64).

The realization strategy faces several difficulties. Of central concern is whether tightly related but distinct tokens compete for causal efficacy. The suggestion is that the mereological relation is so tight that exclusion pressures do not arise. This is similar to the view that the supervenience relation is so tight that exclusion pressures do not arise. In Section 3.j., the compatibilist will similarly argue that the exceeding tightness of the relation dodges exclusion pressures. As discussed in the case of supervenience, however, the advocate of the causal exclusion principle will not be convinced that exclusion pressures are avoided in these cases. A second worry with the realization strategy is whether tokens realize both that subset of causal powers that is the mental property and the complete set of causal powers that is the physical properties. If the causal powers associated with pain includes the disposition to produce winces, and the causal powers associated with C-Fibre firing includes many things including the disposition to produce winces, and both of these causal powers are realized in the same instance, there seems to be a double counting of causal powers (Audi, 2012, 661). This double counting problem seems to re-introduce worries about overdetermination. It is possible to avoid this difficulty by positing token identity with property irreducibility (Wilson, 2011). That is, while mental properties are distinct from physical properties by virtue of their distinct causal profiles, realized mental properties are identical with their realizing physical instances. This move, however, like realizer functionalism, salvages mental causation at the price of abandoning token irreducibility. A third worry is that Shoemaker’s solution takes mental properties to be parts of physical causes of behavioural effects. Since wholes depend on their parts, physical causes of behavioural effects would be dependent upon mental properties, which may not be consistent with physicalism (Pineda and Vicente, 2017).

g. Epiphenomenalism and Autonomy

The epiphenomenalist resolves the causal exclusion problem by abandoning the principle of mental causation while endorsing the principles of physical causal completeness, irreducibility, and causal exclusion. Thus, effects can have only their sufficient physical causes, leading to the view that distinct mental events are not causally efficacious (Robinson, 2006; Gadenne, 2006). A common argument in support of epiphenomenalism is the joint strength of the principles of physical causal completeness, irreducibility, and causal exclusion, which together imply that mental causation fails.

The challenge for the epiphenomenalist is to overcome the argumentation supporting the principle of mental causation. They typically do this in two ways. First, they argue that, due to supervenience, the physical cause of behavioural effects necessitates the presence of a mental event as well. Thus, while mental events are not causal, they necessarily precede behavioural effects (Robinson, 2004, 165). Jennie’s eating of the peach is preceded by her desire for a peach, even though her desire is not a cause of her eating. This is a non-causal account of the common sense fact that the appropriate mental event precedes behavioural effects. Critics typically point out that epiphenomenalists do not think the physical event metaphysically necessitates the mental event. This makes it possible for the physical cause of Jennie’s peach eating to have given rise to a desire for broccoli instead, thereby breaking the link between the mental event and the appropriate behavioural response (Pauen, 2006).

Some endorse a weakened version of epiphenomenalism called the autonomy solution. While epiphenomenalism states that mental events lack causal efficacy tout court, the autonomy solution states that mental events do not causally interact with physical events, but do causally interact with mental effects (Gibbons, 2006). Jennie’s desire for a peach does not cause her to eat peaches but does cause her to believe she desires a peach. This move secures physical causal completeness and irreducibility while simultaneously establishing that behavioural effects are not overdetermined, and that mental events can cause mental effects. Critics argue that this solution leads to the unfortunate result that Jennie’s pain causes her to believe she is about to scream but does not cause her to actually scream (Dennett, 1991, 403). This drawback may be avoided by endorsing a wider autonomist solution, according to which mental events cause both mental effects and behavioural effects, but do not cause microphysical effects (Zhong, 2014, 349-350). This solution, however, like all autonomist solutions, faces worries that subvening microphysical processes determine behavioural effects and mental effects, thereby excluding mental events from causing mental effects or behavioural effects (Kim, 2005, 36-37).

h. Interactionist Dualism

The interactionist dualist solves the causal exclusion problem by accepting the principles of mental causation, irreducibility and causal exclusion, which jointly leads to the falsity of the principle of physical causal completeness. Because distinct mental events cause behavioural effects and behavioural effects are not overdetermined, the physical cause of the behaviour is not a sufficient cause (White, 2017; Meixner, 2008). Again, the plausibility of the three endorsed principles provides support for the conclusion that physical causal completeness is false.

This model must overcome the arguments in support of physical causal completeness. Some do so by providing models according to which mental causation is ‘invisible.’ This means that while mental events are causally efficacious, the efficacy of mental events is not detectable at the physical level (Lowe, 2008, 74; Gibb, 2015). One will not find gaps in the physical causal process that mental events must fill in, so behavioural effects have sufficient physical causes, so physical causal completeness is true. While this solution may not violate physical causal completeness, it does violate physical causal closure, as physical effects do not have only physical causes. This solution also faces exclusion pressures—the behavioural effect has a sufficient physical cause, thereby excluding purported distinct mental causes.

i. Reductionism

The reductionist solves the causal exclusion problem by rejecting the principle of irreducibility, leaving them open to embrace the principles of mental causation, physical causal completeness, and causal exclusion (Kim, 2005, 101). Thus, mental causes are identical with sufficient causes of behavioural effects, thereby establishing that behaviour has no more than a single sufficient cause. It is common to argue in support of reductionism by appeal to the joint strength of the other three principles: because behaviour cannot have more causes than the sufficient physical cause, the only way for mental causation to be true is to identify mental events with physical events.

The challenge is to demonstrate how the identity can be sustained, given the distinctions between mental and physical events, and the multiple realizability of the mental. Reductionists typically argue that the appearance of distinction is explained by the fact that the same event can be known by direct qualitative experience and by third-person description. Kim avoids the multiple realizability issue by emphasizing event reductionism: Jennie’s hunger is identical to an increase in a specific type of ghrelin in her gastrointestinal tract, while the shark’s hunger is identical with some other physiological state. While some supplement event reductionism with property dualism, this move is unavailable to Kim since event identity implies property identity on his model of events. Kim concludes that the property of hunger exists only as a functional concept (Kim, 2010, 207ff). The increase of a specific type of ghrelin in Jennie’s gastrointestinal tract is actually an instance of the property of being an increase of a specific type of ghrelin, and this event can be truly described using the functional concept of hunger as well. Critics worry that this view amounts to ontological monism yoked with conceptual dualism, which is troublesome because Kim has previously argued against the Davidsonian model of event identity with conceptual dualism (Moore and Campbell, 2015). It also seems that events must cause in virtue of their physical properties, not in virtue of their mental properties, since mental properties do not exist, so mental quausation fails.

j. Compatibilism

Compatibilism, as coined by Terence Horgan (1997, 166), is the view that endorses the principles of mental causation, irreducibility, and physical causal completeness, thereby disputing the causal exclusion principle in some manner (Bennett, 2003; Shoemaker, 2007). Thus, there is some benign way of showing how behavioural effects can have sufficient physical causes and distinct mental causes. Support for compatibilism arises from the plausibility of the three endorsed principles, which constitutes evidence that the causal exclusion principle is false. Compatibilists argue that that there obtains an exceedingly tight relation between the physical cause and mental cause. As discussed in previous sections the exceedingly tight relation may be a relation of strong supervenience, or realization, or some other such relation in which the physical cause metaphysically necessitates the mental cause. As it is impossible for blue to occur without a colour occurring, and it is impossible for a horse-wise arrangement of horse parts to occur without a horse occurring, so it is impossible for the physical cause to occur without the mental cause.

This exceedingly tight relation is deployed as a resolution to the arguments supporting the causal exclusion principle. The massive coincidence involved with two independent causal processes converging on the same effect is replaced with the requirement that physical causes and their dependent mental causes must converge on behavioural effects (Loewer, 2002, 658). This tight relation implies that the physical cause necessitates the mental cause, so mental events are necessary for behavioural effects, overcoming the necessity argument. Likewise, the mental event guarantees the presence of some physical cause, so physical causes are necessary for behavioural effects (Kallestrup 2006, 472; Arnadottir and Crane 2013, 255), overcoming the necessity argument as well. Similarly, the parsimony argument stipulates one should not countenance more causes than necessary, but both the physical cause and the mental cause is necessary.

The compatibilist view faces a number of difficulties. Some worry that the compatibilist solution is ad hoc as the only instances of dependent overdetermination in nature appear to be those very instances of mental and physical dependent overdetermination that compatibilists suggest (Pineda, 2002). Compatibilists reply that ubiquitous, naturally occurring part-whole relations are also dependently overdetermined (Arnadottir and Crane, 2013, 258). The boxer’s knuckles and fist both strike the punching bag, simultaneously causing the punching bag to move; the baseball and the baseball’s parts both cause the window to shatter. Secondly, while compatibilists argue that exclusion pressures dissipate once the dependency relation between physical events and mental events is established, critics argue that the exclusion principle precisely applies to only those situations in which there are two dependent causes of the same effect (Kim, 2005, 48). Moreover, while compatibilism establishes that mental events are necessarily present prior to behavioural effects, it is not clear that the mental event is a cause of the behavioural effect. It is possible, for example, that the mental event is like a necessarily present epiphenomenal shadow that does no causal work. This leaves the sufficient physical cause as the single sufficient cause of the behavioural effect. Compatibilists reply by stating that the mental event is not akin to an epiphenomenal shadow, but rather is a cause of the effect. However, the more the compatibilist insists that the behavioural effect necessarily has a mental cause, the more difficult it is to show that the physical cause is an individually sufficient cause of the effect (Moore, 2017, 36).

4. Explanatory Exclusion

The causal exclusion principle has a “companion principle” (Kim, 2005, 17) in the realm of explanation called the principle of explanatory exclusion. The principle of explanatory exclusion states: “There can be no more than a single complete and independent explanation for any one event” (Kim, 1988, 233). In fact, it is of historical worth to note that Kim’s inaugural articulation of the exclusion problem occurs in the context of excluding superfluous explanations (Kim, 1988; Kim, 1989). It is important to note that the principle of explanatory exclusion allows for dependent explanations in excess of the complete explanation of the same event, while causal exclusion specifically bans dependent causes in excess of the sufficient cause of the same event. The viability of this dissimilarity is questionable: if distinct but dependent explanations of the same event are permissible, why are distinct but dependent causes of the same event not permissible? Or, if distinct but dependent causes of the same event are not permissible, why are distinct but dependent explanations of the same event permitted?

Like the causal exclusion principle, the explanatory exclusion principle is supported by a parsimony argument (Kim, 1989, 98): explanations should not be multiplied beyond necessity, where one complete explanation of an event is necessary, so additional independent explanations of the same event can be excluded. The explanatory exclusion principle is also supported by appeal to explanatory realism, which says that explanations track objective relations and that these objective relations are the content of explanations (Kim, 1988, 226). Because there can be no more than a single sufficient cause of events, and explanations track objective relations, there can be no more than a single complete and independent explanation of events.

The explanatory exclusion principle poses problems for models claiming that the same event has a complete physical explanation and an independent mental explanation, which includes the popular model of ontological monism yoked with conceptual dualism (Davidson, 1993; Papineau, 2002). Jennie’s increased heart rate has a complete physical explanation in terms of a release of hormones from her amygdala, but this same event is also explained by her fear of the approaching bear. Since the physical explanation is a complete explanation, the intensionally independent mental explanation can be excluded as unnecessary—an unpalatable result.

Numerous responses to this explanatory exclusion problem have been proposed. First, some reduce mental explanations to physical explanations by endorsing an extensionalist model of explanatory individuation (Kim, 1988, 233). The physical explanation of Jennie’s increased heart rate refers to the same causal relation as the mental explanation of Jennie’s increased heart rate. And, the causal relation is the content of both explanations; therefore,  the explanations state the same thing, so there is really only one explanation. Explanatory exclusion pressures only arise when there are two explanations of the same event, so explanatory exclusion pressures do not arise. This response is accused of endorsing a counterintuitive model of explanatory individuation, whereby two clearly distinct explanations are considered the same explanation. For example, ‘The earthquake caused the collapse of the building’ does not seem to state the same explanation as ‘The event that caused the collapse caused the collapse of the building’, since one is explanatory, and the other is not (Marras, 1998).

Nonreductive physicalists typically solve the explanatory exclusion problem in one of two ways. First, as discussed, the explanatory exclusion principle allows for two explanations of the same event, so long as the explanations are not independent. Thus, if the mental explanation is dependent upon the distinct complete physical explanation, then explanatory exclusion pressure need not arise. Plausibly, mental explanations are dependent upon physical explanations by virtue of the fact that the mental ontologically supervenes upon the physical (Melnyk, 1996). Thus, ‘Jennie’s fear explains her increased heart rate’ is a distinct but ontologically dependent explanation of the same causal relation that is explained by ‘hormone release from her amygdala explains her increased heart rate’. This solution bares certain similarities with the compatibilist solution to the causal exclusion problem, which itself posits distinct but dependent mental causes of the same behavioural effects. Likewise, it is open to the charge that distinct but dependent mental explanations can be excluded on account of the fact that behavioural effects have a complete physical explanation.

Second, rather than attempting to resolve the explanatory exclusion problem, many nonreductive physicalists dismiss the explanatory exclusion principle as a needlessly stringent constraint on explanation. They argue that there is no difficulty with describing the same event in multiple ways (Arnadottir and Crane, 2013, 256). The red rose can be re-described as the red-or-green rose, which can be re-described as the red flower, which can be re-described as the apple coloured rose, etc. Similarly, the same causal relation between events can be described in microphysical terms, neuroscientific terms, or psychological terms, so mental explanations need not be excluded. This solution relies upon acceptance of the contestable view that the value of parsimony does not apply to explanation. Moreover, even if it is true that behavioural events can have physical and mental explanations, it is worrisome that behavioural events need not have mental explanations, given that they already have complete physical explanations.

5. The Generalization Problem

To this point, the causal exclusion problem has been restricted to the domain of mental causation, so only mental events have been in danger of being excluded from causal efficacy. Numerous philosophers worry, however, that the causal exclusion problem might generalize. That is, if mental causes are excluded by the sufficiency of subvening neural causes of behavioural effects, then perhaps neural causes are excluded by the sufficiency of the subvening chemical causes of behavioural effects. These behavioural effects are, in turn, excluded by the sufficiency of the subvening microphysical causes of behavioural effects (Kim, 1997; Burge, 1993, 102). This generalization problem leads to the following two problems. First, not only are mental causes threatened by the causal exclusion problem, but the causal efficacy of all special science properties is now threatened. Second, if there is no bottom level to physics, then all causal efficacy may drain away, since microphysical causation would be excluded by lower level quantum processes, which would in turn be excluded by lower level processes, and so on and so forth (Block, 2003; Walter, 2008). These problems are so severe that they are sometimes treated as reductio ad absurdum arguments against the causal exclusion problem.

Kim’s initial response to the generalization problem is that the causal exclusion problem does not generalize, since the exclusion engendering relations holding between mental and physical events is dissimilar to the relations holding between special science entities. The relation between the mental and the physical is a relation between higher order properties and lower order properties, where both of these properties are instantiated by the same substance. In this case, exclusion pressures arise, as the lower order properties of the substance do all the work, excluding the efficacy of other properties of the substance. On the contrary, the relation between special science properties and their bases is a relation between higher level structural properties of a whole and properties of lower level parts, where these properties are instantiated by different substances. Properties instantiated in different substances need not causally compete—the 10kg weight of the table causes the scale to tip to 10kg, and the table’s weight is not excluded by the 6kg weight of the top and the 4kg weight of the pedestal, since neither of those parts can make the scale tip to 10kg.

Numerous critics reject this response to the generalization problem (Block 2003; Noordhof, 1999). Of central concern is that higher level structural properties are supervenient upon the properties and relations of the lower level parts taken together. The combined properties and relations of the lower level parts is a sufficient cause of whichever effect occurs, thereby excluding supervening higher level structural properties of wholes. The 10kg weight of the table is excluded from causing the scale to tip to 10kg by the combined weight of the pedestal and top. After all, the scale does not tip to 20kg when the pedestal and top, and the table as well, sit upon it.

There are two other replies to the generalization problem that are worth discussing. First, the reductionist response claims that the structural property of the higher level whole is identical to the properties and relations of the lower level parts. Since the higher level structural property is identical with the causally efficacious lower level state of affairs, the higher level structural property is causally efficacious (Kim, 2005, 69). And, because the identity between higher level structural properties and lower level states of affairs holds all the way down the mereological scale, there is no fear of causal powers draining away to a bottomless level (Kim, 2005, 68).

There are several considerations weighing against this reductive solution to the generalization problem. First, the higher level structural property is singular while the lower level properties of, and relations among, the parts is a plurality. A water molecule, for example, is singular. The hydrogen atom, the oxygen atom, the other hydrogen atom, and the binary bonding relations holding between individual atoms are a plurality. It is difficult to see how a singularity can be identical with a plurality (Moore, 2010). Second, higher level wholes are multiply composable (Block, 2003, 145). For example, the same bicycle can have a Mavik tire or a Michelin tire functioning as its front wheel. If, however, the bicycle is identical to its lower level parts and relations, which includes the Mavik tire, and then the Mavik tire is replaced by a Michelin tire, then the bicycle is not the same bicycle after this alteration. This discussion not only intersects with debates in the philosophy of science, but also interacts with longstanding debates in mereology, such as the questions of mereological essentialism and whether composition is identity.

It is also possible to adopt a nonreductivist response to the generalization problem, according to which higher level structures are distinct from their lower level parts and relations. This response bares affinity with the emergentist response to the causal exclusion problem. The task of the nonreductivist is to demonstrate how higher level structures have causal powers above and beyond the causal powers of the parts and their relations. To this end, it is uncontroversial that structure is efficacious. For example, fructose and sucrose are isomers, both composed of C6H12O6. The fundamental elements are the same, and they have the same properties. Fructose and sucrose, however, are structured differently, and so they have different properties. Fructose is sweeter than sucrose, and causes less insulin secretion in humans than sucrose, for example. So, plausibly, higher level structure provides novel efficacy. The question is whether the lower level parts of fructose, with their properties, in their specific relations, are sufficient causes for these effects. If they are not, then nonreduced higher level structure has novel causal powers, but the completeness of the lower level physical level is questioned. If they are, then the completeness of the lower level physical level is established, but the efficacy of the higher level structure may be excluded.

6. Conclusion

While no solution to the causal exclusion problem has enjoyed widespread acclaim, there are several flourishing avenues of response. Chief among them are the compatibilist response, and appeals to differing models of causation, though discussion in other areas is ongoing as well. The causal exclusion problem, however, has a manner of re-establishing itself after it seems to have been solved. So, it is unlikely that the problem will dissipate in the short term. It is clear, however, that the overriding aspiriation of philosophers is to find a nonreductive, physicalist solution to the causal exclusion problem as few follow Kim’s reductive conclusions.

7. References and Further Reading

  • Arnadottir, S., and Crane, T. (2013). “There is No Exclusion Problem”. In Mental Causation and Ontology, edited by S. Gibb, and R. Ingthorsson, p. 248–265. Oxford: Oxford University Press.
  • Audi, P. (2012). “Properties, Powers, and the Subset Account of Realization”. Philosophy and
  • Phenomenological Research, 84, 3, p. 654-674.
  • Bedau, M. (1997). “Weak Emergence”. Philosophical Perspectives, 11, p. 375-399.
  • Baumgartner, M. (2010). “Interventionism and Epiphenomenalism”. Canadian Journal of Philosophy, 40, 3, p. 359-383.
  • Bennett, K. (2003). “Why the Exclusion Problem Seems Intractable, and How, Just Maybe, to Tract it”. Noûs, 37, p. 471-49.
  • Bennett, K. (2008). “Exclusion Again”. Being Reduced.  Hohwy J. and Kallestrup J. (Eds.) Oxford: Oxford University Press, p. 280-305.
  • Bermudez, J. & Arnon, J. (2015). “Mental Causation and Exclusion”. Humana Mente, 29, p. 47-68.
  • Block, N. (1990). “Can the Mind Change the World?” Meaning and Method: Essays in Honor of Hilary Putnam. Cambridge: Cambridge University Press.
  • Block, Ned (2003). “Do Causal Powers Drain Away?” Philosophy and Phenomenological Research, 67, p. 133-150.
  • Brewer, B. (1995). “Mental Causation: Compulsion by Reason”. Aristotelian Society Supplementary, 69, p. 237-253.
  • Burge, T. (1993). “Mind-Body Causation and Explanatory Practice”. Mental Causation. Heil, John and Mele, Alfred (eds.). Oxford: Clarendon Press, p. 97-120.
  • Carey, B. (2011). “Overdetermination and the Exclusion Problem”.  Australasian Journal of Philosophy, 89, 2, p. 251-262.
  • Chalmers, D. (1996). The Conscious Mind. New York, Oxford University Press.
  • Crane, T. and Mellor, D (1990). “There is No Question of Physicalism”. Mind 99, p. 185-206.
  • Davidson, D. (1963). “Actions, Reasons and Causes”. Journal of Philosophy, 60, p. 685-700.
  • Davidson, D. (1980). Essays on Actions and Events. Clarendon Press: Oxford.
  • Davidson, D. (1993). “Thinking Causes”. Mental Causation. Heil, John and Mele, Alfred (Eds.) Oxford: Clarendon Press, p. 3-18.
  • Dennett, D. (1991). Consciousness Explained. Penguin Press
  • Fodor, J. (1989). “Making Mind Matter More”. Philosophical Topics, 17, 1, p. 59-79.
  • Fodor, J. (1974). “Special Sciences, or the Disunity of Science as a Working Hypothesis”. Synthese, 28, p. 77-115.
  • Gadenne, V. (2006). “In Defense of Qualia Epiphenomenalism”. Journal of Consciousness Studies, 13, 1-2, p. 101-114.
  • Gibb, S. (2004). “The Problem of Mental Causation and the Nature of Properties”. Australasian Journal of Philosophy, 82, p. 464-475.
  • Gibb, S. (2009). “Explanatory Exclusion and Causal Exclusion”. Erkenntnis, 71, p. 205-221.
  • Gibb, S. (2015). “Defending Dualism”. Proceedings of the Aristotelian Society, 115, 2, p. 131-146.
  • Gibbons, J. (2006). “Mental Causation Without Downward Causation”. Philosophical Review, 115, p. 79-103.
  • Harbecke, J. (2008). Mental Causation: Investigating the Mind”s Powers in a Natural World, Frankfurt: Ontos Verlag.
  • Hendry, R. (2010). “Emergence vs. Reduction in Chemistry”.  Emergence in Mind, MacDonald, C. and MacDonald, G. (eds.), Oxford: Oxford University Press, p. 205-221.
  • Honderich, T (1982), “The Argument for Anomalous Monism”. Analysis, 42, p. 59-64.
  • Horgan, T. (1997). “Kim on Mental Causation and Causal Exclusion”. Nous Supplement: Philosophical Perspectives,11, p. 165-184.
  • Horgan, T. (2001). “Causal Compatibilism and the Exclusion Problem”. Theoria, 16, p. 95-116.
  • Humphreys, P. (1997). “How Properties Emerge”. Philosophy of Science, 64, p. 1–17.
  • Jackson, F. (1982). “Epiphenomenal Qualia”. The Philosophical Quarterly, 32, 127, p. 127-136.
  • Johansen, M. 2014. “Causal Contribution and Causal Exclusion”. Philosopher’s Imprint, 14, 33, 2-16.
  • Kallestrup, J. (2006). “The Causal Exclusion Argument”. Philosophical Studies, 131, p. 459-485.
  • Kim, J. (1976). “Events as Property Exemplifications”. Supervenience and Mind, Cambridge: Cambridge University Press, p. 33-52.
  • Kim, J. (1988). “Explanatory Realism, Causal Realism, and Explanatory Exclusion”. Midwest Studies in Philosophy, 12, p. 225-239.
  • Kim, J. (1989). “Mechanism, Purpose, and Explanatory Exclusion”. Nous-Supplement: Philosophical Perspectives, 3, p. 77-108.
  • Kim, J. (1993). Supervenience and Mind. Cambridge: Cambridge University Press.
  • Kim, J. (1993b). “Can Supervenience and ‘Non-Strict Laws’ Save Anomalous Monism”. Mental Causation. Heil, John and Mele, Alfred (Eds.). Oxford: Clarendon Press, p. 18-26.
  • Kim, J. (1997). “Does the Problem of Mental Causation Generalize?” Proceedings of the Aristotelian Society, 87, p. 281-297.
  • Kim, J. (1998). Mind in a Physical World. Cambridge: MIT Press.
  • Kim, J. (1999). “Making Sense of Emergence”. Philosophical Studies 95, p. 3-36.
  • Kim, J. (2005). Physicalism, or Something Near Enough.  Princeton: Princeton University Press.
  • Kim, J. (2007). “Causation and Mental Causation”. Contemporary Debates in Philosophy of Mind. McLaughlin, B. and Cohen, J. (eds). Victoria: Blackwell.
  • Kim J. (2009). “Mental Causation”. Oxford Handbook of Philosophy of Mind. McLaughlin, B. and Beckermann, A. and Walter, S. (Eds.). Oxford: Oxford University Press, p. 29-52.
  • Kim, J. (2010). Essays in the Metaphysics of Mind. Oxford: Oxford University Press.
  • Kroedel, T. (2015). “Dualist Mental Causation and the Exclusion Problem”. Noûs 49 (2): 357–375.
  • Lewis, D. (1986). Philosophical Papers: Volume II. Oxford: Oxford University Press.
  • List, C. & Menzies, P. (2009). “Nonreductive Physicalism and the Limits of the Exclusion Principle”. Journal of Philosophy, 106, 9, p. 475-502.
  • Loewer, B. (2002). “Comments on Jaegwon Kim’s Mind and the Physical World”. Philosophy and Phenomenological Research, 65, p. 655-662.
  • Loewer, B. (2007). “Mental Causation, or Something Near Enough”. Contemporary Debates in Philosophy of Mind. McLaughlin, B. and Cohen, J. (Eds). Malden: Blackwell Publishing, p. 243-264.
  • Lowe, E. (2000). “Causal Closure Principles and Emergentism”. Philosophy, 75, p. 571-586.
  • Lowe, E. (2008). Personal Agency. Oxford: Oxford University Press.
  • MacDonald, C. and MacDonald, G. (2006). “The Metaphysics of Mental Causation”. The Journal of Philosophy, 103, p. 539-576.
  • MacDonald, G. (2007). “Emergence and Causal Powers”.  Erkenntnis, 67, p. 239-253.
  • Malcolm, N. (1968). “The Conceivability of Mechanism”. Philosophical Review. 77, p. 45-72.
  • Marcus, E. (2005). “Mental Causation in a Physical World”. Philosophical Studies, 122, p. 27-50.
  • Marras, A. (1998). “Kim’s Principle of Explanatory Exclusion”. Australasian Journal of Philosophy, 76, p. 439-451.
  • Meixner, U. (2008). “New Perspectives for a Dualistic Conception of Mental Causation”. Journal of Consciousness Studies, 15, p. 17-38.
  • Melnyk, A. (1996). “Testament of a Recovering Eliminativist”. Philosophy of Science, 63, p. S185-S193.
  • Menzies, P. (2013). “Mental Causation in a Physical World”.  S. Gibb & R. Ingthorsson (eds.). Mental Causation and Ontology. Oxford: Oxford University Press.
  • Menzies, P. (2015). “The Causal Closure Argument is Non Threat to Non-Reductive Physicalism”. Humana Mente, 29, p. 21-46.
  • Montero, B. (2003). “Varieties of Causal Closure”. Physicalism and Mental Causation. S. Walter and H. Hackmann (Eds.). Exeter: Imprint Academic, p. 173-187.
  • Moore, D. (2010). “The Generalization Problem and the Identity Solution”. Erkenntnis, 72 (1): 57-72.
  • Moore, D. & Campbell, N. (2015). “On the Metaphysics of Mental Causation”. Abstracta, 8, 2, p. 3-16.
  • Moore, D. (2017). “Mental Causation, Compatilism, Counterfactuals”. Canadian Journal of Philosophy, 47, 1, p. 20-42.
  • Ney, A. (2007). “Can an Appeal to Constitution Solve the Exclusion Problem?” Pacific Philosophical Quarterly, 88, p. 486-506.
  • Noordhof, P. (1999). “Micro-Based Properties and the Supervenience Argument”. Proceedings of the Aristotelian Society, 99, p. 109-114.
  • O’Connor, T., and Wong, H. (2005). “The Metaphysics of Emergence“. Noûs, 39, p. 658-678.
  • Pauen, M. (2006). “Feeling Causes”. Journal of Consciousness Studies, 13, 1, p. 129-152.
  • Paul, L. & Hall, N. (2013). Causation: A User’s Guide. Oxford: Oxford University Press.
  • Papineau, D. 1993. Philosophical Naturalism. Oxford: Blackwell
  • Papineau, D. (2001). “The Rise of Physicalism”. Physicalism and its Discontents. Gillett, C. and Loewer, B. (Eds.) Cambridge: Cambridge University Press, p. 3-36.
  • Papineau, D. (2002). Thinking About Consciousness. Oxford: Oxford University Press.
  • Pereboom, D. (2002). “Robust Nonreductive Materialism”. Journal of Philosophy, 99, p. 499-531.
  • Pineda, D. (2002). “The Causal Exclusion Puzzle”. European Journal of Philosophy 10: 26-42.
  • Pineda, S. & Vicente, A. (2017). “Shoemaker’s Analysis of Realization: A Review”. Philosophy and Phenomenological Research 94: 97-120.
  • Putnam, H. (1967). “Psychological Predicates”. Art, Mind and Religion, Capitan, W. and Merrill, D. (Eds.). Pittsburgh: University of Pittsburgh Press, p. 37-48.
  • Robinson, W. (2006). “Knowing Epiphenomena”. Journal of Consciousness Studies, 13, 1-2, p. 85-100.
  • Sider, T. (2003). “What’s So Bad About Overdetermination”. Philosophy and Phenomenological Research, 67, p. 719 – 726.
  • Silberstein, M. (2001). “Converging on emergence”. Journal of Consciousness Studies, 8, p. 61-98.
  • Silberstein M., and McGeever, J. (1999). “The Search for Ontological Emergence”. The Philosophical Quarterly, 49, p. 182-200.
  • Shoemaker, S. 2007. Physical Realization. Oxford: Oxford University Press.
  • Shoemaker, S. (2011). “Realization, Powers and Property Identity”. The Monist, 94, 1, p. 3-18.
  • Slors, M. and Walter, S. (2002). “Introduction”. Mental Causation, Multiple Realization, and Emergence, Slors, M. and Walter, S. (eds).  Rodopi.
  • Sosa, E. (1984). “Mind-Body Interaction and Supervenient Causation”. Midwest Studies in Philosophy, 9, p. 271-281.
  • Thomasson, A. (1998).  “A Nonreductivist Solution to Mental Causation”. Philosophical Studies, 89, p. 181-195.
  • Vicente, A. (2006). “On the Causal Completeness of Physics”. International Studies in the Philosophy of Science, 20, 2, p. 149-171.
  • Walter, S. (2008). “The Supervenience Argument, Overdetermination, and Causal Drainage”. Philosophical Psychology, 21: 673-696.
  • Wilson, J. (2011). “Non-Reductive Realization and the Powers-Based Subset Strategy”. The Monist, 94, 1, p. 121-154.
  • Woodward, J. (2003). Making Things Happen. Oxford: Oxford University Press.
  • Woodward, J. (2015). “Interventionism and Causal Exclusion”. Philosophy and Phenomenological Research, 91, 2, p. 303-347.
  • White, B. (2017). “Conservation Laws and Interactionist Dualism”. Philosophical Quarterly, 67, 267, p. 387-405.
  • Whittle, A. (2007). “The Co-Instantiation Thesis”. Australasian Journal of Philosophy, 85, p. 61-79.
  • Witmer, G. (2003). “Functionalism and Causal Exclusion”. Pacific Philosophical Quarterly, 84, p. 198-214.
  • Wyss, P. (2010).  “Identity With a Difference”. Emergence in Mind, MacDonald, C. and MacDonald, G. (eds.), Oxford: Oxford University Press, p. 169-179.
  • Zangwill, N. (1996). “Good Old Supervenience: Mental Causation on the Cheap”. Synthese, 106, 1, p. 67-101.
  • Zhong, L. 2014. “Sophisticated Exclusion and Sophisticated Causation”. Journal of Philosophy 111: 361–380.

 

Author Information

Dwayne Moore
Email: dwayne.moore@usask.ca
University of Saskatchewan
Canada

Two-Dimensional Semantics

Two-dimensional (2D) semantic theories distinguish between two different aspects, or ‘dimensions’, of the meaning of linguistic expressions. Many other theories identify the meaning of an expression with a dependency of its extension on the state of the world. (The extension of a sentence is its truth-value, and the extension of a sub-sentential expression is the object or objects it applies to.) Consider the following, true sentence:

(1) Anand is a chess player.

If Anand had decided to spend his time very differently, sentence (1) would be false. Which extension this sentence has thus depends on whether a specific individual plays a particular game. One could hold, in line with a common view, that the meaning of sentence (1) is captured by this dependency of its truth-value on Anand’s relation to chess. But notice that there is more than one way in which an expression’s extension depends on the state of the world. For example, in counterfactual circumstances in which the speakers in our linguistic community use the word ‘chess’ so that it exclusively applies to what we call ‘tennis’, (1) would be false. 2D semantic theories identify two kinds of dependencies of extension on the world, both of which are meant to represent important aspects of meaning.

2D semantics is a version of possible worlds semantics. Such theories standardly capture dependency of extension on the world by means of an intension, that is, a function from possible worlds to extensions. 2D semantic theories postulate at least two intensions that capture two kinds of dependencies of extension on the world. One intension, which is sometimes called the ‘2-intension’, corresponds to the first way of construing the dependency of the extension of (1) on the world just outlined. This intension returns ‘True’ for all and only those worlds in which Anand plays chess. In addition, all 2D semantic theories introduce another intension, sometimes called the ‘1-intension’.

Proponents of 2D semantic theories generally agree about how to construe 2-intensions. However, they differ in how they construe 1-intensions, and in what they take to be the theoretical purposes of 1-intensions. Concerning the latter issue, 1-intensions have generally been taken to capture either epistemic features associated with linguistic expressions—such as apriority or cognitive significance—or matters related to context-dependency. There is also disagreement about what kinds of items the theory should be applied to. It is widely accepted that a 2D semantics can be fruitfully applied to some kinds of expressions, such as indexicals. Accounts that apply a 2D semantics only to specific kinds of expressions are called ‘local accounts’ in what follows. Other researchers have argued, more controversially, that 2D semantics is a useful tool for characterizing the meanings of all kinds of expressions, or even the contents of mental states. Accounts that apply a 2D semantics to all kinds of expressions are called ‘global accounts’ in what follows.

The philosophical significance of 2D semantics extends far beyond the philosophy of language. Issues concerning 2D semantics and its interpretation have been at the heart of debates about the mind-body problem and philosophical methodology.

Table of Contents

  1. Introduction to 2D Semantics
    1. From Extensions to Intensions
    2. Rigid Designators and Externalism
    3. From 1D to 2D
  2. Local Accounts
    1. Content and Character (Kaplan)
    2. Superficial and Deep Modality (Evans, Davies & Humberstone)
  3. Global Accounts
    1. Metasemantic 2D Semantics (Stalnaker)
    2. Epistemic 2D Semantics (Chalmers, Jackson)
      1. Modal Rationalism
      2. The Scrutability of Truth
      3. Philosophical Methodology
  4. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction to 2D Semantics

a. From Extensions to Intensions

Meaningful linguistic expressions have extensions. The extension of a sentence, such as (1), is its truth-value, in this case True. The extension of a general term, such as ‘chess player’, is the class of individuals to which the term applies, in this case the class of people who play chess. The extension of a singular term, such as ‘Anand’, is the individual denoted by the term, in this case Anand. Extension is not the same as meaning. ‘Anand’ is not synonymous with ‘the 15th world chess champion’, even though the expressions denote the same individual. Even more obviously, (1) is not synonymous with ‘Serena Williams is a tennis player’, even though the expressions have the same truth-value.

A popular idea is to characterize meanings as truth- or application-conditions, that is, the conditions under which a sentence is true or the conditions under which an expression is correctly applied. For instance, (1) is true if and only if Anand is a chess player. Given a particular state of the world, these truth-conditions then determine the truth-value of the sentence. Truth- and application-conditions capture the (or a) dependency of an expression’s extension on the state of the world. In possible world theories of meaning, they are modeled as intensions. An intension is a function that assigns extensions with respect to possible worlds. For example, the intension of (1) assigns True with respect to all and only those worlds in which Anand is a chess player; and the intension of ‘chess player’ with respect to a particular world assigns all and only those individuals that play chess in that world. According to such possible worlds accounts, the meaning of an expression is thus its intension.

One way to motivate this idea is by arguing that the primary use of language is to exchange information, which suggests that the meaning of an expression (or at least a crucial aspect of its meaning) is its information content. Furthermore, information can be defined as the exclusion of possibilities, and possibilities are commonly characterized by means of possible worlds: Something is possible if and only if there is a possible world in which it is the case. Accordingly, the information conveyed by sentence (1) excludes all possible worlds in which it is not the case that Anand is a chess player. The remaining worlds are precisely those to which the intension of (1) assigns True. Hence, the intension of an expression is well suited for capturing its information content.

Another virtue of possible worlds theories of meaning is that they allow us to assign different meanings to expressions that share the same extension, thus respecting intuitive judgments about synonymy. For example, there is a possible world in which Anand never plays chess but in which Serena Williams is a tennis player. With respect to this world, (1) and ‘Serena Williams is a tennis player’ are assigned different truth-values, which implies that their intensions differ. Likewise, there is a possible world in which Gelfand wins the 2007 world chess championship tournament, thereby becoming the 15th world champion. With respect to this world, ‘Anand’ and ‘the 15th world chess champion’ have different extensions, which implies that their intensions differ as well.

b. Rigid Designators and Externalism

Saul Kripke (1980) forcefully argued that some types of expressions, among them proper names and indexicals, are rigid designators, that is, they refer to the same individual with respect to every possible world. Suppose, for instance, that Gelfand claims ‘Anand could have become a professional tennis player’. It seems obvious that this statement is about the same individual that (1) is about, that is, Anand. The statement could not be made true by someone else doing something in some possible world. Hence, when one uses the name ‘Anand’ to talk about counterfactual circumstances, one still talks about the same person. Likewise, Anand can use the indexical ‘I’ to talk about what he himself would have done in counterfactual situations. This suggests that proper names such as ‘Anand’, and indexicals such as ‘I’, ‘here’, and ‘now’, are rigid designators.

Most philosophers have accepted Kripke’s claim that names and indexicals are rigid designators. In fact, the claim made above that (1) is true in all and only those worlds in which Anand is a chess player already presupposed that the name ‘Anand’ is a rigid designator. This illustrates that in a possible worlds account of meaning, the existence of rigid designators has immediate consequences for the meanings of expressions containing them.

A less obvious consequence that many have drawn from the existence of rigid designators, following Kripke (1980) and Hilary Putnam (1975), is that the meaning of linguistic expressions is not determined by a subject’s intrinsic properties, that is, that meaning externalism is true. Take the name ‘Anand’, as used by Gelfand. A person who is intrinsically identical to Gelfand might refer to a different person by saying ‘Anand’—this might be because his ‘Anand’-utterances are causally related to this other person. Hence, the utterances of Gelfand and his twin have different extensions. Assuming that names are rigid designators, they have different intensions as well—one utterance picks out Anand with respect to every possible world, the other picks out some other person with respect to every possible world. If intensions are meanings, the two subjects’ utterances also have different meanings.

Probably the most famous argument for externalism is provided by Putnam’s ‘Twin Earth’ thought experiment (1975, 139–144). Suppose that on Twin Earth, which is a planet in a remote part of our galaxy (or in another possible world), there is a substance that is called ‘water’ by the inhabitants of this planet and that shares all of its superficial properties with our water: It is colorless, odorless and drinkable, it falls out of grey clouds and is the dominant substance in rivers and lakes, and so forth. However, this substance has a different molecular structure than water, viz. XYZ. According to Putnam, XYZ is not water, and the term ‘water’ has a different meaning on Earth than it does on Twin Earth. One way to support these claims is by appealing to the fact that ‘water’ and other terms for natural kinds, such as ‘tiger’, ‘electron’, and ‘gold’, rigidly denote the kind they pick out in the actual world. Given this, ‘water’ as used by Oscar on Earth picks out H2O with respect to all possible worlds and thus has a different intension from ‘water’ as used by Twoscar, who lives on Twin Earth. At the same time, as Putnam notes, Oscar and Twoscar might be intrinsically identical. Hence, the intension of ‘water’ is not determined by the intrinsic properties of a speaker, and so, it seems that neither is the expression’s meaning.

All the major proponents of 2D semantics accept that proper names, indexicals, and natural kind terms are rigid designators. They also accept that at least one aspect of meaning is not determined by a speaker’s intrinsic properties.

c. From 1D to 2D

The possible worlds account just sketched in effect yields a one-dimensional (1D) semantic theory, in which the meaning of an expression is modeled by means of a single intension. Proponents of 2D semantics, by contrast, hold that at least for some kinds of expressions, a single intension is not enough to capture their meaning. A natural way to motivate this claim is by considering indexical expressions. Take the following sentence:

(2) I am a chess player.

Assuming that (2) is uttered by Anand, and given that ‘I’ is a rigid designator, the intension of (2) is true with respect to all and only those worlds in which Anand is a chess player. It is thus identical to the intension of (1). However, it seems clear that Anand’s utterance is not synonymous with an utterance of (1). Furthermore, assume that Serena Williams utters (2). The intension of this utterance is true in all and only those worlds in which Williams is a chess player. It thus differs from the intension of Anand’s utterance. So, the intension of (2) varies between different tokens of (2)’s type. In a way, this seems adequate—there is a sense in which Williams and Anand express something different by uttering (2). But it seems that there is also a sense in which what they express is the same. More generally, it is natural to think that indexicals such as ‘I’, ‘here’, and ‘now’, have a stable meaning, even if they are uttered by different people, at different places, and at different times. Accordingly, the sentence (2), that is, the sentence type, and other indexical expressions should also have stable meanings that do not vary with its producer.

The intension of an utterance involving an indexical and, consequently, its extension systematically depends on the circumstances in which the utterance is produced. For example, the intension of (2) is true with respect to a world w if and only if the individual who in fact (but not necessarily in w) utters this sentence is a chess player in w. According to proponents of 2D semantics, this kind of dependency must be captured when giving an account of the meaning of indexical expressions. Here is one way to systematize this dependency. The circumstances in which an expression is uttered are often called ‘context of use’. The worlds with respect to which (according to 1D accounts) an expression’s intension outputs an extension are often called ‘circumstances of evaluation’.

Now assume, first, that one wants to capture how the truth-value of indexical expressions varies depending on the expressions’ context of use. In 2D semantics, this is done by means of a 1-intension:

(1-Intension) A 1-intension is a function from contexts of use to extensions.

For example, relative to a context in which Serena Williams is not a chess player and utters (2), the 1-intension of her utterance is false. In 1-intensions, the context of use also serves as the circumstances of evaluation. In 1D accounts, in which expressions only have one intension, these intensions are construed differently. There, it is assumed that the context of use is fixed, that is, an expression is uttered by a specific individual, at a particular place, and at a particular time. Then, the expression is assigned an extension relative to different circumstances of evaluation. Within a 2D account, this kind of intension is a 2-intension:

(2-Intension) A 2-intension is a function from circumstances of evaluation to extensions.

For example, if Serena Williams utters (2), then her utterance is true with respect to counterfactual circumstances of evaluation in which Serena Williams is a chess player. The 2-intensions of indexical expression types vary between different contexts of use. This is precisely why one needs to appeal to 1-intensions to characterize the meanings of indexical expressions. However, every expression token, that is, every utterance, has a context of use and therefore also has a 2-intension. How 2-intensions of expressions vary between different contexts of use, that is, how they depend on contexts of use, can itself be captured by another, 2D-intension. It is a function from contexts of use to 2-intensions. Equivalently, it can be defined as a function that takes pairs of a context of use and circumstances of evaluation as input and delivers an extension as output:

(2D-Intension) A 2D-intension is a function from pairs of contexts of use and circumstances of evaluation to extensions.

A note concerning the construal of contexts of use and circumstances of evaluation: Circumstances of evaluation can simply be understood as possible worlds. It is natural to understand contexts of use as possible worlds as well. However, there is a catch. In many possible worlds, a great number of utterances are made. Consider, for instance, a world in which both Anand and Williams utter (2). If contexts of use are just possible worlds, then it is impossible to identify the utterance that is to be assigned an extension. A common way to solve this problem is to construe contexts of use as centered worlds (compare Lewis 1979). A centered world is a triple of a possible world, an individual, and a time. A centered world can thus serve to pick out the relevant utterance by specifying, or ‘marking’, the producer of the utterance and the time at which it is uttered.

All of the intensions just defined can be represented in a 2D matrix. Figure 1 below depicts a snippet of the 2D matrix of sentence (2). In each centered world, the individual at the center utters (2) at the marked time. The worlds involved have the following character—notice that centered worlds are flagged by a ‘*’, and that any wn* differs from any wn only in that the former involves a center.

w1* is centered on Anand. In w1, Anand and Gelfand are chess players, and Williams is not.

w2* is centered on Gelfand. In w2, Gelfand and Williams are chess players, and Anand is not.

w3* is centered on Williams. In w3, Anand is a chess player, and Gelfand and Williams are not.

figure 1Figure 1

The worlds on the left side, marked with an ‘*’, are contexts of use, understood as centered possible worlds. The worlds on the top are circumstances of evaluation, understood as possible worlds. Notice that the class of contexts of use and the class of circumstances of evaluation are identical, the only difference being the presence or absence of centering. For the purposes of illustration, here is how the second row of the matrix is evaluated. In this row, w2* is assumed to be the context of use, in which Gelfand utters (2). Now this utterance is evaluated with respect to different circumstances of evaluation. Since Gelfand is a chess player in w1 and in w2, but not in w3, the first two cells in this row get assigned a ‘T’ (for True), while the third one gets assigned an ‘F’ (for False). Now assume that w1* is the actual world. That is to say, the utterance of (2) that we consider is in fact produced by Anand, who is a chess player. Then the three kinds of intensions identified above are represented in the matrix as follows. The top row of the matrix, with respect to which the actual world, centered on Anand, is the context of use, represents the 2-intension of (2). The other rows represent 2-intensions that (2) could have had, if it had been uttered in different contexts. The 1-intension is represented by the diagonal that runs from the top left to the bottom right of the matrix. The matrix itself represents the utterance’s 2D-intension.

According to the 2D account just sketched, indexical expressions are associated with three intensions. This raises the question: Which of these intensions represents the meaning of these expressions? The 1-intensions and 2D-intensions of indexical expressions do not vary with their context of use. This distinguishes them from 2-intensions and makes them more suitable for representing the meaning of such expressions. Relatedly, it is plausible that subjects who know the meaning of an indexical expression are able to evaluate 1-intensions and 2D-intensions. For example, a speaker is able to say that if, in the context of use, a is the speaker, and if, in the circumstances of evaluation, a is a chess player, (2) is true. An expression’s 2-intension, however, cannot be evaluated on the basis of mere semantic competence, because semantic competence does not provide knowledge of the context of an utterance. These considerations suggest that both the 1-intension and the 2D-intension of an indexical expression are better candidates for representing its meaning. But 2D accounts are not committed to the claim that one of these intensions represents the meaning of such an expression. Rather, proponents of 2D semantics could say that all three intensions represent important aspects of meaning. For instance, as was mentioned above, when Williams and Anand utter (2), one would like to say that in one sense, they expressed the same thing, and in another sense, they expressed something different. The sense in which what is expressed differs is reflected in the differences between the 2-intensions of the respective utterances. 2D semantics thus provides the resources to capture all these aspects of meaning.

All 2D semantic theories share the basic structure represented by a 2D matrix. This structure is 2D because the worlds involved play two different roles. Above, these roles were introduced as the contexts of use on the one hand, and the circumstances of evaluation on the other. However, notice that, while this is a natural and popular understanding of the two roles of possible worlds in 2D semantics, some 2D accounts understand these two roles in slightly different ways. In particular, the construal of the worlds involved in the ‘first dimension’, that is, the centered worlds listed on the left of each row in the 2D matrix, is contested. As a consequence, the construal of 1-intensions is contested as well. 2D accounts also differ in other respects. There is widespread agreement that a 2D account is well suited to describe the meanings of indexicals, and of indexical expressions. However, whether or not a 2D account should also be applied to other kinds of expressions, and if so, to what kinds of expressions, is controversial.

2. Local Accounts

a. Content and Character (Kaplan)

We just saw that indexicals provide a natural motivation for adopting a 2D theory. It is therefore unsurprising that the first 2D theories were introduced as accounts of indexical expressions (compare for example, Kamp 1971). One such account that has been particularly influential is David Kaplan’s general semantic theory of indexicals (compare Kaplan 1989). According to Kaplan, expressions have contents. These contents are supposed to correspond to ‘what is said’ by the relevant expression. Furthermore, the content of a sentence token is a proposition. Kaplan’s way of construing propositions, and contents in general, requires some elaboration. On his view, propositions are structured entities. The content of Anand’s utterance of (2), for instance, is a singular proposition that consists of Anand himself and the property of being a chess player. (A proposition is singular if it has an individual as its constituent.) According to Kaplan, the content of a singular expression, such as ‘Anand’, is an individual—in this case, Anand. The content of a general expression, such as ‘chess player’, is a property—in this case, the property of being a chess player. The contents of composite expressions then systematically depend on the contents of their parts.

This account of content seems very different from a possible words account. However, the contents postulated by Kaplan can be taken to determine intensions. And in fact, Kaplan often appeals to intensions to characterize contents. For Kaplan, an intension is a function from circumstances of evaluation to extensions. Kaplan’s intensions are basically the 2-intensions introduced above, except that Kaplan favors a different characterization of circumstances of evaluation. For Kaplan, circumstances of evaluation are not just possible worlds. They also include a designated time and potentially other features. Now consider again Anand’s utterance of (2), which expresses a singular proposition containing Anand and the property of being a chess player. This proposition determines an intension that is true with respect to all circumstances of evaluation in which Anand is a chess player.

According to Kaplan, indexicals are directly referential, which is to say that the only contribution they make to the contents of the expressions they figure in is their referent. He takes this to imply that indexicals pick out the same individual with respect to all circumstances of evaluation. Kaplan thus seconds Kripke’s claim that indexicals are rigid designators.

Up to this point, the account described is just a standard 1D account. However, Kaplan argues that content is not all there is to meaning. He therefore introduces another aspect of meaning, character. The character of an expression can be understood as a rule that specifies how the content of the expression depends on the context. For example, for the indexical ‘I’, the rule would be something like this: ‘If x is the producer of the utterance in the relevant context, then x is the content of ‘I’’. Similar rules apply to other indexicals. More formally, characters can be defined as functions from contexts to contents. Hence, on the possible worlds understanding of content, characters are 2D-intensions. The inclusion of characters thus makes Kaplan’s account a type of 2D semantics. Now consider three contexts, in all of which someone utters ‘I’. In w1*, it is Anand; in w2*, it is Gelfand; in w3*, Williams. Given these contexts, Kaplan’s account entails the following snippet of the matrix for the indexical ‘I’:

figure 2
Figure 2

This matrix illustrates Kaplan’s claim that ‘I’ is a rigid designator: First, the reference of ‘I’ is determined by the context, and then the expression picks out the same individual with respect to all circumstances of evaluation.

For ease of exposition, it has so far been assumed that contents are assigned to utterances in Kaplan’s account, and that Kaplan’s contexts are just the contexts of use introduced in § 1. However, both assumptions are not entirely correct. According to Kaplan, contents are assigned to expressions with respect to contexts. (Characters, on the other hand, are assigned to expressions without relativization to anything.) The subject that is the content of, for instance, ‘I’, need not in fact have produced an utterance in a Kaplanian context. On his account, the context does not have to involve an utterance at all. Kaplan states that every context has an agent, a time, and a location within a possible world. (Contexts can thus be understood as centered worlds.) The content of the expression ‘I’ with respect to a context is the agent of the context, where this agent may or may not produce an utterance in the relevant context. This feature of Kaplan’s account has implications for the evaluation of some expressions. For instance, the sentence ‘I utter nothing’ is true with respect to some Kaplanian contexts, while it comes out as false with respect to all contexts of use as they were construed in § 1.

The character of an indexical expression corresponds to what a competent speaker can know in virtue of understanding the expression. The same does not hold for contents, since they vary between contexts. Should one therefore say that the character of an indexical is its meaning? While Kaplan affirms this in several places, he stresses elsewhere that content is also an important aspect of meaning. Again, it does not seem too important to settle the question of what the meaning of indexicals is. What is clear is that both characters and contents play crucial roles in Kaplan’s account of the semantics of indexicals.

On Kaplan’s account, all meaningful expressions can be assigned a character and a content. However, he believes that the characters of many expressions are not very interesting, since they assign the same content with respect to every context. Expressions of this type thus have a constant content. According to Kaplan, proper names fall into this category. In Kaplan’s view, we can say with respect to such expressions that their meaning is just their content. In any case, it would not be theoretically very fruitful to apply Kaplan’s 2D account to expressions with a constant content.

b. Superficial and Deep Modality (Evans, Davies & Humberstone)

Gareth Evans (1979), and Martin Davies & Lloyd Humberstone (1980) applied ideas from 2D semantics to give accounts of both contingent truths that can be known a priori, and of necessary truths that can only be known a posteriori. The existence of both kinds of truths seems to follow straightforwardly from the fact that some expressions are rigid designators. For instance, take the names ‘Hesperus’ and ‘Phosphorus’, both of which refer to the same object, the planet Venus. Since both names are rigid designators, they refer to Venus with respect to every possible world. And this implies that ‘Hesperus = Phosphorus’ is necessarily true. At the same time, it took substantial astronomical research to establish that Hesperus = Phosphorus, and it seems clear that no amount of a priori reasoning could have sufficed to come to know it. Hence, ‘Hesperus = Phosphorus’ is an example of a necessary a posteriori truth (or at least it is a necessary truth that if Hesperus exists, then Hesperus = Phosphorus.) A simple way of formulating contingent a priori truths is by drawing on sentences that contain the expression ‘actual’. This expression is standardly taken to be a device that turns non-rigid expressions into rigid designators. For instance, take the definite description ‘the 15th world chess champion’. This expression picks out Anand; but with respect to a world in which Gelfand wins the 2007 world championship tournament, it picks out Gelfand. The description at issue is therefore not rigid. However, if one adds the word ‘actual’ to it, changing the description to ‘the actual 15th world chess champion’, the new description will pick out the person who in our—the actual—world is the 15th world chess champion (that is, Anand) with respect to every possible world. With this in mind, consider the sentence ‘The actual 15th world chess champion is the 15th world chess champion’. This sentence is contingent—for instance, it is false with respect to the world just mentioned, in which Gelfand, and thus someone other than the actual 15th world chess champion, is the 15th world chess champion. At the same time, the sentence can be known a priori. Hence, ‘The actual 15th world chess champion is the 15th world chess champion’ is a contingent a priori truth.

Many people have found it puzzling that there could be contingent a priori truths and necessary a posteriori truths. If a sentence is contingent, then it seems that its truth depends on features that are not shared by all worlds. It is thus natural to think that to find out whether our world has these features, one needs to do empirical research. On the other hand, if a sentence is necessary, then it seems that its truth does not depend on specific features of our world. It is thus natural to think that to find out whether such a sentence is true, purely a priori reasoning is sufficient.

Evans tries to explain how there can be contingent a priori truths, focusing on examples that arise from what he calls “descriptive names”. To introduce such a name, he stipulates that the name ‘Julius’ is to refer to whoever invented the zipper (Evans 1979, 163). A descriptive name is thus a name whose reference is fixed by a description—in this case, the description ‘the inventor of the zipper’. Evans argues that since descriptive names are names, they are rigid designators. The descriptive name ‘Julius’ thus refers to the same person with respect to every possible world, unlike the definite description ‘the inventor of the zipper’. With this in mind, consider the following sentence:

(3) Julius invented the zipper.

With respect to a possible world in which someone other than the actual inventor of the zipper (Whitcomb Judson) invented the zipper, (3) is false. Hence, (3) is contingent. At the same time, according to Evans, someone who understands the expression ‘Julius’ knows its associated description and is thus in a position to know a priori that (3) is true. Therefore, (3) is a contingent a priori truth. To account for such sentences, Evans introduces a distinction between superficial and deep contingency. Superficial contingency corresponds to the ordinary understanding of contingency—as Evans puts it, whether a sentence is superficially contingent depends on how it “embeds in the scope of modal operators” (1979, 161). Deep contingency, on the other hand, depends on what makes a sentence true: A sentence is deeply contingent only if the world needs to satisfy some condition for this sentence to be true, that is, only if there is some feature that the world needs to have to make it true. What makes a sentence true, according to Evans, is in turn related to the sentence’s content. Accordingly, a deeply necessary sentence is one whose content guarantees its truth. Superficial and deep contingency can come apart because the notion of content is not tied to metaphysical modality. Evans’s notion of content thus differs from the one invoked, for instance, by Kaplan. Following Gottlob Frege (1892/1952), Evans holds that there are epistemic constraints on content: If two sentences have the same content, then a subject who understands both of them cannot believe what one of the sentences says without also believing what the other one says. Evans calls such sentences “epistemically equivalent”.

According to Evans’s distinction, (3) is superficially contingent. Evans also holds that ‘Julius’ and ‘the inventor of the zipper’ have the same content; therefore, (3) is deeply necessary. In his view, there can be no a priori sentences that are deeply contingent. Accordingly, contingent a priori truths are those sentences that are superficially contingent but deeply necessary, that is, those whose truth is guaranteed by their content, even though they are not true in all possible worlds. One might have doubts that such a separation of content and modality is sensible. Take the two sentences ‘Julius is male’ and ‘The inventor of the zipper is male’. These sentences place different demands on a possible world with respect to which they are to be true: ‘Julius is male’ is true with respect to some world if and only if the individual who invented the zipper in the actual world is male, while ‘The inventor of the zipper is male’ is true with respect to some world if and only if the individual who invented the zipper in that world is male. So how could these sentences have the same content, as Evans’s account has it? In response to this kind of worry, Evans points out that the sentences nevertheless place the exact same demands on the actual world: They are both true with respect to the actual world if and only if the individual who invented the zipper in that world is male. Since believing something means believing that it is actually the case, the two sentences are epistemically equivalent. And this, in turn, implies that they have the same content.

By distinguishing between two kinds of modality, Evans draws on a central idea of 2D semantics. Accordingly, deep and superficial modality could, in principle, be used to define 1- and 2-intensions, respectively. The connection to 2D semantics becomes even clearer once one considers the account of Davies & Humberstone (1980), who try to characterize Evans’s distinction between superficial and deep necessity in formal terms. They start from a standard modal logic, with ‘□’ as the sentential operator expressing necessity. Then they introduce the sentential operator ‘A’, which stands for ‘it is actually the case that’. In line with what was said above about ‘actually’-involving expressions, a sentence AS is true with respect to a world if and only if S is true with respect to the actual world. Accordingly, if S is true, then AS is necessarily true, that is, true with respect to every world. But as Davies & Humberstone note, there is an intuitive sense in which some other world might have been actual, and thus, we can consider different worlds as actual. Based on this idea, they introduce another sentential operator, F (for ‘fixedly’), such that FS is true with respect to a world w if and only S is true with respect to w irrespective of which world is considered as actual. Combining these two operators, one can derive another operator, FA, such that FAS is true if and only if S is true with respect to any world that is considered as actual. As Davies & Humberstone point out, the resulting logic can also be characterized in 2D terms. Accordingly, one can evaluate a sentence S with respect to pairs of a world considered as actual and a possible world—this way of evaluating expressions thus yields a kind of 2D-intension.

Davies & Humberstone argue that the distinction between □-truth and FA-truth captures Evans’s distinction between superficial and deep necessity. Accordingly, FAS is true if and only if S is deeply necessary. On Evans’s account, this implies that if S is a priori, then FAS is true. Davies & Humberstone hypothesize that all contingent a priori truths are A-involving. For instance, assume that it is part of the meaning of Evans’s descriptive name ‘Julius’ that it rigidly refers to the inventor of the zipper. Then we can take ‘Julius’ to abbreviate ‘the actual inventor of the zipper’. Given this, (3) is clearly FA-true: No matter which world w is considered as actual, the person who invented the zipper in w actually invented the zipper in w. Other examples of sentences that are FA-true and contingent a priori are easy to come by. These include all sentences of the form S ↔ AS, such as ‘Grass is green if and only if grass is actually green’. Davies & Humberstone also hold that there are many A-involving necessary a posteriori truths. For instance, if S is an ordinary (superficially and deeply) contingent truth, such as ‘Grass is green’, then AS is necessary and a posteriori.

As was noted above, Davies & Humberstone follow Evans in holding that all a priori truths are deeply necessary, which in their framework means that they are FA-true. According to Davies & Humberstone, contingent a priori truths involve a divergence between □-truth and FA-truth—such sentences are FA-true but not □-true—that is due to the involvement of an (implicit) A-operator. If this indeed applies to all contingent a priori truths, then these can be given a unified explanation in their framework. But it is not obvious that all contingent a priori truths are A-involving. Take, for instance, ‘The local theater is a theater’. Given that the expression ‘local’, like other indexicals, is a rigid designator, this sentence is contingent. It is also clearly a priori. But it is less clear that ‘local’, or any other expression in the sentence at hand, is even implicitly A-involving. It is therefore disputable both that Davies & Humberstone can explain all contingent a priori truths and that they can preserve Evans’s claim that all a priori truths are deeply necessary. Nevertheless, there is some plausibility to the claim that ‘The local theater is a theater’, and indeed all contingent a priori truths, involve some kind of implicit or explicit reference to actuality.

Davies & Humberstone tentatively suggest that many other expressions are also A-involving, among them natural kind terms, such as ‘water’. Recall that, since water is composed of H2O molecules, ‘water’ rigidly refers to H2O. This implies that ‘Water = H2O’ is necessarily true. Since this sentence cannot be known a priori, it represents another example of a necessary a posteriori truth. Davies & Humberstone’s suggestion is that ‘water’ and other natural kind terms can be understood analogously to descriptive names. For instance, the description associated with ‘water’ could be something like ‘the actual chemical kind exemplified by the liquid that falls from clouds, flows in rivers, is colorless and odorless, …’, (compare Davies & Humberstone 1980, 18) which rigidly refers to H2O. If this is correct, then sentences containing the term ‘water’ are A-involving, and the fact that ‘Water = H2O’ is a necessary a posteriori truth can be explained by Davies & Humberstone’s account. However, Davies & Humberstone believe that ordinary proper names are not even implicitly A-involving, and thus that true identity statements involving names, such as ‘Hesperus = Phosphorus’, are both □-true and FA-true. This is in line with Evans’s view, according to which such sentences are both superficially and deeply necessary. Hence, not all necessary a posteriori sentences are given a unified treatment in the account of Evans and Davies & Humberstone.

3. Global Accounts

a. Metasemantic 2D Semantics (Stalnaker)

Robert Stalnaker (1978) introduces his 2D account as a part of a theory of assertions and their role in communication. According to Stalnaker, the contents of assertions are propositions, which he construes as intensions, that is, functions from possible worlds to extensions. Every proposition thus corresponds to a set of possible worlds, viz. those worlds with respect to which the extension is True. In a conversation, each of the participants makes certain assumptions. These speaker presuppositions are those propositions that the participants in the conversation believe to be true, or at least accept for the purposes of the conversation, and that they believe to be accepted by all the other participants in the conversation. Those speaker presuppositions that are indeed shared, and known to be shared, by all participants in a conversation constitute their common knowledge. This common knowledge is characterized by the context set—the set of those possible worlds that are not ruled out by the common knowledge of the participants in the conversation. Now if a speaker in a conversation asserts a proposition that is accepted by the hearers, then this proposition is added to their common knowledge, which means that those possible worlds not compatible with it are eliminated from the context set. On this account, the goal of communication is to reduce the context set by means of making assertions.

One problem about this very natural account of assertion and communication is that it seems unable to explain the use of certain perfectly sensible assertions. For example, there are many conceivable circumstances in which a speaker successfully communicates something by asserting ‘Hesperus = Phosphorus’. However, the proposition expressed by this utterance has a necessary intension (an intension that is constantly True in all possible worlds). Therefore, no matter what the common knowledge of the participants in such a conversation consists in, the utterance cannot eliminate any possibilities from the context set. Stalnaker thus needs to explain how an utterance of ‘Hesperus = Phosphorus’ and other utterances of this type can be informative. His explanation relies on the fact that which proposition a specific sentence expresses depends on features of the world. Suppose, for instance, as seems plausible, that if some celestial body other than Venus had been the brightest object in the evening sky (BOE), then that object would have been called ‘Hesperus’, and likewise that if some celestial body other than Venus had been the brightest object in the morning sky (BOM), then that object would have been called ‘Phosphorus’. Given this, if Mars had been the BOE and Venus the BOM, ‘Hesperus = Phosphorus’ would have expressed a different proposition that is necessarily false. In Stalnaker’s account, this dependency of the proposition expressed by an utterance on the state of the world is captured by a propositional concept. A propositional concept is a function from possible worlds to propositions or, equivalently, from pairs of possible worlds to truth-values. A propositional concept is thus a 2D-intension; it corresponds to a 2D matrix. Below is a snippet of the 2D matrix of ‘Hesperus = Phosphorus’, involving the following worlds:

w1: BOE = Venus; BOM = Venus

w2: BOE = Mars; BOM = Venus

w3: BOE = Mars; BOM = Mars


Figure 3

As was just noted, the whole matrix represents a propositional concept and thus a 2D-intension. Given that w1 is the actual world, the upper row of the matrix represents the intension actually expressed by ‘Hesperus = Phosphorus’. In 2D terminology, this horizontal intension is a 2-intension. The diagonal of the matrix running from the upper left to the bottom right is what Stalnaker calls a ‘diagonal proposition’. The diagonal proposition of ‘Hesperus = Phosphorus’, which in 2D terms is its 1-intension, is true with respect to a world if and only if the sentence expresses a true proposition in this world.

While the 2-intension, or horizontal proposition, of ‘Hesperus = Phosphorus’ is necessary, its 1-intension, or diagonal proposition, is contingent, which reflects the fact that (for all that is presupposed in a certain context) the sentence could have expressed a different, false proposition. This is crucial for Stalnaker’s explanation of the informativeness of assertions such as ‘Hesperus = Phosphorus’, because he argues that in uttering one of them, a speaker communicates the expression’s diagonal proposition. Assume, for instance, that w1, w2, and w3 are in the context set in a conversation, when the speaker utters ‘Hesperus = Phosphorus’. Interpreted according to its 2-intension—which for Stalnaker corresponds to literal interpretation—this utterance is uninformative and thus violates an important conversational rule. Moreover, a hearer who trusts this utterance knows that it is uninformative. According to Stalnaker, the utterance should thus be reinterpreted. What it really communicates is that the sentence ‘Hesperus = Phosphorus’ expresses something true. Assuming that it is common knowledge in the conversation that Hesperus is the BOE and Phosphorus the BOM, the utterance also conveys that the BOE is identical to the BOM. This content is captured by the utterance’s diagonal proposition. Hence, if the hearer accepts the speaker’s utterance, then w2, with respect to which the diagonal proposition is false, is eliminated from the context set.

There are some important differences between Stalnaker’s 2D account and the accounts considered in § 2. For a start, the accounts discussed previously were introduced to explain the behavior of specific kinds of expressions, such as indexicals (Kaplan), descriptive names (Evans; Davies & Humberstone), ‘actually’ (Evans; Davies & Humberstone), and natural kind terms (Davies & Humberstone). But since the proposition expressed by any sentence depends on the state of the world, Stalnaker’s 2D account can be sensibly applied to all kinds of sentences. Furthermore, Stalnaker (1987, 182f) stresses that his account concerns expression tokens, not types. The reason for this is that Stalnaker’s 2D account is not semantic, but metasemantic: Its 1-intension and its 2D-intension are not aspects of the meaning of expressions, but capture how their meanings depend on features of the world. And as Stalnaker notes, the latter dependency can vary between tokens of an expression type.

Stalnaker’s diagonal propositions have several further uses, for example in capturing the contents of mental states. For example, Stalnaker (1981) argues that diagonal propositions can serve to resolve puzzles raised by so-called ‘indexical’ or ‘egocentric’ beliefs, for example, ‘I am sleepy’ or ‘It is dark here’, that are essentially about the believer and her relation to the world. The following story, loosely based on a case devised by John Perry (1977, 492), illustrates one such puzzle. Suppose that Anand has lost his memory and does not remember who he is. From a book about the history of chess, he learns that Anand is the 15th world chess champion. But Anand is quite sure that he himself never even played in a world championship. Hence, Anand both believes I am not the 15th world chess champion and Anand is the 15th world chess champion. On the possible worlds account of content endorsed by Stalnaker, the former of these beliefs is true in all and only those worlds in which Anand is not the 15th world chess champion. Accordingly, the two beliefs are contradictory. But this does not seem right since, from Anand’s perspective, there is a clear sense in which his beliefs could both be true. Stalnaker’s solution is to ascribe to the subject the diagonal proposition of one of the beliefs in such cases. In the case at hand, one option is to reinterpret the belief of Anand’s that he would express by saying ‘I am not the 15th world chess champion’, such that he in fact believes not the horizontal proposition, that is, the 2-intension associated with this utterance, but rather the 1-intension associated with it, that is, the diagonal proposition. This belief is compatible with Anand being the 15th world chess champion because there are, for instance, worlds in which the amnesiac reading a book about chess history is not Anand, or has simply never played a match for the world championship. By ascribing the diagonal proposition to Anand, one can thus escape the undesirable conclusion that his belief state is inconsistent.

We saw above that Stalnaker’s 2D account can be applied to a posteriori necessities, such as ‘Hesperus = Phosphorus’. Stalnaker (2001, 155) suggests that his account can provide a general explanation for this phenomenon. This is a surprising claim, since the explanation he offers for the informativeness of ‘Hesperus = Phosphorus’ and other necessary a posteriori truths can be applied just as well to necessary a priori sentences, such as the following sentence that states Fermat’s last theorem: ‘No three positive integers a, b, and c satisfy an + bn = cn for any n greater than 2’. It is very plausible that mathematical truths, such as Fermat’s last theorem, are necessary. At the same time, it is intuitively obvious that the above sentence can be informative for a subject. On Stalnaker’s account, this is explained in the same way that the informativeness of necessary a posteriori truths is explained, by the fact that the diagonal proposition expressed by the above sentence is contingent. Intuitively, one might think that the informativeness of necessary a priori truths and that of necessary a posteriori truths are different kinds of phenomena that demand different explanations. However, whether one should consider this as a problem for Stalnaker’s account depends on one’s theoretical commitments. Stalnaker himself is skeptical about the existence of a priori truths. From his perspective, there is thus no deeper theoretical reason to provide structurally different explanations for the informativeness of, say, a statement of Fermat’s last theorem on the one hand and of ‘Hesperus = Phosphorus’ on the other.

One may have doubts that it is always adequate to ascribe diagonal propositions in cases that concern seemingly informative necessary (or necessarily false) statements or contents. For instance, is Anand’s belief, expressed by ‘I am not the 15th world chess champion’, really about the truth-value of this particular expression? Similarly, is the information a subject acquires upon hearing an utterance of ‘Hesperus = Phosphorus’ really metalinguistic? Notice, however, that the information conveyed in cases that involve the ascription of diagonal propositions need not be—at least not purely—metalinguistic. For instance, in the case discussed above, the speaker managed to convey to the hearer that the BOE is identical to the BOM by uttering ‘Hesperus = Phosphorus’. This is enabled by their common knowledge, in particular by the fact that in all the worlds in the context set, the BOE and the BOM are called ‘Hesperus’ and ‘Phosphorus’, respectively. This illustrates how diagonal contents can capture ordinary object-level information. Nevertheless, diagonal propositions do involve metalinguistic information, and their transmission in communication reflects a kind of ignorance of meaning or content on the side of the hearer. To motivate the view that such ignorance is quite common, it is useful to consider it in the context of Stalnaker’s general approach to linguistic meaning and mental content, which is externalist. The accounts of Kaplan, Evans, and Davies & Humberstone (and even more so the accounts discussed in the following section) can be interpreted as attempts to at least partially retain an internalist type of meaning or content (in the form of a 1-intension or a 2D-intension). Stalnaker rejects this interpretation of 2D semantics (2004). In his view, there is no viable internalist component of meaning or content, inter alia because he believes that one needs to appeal to features of the external world to obtain determinate content. From a purely externalist perspective, it is to be expected that even competent speakers often lack knowledge of the meaning of the expressions they use. It therefore makes sense to characterize the information they gain from utterances that express necessary propositions as (partly) metalinguistic, that is, as information about the meanings of certain expressions. But of course, this externalist viewpoint is contested. In the next section, we will consider a very different account of meaning and content, and accordingly, a very different interpretation of 2D semantics.

b. Epistemic 2D Semantics (Chalmers, Jackson)

Epistemic 2D semantics is a particularly ambitious, and also particularly controversial, theory that relies on an epistemic understanding of 1-intensions. These 1-intensions are supposed to capture the role of linguistic expressions for a subject’s reasoning and in a subject’s cognition more generally, and thus serve as the basis for a general internalist semantics. Epistemic 2D semantics also provides general explanations for the occurrence of contingent a priori truths and necessary a posteriori truths, in a way that promises to retain systematic a priori access to modality. The two main proponents of epistemic 2D semantics are David Chalmers and Frank Jackson, who have defended the account in a great number of writings (for example, Chalmers 2004; Jackson 2004).

Epistemic 2D semantics is based on the idea that there are two ways of considering a possible world: One can consider it as actual or as counterfactual. Putnam’s Twin Earth scenario serves to illustrate this distinction. Assume first that the scenario does not represent the state of our world and the planet we live on. In the actual world, the odorless, drinkable substance in our rivers and lakes is H2O, and hence the Twin Earth scenario represents a way the world is not, but could have been. Considering the Twin Earth world as counterfactual in this way lends plausibility to the view that the substance on Twin Earth is not water, because its molecular structure differs from that of the substance in our rivers and lakes. However, one can also consider Putnam’s scenario in a different way. To do this, suppose that the scenario describes the actual world. That is to say, suppose that what you have been told about the molecular structure of the odorless, drinkable substance in our rivers and lakes is wrong. The stuff that we drink every day, that comes out of our faucets, that we call ‘water’, and so forth, is really XYZ. It seems natural to say that under this assumption, one should conclude that water is XYZ. As Chalmers often puts it: If it turns out that the watery stuff (that is, the odorless, drinkable, substance in our rivers and lakes) is XYZ, then water is XYZ. In epistemic 2D semantics, these two ways of considering possible worlds are used to define two intensions:

A primary intension is a function from possible worlds considered as actual to extensions.

A secondary intension is a function from possible worlds considered as counterfactual to extensions.

The distinctive claim of epistemic 2D semantics is that every linguistic expression that is eligible for having an extension has both a primary intension and a secondary intension. Note that one can also define the epistemic version of a 2D-intension, as follows:

An epistemic 2D-intension is a function from pairs of possible worlds considered as actual and possible worlds considered as counterfactual to extensions.

Secondary intensions are closely related to the standard notion of modality—to what Evans called ‘superficial modality’ and what, following Kripke (1980), is today usually called ‘metaphysical modality’: A sentence is metaphysically necessary if and only if it has a necessary secondary intension, that is if and only if its secondary intension outputs True with respect to every world. Primary intensions, on the other hand, are closely connected to apriority: One of the key theses of epistemic 2D semantics is that a sentence is a priori if and only if it has a necessary primary intension. The worlds involved in primary intensions thus represent epistemic possibilities, that is, ways the world could be like for all one can know a priori. This thesis is based on the idea that primary intensions are a priori accessible, which can be motivated as follows. To consider a possible world as actual, one may need to bracket one’s empirical knowledge, such as one’s knowledge that the substance in our rivers and lakes is H2O. But, most importantly for epistemic 2D semantics, one does not need empirical knowledge to determine the extensions of one’s expressions with respect to worlds considered as actual. This is because any lack of empirical knowledge is ignorance of features of the actual world, and such ignorance is irrelevant if one assumes that the world one is considering is the actual world: In considering a possible world as actual, only information that is hypothetically assumed is brought to bear. The question of how the information about these possible worlds is presented to a subject, such that it is sufficient to determine the extensions of the expressions she uses, is discussed in more detail below.

Epistemic 2D semantics assigns intensions to linguistic tokens. One obvious reason for this is that, as with other kinds of 2-intensions, secondary intensions can vary between different linguistic tokens of the same type, for instance, when indexical expressions are involved. A less obvious reason is that, as we will see, primary intensions can also vary between tokens of the same type. The worlds involved in primary intensions are centered worlds. Again, this can be motivated by their usefulness in dealing with indexical expressions, in the way described in §§ 1.c and 2.a.

In what follows, the most important philosophical implications of epistemic 2D semantics will be discussed. Epistemic 2D semantics has been used to defend modal rationalism, that is, the view that we have a priori access to what is possible or necessary (compare § 3.b.i.). Another important claim made by proponents of epistemic 2D semantics is the thesis of scrutability, according to which all truths can be derived a priori from a narrowly constrained description of the world (compare § 3.b.ii.). Based on these epistemic theses, proponents of epistemic 2D semantics have argued that philosophical practice involves (or even has to involve) a central a priori element (compare § 3.b.iii.).

i. Modal Rationalism

According to epistemic 2D semantics, all expressions that are eligible for having an extension have both a primary and a secondary intension. Given the connections between primary intensions and apriority on the one hand, and between secondary intensions and metaphysical modality on the other, this implies that the 2D structures of all contingent a priori truths and of all necessary a posteriori truths can be described as follows:

(Contingent a priori) A sentence S is contingent a priori if and only if S has a necessary primary intension and a contingent secondary intension.

(Necessary a posteriori) A sentence S is necessary a posteriori if and only if S has a contingent primary intension and a necessary secondary intension.

On the standard construal, the worlds involved in primary and secondary intensions are the same, the only difference being that the worlds involved in primary intensions are centered. Any kind of divergence between epistemic and metaphysical modality can thus occur only when expressions are involved whose primary and secondary intensions yield different extensions with respect to some worlds, and this difference in extensions, in turn, must be due to the fact that it makes a difference whether the world in question is considered as actual or as counterfactual. The example of ‘water’, discussed above, suggests that this can indeed make a difference: If one considers the Twin Earth scenario as actual, the substance in its rivers and lakes falls under the extension of ‘water’, but if one considers it as counterfactual, it does not. But one might still wonder why an expression’s extension with respect to some possible world should depend on the way one considers this world. To explain this, it is helpful to reconsider ‘actually’-involving expressions, such as the following:

(4) The actual inventor of the zipper is male.

Let our world, in which Whitcomb Judson invented the zipper, be w@, and let w1 be a world in which his wife Annie invented the zipper. If one considers w@ as actual, then (4) is true with respect to it. If, however, one assumes that w@ is the actual world and thus considers w1 as a counterfactual world, then (4) is false with respect to w1, because the occurrence of the term ‘actual’ makes the truth-value of (4) depend on features of the actual world, that is, w@. Inter alia, its truth-value depends on features of Whitcomb Judson, since he invented the zipper in w@. Davies & Humberstone already suggested that the occurrence of expressions that either explicitly or implicitly refer to features of the actual world can give rise to both contingent a priori truths and necessary a posteriori truths. Epistemic 2D semantics generalizes this idea. Accordingly, all cases in which primary and secondary intensions diverge involve expressions that depend in some way on the actual world.

Since primary intensions involve the same worlds as secondary intensions, every epistemic possibility corresponds to a metaphysical possibility. Chalmers puts this as follows:

Metaphysical Plenitude: For all S, if S is epistemically possible, there is a centered metaphysically possible world that verifies S. (Chalmers 2006, 82)

(Roughly speaking, that a world w verifies an epistemic possibility S means that w makes S true provided that w is considered as actual.) Metaphysical plenitude may still seem like a surprising claim, since the existence of a posteriori necessities implies that there are epistemic possibilities that are metaphysically impossible. So how can a metaphysical impossibility, such as ‘water = XYZ’, be verified by a metaphysically possible world? The world that verifies ‘water = XYZ’ can be described as follows: It is a world in which the odorless, drinkable substance in rivers and lakes, that is, the watery stuff, is XYZ. Such a world is clearly metaphysically possible. However, if one describes it as a world in which water = XYZ, this misleadingly suggests that the world described is one in which the substance that is the watery stuff in the actual world (that is, H2O) is XYZ, due to the actuality-dependence of the term ‘water’. The latter scenario is indeed metaphysically impossible. To avoid any confusion, one should thus describe the possibility in question by using expressions that do not involve any actuality-dependence, that is, expressions whose primary and secondary intensions cannot come apart.

To sum up, epistemic 2D semantics provides the following account of our epistemic access to metaphysical modality. Both contingent a priori truths and necessary a posteriori truths are explained by the occurrence of expressions whose extension with respect to some worlds depends on whether these worlds are considered as actual or as counterfactual. Such expressions must involve some kind of explicit or implicit reference to the actual world, such that their 2-intensions vary depending on the actual world’s characteristics. This explanation of necessary a posteriori truths allows that whenever some hypothesis cannot be ruled out a priori, that is, whenever it is epistemically possible, there is a metaphysical possibility that corresponds to it. To correctly identify this metaphysical possibility, one just needs to make sure to describe it by using only expressions whose primary and secondary intensions cannot come apart. Because it postulates that we have a priori access to metaphysical modality, this account is often called ‘modal rationalism’.

Since philosophical inquiry is traditionally thought to proceed a priori, and since philosophy very often deals with sentences that are necessarily true (or necessarily false), it would be of great philosophical importance if modal rationalism could be established. However, the epistemic 2D account of modal knowledge is highly controversial. Objections to the account can be divided into two categories. Objections of the first category state that epistemic 2D semantics does not successfully explain the standard examples of a posteriori necessities, such as ‘Water = H2O’ and ‘Hesperus = Phosphorus’—or at least not all of them. It seems hard to deny that if epistemic 2D semantics accurately captures the semantics of the expressions involved, then its explanation of a posteriori necessities is compelling. The crucial question is thus whether the relevant semantic account is indeed accurate. The most pressing issue here is whether all the relevant expressions—including, for instance, proper names—really have primary intensions. This issue is discussed in the next section.

According to objections of the second category, there are other kinds of necessary truths that are different from the cases that have usually been discussed and that cannot be explained in the same way. One example sentence that has been brought up in this context is ‘God exists’. It has been argued that God exists necessarily. However, it is plausibly not a priori that God exists, and it is also plausible that the sentence does not exhibit any actuality-dependence. If all of this is correct, then ‘God does not exist’ is epistemically possible, but it is not verified by any metaphysically possible world. The example is of course highly controversial—the vast majority of philosophers do not believe that there is a necessarily existing God. But it illustrates what general form a counterexample of the type at issue would need to have. Modal rationalists have argued that there are general reasons to deny that such epistemic possibilities that do not correspond to metaphysical possibilities exist (for example, Chalmers 2009).

ii. The Scrutability of Truth

Primary and secondary intensions are defined in terms of how one considers a possible world—as actual or as counterfactual. But it is not obvious what it means to consider a possible world. Since one cannot perceive merely possible worlds, it is natural to assume that they are given to us via a description. Chalmers has explained in detail what such descriptions, which he calls “canonical descriptions”, should involve (for example, Chalmers 2006, 86–93). To begin with, a canonical description has to be complete, in the sense that it must not leave out any facts that might be relevant to the extension of some expression. However, completeness should not be achieved at the cost of triviality. Suppose, for instance, that one wonders whether a sentence S is true with respect to world w considered as actual. If S is part of the canonical description of w, then it is trivial that one can derive the extension of S a priori. According to Chalmers, the canonical description of a world should thus involve only a limited vocabulary. Furthermore, there are constraints on the kinds of expressions that may be used in a canonical description. For instance, ‘There is water in rivers and lakes’ should come out as true with respect to a world in which the only watery stuff is XYZ if that world is considered as actual, and it should come out as false with respect to such a world if that world is considered as counterfactual. However, if the canonical description of this world involves the word ‘water’, then this difference between the sentence’s primary and its secondary intension cannot be maintained, on pain of contradiction (compare Chalmers 2006, 86). Therefore, the limited vocabulary in which canonical descriptions are phrased should involve only expressions that are semantically neutral, that is, expressions whose primary and secondary intensions cannot come apart.

According to epistemic 2D semantics, primary intensions—that is, the extensions of our expressions with respect to worlds considered as actual—are a priori. Given this, the understanding of what it is to consider a world as actual in terms of canonical descriptions just explained leads to another central thesis of epistemic 2D semantics, the scrutability of truth. Here is one formulation of this thesis that concerns the actual world. (Notice that with respect to the actual world, primary and secondary intensions always yield the same extensions.)

(Scrutability of Truth) There is a description of the world in a limited and semantically neutral vocabulary from which every truth can be derived a priori.

Chalmers & Jackson (2001) make a specific proposal as to what such a description could look like. They argue that all truths follow a priori from a description they call ‘PQTI’—for Physics, Qualia, That’s all, and Indexicals. P stands for a complete microphysical description of the world, in the language of a completed future physics. Q is a complete description of the phenomenal states of all subjects, that is, of their subjective experiences. The word I adds indexical information, which picks out a subject and a time, in order to determine the truth-values of sentences such as ‘I am a chess player’ or ‘Today is Tuesday’. Finally, T is a totality clause, which states that this is all there is in the world. This clause is necessary to rule out things not entailed by PQI. Suppose, for instance, that there are no ghosts, and thus, ‘There are no ghosts’ is true. PQI does not entail the truth of this sentence, because it does not state that PQI provides a complete description of the world. The inclusion of T adds this information. Notice that, plausibly, ghosts could have existed. This illustrates that microphysical, phenomenal, indexical information plus a totality clause are insufficient to derive canonical descriptions of many other possible worlds. To describe these, one may thus need to expand one’s vocabulary.

The scrutability thesis brings out an important commitment of epistemic 2D semantics. According to this account, all of our expressions have a priori associations that are extensive enough to determine the expressions’ extensions. In fact, epistemic 2D semantics even entails that these a priori associations determine the extensions of our expressions with respect to every world considered as actual. However, there are highly influential arguments, originating in Kripke (1980), that seem to show that some kinds of expressions—most notably proper names—have no such a priori associations. The epistemic arguments suggest that everything a speaker associates with a name could turn out to be false in the light of additional empirical information—one might thus call these arguments ‘arguments from empirical defeasibility’. For instance, even our most central beliefs about Kurt Gödel—for instance, that he discovered the incompleteness of arithmetic—could be empirically defeated if, say, we got compelling evidence that the incompleteness proofs were developed by a man named ‘Schmidt’ and then later stolen and published by Gödel (Kripke 1980, 83f.). Thus, not even ‘discoverer of the incompleteness of arithmetic’ is a priori associated with the name ‘Kurt Gödel’, and it is hard to see what else could be. The epistemic arguments are compatible with the view that speaker associations determine the extension of our expressions—after all, Gödel is the unique discoverer of the incompleteness of arithmetic, even though it is not a priori that he is. However, the semantic arguments, also called ‘arguments from ignorance and error’, suggest that speaker associations need not suffice to determine the (correct) extension of an expression. Speaker associations might fail to do so in a given case either because they are insufficiently specific (in a case of ignorance), or because the speaker associations attribute features the referent does not have (in a case of error). The following examples due to Kripke exemplify these cases. First, a case of ignorance: Many people know that Feynman and Gell-Mann are physicists. But they know nothing to distinguish Gell-Mann from Feynman. Nevertheless, when these people say ‘Feynman’, they refer to Feynman, and when they say ‘Gell-Mann’, they refer to Gell-Mann. Second, a case of error: The only thing many people believe about Einstein is that he invented the atomic bomb. Nevertheless, when these people use the name ‘Einstein’, they refer not to Oppenheimer or Szilard, but to Einstein (Kripke 1980, 81). The example illustrates that the semantic arguments go one step further than the epistemic arguments: While cases like that of Gödel and Schmidt aim only to show that what speakers associate with a name could be erroneous, for all the speaker knows a priori, the case of Einstein is supposed to show that these associations are sometimes erroneous.

Similar kinds of arguments have been given concerning other kinds of expressions, such as natural kind terms (for example, Putnam 1975, 226). In order to defuse them, proponents of epistemic 2D semantics have to make a case that there are nevertheless speaker associations that are sufficient to determine an expression’s extension (including the extension with respect to other possible worlds considered as actual), and that these associations are also a priori. To do this, they have pointed out that the associations that determine primary intensions need not correspond to what first comes to a speaker’s mind. For example, in the case of a proper name such as ‘Gödel’, it is indeed plausible that ‘discoverer of the incompleteness of arithmetic’ is not part of the primary intension ordinary speakers associate with the name. Both Chalmers (2002, 617; 2003, 62–64; 2006, 91) and Jackson (1998b, 209–212; 2004, 270–271) suggest that speaker associations often concern other people’s usage of an expression. Accordingly, the primary intension of ‘Gödel’ could be approximately equivalent to the definite description ‘the individual called ‘Gödel’ by those from whom I acquired the name’. Other expressions can be treated in the same kind of way. If a speaker knows, for instance, that Gödel is called ‘Gödel’ by those from whom they acquired the name, then this proposal suffices to repudiate the arguments from ignorance and error. If the associations at issue are also a priori, then the proposal suffices to repudiate the arguments from empirical defeasibility. The claim that such associations are a priori is especially controversial, however, and will require further investigation.

Even if one accepts that linguistic expressions have primary intensions, and thus speaker associations that determine their extensions with respect to the actual world and with respect to other worlds considered as actual, one may still have doubts about the scrutability of truth. For instance, it seems extremely bold to claim that microphysical, qualitative, and indexical information, in conjunction with a ‘that’s all’ clause, is sufficient to derive a priori that Anand is a chess player—in any case, it is far beyond anyone’s cognitive capacities to perform such a derivation. Chalmers and Jackson try to explain in broad outline how certain kinds of truths, such as ordinary macroscopic truths like ‘Water covers most of the Earth’, are in principle a priori derivable from PQTI (Chalmers & Jackson 2001; Chalmers 2012). Another way of arguing for the scrutability thesis is by appealing to modal rationalism. It is plausible that the information in PQTI metaphysically determines all truths, in the sense that there is no other metaphysically possible world described by PQTI with respect to which some sentence has a different truth-value. More generally, the majority of philosophers believe that all facts are metaphysically determined by the facts in a small number of domains—most prominently, physical facts (according to physicalists), combined with some irreducible mental facts (according to dualists), and possibly some few other kinds of facts (perhaps including normative facts). Given this, if one adds to these facts indexical information and a ‘that’s all’ clause—for reasons explained above—a complete description stating these facts will metaphysically determine all truths. Now assume that some world w is considered as actual and described in terms of a semantically neutral vocabulary. If the explanation of necessary a posteriori truths provided by epistemic 2D semantics is correct and complete, then everything that is metaphysically determined by the features given the description of w is also epistemically determined, that is, a priori entailed. If one adds to this the assumption that our world has relatively few fundamental ingredients, and that one thus needs only a limited vocabulary to describe it, the scrutability of truth follows: All truths are a priori entailed by a complete description of the world in a limited and semantically neutral vocabulary.

iii. Philosophical Methodology

Epistemic 2D semantics has sparked many debates that revolve around the question whether there is an essential a priori component to philosophical inquiry. One instance of this general issue that has received particularly close attention concerns the issue of physicalism in the debate about the mind-body problem. Physicalism involves the claim that all facts about our world are metaphysically determined by the collection of physical facts. Both Chalmers (2009) and Jackson (2005) have argued that the truth of physicalism would entail that mental facts follow a priori from physical facts. Their claim is motivated by modal rationalism, which entails that metaphysical determination amounts to epistemic determination, given that the world is considered as actual and described in semantically neutral vocabulary. But many have argued that phenomenal facts are not a priori entailed by physical facts. Hence, modal rationalism seems to raise a problem for physicalism. (Notice that while Chalmers uses the epistemic 2D account to argue against physicalism, Jackson endorses physicalism and hence believes that there is an a priori entailment between physical and phenomenal facts.) Many physicalists have resisted the idea that they are committed to the existence of an a priori entailment between the physical and the mental. In making their case, they have either appealed to some special features of phenomenal concepts, or they have resisted modal rationalism on more general terms. At this point, no consensus regarding this issue is in sight.

Proponents of conceptual analysis hold that we can gain philosophical knowledge in virtue of our grasp of philosophical concepts and our understanding of philosophical expressions. Semantic externalism presents a challenge for this view, since externalism seems to imply that linguistic understanding and concept possession need not involve any significant knowledge about meaning. Now, one of the key claims of epistemic 2D semantics is that, against standard externalist views, our expressions come with a priori associations that determine the expressions’ extensions—that is to say, expressions have primary intensions. It is natural to suggest that these a priori associations can underpin the method of conceptual analysis. And indeed, Chalmers and Jackson have used insights from 2D semantics to defend this traditional philosophical method (Jackson 1998a; Chalmers & Jackson 2001). As they point out, conceptual analysis often involves thought experiments. A famous example is Edmund Gettier’s contribution to the analysis of knowledge. To undermine the claim that knowledge is justified true belief, Gettier (1963) describes two hypothetical cases in which a subject has a justified true belief, but no knowledge. If the central claims of epistemic 2D semantics are tenable, they can serve as a theoretical underpinning for this practice of doing conceptual analysis via thought experiments. If modal rationalism is correct, any hypothetical scenario that is epistemically possible corresponds to a metaphysical possibility. Furthermore, if the scrutability thesis is correct, this can explain our ability to determine the extension of expressions, such as ‘knowledge’, with respect to a hypothetical scenario.

For conceptual analysis to be fruitful as a philosophical method, the speaker associations that constitute primary intensions need to be shared with regard to at least some of those expressions that are subject to philosophical scrutiny. Otherwise, a philosopher engaged in conceptual analysis may acquire knowledge, but this knowledge will be based only on the primary intensions that this particular individual associates with the relevant expressions, and thus not be readily shareable with the philosophical community. Jackson (1998b; 2004) has argued that primary intensions are indeed usually shared among competent speakers in a linguistic community, along the following lines. Language is primarily used to convey information to others, that is, to communicate. This is possible only if linguistic expressions can be used to represent the state of the world, which requires there to be associations between words and properties. Furthermore, for speakers to be able to make use of this information, these associations between words and properties must be known to them. Jackson holds that the known associations between words and properties are primary intensions. On this view, it would be of little use if different speakers within a linguistic community ‘knew’ different associations between words and properties. Hence, primary intensions must be shared.

It seems plausible that it would be a serious hindrance for communication if the associations between words and properties differed greatly between subjects. But it is doubtful that successful communication requires perfect alignment of these associations. And indeed, Jackson concedes that often, communication can succeed if a speaker’s and a hearer’s primary intensions are sufficiently close to identical (Jackson 1998b, 214f.). However, this concession raises a worry regarding conceptual analysis. The cases discussed in conceptual analysis are often highly contrived and not very relevant to ordinary communication. Their evaluation is, however, often crucial for the evaluation of the underlying philosophical issues. Jackson’s argument thus leaves it open that there are often, or even always, divergences in the primary intensions associated by different speakers that, while irrelevant to ordinary communication, become apparent in the evaluation of philosophical thought experiments. During the early 21st century, experimental philosophers have collected a wealth of empirical data about subjects’ intuitions concerning philosophically relevant hypothetical cases. The results of these wide-ranging studies are varied and not easily summarized. In some cases, they have confirmed the philosophical consensus; in others, they have not.

4. References and Further Reading

a. Primary Sources

  • Chalmers, David. 2002. The Components of Content. In D. Chalmers (ed.), Philosophy of Mind: Classical and Contemporary Readings. Oxford: Oxford University Press, 608–633.
    • Motivates a 2D account of mental content.
  • Chalmers, David. 2003. The Nature of Narrow Content. Philosophical Issues 13(1), 46–66.
    • Argues that primary intensions can serve as a type of mental content that is determined by a subject’s intrinsic state.
  • Chalmers, David. 2004. Epistemic Two-Dimensional Semantics. Philosophical Studies 118(1–2), 153–226.
    • An extensive elaboration and defense of epistemic 2D semantics.
  • Chalmers, David. 2006. The Foundations of Two-Dimensional Semantics. In M. Garcia-Carpintero & J. Macia (eds.), Two-Dimensional Semantics: Foundations and Applications. Oxford: Oxford University Press, 55–140.
    • A survey of 2D theories, but with a focus on whether they yield a connection between 1-intensions and apriority.
  • Chalmers, David. 2009. The Two-Dimensional Argument Against Materialism. In B. McLaughlin & S. Walter (eds.), Oxford Handbook to the Philosophy of Mind. Oxford: Oxford University Press.
    • Articulates the 2D argument against physicalism in detail and gives an overview of the debate.
  • Chalmers, David. 2012. Constructing the World. Oxford: Oxford University Press.
    • A defense of various scrutability theses.
  • Chalmers, David & Frank Jackson. 2001. Conceptual Analysis and Reductive Explanation. The Philosophical Review 110(3), 315–361.
    • Argues that all truths are a priori implied by PQTI, and that physicalists should hold that phenomenal truths are a priori implied by physical truths.
  • Davies, Martin & Humberstone, Lloyd. 1981. Two Notions of Necessity. Philosophical Studies 58, 1–30.
    • Proposes to capture Evans’s distinction between deep and superficial contingency by means of the sentential operators ‘□’, ‘fixedly’, and ‘actually’.
  • Evans, Gareth. 1979. Reference and Contingency. The Monist 62, 161–189.
    • Explains the occurrence of contingent a priori truths on the basis of the distinction between deep and superficial contingency.
  • Frege, Gottlob. 1892/1952. Über Sinn und Bedeutung. Translated in P. Geach & M. Black (eds.), Translations from the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1952.
    • Introduces the distinction between sense and reference, the former of which yields a type of content that is intimately connected to epistemic notions.
  • Gettier, Edmund. 1963. Is Justified True Belief Knowledge? Analysis 23(6), 121–123.
    • Presents two counterexamples against the traditional view that knowledge is justified true belief.
  • Jackson, Frank. 1998a. From Metaphysics to Ethics: A Defence of Conceptual Analysis. Oxford: Clarendon Press.
    • Uses 2D semantics to argue that conceptual analysis plays a vital role in answering philosophical questions.
  • Jackson, Frank. 1998b. Reference and Description Revisited. Philosophical Perspectives 12, Language, Mind, and Ontology, 201–218.
    • A defense of the view that speaker associations determine reference and meaning.
  • Jackson, Frank. 2004. Why We Need A-Intensions. Philosophical Studies 118(1–2), 257–277.
    • Argues that primary intensions capture the representational content of what is communicated from speaker to hearer.
  • Jackson, Frank. 2005. The Case for A Priori Physicalism. In C. Nimtz & A. Beckermann (eds.), Philosophy–Science–Scientific Philosophy. Main Lectures and Colloquia of Gap 5, Fifth International Congress of the Society for Analytical Philosophy. Paderborn: Mentis.
    • Argues that physicalists should hold that mental facts follow a priori from physical facts.
  • Kamp, Hans. 1971. Formal Properties of ‘Now’. Theoria 37(3), 227–274.
    • An early 2D account of the indexical ‘now’.
  • Kaplan, David. 1989. Demonstratives. In J. Almog, J. Perry & H. Wettstein (eds.), Themes from Kaplan. Oxford: Oxford University Press, 481–563.
    • Develops an account of indexicals on the basis of the distinction between character and content.
  • Kripke, Saul. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
    • Argues that proper names, indexicals, and natural kind terms are rigid designators and spells out the consequences for the epistemology of modality.
  • Lewis, David. 1979. Attitudes De Dicto and De Se. Philosophical Review 88(4), 513–543.
    • Introduces a theory of mental content that can account for the phenomenon of egocentric belief.
  • Perry, John. 1977. Frege on Demonstratives. Philosophical Review 86(4), 474–497.
    • Argues that demonstrative expressions present a problem for Frege’s philosophy of language, and suggests an alternative account.
  • Putnam, Hilary. 1975. The Meaning of ‘Meaning’. Minnesota Studies in the Philosophy of Science 7, 131–193.
    • Argues for an externalist account of meaning.
  • Stalnaker, Robert. 1978. Assertion, in P. Cole (ed.), Syntax and Semantics 9: Pragmatics. New York, NY: Academic Press, 315–332.
    • Develops an account of communication that introduces 1-intensions as a way of reinterpreting certain kinds of utterances.
  • Stalnaker, Robert. 1987. Semantics for Belief. Philosophical Topics 15, 177–190.
    • Defends a possible world account of mental content, drawing on 2D semantics.
  • Stalnaker, Robert. 2001. On Considering a Possible World as Actual. Aristotelian Society Supplementary Volume 75, 141–156.
    • Defends a metasemantic account of 2D semantics and argues that this account suggests skepticism about apriority.
  • Stalnaker, Robert. 2004. Assertion Revisited: On the Interpretation of Two-Dimensional Modal Semantics. Philosophical Studies 118 (1–2), 299–322.
    • Suggests that epistemic 2D semantics is based on an internalist view of intentionality, in contrast with metasemantic 2D semantics, and argues that internalism is untenable.

b. Secondary Sources

  • Chalmers, David. 2002b. Does Conceivability Entail Possibility? In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press, 145–200.
    • Offers a comprehensive statement and defense of modal rationalism.
  • Chalmers, David. 2002c. On Sense and Intension. Philosophical Perspectives 16, 135–182.
    • Argues that epistemic 2D semantics represents a viable account of meaning in the tradition of Frege.
  • Garcia-Carpintero, Manuel & Josep Macia (eds.). 2006. Two-Dimensional Semantics: Foundations and Applications. Oxford: Oxford University Press.
    • A collection of articles on 2D semantics.
  • Jackson, Frank. 2010. Language, Names, and Information. Oxford: Wiley-Blackwell.
    • Argues that the role of proper names in transmitting information is best explained by an account on which their meaning is given by speaker associations.
  • Kipper, Jens. 2012. A Two-Dimensionalist Guide to Conceptual Analysis. Frankfurt a.M.: Ontos.
    • Defends epistemic 2D semantics and discusses the role that conceptual analysis can play on the basis of this account.
  • Nimtz, Christian. 2017. Two-Dimensional Semantics. In B. Hale, C. Wright & A. Miller (eds.), The Blackwell Companion to the Philosophy of Language, 2nd Edition. Oxford: Blackwell, 948–969.
    • A survey of 2D semantic theories, with an emphasis on how they relate to Kripke’s semantic and metasemantic views.
  • Schroeter, Laura. 2017. Two-Dimensional Semantics. In The Stanford Encyclopedia of Philosophy (Summer 2017 Edition), Edward N. Zalta (ed.).
    • A survey of 2D semantic theories that includes an extensive elucidation of Chalmers’s 2D argument against physicalism.

 

Author Information

Jens Kipper
Email: jkipper@ur.rochester.edu
University of Rochester
U. S. A.

Health Care Ethics

Health care ethics is the field of applied ethics that is concerned with the vast array of moral decision-making situations that arise in the practice of medicine in addition to the procedures and the policies that are designed to guide such practice. Of all of the aspects of the human body, and of a human life, which are essential to one’s well-being, none is more important than one’s health. Advancements in medical knowledge and in medical technologies bring with them new and important moral issues. These issues often come about as a result of advancements in reproductive and genetic knowledge as well as innovations in reproductive and genetic technologies. Other areas of moral concern include the clinical relationship between the health care professional and the patient; biomedical and behavioral human subject research; the harvesting and transplantation of human organs; euthanasia; abortion; and the allocation of health care services. Essential to the comprehension of moral issues that arise in the context of the provision of health care is an understanding of the most important ethical principles and methods of moral decision-making that are applicable to such moral issues and that serve to guide our moral decision-making. To the degree to which moral issues concerning health care can be clarified, and thereby better understood, the quality of health care, as both practiced and received, should be qualitatively enhanced.

Table of Contents

  1. A Brief History of Health Care Ethics
  2. Methods of Moral Decision-Making
    1. Virtue Ethics: Aristotle
    2. Utilitarian Theories: Mill
    3. Deontological Theories: Kant
    4. Principlism
    5. Casuistry
    6. Feminist Ethics
    7. The Ethics of Care
  3. Ethical Principles
    1. Autonomy
    2. Beneficence
    3. Nonmaleficence
    4. Justice
  4. Ethical Issues
    1. The Health Care Professional-Patient Relationship
      1. Truth-Telling
      2. Informed Consent
      3. Confidentiality
    2. The Question of a Right to Life
      1. Human Life: Abortion
      2. Human Death: Euthanasia and Physician-Assisted Suicide
    3. Human Subject Research
      1. The Rights of Subjects
      2. Vulnerable Populations
    4. Reproductive and Genetic Technologies
      1. Reproductive Opportunities for Choice
      2. Genetic Opportunities for Choice
    5. The Allocation of Health Care Resources
      1. Organ Procurement and Transplantation
      2. The Question of Eligibility in Health Care
    6. Health Care Organization Ethics Committees
  5. Conclusion
  6. References and Further Reading

1. A Brief History of Health Care Ethics

While the term “medical care” designates the intention to identify and to understand disease states in order to be able to diagnose and treat patients who might suffer from them, the term “health care” has a broader application to include not only what is entailed by medical care but also considerations that, while not medical, nevertheless exercise a decided effect on the health status of people. Thus, not only are bacteria and viruses (which are in the purview of medicine) of concern in the practice of health care, so too are cultural, societal, economic, educational, and legislative factors to the extent to which they have an impact, positive or negative, on the health status of any of the members of one’s society. For this reason, health care workers include not only professional clinicians (for example, physicians, nurses, medical technicians, and many others) but also social workers, members of the clergy, medical facility volunteers, to name just a few, and, in an extended sense, even employers, educators, legislators, and others.

For a person to be considered healthy, in the strictest sense of the term, is for that person to exhibit a state of well-being in the absence of which are any effects of disease, illness, or injury as might concern the person’s physiological, psychological, mental, or emotional existence. It is fair to say that no one could ever achieve this level of “complete health.” Consequently, the health status of any given person, at any given time, is best understood in terms of the degree to which that person’s health status can be said to approximate this ideal standard of health.

In the preamble to the Constitution of the World Health Organization, “health” is defined as: “…a state of complete physical, mental and social well-being[,] and not merely the absence of disease or infirmity.” This definition of “health” can also be said to embrace an ideal, but it does so by representing health as a positive, rather than as a negative, concept.

Additional distinctions concerning definitions of “health” include that between what is sometimes referred to as a natural, or biological, view of health (and of disease) as contrasted with a socially constructed view. The former view entails that health, for all natural organisms (to include the biological status of human beings), is to be correlated with the degree to which the natural functions of the organism comport with its natural evolutionary design. On this interpretation, disease is to be correlated with any malfunctions, that is, any deviations of the organism’s natural functions from what would be expected given its natural evolutionary design. The adoption of this view of health by health care practitioners results in identifiable standards, or ranges, of “normalcy” concerning health care diagnostics, such as blood pressure, cholesterol levels, and so forth, the upshot of which is that any deviation from these norms is sufficient to pronounce the patient as “unhealthy,” if not as “diseased.” By contrast, the socially constructed view of health is determined by some social value(s) such that any deviation from the socially accepted norm, or average, for our species is considered to be a disease or a disability if the deviation is viewed as a disvalue, that is, as something to be avoided. For example, whether homosexuality is to be seen as a disease state, specifically, as a mental disorder, as the American Psychological Association officially held it to be for the longest time throughout the 20th century, until they reversed their position in 1980. Based on their own explanations of each of these definitional decisions, it would appear that their former official position was value-based in a way in which their latter position was a correction (Tong, 2012).

Similar distinctions concerning the concept of health, and its resultant definition, include the representation of health as “normative,” as contrasted with a “normal biological functioning” representation. Anita Silvers argues that organizations that set public health policy by their very nature incorporate (even if unconsciously) any of a number of social dimensions of health in their official definitions of “health.” Of course, to do this has practical effects that typically serve the interests of the organization in question. Any definition of “health” that uses a limited standard, and that might be appropriate for some segments of the larger human population to which the definition is being applied, but that of necessity is not reflective of some other of the segments of that same human population might render people in these latter segments of the human population as “pathological,” literally, by definition, despite the fact that with a more objective definition of “health” they would be deemed members of the healthy population.

Moreover, some such organizations implement classification systems that allow for both biological and social considerations to measure health outcomes for the purpose of determining the effectiveness of health care programs when compared to each other. Such comparisons are then used to decide, for example, what type of disease prevention measure(s) to implement or which particular sub-populations get selected for curative measures. According to Silvers, whatever the consensus in any particular society is, concerning what the word “health” designates, determines the health care services to be provided as well as the specific beneficiaries of such services. This conflation of normative and biological factors of consideration in the conceptualization and the ultimate definition of “health” by these organizations that set public health policy leads one to believe that such a definition is exclusively biological, that is, objective, and thereby to be accepted without question (Silvers, 2012).

Michael Boylan surveys a good number and variety of what he calls recent popular paradigms concerning the concept of health, as follows: 1) functional approaches to health, including “objectivism,” as associated with an “uncompromised lifespan,” and the “functionalism/dysfunctionalism” debate; 2) the public health approach to health; and 3) subjectivist approaches to health, which do not restrict themselves to physiological health but focus more broadly on human “well-being.” After demonstrating respects in which each of these approaches to our understanding of health fail, he proposes a “self-fulfillment approach” to human health. Central to this approach, and as a first-order metaethical theory, is the “personal worldview imperative,” which requires of each of us to develop a worldview that is both comprehensive and internally coherent but that is also good and one that we would strive to actualize in our daily lives. In other words, according to this imperative, such a worldview must 1) be comprehensive, 2) be internally coherent, 3) connect to a normative ethical theory, and 4) be, at a minimum, aspirational and acted upon. This personal worldview imperative is designed as an independent and objective means of assessment in order to avoid some of the inherent flaws of the well-being approach. In conjunction with what Boylan recommends as a “personal worldview of cooperation” (as a more holistic way of viewing the world), this personal worldview imperative would, arguably, constitute the most comprehensive and objective approach to our understanding of human health (Boylan, 2004 and Boylan, 2012).

Despite the fact that “health care” is a term that reflects the more recent phenomenon of the practice of health care as expanded beyond the practice of medical care, ethical concerns related to health care can be traced back to the beginnings of medical care. While this would take us back to primitive cultures at the time of the origin of human life as we know it, the first known evidence of ethical concerns in the practice of medicine in Western cultures is what has been handed down as the Corpus Hippocraticum, which is a compilation of writings by a number of authors, including a physician known as Hippocrates, over at least a few centuries, beginning in the 5th century, B.C.E., and which includes what has come to be known as the Oath of Hippocrates. According to these authors, medical care should be practiced in such a way as to diminish the severity of the suffering that illness and disease bring in their wake, and the physician should be acutely aware of the limitations concerning the practical art of medicine and refrain from any attempt to go beyond such limitations accordingly. The Oath of Hippocrates includes explicit prohibitions against both abortion and euthanasia but includes an equally explicit endorsement of an obligation of confidentiality concerning the personal information of the patient.

Additional codes of ethics concerning the practice of medicine have also come down to us: from the 1st century A.D., known as the Oath of Initiation, attributed to Caraka, an Indian physician; from (likely) the 6th century A.D., known as the Oath of Asaph, written by Asaph Judaeus, a Hebrew physician from Mesopotamia; from the 10th century A.D., known as Advice to a Physician, written by Haly Abbas (Ahwazi), a Persian physician; from the 12th century A.D, known as the “Prayer of Moses Maimonides,” Maimonides being a Jewish physician in Egypt; from the 17th century A.D., known as the Five Commandments and Ten Requirements, written by Chen Shih-kung, a Chinese physician; from the 18th century A.D, known as A Physician’s Ethical Duties, written by Mohamad Hosin Aghili, a Persian; and many more.

In 1803, Thomas Percival in England published his Medical Ethics: A Code of Institutes and Precepts, Adapted to the Professional Conduct of Physicians and Surgeons, which included professional duties on the part of physicians in private or general practice to one’s patients. The founding of the American Medical Association in 1847 was the occasion for the immediate formulation of standards for an education in medicine and for a code of ethics for practicing physicians. This Code of 1847 included not only “duties of physicians to their patients” but also “obligations of patients to their physicians,” and not only “duties of the profession to the public” but also “obligations of the public to physicians.” From the 19th century to well into the 20th century, societies or associations of medical doctors formulated and published their own codes of ethics for the practice of medicine.

A good number of medical codes of ethics were formulated and adopted by national and international medical associations during the middle part of the 20th century. In an effort to modernize the Oath of Hippocrates for practical application, in 1948 the World Medical Association adopted the Declaration of Geneva, followed the very next year by its adoption of the International Code of Medical Ethics. The former included, in addition to an enumeration of a physician’s moral obligations to one’s patients, an explicit commitment to the humanitarian goals of medicine. Since then, virtually every professional occupation that is health care-oriented in the U. S. has established at least one association for its membership and a code of professional ethics. In addition to the American Medical Association, there is the American Nurses Association, the American Hospital Association, the National Association of Social Workers, and many others.

2. Methods of Moral Decision-Making

Methods of moral decision-making are concerned, in a variety of ways, not only with moral decision-making but also with the people who make such decisions. Some such methods focus on the actions that result from the choices that are made in moral decision-making situations in order to determine which of such actions are right, or morally correct, and which of such actions are wrong, or morally incorrect. Other methods of moral decision-making concentrate on the persons who commit actions in moral decision-making situations (that is, the agents) in order to determine those whose character is good, or morally praiseworthy, and those whose character is bad, or morally condemnable. The theorists of such methods deal with such questions as: Of all of the available options in a particular moral decision-making situation, which is the morally correct one to choose?; What are the particular virtues of character that, in conjunction, constitute a good person?; Are there certain human actions that, without exception, are always morally incorrect?; What is the meaning of the language used in specific instances of moral discourse, whether practical or theoretical?; What is meant by a specific moral concept?; and many others.

What follows is a look at some of the most influential methods of moral decision-making that have been offered by proponents of such methods and that have been applied to ethical issues in the field of health care.

a. Virtue Ethics: Aristotle

While not the first of the Ancient Greeks to articulate in writing a theory of virtue ethics, Aristotle’s version of virtue ethics, as it has come down to us, has been one of the most influential versions, if not the most influential version of all. According to Aristotle, a person’s character is the determinative factor in discerning the extent to which that person is a good person. To the extent to which a person’s character is reflective of the moral virtues, to that same extent is that person a good person. Moral virtues include but would not be limited to courage, temperance, compassion, generosity, honesty, and justice. The person in whom these moral virtues are to be found as steadfast dispositions can be relied on to exhibit a good character and thereby to commit morally correct actions in moral decision-making situations. For example, a courageous soldier will neither run headlong into battle in the belief that “war is glory” nor run away from the battle in the belief that he is afraid of being injured or killed. The former soldier has chosen to be rash during the heat of battle while the latter soldier has chosen to be a coward. By contrast, the courageous soldier holds his position on the battlefield and chooses to fight when he is ordered to do so. The fundamental difference between the courageous soldier on the one hand and the rash and cowardly soldiers on the other is that, of the three, only the courageous soldier actually knows why he is on the battlefield and chooses to do his duty to defend his comrades, his country, and his family while recognizing, at the same time, the realistic possibility that he might be injured, or even killed, on the battlefield (Aristotle, 1985).

Virtue ethics is directly applicable to health care ethics in that, traditionally, health care professionals have been expected to exhibit at least some of the moral virtues, not the least of which are compassion and honesty. To the extent that the possession of such virtues is a part of one’s character, such a health care professional can be relied on to commit morally correct actions in moral decision-making situations involving the practice of health care.

b. Utilitarian Theories: Mill

The preeminent proponent of utilitarianism as an ethical theory in the 19th century was John Stuart Mill. As a normative ethical theorist, Mill articulated and defended a theory of morality that was designed to prescribe moral behavior for all of humankind. According to Mill’s utilitarian theory of morality, human actions, which are committed in moral decision-making situations, are determined to be morally correct to the extent to which they, on balance, promote more happiness (as much as possible) than unhappiness (as little as possible) for everyone who is affected by such actions. Conversely, human actions, which are committed in moral decision-making situations, are determined to be morally incorrect to the extent to which they, on balance, produce more unhappiness rather than happiness for those who are affected by such actions. Mill hastens to acknowledge that the agent in the moral decision-making situation must count oneself as no more, or less, important than anyone else in the utilitarian calculation of happiness and/or unhappiness.

However, unlike virtually all of his utilitarian predecessors, Mill offered a version of utilitarian ethics that was designed to accommodate many, if not most, of the same ethical concerns that Aristotle had expressed in his version of virtue ethics. In other words, even after it is determined that the utilitarian calculation of the ratio of happiness to unhappiness, in a particular moral decision-making situation, might result in an option that is deemed to be morally correct, an additional calculation might be in order to determine the ratio of happiness to unhappiness in the event that such an option, in future like cases, would consistently be deemed the appropriate one such that if this latter calculation would likely result in a ratio of unhappiness over happiness, then the option in the original case might be rejected (despite its having been recommended by the utilitarian calculation for the original moral decision-making situation). For example, in a moral decision-making situation in which an employed blue-collar worker witnesses a homeless person dropping a twenty-dollar bill on the sidewalk, the utilitarian calculation would recommend, as the morally correct option, to return the twenty-dollar bill to the homeless person rather than to keep it for oneself. However, given the same exact moral decision-making situation except that rather than a homeless person dropping a twenty-dollar bill on the sidewalk, the twenty-dollar bill is dropped by a universally known and easily recognizable multi-billionaire. Despite the utilitarian calculation determining that the blue-collar worker should keep the twenty-dollar bill for oneself, the additional calculation would involve the question of the likely negative effect of such an action, if repeated in a habitual way, on the agent’s own character over a period of time.

Another possible reason to reject an otherwise recommended option, based on the utilitarian calculation, would be if the same option were to be repeatedly chosen routinely by others in society, as influenced by the action in the original case in question. To the extent that the action in question, if repeated routinely by others in society, would result in unfavorable consequences for the society as a whole, that is, it would run counter to the maintenance of social utility, then the agent in the original moral decision-making situation in which this action was an option should choose to refrain from committing this action. For example, if a prominent citizen of a small town, upon learning that the local community bank was having financial problems due to an unusually bad economy decided to withdraw all of the money that he had deposited in his accounts with this bank, the utilitarian calculation would, presumably, sanction such an action. However, precisely because this man is a well-known citizen of this small town, it can be predicted, reasonably, that word of his bank withdrawal would spread throughout the town and would likely cause many, if not most, of his fellow citizens to follow suit. The problem is that if the vast majority of the townspeople did follow suit, then the bank would fail, and everyone in this town would be worse off than before. In other words, this would serve to undermine social utility, and so, the original action would not be recommended by the utilitarian calculation.

As applicable to health care ethics, utilitarian considerations have become fairly standard procedure for large percentages of health care professionals over the past several generations. It is not at all uncommon for decisions to be made, by health care professionals at all levels of health care, on the basis of what is in the best interest of a particular collectivity of patients. For example, officials at the U. S. Centers for Disease Control (CDC) learn of an outbreak of a serious, potentially fatal communicable disease. These officials decide to quarantine hundreds of people in the geographic area in which the outbreak occurred and to mandate that health care professionals across the country who diagnose patients with this same communicable disease must not only take similar measures but also must report the names and other personal information of the affected patients to the CDC. These decisions are, themselves, decisions of moral (if not also legal) decision-making, and these decisions raise additional moral issues. At any rate, the fundamental reason for taking such measures, under the specified circumstances, is for the protection of the health of the citizens in those areas where the outbreaks occurred, but, ultimately, such measures are taken for the protection of the health of American citizens in general, that is, to promote social utility (Mill, 1861).

c. Deontological Theories: Kant

A deontological normative ethical theory is one according to which human actions are evaluated in accordance with principles of obligation, or duty. The most influential of such theories is that of Immanuel Kant, whose categorical imperative, as his fundamental principle of morality, was first formulated as, “Act only on that maxim whereby you can, at the same time, will that it should become a universal law.” In application to any particular moral decision-making situation, the agent is being asked to entertain the question of whether the action that one has chosen to commit is sufficiently morally acceptable to be sanctioned by a maxim, or general principle. In other words, the agent is asked to attempt to universalize the maxim of one’s chosen action such that all rational beings would be morally allowed to commit the same action in relevantly similar circumstances. If this attempt to universalize the maxim were to result in a contradiction, such a contradiction would dictate that the maxim in question cannot be universalized; and if the maxim cannot be universalized, then one ought not to commit the action. Kant asks his reader to consider the case of a man who stands in need of a loan of money but who also knows well that he will not be able to repay such a loan in the appropriate amount of time. The maxim of his action would be: Whenever I find myself in need of a loan of money but know that I am unable to repay it, I shall deceitfully promise to repay the loan in order to obtain the money. To attempt to universalize this maxim, this man would need to entertain a future course of events in which all rational beings would also routinely attempt to act on this same maxim whenever they might find themselves in relevantly similar circumstances. However, as a rational being, this man would come to realize that this maxim could not be universalized because to attempt to do so would result in a contradiction. For, if such an action were to become a routine practice, on the part of all rational beings in relevantly similar circumstances as those in this man’s case, then those who loan money (either as loan officers for financial institutions or as private financiers themselves) would almost immediately wise up to the fact that people are routinely attempting to borrow money on deceitful promises, that is, with no intention to repay such loans. Thus, the loaning of money would, at least temporarily, come to a halt. As Kant points out to his reader, because of the contradiction involved in attempting to universalize this maxim, neither the promise (deceitful as it is) itself nor the end to be achieved by the promise (that is, the loan of money) would be realizable. So, the fact that a contradiction results from the attempt to universalize the maxim reveals the impossibility of the maxim being able to be universalized, and because the maxim cannot be universalized, then the man ought not to commit the action.

Another formulation of the same categorical imperative was formulated as, “Act in such a way that you treat humanity, whether in your own person or in the person of any other, never as a means only, but always as an end.” According to this formulation, Kant is calling attention to his belief that all rational beings are capable of exhibiting a “good will,” which he claims is the only thing in the universe that has intrinsic value, that is, inherent value, and because a good will can only be found in rational beings, they have a singular type of dignity that must always be respected. In application to any specific moral decision-making situation, the agent is being asked to respect rational beings as valuable in, and for, themselves, or as ends in themselves, and, thereby, to commit to the principle to never treat a person (either oneself or any other) as merely a means to some other end. To apply this formulation of the categorical imperative to the same example as before is to realize that, once again, one ought not to make a deceitful promise. For, to make a deceitful promise to repay a loan of money in an effort to obtain such a loan is to treat the person to whom such a promise is made as a means only to the end of obtaining the money. To be faithful to this formulation of the categorical imperative is to never commit any action that treats any person as a means only to some other end (Kant, 1989).

Deontological theories, in general, and Kant’s categorical imperative (in either of these two formulations), in particular, can be applied to any number of issues in the practice of health care. For example, if a patient who had been prescribed an opioid for only a short period of time, post-surgery, were to contemplate whether to feign the continued experience of pain during the follow-up visit with the surgeon in an effort to obtain a new prescription for the same opioid in order to abet the opioid addiction of a friend, then the patient would be attempting to treat the surgeon as a means only to another end. Because any attempt to universalize the maxim of such an action would result in a contradiction, Kant’s categorical imperative would allow one to see that such an action ought not to be taken.

d. Principlism

One approach to health care ethics was actually developed as a result of its originators’ belief that, especially, utilitarian and deontological ethical theories were inadequate to deal effectively with the issues that had arisen in medical ethics in particular. Tom Beauchamp and James Childress introduced their “four principle approach” to health care ethics, sometimes referred to as “principlism,” in the final quarter of the 20th century. Central to their approach are the following four ethical principles: 1) respect for autonomy, 2) nonmaleficence, 3) beneficence, and 4) justice. These four ethical principles, in conjunction with what are identified as moral rules and moral virtues, together with moral rights and emotions, provide a framework for what they call the “common morality.” This common morality is put forward as the array of moral norms, which are acknowledged by all people who take seriously the importance of morality, regardless of cultural distinctions and throughout human history and so are said to be universal. However, given the abstract nature of these ethical principles, it is necessary to instantiate them with sufficient content so as to be able to be practically applicable to particular cases of moral decision-making. This is what is referred to as an application of the method of specification, which is designed to restrict the range and the scope of the ethical principle in question. In addition, each ethical principle, again, in order to be practically applicable, needs to be subjected to another methodological procedure, namely that of balancing according to which the principle, as a moral norm that is competing with others, and in order to be eligible for application to a particular case of moral decision-making, needs to be deemed to be of sufficient weight or strength, as compared to its competitors (Beauchamp and Childress, 2009).

None of the four ethical principles has been designated as enjoying superiority over the others; in fact, it is explicitly acknowledged that any of the four principles can, and would, reasonably be expected to conflict with any other. Because of this, it has been pointed out that this method of moral decision-making is subject to the problem of having no means by which to adjudicate such conflicts. Moreover, to the extent that, in practice, the application of principlism can be reduced to a mere checklist of ethical considerations, it is not sufficiently nuanced to be, ultimately, effective (Gert and Clouser, 1990).

e. Casuistry

Another method of moral decision-making that explicitly rejects the use of any ethical theory or any set of ethical principles is known as “casuistry.” Although not a new method of moral decision-making, it was re-introduced by Albert Jonsen and Stephen Toulmin in the last quarter of the 20th century within the context of ethical issues in the field of health care. This method of moral decision-making is not unlike what is normally referred to in the Western system of jurisprudence as “case law,” which makes almost exclusive use of what are considered to be “precedent-setting cases” from the past in an effort to decide the present case. In other words, like the method of decision-making that is used by judges who must render decisions in the law, casuists insist that the best way in which to make decisions on specific cases as they arise in the field of health care, and which raise significant moral issues, is to use prior cases that have come to be viewed as paradigmatic, if not precedent setting, in order to serve as benchmarks for analogical reasoning concerning the new case in question. For example, if a new case were to come about in the field of health care that raised the moral issue of how the health care professionals of a hospice organization should treat a woman who is five months into her pregnancy but who also has been diagnosed with stage four pancreatic cancer and has a life expectancy between two and three months, the casuist would advise that the moral decisions concerning the treatment of this woman should be made by seeking out as large a number as possible of cases that had occurred prior to this one and that exhibited as many as possible relevantly similar salient characteristics in addition to as many as possible of the same moral issues. To render moral assessments concerning how these previous cases were handled (some more morally acceptable and others not, or even more instructive would be at least one that stands out as reflective of either decisions determined to have been obviously morally correct or decisions determined to have been blatantly morally objectionable) is to have established guideposts for moral decision-making in the present case under consideration (Jonsen and Toulmin, 1988).

According to the proponents of casuistry, normative ethical theories and ethical principles can take moral decision-making only so far because, first, the abstract nature of such theories and principles is such that they fail to adequately accommodate the particular details of the cases to which they are applied, and second, there will always be some cases that serve to confound them, either by failure of the theory or principle to be practically applicable or by suggesting an action that is found to be morally unsatisfying in some way. However, casuistry, as a method of moral decision-making, seems to make use of various sorts of moral norms or rules, if only in a subconscious or nonconscious way. For, in order to reason, analogically, from a paradigmatic or precedent-setting past case to a current case, which exhibits even a good number of relevantly similar salient characteristics and even a good number of the same moral issues, is to base one’s judgment on some norm or rule that serves as the moral standard by which to draw out the points of agreement or disagreement between the past and the present cases. Furthermore, this moral norm or rule, itself, will almost certainly turn out to have been reflective of either popular societal or cultural bias because of the conscious methodology to refrain from the use of normative ethical theories and ethical principles, both of which carry with them standards of objectivity (Beauchamp and Childress, 2009).

f. Feminist Ethics

Not unlike the proponents of casuistry, the proponents of what has come to be called “feminist ethics” shun the use of ethical theories; however, being distinctively different from traditional methodological approaches to ethics in general, and health care ethics in particular, there is a skepticism concerning traditional ethical concepts, including the concept of autonomy. In an effort to focus more particularly on issues concerning gender equality, including the social and political oppression of women as well as the suppression of women’s voices on social, political, and ethical issues, the concept of autonomy, in an abstract sense, is thought to be less meaningful for women who are socially and politically oppressed, by virtue of their gender, than for men. For example, even though, theoretically and even legally, women, by a particular point in time during the first half of the 20th century, were eligible for admission to medical schools should they have chosen to exercise their autonomous rights to apply for such admission, in practice and in fact, both the social conditioning of women and the gender bias of the men who administered medical schools, and who made decisions on which applicants would satisfy the requirements for admission, ensured that medical schools would graduate, almost exclusively, men (with only single digit exceptions in America). The point is that the concept of autonomy, in its theoretical sense, is too abstract to have had any practical application to women, in this case, whose eligibility for acceptance to medical schools was denied on the basis of gender. Rather, the social realities of the day-to-day existence of women, within their social, political, and cultural confinements, must be addressed in such a way that the specific circumstances concerning a particular woman’s relationships with other people, in all of their varieties of dependence, if not interdependence, are to be taken into account. Thus, in addition to this concept of relational autonomy, concepts of responsibility and compassion as well as those of freedom and equality are essential to the majority of the proponents of feminist ethics (Holmes and Purdy, 1992 and Sherwin, 1994).

While among those who consider themselves to be proponents of feminist ethics there exists a range of perspectives concerning not only some of the most important ethical issues within the framework of this school of thought but also concerning the very nature of this school of thought, agreement can be found in the need to reflect on both the oppression and the suppression of women that has been inherent in most every culture throughout human history.

g. The Ethics of Care

Yet another method of moral decision-making, which is sometimes thought of as a sub-field of feminist ethics but in the early 21st century has come to be seen in its own right as a methodology and was given birth by feminist ethics, is usually referred to as the ethics of care. Like the proponents of feminist ethics, the proponents of the ethics of care have decided that any methodology of moral decision-making that is based on abstract theories or principles, rights or duties, or even objective decision-making turns out to be unsatisfying in terms of interacting with others in moral decision-making situations. Instead, the focus should be, again, not unlike proponents of feminist ethics, on the specific circumstances of the personal relationships of individual people, with particular attention to be paid to compassion, sympathy/empathy, and a sincere concern for caring for others with whom one shares any intimate relationship. The upshot of this methodology is that “caring” is a necessary constituent of all moral decision-making but that it has been absent from the traditional methodologies of moral decision-making. For, traditional normative ethical theories render objectivity an essential ingredient in moral decision-making but in so doing leave no place for the care (that is, the compassion, sympathy/empathy, and kindness) that is necessary for our inter-personal relationships to be morally successful (Held, 2006).

Nursing, as a profession, has been, traditionally, a profession of the nurturing of, as well as the caring for, the patient. Until the latter part of the 20th century, nursing was also, historically, a profession for women. It should come as no surprise, then, that an “ethics of care” approach to moral decision-making would be embraced by nurses, as well as by women in other health care professions, up to, and including, the profession of medical doctors. (In no way is this to suggest that this ethics of care would, either intentionally or in practice, preclude men from identifying with it also.) In the early 21st century, this approach to patient care in medical facilities, as well as in allied health care facilities, became almost mainstream in many societies across the globe, with accrediting agencies offering their respective “seals of approval” for those medical organizations that are successful in treating patients holistically. It should go without saying that many are the health care professionals who would choose to nurture, and to care for, their own patients in this same way, with or without the existence of any such accrediting agencies (Kuhse, 1997).

3. Ethical Principles

In addition to the application of a variety of methods of moral decision-making to the practice of health care, ethical principles are also so applicable, but not procedurally in the same way as in the method of moral decision-making identified above as principlism. As concerning normative ethical theories, in particular, regardless of the particular method of moral decision-making and its moral standard for action that one might choose to apply to moral decision-making situations, and even in the absence of any such theory or standard being applied in the day-to-day practice of professionals in the field of health care, ethical principles serve to guide one’s actions in moral decision-making situations by identifying those important and relevant considerations that must be taken into account in order for one to be able to think about such situations in a serious way. In other words, ethical principles operate on a different level of moral decision-making than do normative ethical theories or other methods of moral decision-making; nonetheless, ethical principles, like normative ethical theories and these other methods of moral decision-making, are prescriptive, that is, they offer recommendations for moral action. In theory, ethical principles can be used as one measure of how effective normative ethical theories are in their application to moral decision-making situations. For, any proposed normative ethical theory that is incapable of accommodating the requirements of the most fundamental ethical principles can be called into question on that very basis.

a. Autonomy

Patient autonomy, in the clinical context, is the moral right on the part of the patient to self-determination concerning one’s own health care. Conversely, whenever a health care professional restricts, or otherwise impedes, a patient’s freedom to determine what is done, by way of therapeutic measures, to oneself, and attempts to justify such an intrusion by reasons exclusively related to the well-being, or needs, of that patient, that health care professional can be construed to have acted paternalistically. In practice, autonomy, on the part of the patient, and paternalism, on the part of the health care professional represent mutually exclusive events, that is, to the extent that one of these two is present, in decision-making and their attendant actions within the clinical relationship of the patient and the health care professional, to that same extent is the other one absent. In other words, for the health care professional to act paternalistically is for that same health care professional to have failed to respect the patient’s autonomy, and conversely, for the health care professional to respect the patient’s autonomy is for that same health care professional to have refrained from acting paternalistically.

For example, if a physician were to offer a patient only one recommendation as a remedy for a particular medical malady, when, in fact, the physician knows of more than one prospective remedy (even if the different prospective remedies would be expected to address the medical issue in question to varying degrees and/or might have distinct reputations for varying degrees of success), then the physician in question would be said to have acted paternalistically and, thereby, to have failed to respect that patient’s moral right to autonomous decision-making. In such a case, the physician might hasten to call attention to the fact that, in the typical clinical situation, the physician’s knowledge of the medical issue in question is both qualitatively and quantitatively superior to that of the patient. This fact, while not in dispute, fails to change the nature of this physician’s act of paternalism.

Some health care professionals continue to profess their own personal beliefs that patient autonomy is over-rated because, in their own clinical experience, patients continue to make poor decisions concerning what is in the best interest of their own health care. Certainly, this is a realistic concern, and it probably always will be. However, in some cases, the poor decisions, on the part of the patient who has exercised the right to self-determination concerning one’s own health care, can be explained, at least in part, by the fact that the health care professional in question has failed to engage in the necessary amount of “patient education” in an effort to ensure that better quality decisions can be made, by the patient. In too many cases, this failure, on the part of the health care professional, is due to the language in which the patient education takes place relative to the patient’s ability to comprehend language at a certain level of sophistication. That is, not every adult patient has the ability to comprehend medical explanations even if such explanations are cast in the language of the native tongue of the patient and even if the ability of comprehension that is necessary for a proper understanding is at the level of, say, an average high school graduate. The point is that a genuine respect for the patient’s right to autonomous decision-making concerning one’s own health care demands that each and every health care professional make a sincere effort to ascertain the level of language comprehension of each and every patient, and to convey, in language that is understandable by the patient, all of the relevant medical information that is necessary in order for the patient to be able to make, in consultation with the health care professional, better quality health care decisions than might otherwise be the case.

Starting in the latter part of the 20th century, and having enjoyed a sustained progression to the 21st century, has been the belief, on the part of many, if not most, health care professionals (including physicians), that the patient’s moral right to self-determination concerning their own health care is of fundamental importance to the success of the delivery of health care. This transition, from the practice of health care being extremely paternalistic, with virtually no recognition of the patient’s right to autonomy, to the practice of health care in the 21st century, and especially in Western cultures, being such that patient autonomy is respected by health care professionals in general, as being of prime importance in the clinical context, has been painstakingly incremental. However, a fundamental problem concerning this respect for the patient’s autonomy persists, and this is the problem of the inconsistency with which it is applied. Many health care professionals are the first to sing the praises of the need to respect the patient’s autonomous preferences in their own health care, however, they are all too willing to make exceptions in situations in which they, themselves, are fundamentally opposed to such an autonomous decision to be made by a particular patient. Reasons for these so-called exceptional cases vary from cultural or religious differences between the health care professional, on the one hand, and the patient, on the other, to the patient in question being a close relative, or friend, of the health care professional (even in a clinical situation in which the health care professional has no part in the practice of health care for this close relative or friend). In either of these types of cases (and many like ones), these so-called exceptional cases are not exceptional cases at all. Rather, subjective considerations have taken the place of the more objective considerations on which the health care professional in question normally acts; that is, in every such case, the health care professional is imposing one’s own personal beliefs on the patient (albeit, usually, for the patient’s own good, that is, as an act of paternalism), and thereby, is failing to actually respect the patient’s autonomy. In the final analysis, the health care professional’s respect for autonomous decision-making on the part of the patient, in order for it to be sincere and objective, does not demand its adherence only when it is convenient for the health care professional but allows for its suspension when it is inconvenient, again, for the health care professional. On the contrary, for a health care professional to respect a patient’s autonomy is to respect that patient’s autonomous goals and preferences, even if the health care professional does not agree with them. At its most fundamental level, a true respect for autonomous decision-making on the part of the patient demands that it be honored, objectively, even in the tough cases.

b. Beneficence

To act beneficently toward others is to behave in such a way as to “do good” on behalf of, or to benefit, someone other than oneself. To the extent to which health care professionals serve their patients by helping them to maintain or improve their health status, health care professionals can be said, to the same extent, to be acting beneficently toward the patients they serve. In theory, every action performed by a health care professional, in a professional relationship with a patient, can be expected to be guided by the ethical principle of beneficence. Moreover, the respect for patient autonomy and the practice of beneficent medical care can be considered to be mutually complementary. For, it is difficult to imagine a health care professional who is committed to the principle of beneficence, on behalf of one’s patients, without also respecting the right to autonomous decision-making on the part of those same patients.

However, despite the complementary nature of the ethical principle of autonomy and that of beneficence, it is not uncommon for these two ethical principles to conflict one with the other. It is possible for a patient’s autonomous preference to appear to conflict with what is in that same patient’s own best interest(s). For example, a young adult patient who has only recently suffered a ruptured appendix (such that it is still early in the progression of pain) might refuse to undergo an appendectomy for the reasons that the patient has never undergone surgery before and claims to be deathly afraid of hospitals. To respect this patient’s autonomy is for the patient to, inevitably, die, which, reasonably, is not in the patient’s own best interest. On the other hand, to coerce this patient into agreeing to the appendectomy, and thereby to prevent the patient’s death, would be to fail to respect the patient’s autonomous preference. It is also possible for a patient’s autonomous preference to appear to conflict with the best interest(s) of someone else. For example, a patient who has only recently been diagnosed with a serious sexually transmitted infection (STI) might agree to treatment for this STI only on the condition that the health care professional in question promise to refrain from telling the patient’s spouse about the STI (as the patient’s attempt to invoke the privilege of confidentiality that is considered to be inherent in the health care professional-patient relationship). To respect this patient’s autonomy is to place at risk the health status of the patient’s spouse, at the very least, regardless of whether the patient is provided treatment for this STI.

Such cases of conflict between these two ethical principles would normally be adjudicated according to which right (that is, that of autonomy or that of beneficence) can reasonably, and objectively, be determined to supersede the other in importance. In the former example, the patient, after recovering from the life-saving appendectomy, might be appreciative of the fact that the principle of beneficence was allowed to prevail over the principle of autonomy. In the latter example, the right to know, on the part of the patient’s spouse, of one’s own potential health risks involved in the patient’s having contracted the serious STI in question would allow for the principle of beneficence (concerning another rather than the patient) to take precedence over the principle of autonomy. Of course, many are the occasions on which the principle of respect for autonomy might take precedence over the principle of beneficence. Take, for example, a patient who is similar to the one in the above-mentioned case of a ruptured appendix in that the patient is, once again, deathly afraid of hospitals, but this time is elderly and has had only one surgery, although a major one. This time the surgery is recommended to remedy a leaky mitral valve in the patient’s heart. If, having had a series of bouts of patient education such that the cardiologist can, reasonably, determine that the patient is sufficiently aware of the ramifications of both options, (that is, the likelihood that the mitral valve repair would be successful and the equal likelihood that refraining from undergoing this mitral valve repair would, within a relatively short period of time, result in the patient’s death), then respect for this patient’s autonomous decision to refrain from undergoing this surgical procedure might reasonably be seen as superseding this patient’s right to beneficence, that is, to actually undergo this surgical procedure.

c. Nonmaleficence

An ethical principle that is typically traced back to the Oath of Hippocrates is to “first, do no harm,” or to refrain from engaging in any acts of maleficence in the clinical context, that is, acts that would result in harm to the patient. Acts of maleficence can be intentional or unintentional, and a large percentage of the latter kind happen as a result of either negligence or ignorance on the part of the health care professional. An example of the former would be a surgeon who fails to exercise due diligence in scrubbing prior to surgery, the result of such negligence being that the surgical patient contracts an infection. An example of the latter would be a primary care physician who fails to scrutinize sufficiently the recent medication history of a patient prior to prescribing a new medication, the result of such ignorance being that the patient suffers a new health issue due to the adverse interaction of the newly prescribed medication with a previously prescribed one that is still being taken by the patient.

Because of the intimate relationship between the principle of nonmaleficence and that of beneficence, it is possible (at least in some cases) to construe the violation of either as a violation of the other. In other words, it might be possible to construe the failure to act in such a way as to benefit someone not only as a violation of the principle of beneficence but also as a violation of the principle of nonmaleficence. Conversely, it might be possible to construe the committing of an action that, reasonably, would be expected to actually cause harm to someone, not only as a violation of the principle of nonmaleficence but also as a violation of the principle of beneficence. To leave a surgical patient under general anesthesia longer than is medically necessary would be an example of the former, and to allow surgery to be performed on a patient by a surgeon who is under the influence of drugs or alcohol to the extent that the surgeon’s skills and judgment have been seriously impaired would be an example of the latter.

Raising the question of whether the principle of nonmaleficence has been violated would also include clinical situations in which it can be determined, objectively, that the potential risks of the recommended treatment option, be it a procedure or a medication, actually outweigh the expected benefits, all things considered. To avoid this possibility, a calculation of the ratio of potential risks to expected benefits (sometimes referred to as a risk-benefit analysis) in the case of both medical procedures and the prescribing of medications is always necessary. For a health care professional to fail to render such a calculation is, at least in theory, to violate the principle of nonmaleficence.

d. Justice

In the clinical context, the ethical principle of justice dictates the extent to which the delivery of health care is provided in an equitable fashion. As such, justice is not applicable to particular decisions, or their attendant actions; rather, the principle of justice is intended to provide the guidance that is necessary to ensure that, considered in conjunction with one another, one’s decisions, and their attendant actions, are consistent each with the others. Consequently, the hallmarks of the concept of justice are fairness and impartiality. In the context of health care, the question of justice is concerned with the degree to which patients are treated in a fair and impartial manner. Justice, as an ethical principle, demands that the actions taken by health care professionals, in their professional relationships with patients, be motivated by a consistent set of standards concerning the relevance of the variety of factors that are taken into consideration for such actions. For example, the recommendation, on the part of a health care professional, of two different primary treatment options for two different patients, each of whom having presented with the exact same symptoms to approximately the same extent, and with no known other relevant differences between the two patients except for one demographic distinction (say, age, gender, or race), would, when taken together, appear to be unjust.

Of course, it is possible for a health care professional to be the subject of an unsubstantiated and erroneous charge of injustice concerning two, or more, clinical cases that might appear to be relevantly the same. Typically, the reason for such an accusation, should the accusation be inaccurate, is that the accuser is lacking the requisite knowledge of the cases in question in order to be able to determine that, although these two, or more, cases do, indeed, appear to be relevantly similar, in fact, they are not. For example, a physician assistant might prescribe two different antibiotics (one of which has been proven to be highly effective but the other of which has an inconsistent success rate, each for the same medical malady) to two different patients who have been diagnosed with the medical malady in question. Learning of these facts, someone might accuse the physician assistant of being unfair, that is, unjust, in the treatment of these two patients. However, what this accuser does not know is that the patient for whom the less effective antibiotic was prescribed is deathly allergic (that is, subject to anaphylactic shock) to the antibiotic with the higher success rate.

In the final analysis, the ethical principle of justice demands that cases, which are relevantly similar, be treated the same and that cases, which are relevantly different, be treated in appropriately distinct ways in recognition of such differences.

4. Ethical Issues

 The practice of every profession reveals ethical issues that are endemic to the professional field in question. The practice of health care is no different. What follows is a look at some of the most pervasive ethical issues that are encountered in the practice of health care.

a. The Health Care Professional-Patient Relationship

Any ethical issues that can arise within the clinical relationship between the health care professional and the patient are of the utmost importance if only because this relationship represents the front line of the provision of health care. The most important part of this relationship is trust on the part of each of the participants in this relationship. This is why the issues of truth-telling, informed consent, and confidentiality are essential to the success of any relationship between a patient and a health care professional.

i. Truth-Telling

The most important value of telling the truth is that, under ordinary circumstances, the recipient of a claim, offered by someone else, has reasonable expectations that the claim is true, and for that reason, will, more often than not, adopt such a claim (it is to be hoped only after subjecting it to sufficient scrutiny), incorporate it into one’s own belief system, and eventually act on it. To act on this formerly received claim, which, subsequently, has become one’s own belief, is to engage in autonomous decision-making. However, should it turn out that such a belief is objectively inaccurate because the claim (from which this belief was derived) was not true, then the person who is acting on this belief will have had one’s own capacity for autonomous decision-making compromised. True, or genuine, autonomous decision-making is possible only if the beliefs on which such decisions are made are accurate; in other words, any decision that is based on an inaccurate belief (even if the belief is not recognized as such), cannot be a true autonomous decision. Thus, every person can be said to be under a moral obligation to tell the truth, especially on topics the claims about which are important and relevant to the lives of their recipients. For, in such cases, the recipients of such claims, who choose to accept them, will, eventually, hold them as beliefs, and will act on them in order to pursue what they take to be interests of their own, and, perhaps, too, the interests of others.

To respect another person, as a person, is to respect that other person’s right to autonomous decision-making, especially when such decisions concern their own interests that bear, in important and relevant ways, on the quality of their own lives. For, the quality of one’s life is a pre-requisite for human happiness, and of the entire range of interests that one might identify as essential to one’s own happiness, good health is arguably the most fundamental. Not only can the moral right to autonomy be said to be the most important right of a patient, in a clinical setting, it also can be said to be the foundational right for all of the other rights that a patient can be said to have. In order for a patient to be able to protect one’s own interest in promoting, or regaining, one’s own health, that patient’s moral right to autonomy demands to be respected.

To the extent to which any health care professional, in a professional relationship with a patient, fails to be honest with a patient (concerning that patient’s diagnosis, the recommended treatment options, the identification of realistic potential risks and expected benefits associated with such treatment options, or the patient’s prognosis by virtue of the diagnosis in relationship to each of the recommended treatment options), that patient’s autonomy can be said to have been compromised. If this compromised autonomy were to result in the patient’s inability to protect one’s own interest in promoting, or regaining, one’s own health, then this failure to be honest with the patient would represent a moral failure on the part of the health care professional. For example, a physician who, when asked explicitly by a patient what the potential adverse side-effects of the medication that the physician is in the process of prescribing might be, and who responds in such a way as either to play down the number and severity of such adverse side-effects or to suggest that there are none, can reasonably be considered to have failed one’s patient by having been dishonest. Any attempt, on the part of the physician, to justify such deception as an act of beneficence toward the patient is doomed to failure because, by definition, such deception, resulting from such a motive, would constitute an act of paternalism, that is, an act that would disregard the patient’s right to autonomous decision-making.

ii. Informed Consent

Concerns about patient autonomy give rise to the concept of “informed consent.” For, if one believes that the patient, indeed, does have a moral right to self-determination concerning one’s own health care, then it would seem to follow that health care professionals, especially physicians, ought not to prescribe any therapeutic measure in the absence of the patient’s informed consent.

Informed consent is intended to be not only a moral but also a legal safeguard for the respect of the patient’s autonomy. Furthermore, informed consent is designed to promote the welfare of the patient (that is, to ensure the patient’s right to beneficence) and to avoid the causing of any harm to the patient (that is, to ensure the patient’s right to nonmaleficence). In the clinical context, informed consent is a reference to a patient’s agreement to, and approval of, any recommended treatment or procedure that is intended to be of therapeutic value to the patient but only on the condition that the patient has an adequate understanding of all of the most important and relevant information concerning the treatment or procedure in question.

Typically, the concept of “informed consent” arises in the context of a patient (or either a patient advocate or a patient surrogate) who asserts a right to informed consent; it is usually articulated as the patient’s “right to know” any, and all, relevant information in the therapeutic relationship (usually) with the physician. A patient enters a therapeutic relationship with a physician either in an effort to maintain one’s current status of optimal health (perhaps, with an annual visit for a physiological examination in conjunction with a series of laboratory, or other, diagnostic tests) or in an effort to regain the lost status of optimal health that the patient might have previously enjoyed. To fail to respect the patient’s right to informed consent, by refraining to provide any specific important and relevant information to the patient, is to fail to uphold either the principle of beneficence or the principle of nonmaleficence, if not both.

For example, a physician might choose to knowingly, and intentionally, refrain from informing a patient of the potential risks of a certain procedure that has been recommended, up to and including a realistic risk of death. Other examples would include specific anesthetics that have a risk, small though it might be, of causing the death of the patient. To genuinely respect the patient’s right to informed consent in cases like these would be for the physician to fully inform the patient of such risks and to inform the patient, too, of the most recent statistics on how probable such risks might be. This would provide the patient with the opportunity to make a more informed decision in consultation with the physician.

Consequently, for informed consent to be truly meaningful, from the patient’s perspective, not only does the physician have an obligation to provide any and all important and relevant information concerning recommended treatments and procedures but also an obligation to refrain from interfering, without justification, with the patient’s ultimate decision.

Julian Savulescu and Richard W. Momeyer argue, effectively, that not only does being insufficiently informed of relevant information restrict a patient’s autonomous decision-making, so too does the holding of irrational beliefs, which could result in irrational deliberation. To illustrate this point, they choose the case of a patient who is a Jehovah’s Witness and who, on grounds of religious beliefs, refuses a prospective life-saving blood transfusion. They argue that, rather than viewing such a case as one in which the health care professional ought to exercise deference to the patient’s right to autonomous decision-making, out of respect for a patient whose value system differs from one’s own, the health care professional has a moral obligation to attempt, as best one can, to inform the patient of all of the important details that are relevant to the patient’s current health care situation, but also to spend the time that is necessary to help guide the patient through a process of rational deliberation concerning those details in an effort to make the best possible treatment decision. To attempt to accomplish both of these tasks is to demonstrate respect for the patient’s right to autonomous decision-making in a way in which to merely address the former task is not. Savulescu and Momeyer recognize, and advise against, the exercise of paternalism, if not coercion, when it comes to both the providing of important and relevant information and the guiding of the patient through a process of, theoretically, rational deliberation because, as they say, to compel the patient either to accept medically justified information or to engage in practical rational deliberation concerning such information would be counter-productive in many respects (Savulescu and Momeyer, 1997 and Savulescu 1995).

In the case of any non-emergency medical procedure of any significance, there is a moral obligation to obtain the informed consent of the patient by written signature authorization of an informed consent document. In the case of any emergency medical procedure of any significance, there is a moral obligation to make every reasonable effort to obtain the informed consent of the patient, in like manner. Failing that (for example, due to the mental incapacity, or incompetence, of the patient), every reasonable effort should be made to obtain the informed consent, in like manner, of either a patient surrogate (if the patient has a durable power of attorney for health care decisions) or a patient advocate (in the absence of such an advance directive). Only in cases of an emergency medical procedure of any significance in which the nature of the illness, or injury, of the patient is such that proper treatment requires urgent medical attention, in addition to which it is not possible (again, due to the mental incapacity, or incompetence, of the patient) to obtain the written signature authorization of the patient, and there is insufficient time to secure the written signature authorization of either a patient surrogate or a patient advocate, would it be morally justified to proceed with such a medical procedure in the absence of any written signature authorization.

Adolescent patients represent a special case in that while, in many cases, the cognitive ability of the adolescent patient is sufficient to comprehend most, if not all, of the important and relevant information concerning their own health care needs as well as the recommended options for treatment, normally, they are not recognized as competent medical decision-makers in the law. To accommodate both of these facts, and in addition to the written signature authorization by a parent or guardian, every reasonable effort should be made to inform adolescent patients of all of the important and relevant information concerning their own health care needs and the recommended treatment options, including the approved one, in order to obtain their assent to the latter. An exception to this is the case of emancipated minors, that is, minors who are in the military, married, pregnant, already a parent, self-supporting, or who have been declared to be emancipated by a court; emancipated minors, in most legal jurisdictions, are granted the same legal standing as adults for health care decision-making.

iii. Confidentiality

There is a moral obligation to protect from dissemination any and all personal information, of any type, that has been obtained on the patient by any and all health care professionals at any medical facility. The justification for the protection of this right is integral to the very provision of health care itself. It is essential that there exist a relationship of trust between the patient and any health care professional. This is so because there is a direct correlation between the trust that a patient places in a health care professional to keep in confidence any and all information of a personal nature that surfaces within the context of their clinical relationship and the extent to which that patient can be expected to be forthcoming with full and accurate information about oneself, which is necessary in order for the proper diagnosis and treatment of the patient to even be possible. In fact, the absence of such trust, either well-founded or not, in the mind of a person who is considering whether to enter a patient-health care professional relationship can be sufficient to keep that person from entering such a relationship at all.

Adding to the concern that a patient in any medical facility has, with respect to the extent to which personal information about oneself can reasonably be expected to be kept in confidence, is the number of employees of such a facility (especially a hospital) who have access to such information. Even limiting the number of such employees to those who need access to such information in order to properly perform their own medical duties, and even allowing for relevant distinctions between, for example, small community hospitals in rural areas and large metropolitan medical centers that serve as “teaching hospitals” for medical schools, there are literally dozens of people who have such legitimate access. For example, it is not atypical for the personal information on a surgical patient in a hospital to be accessed by attending physicians as well as physicians who are specialists and who serve as case consultants, nurses (for example, in the operating room, in the post-anesthesia care unit, in a step-down unit, on a medical-surgical floor, and perhaps, in other clinical areas), therapists (respiratory, physical, and other types), laboratory technicians (of a variety of kinds), dieticians, pharmacists, and others, including, but not limited to, patient chart reviewers (for example, for quality assurance), and health insurance auditors. Eventually, a point is reached at which the very concept of “confidentiality” either no longer applies or loses any meaning that it might have originally had. Moreover, the greater the number of people who have access to the personal information on a patient, the greater is the possibility that such information might be compromised in any of a number of ways.

In order for the respect of the patient’s moral right to the confidential maintenance of personal information in the clinical setting to have any real credibility and in order to ensure that the patient receive the best possible quality of health care, there is a moral responsibility on the part of any and all health care professionals to exercise the utmost care in the handling of the personal information on the patient such that the access to, and the use of, such information is strictly limited to what is necessary for the proper medical care of the patient. Furthermore, patients, themselves, have the right to request access to their own medical records in any medical facility (including medical offices as well as hospitals and long-term care facilities) and should be allowed (to the extent to which it is reasonably possible) a voice in who else has access to such information. To allow the patient this kind of input in one’s own medical care can foster, in any of a number of ways, the relationship of trust between the patient and the various health care professionals that is necessary for the proper medical care of the patient. (Confidentiality rights for patients in America received a comprehensive make-over with the implementation, in 2003, of the Health Insurance Portability and Accountability Act (HIPAA).)

Despite the fact that the patient’s moral right to confidentiality, concerning personal information, is of the utmost importance and despite the fact that the physician-patient relationship has traditionally enjoyed a privileged status, even in the law, there is at least one exception to this moral right: the oral or written expression of the intention, in a serious and credible way, on the part of the patient, to harm another. Such a communication imposes on the health care professional not only a moral, but also a legal, obligation to notify the proper authorities. In such a case, the right of another to not be harmed supersedes the otherwise obligatory moral right to confidentiality on the part of the patient. (The Supreme Court of the State of California decision in the Tarasoff v. Regents of the University of California case (1976) held that mental health professionals have a legal obligation to warn anyone who is threatened, in a serious way, by a patient.)

Another possible exception to the patient’s moral right to confidentiality is to be found within the context of the policies and programs of public health organizations. Given that the primary goals of such organizations are to foster and to protect the health of the members of entire populations, or societies, of people, the fundamental means by which to accomplish these goals are policies and programs the intent of which is either to prevent illness and injury or to provide health care services. In their efforts to prevent illness, public health policies sometimes come into conflict with a patient’s moral right to confidentiality. For example, a person’s right to know, for reasons of self-protection, that one’s spouse has contracted a sexually transmitted infection, by virtue of this spouse’s extra-marital relationship with one, or more, other sexual partners, might be given precedence (on moral, if not legal, grounds) over this spouse’s moral right to confidentiality, which normally would be protected within the physician-patient relationship (in this case, the same physician-patient relationship in which this sexually transmitted infection was discovered). Depending on the severity of the particular type of sexually transmitted infection, and the degree to which it is wide-spread in the population in question, the fact that this spouse has contracted this particular sexually transmitted infection might reasonably be not only a matter of individual concern but also, properly, a public health matter.

b. The Question of a Right to Life

Of all of the ethical issues that can be encountered in the practice of health care, none has been more controversial than those of abortion, euthanasia, and physician-assisted suicide. Despite the debates that are waged, with an abundance of passion concerning the specific moral aspects of each of these ethical issues, a reasoned analysis of each of these ethical issues might be expected to provide new opportunities for a better appreciation of the complexities of each.

i. Human Life: Abortion

At least since the time of the Oath of Hippocrates, with its explicit prohibition against abortion, there have been admonishments against the practice of the aborting of a human fetus together with arguments on both sides of this issue. Abortion is a perennial moral issue in most societies that ebbs and flows in its importance as an issue that serves to inform, if not incite, social debate and social action. However, over the late 20th century and early 21st century in America, stark differences between the opinions on each fundamental side of this issue have been voiced by people in the society at large, as compared to the reasoned debates waged by philosophers as a result of their attempts to bring clarity to the relevant moral issues, to the concepts that are inherent in such issues, and to the language that is used to express such issues and concepts. Historically, some theologians and some legal theorists have made moral and legal distinctions, respectively, that are relevant to the practice of abortion based on the concept of “quickening,” that is, the point in time (usually 16 to 20 weeks after conception) during a pregnancy at which the expectant mother is first able to discern fetal movement in the womb, and on the concept of “viability,” that is, the stage of development of the fetus (usually taken to be 24 weeks into the pregnancy) after which the fetus is expected to be able to survive outside of the womb (despite the likelihood of under-developed body organs and physiological, if not also mental disabilities).

The U. S. Supreme Court decision in the case of Roe v. Wade (1973) upheld a woman’s legal right to an abortion in accordance with the “due process” and “equal protection” clauses of the Fourteenth Amendment of the U. S. Constitution, rendering illegal any outside attempts to the contrary (usually by state governments), during the initial trimester of the pregnancy, but allowing state governments to limit, although not prohibit, a woman’s decision to have an abortion during the second trimester of the pregnancy. From the end of the second trimester to the time of delivery, that is, after viability, state governments were granted the authority not only to limit but also to prohibit abortions.

Despite the fact that those who adopt what are usually referred to as conservative positions and those who adopt what are usually referred to as liberal positions on the issue of abortion sometimes take the same position on related moral issues, for example, that murder is morally unacceptable and that people have a moral right to their own lives, many disagree, fundamentally, on the question of whether the act of abortion is also an act of murder and on the question of whether a fetus has a right to life. Since the Roe v. Wade landmark decision, most of the theoretical ethical debates have attempted to address each of these issues by focusing on the concept of “personhood,” as central to this debate.

Mary Anne Warren, in an influential essay in which she responds to many of the significant arguments in the literature to that point in time, makes an important distinction between what it is to be a human being as compared to what it is to be a person. According to Warren, the classic argument against abortion relies on a logical argument that depends on the fallacy of equivocation in order to attempt to be successful. The argument is as follows: since it is morally incorrect to kill innocent human beings, and since fetuses are innocent human beings, then it follows that it is morally incorrect to kill fetuses. Warren points out that the proponent of this argument is equivocating on the term “human being.” For, in its occurrence in the initial premise, “human being” is intended to mean something like “a full-fledged member of the moral community,” that is, the moral sense of the term “human being,” but, in its occurrence in the second premise, “human being” is intended to mean something like “a member of the species, Homo sapiens,” that is, the genetic sense of the term “human being.” Because the term “human being” shifts its meaning from its occurrence in the initial premise to its occurrence in the second premise, the conclusion, in fact, fails to follow from its premises; in other words, because the proponent of this argument is guilty of the fallacy of equivocation, this argument (which in order to succeed would need a different term in the place of “human being,” the meaning of which would be preserved in both of its occurrences) fails.

Warren argues that “moral humanity” and “genetic humanity” are not synonymous in meaning because the membership of these two classes is not the same. In other words, persons are viable candidates to be “full-fledged members of the moral community” in a way in which human beings are not. Consequently, the moral community consists of all, but only, persons. She then entertains the question concerning what characteristics an entity must have in order to be considered a person and launches a search for what might constitute the criteria necessary for personhood. In the final analysis, she identifies five such criteria, which she offers as “most central to the concept of personhood,” as follows: “1) consciousness (of objects and events external and/or internal to the being), and in particular the capacity to feel pain; 2) reasoning (the developed capacity to solve new and relatively complex problems); 3) self-motivated activity (activity which is relatively independent of either genetic or direct external control); 4) the capacity to communicate, by whatever means, messages of an indefinite variety of types, that is, not just with an indefinite number of possible contents, but on indefinitely many possible topics; and 5) the presence of self-concepts, and self-awareness, either individual, or racial, or both” (Warren, 1973). Warren acknowledges that it should not be required of an entity that it must exhibit all five criteria in order to qualify as a person, nor should any particular one of these criteria be deemed necessary for personhood. However, she does identify the first two criteria, followed closely by the third, as the most important. Finally, she insists that any entity that fails to exhibit any of these five criteria is, definitely, not a person, and that a human fetus is just such an entity.

Yet another argument against the right of a woman to have an abortion stems from the claim that, even if it can be demonstrated that a fetus is not, strictly speaking, a person, a human fetus is, after all, “potentially” a person. That is, if a fetus is allowed to develop, over the course of a normal pregnancy, its potential to become a person becomes more and more likely the closer that it gets to its time of delivery. The question is whether this potentiality for personhood should be considered to guarantee the fetus some rights akin to the rights of a person, for example, a right to life. Warren takes up this issue and concludes that while the fact that the human fetus is a potential person, which, on moral grounds, might entail that women ought not to wantonly have abortions, in the final analysis, whenever the question comes down to the right to life of the fetus as opposed to the right of a woman to have an abortion, the right of the woman must always supersede the claimed right on behalf of the fetus because the rights of actual persons always outweigh the rights of potential persons.

Don Marquis takes on the question of the morality of abortion in a way that is separate and apart from any considerations of whether a fetus can be a determined to be a person and even whether a fetus can be considered to be potentially a person. Rather, Marquis’s argument is an attempt to avoid the logical pitfalls of each of these other types of arguments. According to Marquis, the one factor that allows us to consider the taking of a human life to be morally objectionable is that to do so is to take away that individual’s life experiences, activities, projects, and enjoyments, which, had that individual’s life not have been taken away, would have constituted that individual’s future personal life, all of which (that is, one’s experiences, activities, projects, and enjoyments) would have been either intrinsically valuable or, at least, valuable as means to ends, such ends being intrinsically valuable to that individual. To take a human life is to deprive the individual of both what one values at present but also what one would have come to value over time had one been allowed to live on, that is, to deprive one of all of the value that one’s future continued life had promised, a future that now will not exist. It is, says Marquis, this loss that makes the taking of a human life morally incorrect. This argument against the taking of a human life would apply not only to adults but also to young children and babies who, arguably, also have a future of value concerning life experiences, activities, projects, and enjoyments to which to look forward. In the same way, a human fetus has a similar future such that, if aborted, would never be able to come to pass (Marquis, 1989).

An obvious criticism of this argument concerning the moral status of abortion is that Marquis’s argument suggests, at least, that the reason he identifies to support the claim that the taking of a human life, in the case of human adults, young children, babies, and even fetuses, is morally incorrect is, if not the only reason, then at least far and away the most important reason to support this claim. However, this is to minimize the importance of other such reasons, the plausibility of which also seems likely, such as the varying degrees of emotional pain and grief as suffered by the friends and loved ones of the victim and the denigrating effects on the perpetrator’s character, if only in terms of a desensitization to the value of human life itself. Finally, and notwithstanding the concept of personhood, Marquis’s argument, again, at least suggests that the prospective future of a human fetus is, if not identical to, then on a fundamental par with that of not only a baby or a young child but also a human adult. However, surely, there are relevant differences, not the least of which would be the capacity, more so for a young child than for a baby and more so for an adult than for a young child, to envision and to have anticipatory thoughts about one’s own prospective future and the value that it might hold, a capacity that, in theory, a fetus just does not have.

At least since the Roe v. Wade U. S. Supreme Court decision, the spectrum of positions on the issue of the moral status of abortion has been represented by an extreme conservative position, namely, that, without any exception, abortions of human fetuses ought never to be allowed; by an extreme liberal position, namely, that abortions of human fetuses ought always to be allowed, and for any reason whatsoever; and by more moderate positions, like, for example, that abortions of human fetuses ought not to be allowed, in general, but ought to be allowed in cases in which the following circumstances serve as the exceptions: in cases in which pregnancies have occurred as a result of the act of rape or the act of incest, or in cases in which the life of the expectant mother is seriously jeopardized by the pregnancy itself.

Although the Roe v. Wade U.S. Supreme Court decision was settled precedent for the previous half century, on June 24, 2022 the U.S. Supreme Court handed down its opinion in the Dobbs v. Jackson Women’s Health Organization case, which explicitly overturned both Roe v. Wade. and Planned Parenthood of Southeastern Pennsylvania v. Casey (1992), the latter of which had upheld a woman’s right to access an abortion, in effect reaffirming Roe v. Wade.  The 6-3 vote in the Dobbs v. Jackson Women’s Health Organization case held that “the [U.S.] Constitution does not confer a right to abortion. Roe and Casey must be overruled, and the authority to regulate abortion must be returned to the people and their elected representatives.”  In effect, this mandates that the individual states have the authority to regulate abortion for “legitimate reasons,” up to and including the banning of abortion with or without any exceptions.  As yet another landmark decision, Dobbs v. Jackson Women’s Health Organization has proven to be at least as controversial as Roe v. Wade ever was.

It is likely that people, in societies throughout the world, will continue to stake out positions on this issue as influenced by their cultural and/or religious beliefs, by the beliefs of their ancestors and/or living relatives, by their own ignorance or knowledge on the subject, and for all other manner of reasons, but it is unlikely that the spectrum of positions on the issue of the moral status of abortion will change.

ii. Human Death: Euthanasia and Physician-Assisted Suicide

Euthanasia is an intervention in the standard medical course of treatment of a patient who is reasonably considered to be terminally, or irreversibly, ill or injured for the express purpose of causing the imminent death of that patient, normally for reasons of mercy.

Whenever a patient who is competent to make health care decisions for oneself and who, under no coercion from anyone else, makes an explicit request (oral or written) to be euthanized, the case in question is one of “voluntary euthanasia.” Moreover, whenever a patient is not competent to make health care decisions for oneself but on behalf of whom an advance directive has been properly provided, one that was properly executed by the patient prior to becoming incompetent to make health care decisions for oneself and that explicitly expresses (in the case of a living will) or explicitly authorizes a surrogate to express (in the case of a durable power of attorney for health care decisions) the request to be euthanized under certain specified conditions, and these conditions are present, the case in question is also one of “voluntary euthanasia.”

Whenever a patient who is not competent to make health care decisions for oneself and on behalf of whom no advance directive has been properly provided but for whom a patient advocate (that is, a close relative whose decision-making authority is recognized in the law or, failing that, a more distant relative or friend) makes an explicit request (oral or written) that the patient in question be euthanized, the case in question is one of “non-voluntary euthanasia.”

Whenever a patient who is competent to make health care decisions for oneself but for whom someone other than the patient makes the decision that the patient be euthanized and does so without the consent of the patient (either because the patient was never consulted on the matter or because the patient was consulted but chose not to give consent), the case in question is one of “involuntary euthanasia.” While neither voluntary nor non-voluntary euthanasia presents any moral concerns, by its very nature, it is impossible to imagine a situation in which involuntary euthanasia could ever be morally justifiable.

When an instance of euthanasia takes the form of the committing of an action, it is usually referred to as “active euthanasia;” when such an instance takes the form of refraining from the committing of an action, it is usually referred to as “passive euthanasia.” The administering of a lethal injection would be an example of the former; the withholding of a regular course of medical treatment in order for a fatal injury, illness, or disease to take its natural toll would be an example of the latter. This distinction between active and passive euthanasia has been, historically, the focal point of the most controversy concerning the practice of euthanasia.

Traditionally, all health care-related professional codes of ethics find passive euthanasia to be morally allowable but active euthanasia to be tantamount to murder; the relevant laws in all of the legal jurisdictions in America follow suit. However, an argument can be made that terminally ill or injured patients ought to be allowed, both morally and legally, to decide when one’s own life should end and whether it should be an instance of active or passive euthanasia; the justification for such allowances would be out of a true respect for the right of such patients to self-determination concerning not only their own health care, but also the duration of their own lives as well as the means by which their lives are to end, which would be an instance of a true respect for the autonomy of such patients. Indeed, an additional argument can be advanced in an effort to uphold the patient’s right to beneficent health care. That is, in an effort to attempt to “do good” on behalf of, or to benefit, a terminally ill or injured patient, once again, one could argue that such patients should be allowed to decide their own fate and the means by which to achieve their chosen fate, that is, by the method of either active or passive euthanasia.

James Rachels, in a famous article on this very question (Rachels, 1975), attempts to demonstrate that this controversy represents a distinction without a difference. That is, Rachels argues that there are, indeed, no relevant moral differences between active and passive euthanasia, and that, in order to be consistent in one’s thinking, one has to acknowledge that active and passive euthanasia are either both morally allowable or both morally condemnable. William Nesbitt argues that Rachels fails to prove that the ordinary interpretation of responses to the two agents in Rachels’s famous comparative examples would be the same, which is the heart of the case that Rachels sets forth (Nesbitt, 1995 and Callahan, 1989).

Related to the topic of active euthanasia is what has come to be known as “the doctrine of double effect.” This doctrine has a long and rich history in the doctrine of the Roman Catholic Church but has only been applied to cases of terminally ill patients in the early 21st century. In its application to patients with terminal diagnoses who receive palliative care, the doctrine of double effect is typically invoked in an effort to justify on moral (if not legal) grounds the commission of an action by a medical professional the intention of which is to relieve the patient’s usually excruciating physiological pain while being fully cognizant of the likely, but unintended, consequence of causing the death of the patient. For example, a cancer patient, with a prognosis of only a matter of days to live, continues on a regimen of the sedative lorazepam and the opioid morphine. With increasing frequency, the patient has complained of the worsening of the pain and has repeatedly requested ever-higher doses of the morphine drip. In response to each of these requests, the physician has complied, knowing full well that there will be a threshold beyond which the dosage of morphine will be sufficient (in conjunction with a myriad of other causal factors that are idiosyncratic to this patient) to kill the patient. This, then, comes to pass. If asked by a nurse on this case whether anyone was culpable for the patient’s death, the physician would, typically, reply that no one was so culpable because, even with the final increase in the dosage of morphine, the intention was not to kill the patient; rather, the intention was to alleviate the patient’s pain.

The myriad of other causal factors that can, mutually, hasten such a patient’s death include (but would not be limited to) the patient’s body weight, the status of the patient’s immune system, the effects of the progression of the cancer, the effects of other medications, and whether the patient is still receiving nutrition and hydration. The key factor in the doctrine of double effect is the intention on the part of the medical professional in question. As long as the action in question is deemed a good one, the intention was the beneficial effect (alleviating the patient’s pain) rather than the harmful effect (killing the patient), the beneficial effect stemmed from the action directly rather than as a result of the harmful effect, and the beneficial effect outweighed, in importance, that of the harmful effect, then the action in question is determined to have been morally (if not also legally) allowable by the doctrine of double effect. However, the most fundamental criticism of the application of the doctrine of double effect to such cases is that there is no relevant moral distinction between the action in question and an instance of active euthanasia.

Palliative sedation, as the monitored use of medications, including sedatives and opioids, among others, to provide relief from otherwise unmitigated and excruciating physiological, among other types of, pain or distress by inducing any of a number of degrees of unconsciousness, can be similarly problematic depending on whether and to what extent the pain or distress of the patient in question is managed appropriately. If managed well, palliative sedation need not be a causal factor in hastening the death of the patient; however, if it is not managed well, in theory, palliative care can be such a causal factor.

If “suicide” were to be understood as one’s pursuit of a plan of action the effect of which is expected to be the intentional premature death of oneself, then “assisted suicide” can be understood to be one’s pursuit of a plan of action the effect of which is expected to be the intentional premature death of oneself but the effect of which, in order to be successful, needs to be facilitated in some way, shape, or form by someone else. If that someone else were to be a physician, then it would constitute a case of “physician-assisted suicide.” Public attention was brought to bear on the issue of physician-assisted suicide in America by Dr. Jack Kevorkian who, throughout the final decade of the 20th century, as a retired pathologist, offered to help fatally ill patients to end their lives prematurely. Prior to his fifth, and final, prosecution, which was for second degree murder, and for which he was convicted (having avoided this fate the first four times), he claimed to have assisted approximately 130 patients to end their lives, which he had claimed, throughout his entire medical career, that patients ought to have a right (both morally and legally) to do. Despite the fact that all health care-related professional codes of ethics have consistently, and still do, condemn physician-assisted suicide, currently, at least five of the fifty states in America have legalized physician-assisted suicide. Among those European nations that had legalized both active euthanasia and physician-assisted suicide by the early 21st century, the Netherlands has led the way (Kevorkian, 1991).

c. Human Subject Research

Theoretically, the most fundamental reason to conduct research involving human subjects is to add to our existing knowledge concerning the physiological and the psychological constitution of the human body and the human mind, respectively, in an effort to improve the quality of life of people as determined by the status of their bodily and mental health. Thus, the principle of beneficence should lie at the heart of all research that is conducted with human subjects. The history of such research is one of major achievements, typically incremental and over time, each of which has played a part in the extension of not only the duration of human life but also the quality of the day-to-day existence of members of the human race, virtually all over the planet. However, many are the moral issues that have arisen due to the mistreatment to which many such human subjects have been subjected, and which have occurred in any of a number of important ways, from physiological abuse to mental and emotional abuse to the abuse of human rights. The history of human subject research is replete with examples of such abuses. By the middle of the 20th century, enough people in sufficiently important roles in Western societies began to codify what they took to be some of the most basic moral rights that would need to be respected in order for human subject research to be recognized as morally acceptable.

i. The Rights of Subjects

Over many decades throughout the second half of the 20th century, a variety of codes of ethics were developed for the protection of the rights of people who serve as human research subjects. In virtually every case, those codes, that were of the most importance, were formulated in response to specific cases of human subject research during the course of which at least some of the people who served as participants had some of their fundamental rights abused. A few examples follow.

The Nuremberg Code (1949) was formulated in response to experiments that were performed on people who were members of demographic groups that were targeted for extinction by Hitler in Nazi Germany and that were conducted by medical doctors and biomedical researchers some of whom had little to no expertise or experience in either the practice of medicine or the conducting of biomedical research. In the judgment of those who prosecuted two dozen of these experimenters in what came to be known as the “Doctor Trials,” held in Nuremberg after the more famous Nuremberg trials in which the Third Reich’s major suspected war criminals were prosecuted, the main charge for which the defendants were tried was the murderous and torturous human experiments that were conducted in many of the concentration camps and the prisoner of war prisons. Of the ten principles in the Code, the emphasis, in general, was on the need for biomedical researchers to obtain the voluntary informed consent of the prospective human subjects prior to the commencement of any such experimentation. The second most important right of human subjects of such research to be emphasized in the Code was the human subject’s right to protect oneself by determining whether, and when, it is one’s own interest to end one’s own participation in such an experiment, without fear of any penalty or punishment. Despite having no legal force, The Nuremberg Code has had profound effects on the ethics of human experimentation and has spawned a good number of other such codes since its formulation.

The Declaration of Helsinki (1964, and with multiple revised versions since) was adopted by the World Medical Association’s World Medical Assembly with the title, “Recommendations Guiding Medical Doctors in Biomedical Research Involving Human Subjects.” This code of ethics consists of a host of recommendations, the result of which is the establishment of the following moral principles: 1) a competence requirement for research investigators, 2) a requirement that the significance and importance of any expected positive outcomes of the research outweigh any anticipated risks to the human subjects, 3) a requirement of informed consent on the part of the human subjects, and 4) a requirement for the external review of all of the research protocols.

The National Research Act (1974) created the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research and did so in direct response to the infamous “Tuskegee Syphilis Study” (1932-1972), which was a study of approximately 400 African-American male share-croppers, each of whom suffered from this most serious of venereal diseases, the stated purpose of which was to attempt to ascertain whether there were any significant differences between the progression of syphilis in African-American men as compared to Caucasian men. The participants in this study, begun during the throes of the Great Depression and in one of the economically poorest regions of America, were promised free food and free medical care for their participation. However, rather than being informed of the venereal disease from which they suffered, they were told only that they had “bad blood.” Most of these men were married and continued to have conjugal relations with their wives and to produce children (many of whom, wives and newborn babies, were infected with syphilis). Worse, even after penicillin was discovered and approved as modern medicine’s first antibiotic (and found to be effective against a variety of bacterial infections in humans, including syphilis, by the late 1940s), not only were these men never informed about this “miracle cure,” the health care professionals who were conducting this study, knowingly and intentionally, refrained from administering any penicillin to any of this study’s participants (Brandt, 1978).

The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research (1979) was generated by the above-mentioned commission and identified boundaries between the practice of routine medical care as compared to biomedical research protocols (again, as a direct result of the “Tuskegee Syphilis Study”), identified moral guidelines for the process by which research subjects are selected and for their informed consent, and emphasized the moral principle of respect for research subjects as persons as well as the ethical principles of beneficence and justice in the treatment of human subjects.

The Public Health Services Act (1985) established and mandated that every research facility in America that conducts either biomedical or behavioral research on human subjects have an Institutional Review Board (IRB) for the protection of the rights of human research subjects. This requirement for each such research institution (academic or otherwise) to have IRB approval for each and every biomedical or behavioral research study was a result of many instances of research protocols that, for a variety of reasons, were thought, at least in retrospect, to have violated the human rights of their human participants. For example, the “Stanford Prison Experiment” (1971) was a behavioral study, the purpose of which was to identify and analyze the psychological effects of the relationship between prison guards and prisoners on members of each group, but which took on a life of its own and resulted in a good number of human rights violations. As for biomedical research, the famous case of Henrietta Lacks and her HeLa cells allowed for at least dozens and dozens of medical breakthroughs in the curing of diseases in the latter half of the twentieth century, making large amounts of money for some people and some institutions in the research process, while most of her descendants, including some of her own children, lived their entire lives without health insurance, some of whom were, even if temporarily, homeless. Only recently has attention been brought to her story, and to this situation, by her biographer (Skloot, 2010).

The composition of the membership of all Institutional Review Boards (IRBs) is mandated to be reflective of diversity with respect to gender, race, and culture or heritage as well as a diversity of social experiences and an appreciation for issues (relevant to the research involving human subjects) that reflect the standards and values of society, if not also of the local community. The fundamental goal of all IRBs is to determine the acceptability of all research proposals, involving human subjects, based on the extent to which such proposals adhere to all relevant federal, state, and local laws, the research institution’s own policies and regulations, and all relevant standards of professional conduct, as mandated by the federal government. Moreover, IRBs are obligated to ensure that all proper procedures are followed for the voluntary informed consent of all of the subjects of all research projects.

In addition to enforcing stringent standards in order to ensure that the consent of prospective human participants be truly informed, IRBs are mandated to enforce equally strict standards concerning the following: that potential risks as well as expected benefits of the research protocols are made clear to prospective participants; that information of a personal nature that is obtained on research participants is kept in strict confidence; and that any research participant who is, simultaneously, a patient (whether in a medical facility or not) under medical treatment, is made sufficiently aware of the differences between those practices that are a part of one’s medical treatment as compared to those practices that are a part of the research protocol. In other words, researchers, in such situations, are morally obligated to exercise what sometimes might constitute supererogatory measures in an effort to help the research participant to be aware of which procedures that one is subjected to are a part of one’s medical treatment and which procedures that one is subjected to are a part of the research study, which might or might not be expected to be of therapeutic value.

The moral issues that have arisen, over decades, concerning human subjects in both biomedical and behavioral research are many and varied. In biomedical research, such issues include the exclusion of the members of specific demographic groups from even being considered to be eligible to become participants in such research. For example, until the latter part of the 20th century in America, biomedical research on breast cancer was almost nonexistent. Not until women, in decent numbers, had entered the field of medicine and the field of biomedical research did research proposals into various aspects of breast cancer begin to compete for funding with research proposals into various aspects of prostate cancer. Furthermore, even biomedical research into, for example, the correlative, if not causal, factors involved in heart disease solicited only Caucasian males as prospective research participants. In response to what some viewed as unjust funding priorities and unfair funding criteria, was the National Institutes of Health (NIH) Revitalization Act of 1993, which mandates that women and members of minority groups be included in all research that is funded by the NIH unless there is a “clear and compelling” reason that their inclusion in such research is “inappropriate” with respect to the health of the prospective subjects themselves or the purpose(s) of the research. Examples of appropriate exclusionary practices would be biomedical research into testicular cancer, which would properly exclude women, just as biomedical research into sickle-cell anemia would properly exclude Caucasians.

One of the most popularly known moral issues concerning both biomedical and behavioral research is the use of placebos. The classic case of the use of placebos is the clinical drug trial, in which researchers are attempting to determine, first, the effectiveness of the experimental drug, and second, the extent to which potential adverse side-effects of the experimental drug are significant, if not fatal. Typically, the study includes two groups of participants: those to whom is administered the experimental drug and those to whom is administered a placebo (popularly known as a “sugar pill” due to the fact that it is designed to have no relevant affect, at all, on the research participant to whom it is administered). In order to attempt to ensure credibility concerning the use of a placebo, the participants in both groups are intentionally deceived as to which group of participants is receiving the experimental drug and which is receiving the placebo. To attempt to ensure even more credibility concerning the use of a placebo, the researchers orchestrate not only a blind study, as just mentioned, but a double-blind study, in which in addition to the researchers withholding from the participants of each group the knowledge of which group’s participants are receiving the experimental drug and which are receiving the placebo, neither do the researchers themselves know this information. The main reason for a blind study is to attempt to avoid any possibility of what we might refer to as suggestive bias on the part of the participant concerning the possible effectiveness of the experimental drug. The main reason for a double-blind study is to attempt to avoid any possibility of what we might call expectation bias on the part of the researchers themselves concerning either the effectiveness, or the lack thereof, of the experimental drug.

The use of placebos in biomedical or behavioral research does raise questions concerning the ethical principle of beneficence in addition to the moral right to be told the truth. First, in theory, the participants in many, if not most, clinical trials, including drug trials, have reasonable expectations of benefitting in any of a number of ways from their participation in such research. At least in cases in which such a participant is, simultaneously, a patient with a terminal illness who ends up in the placebo designated group, it would appear that the right to beneficent treatment is being thwarted. In such a situation, and by the nature of the case, such a participant would be, perhaps, literally, betting one’s life on, in this case, the experimental drug. Second, to the extent to which participants in human subject research are being deceived, knowingly and intentionally by the researchers, which is a necessary part of any research study involving the use of placebos, a case can be made that the moral right to be told the truth, on the part of the research participant, has been violated (regardless of whether such participants are also, simultaneously, patients who are receiving medical treatment). Of course, the response to either of these criticisms of research protocols that make use of placebos is that the participants agree to the use of placebos and know, full well and in advance, that they have an equal opportunity to be members of the group who receive the placebo or members of the group who do not.

ii. Vulnerable Populations

By the nature of the case, there are some groups of people in society who are especially susceptible to abuse, concerning their rights, whenever they are the subjects of human research. Such vulnerable populations are as follows: babies, including neonates (as well as human fetuses and the subjects of human in vitro fertilization, at least in theory); children; pregnant women; prison inmates; undergraduate and graduate students; the members of any demographic minority group; and anyone who is cognitively challenged, physiologically challenged, educationally disadvantaged, economically disadvantaged, significantly compromised in one’s health, terminally ill, injured, or disadvantaged in any other relevant way.

Of particular concern in the recruitment of human research subjects, especially in cases involving prospective participants who are known to be vulnerable in any important and relevant respect(s), is the issue of coercion, whether explicit or implicit. Notwithstanding the initial one, people in every category, above-enumerated, as groups of people who represent vulnerable populations, would be susceptible, for a variety of reasons, to the influence of coercion by recruiters for human subject research. Whenever possible, biomedical and behavioral researchers should refrain from even attempting to recruit, as a prospective participant, anyone who is reasonably identifiable as a member of any vulnerable population. In the event that a biomedical or behavioral researcher needs to recruit any such vulnerable prospective participants (by virtue of the nature of the research itself), the researcher has a moral obligation to be aware of the likelihood that the prospective participants in question will feel coerced (either explicitly or implicitly, and whether they are aware of it or not) to “voluntarily” consent to participate in the research project in question. In such a situation, the researcher is morally obligated to engage in supererogatory efforts to attempt to minimize, as best one can, the effects of the coercion involved.

Once recruited, the most fundamental concern of the biomedical or behavioral researcher is the need to ensure, as best one can, that the participant (as a member of a vulnerable population) is as fully informed as possible, with respect to all relevant information concerning the proposed research project and the participant’s role in it, in an effort to approximate, once again, as best one can, truly informed consent on the part of the participant. The main reason for this concern is that any particular research participant, who is vulnerable in any important and relevant respect(s), might find it difficult, if not impossible, to comprehend any, much less all, of the relevant information concerning the proposed research project and one’s own role in it, for any of a number of reasons, for example, insufficient comprehension abilities, insufficient familiarity with the language spoken by the researchers, inadequate cognitive abilities, chronic pain of such intensity as to inhibit one’s cognitive processes in the case of a research participant who is also a patient with at least one acute health issue, and more.

d. Reproductive and Genetic Technologies

Throughout the history of the practice of health care, the acquisition of knowledge and the innovation of medical technologies have brought with them new moral issues. Beginning in the last quarter of the 20th century and continuing into the 21st century, advancements in knowledge and technologies concerning human reproduction and human genetics have spawned whole new types of moral questions and moral issues, many of which involve even more complexities than the previous ones.

i. Reproductive Opportunities for Choice

The last quarter of the 20th century brought with it major advances in biological knowledge and in biological technology that allowed, for the first time in human history, for the birth of human offspring to result from biological interventions in the birthing process. For those whose ability to procreate was biologically compromised, new scientific methods were developed to facilitate success in the birthing process. Such methods include artificial insemination (AI), in vitro fertilization (IVF), and surrogate motherhood (SM).

Artificial insemination is the process by which the sperm is manually inserted inside of the uterus during ovulation. In vitro fertilization is the process of uniting the sperm with the egg in a petri dish rather than allowing this process to take place in utero, that is, in the uterus. To increase the probability of success, multiple embryos are transferred to the uterus. As a result, multiple pregnancies are not uncommon. These multiple pregnancies increase the probability of premature births, which usually result in low-birth weight, under-developed organs, and other health issues. As to the embryos that are not chosen for transfer, the normal practice is to freeze them for possible future use because the success rate for any given round of IVF is only approximately 1 in 3.

Many opponents of IVF focus on the probability of the resultant health issues; in other words, to bring into the world, in a contrived way, children who stand a reasonable chance of suffering any of a number of health problems is unfair to such children (Cohen, 1996), if not also to the society into which they are born. Others disagree and argue that to be the recipient of the gift of life would more than outweigh the usual health issues that might result from IVF (Robertson, 1983). Some commentators argue that reproductive technologies, such as AI and IVF, allow women the opportunity to realize their potential for autonomous decision-making when it comes to their own reproductive preferences (Robertson, 1994 and Warren, 1988). Another criticism is the likelihood that the children, so produced, will be viewed as, somehow, inferior to children who are born as a result of the traditional process of procreation. There are also moral issues concerning frozen embryos. First, the longer that an embryo is maintained in a frozen state, the more likely it is that it will become degraded to the extent that either it is no longer capable of being used for its intended purpose or it is no longer alive. Second, there are serious questions as to what the fate of these frozen embryos should be when, for example, because of the splitting up of the relationship of the biological parents or the death of one, or both, of these parents, such embryos are left in a state of limbo. Should they be used for scientific research, should they be offered to other people, whose compromised procreative abilities dictate a need for such embryos to be brought to fruition through the process of IVF, or should such embryos merely be discarded?

Surrogate motherhood is the process by which one woman carries to term a fetus for someone else (typically a couple). The surrogate mother is impregnated by the method of either AI (traditional surrogacy, according to which the surrogate mother’s egg is fertilized) or IVF (gestational surrogacy, according to which an embryo is transferred to the uterus of the surrogate mother). Not only in the former case (in which the surrogate mother is also the genetic mother) but also in the latter case (in which the surrogate mother is not the genetic mother), one of the most important moral, if not also legal, issues has always been whether the surrogate mother has any proprietary rights to the newborn baby, regardless of whether a legal contract applies and regardless of whether any money changes hands.

Another fundamental moral issue occurs in cases in which there is a contractual relationship as a legal guarantee for a financial agreement. Such cases raise the moral issue of whether fetuses and newborn babies should be treated as commodities, and indeed, whether the womb of the surrogate mother should be rented out as a service for someone else, that is, also treated as a mere commodity (Anderson, 1990). However, not all commentators on this subject agree that surrogate motherhood can, of necessity, be reduced to the crass practice of baby selling or that women who serve as surrogate mothers are, necessarily, exploited. On the contrary, it can be argued that women who serve as surrogate mothers are willing to forgo any parental right that they might have to begin, much less to maintain, an inter-personal relationship with the babies they deliver. In the same way in which this forgoing of any parental right to engage in any type of inter-personal relationship with the baby appears to not be offensive in cases of surrogate motherhood, when engaged in for altruistic reasons, consistency would seem to demand that no such offense should enter into the situation just because an exchange of money is involved; in other words, the motive is not relevant to the moral assessment of the process of surrogate motherhood (Purdy, 1989).

Artificial insemination, in vitro fertilization, and surrogate motherhood have been defended on the ground that the right to reproductive freedom, including the right to exercise one’s autonomy concerning procreation, allows for any such means to bring children into the world.

Cloning is the asexual reproduction of an organism from another that serves as its progenitor but that is genetically identical to its progenitor. Cloning has always been a natural process of reproduction for many bacteria, plants, and even some insects, and it has been used as an intervention in the reproduction of plants for hundreds of years. However, since the successful cloning of a sheep named Dolly in 1996, major moral concerns have been voiced concerning the ability of scientists to clone, not only other animals, but also human beings. Despite some claims to the contrary, none of which has ever been verified, the cloning of human beings is not yet feasible.

The purpose of therapeutic cloning is to create an embryo, the stem cells of which are identical to its donor cell and are able to be used in scientific research in order to better understand some diseases, from which can be derived treatments for such diseases. The same moral issues concerning the use and ultimate fate of human embryos, as aforementioned, apply to these cloned human embryos.

The purpose of reproductive cloning is to create an embryo, which if brought to fruition will become a member of the animal kingdom. In the successful attempts to clone a variety of animals to date, a consistent problem has been health issues related to significant defects in major organs, including the heart and the brain; in addition, the duration of the lives of these cloned animals has been, on average, only half of the number of years of the normal life expectancy of such species. Moreover, each successful attempt to clone these animals has been preceded by literally dozens, if not hundreds, of unsuccessful attempts. These same problems would represent major moral concerns in any attempt to clone human beings. However, were any such attempt to be successful and were the resultant cloned human being to be of sufficiently good health to lead anything like a normal existence, new moral issues would arise. Would such cloned human beings be viewed as second class members of the human race? Would they be deprived, either socially or legally, of some of the fundamental freedoms that are normally afforded people, for example, the right to exercise one’s own autonomy? Would cloned human beings have been robbed of the exact same uniqueness (in terms of their physiology, their personality characteristics, and their character traits) that every human being in the history of humankind has hitherto enjoyed? (Just because a cloned human being would be identical, genetically, to its progenitor does not mean, by virtue of its idiosyncratic experiences in utero and in life in a large number and variety of ways, that it would, of necessity, have exactly the same life as its progenitor) (National Academy of Sciences, 2002). This last point notwithstanding, would cloned human beings be denied rights to their own identity (Brock, 1998)?

Any scientific researcher who has aspirations to clone a human being would be well advised to read, carefully, Mary Shelley’s Frankenstein; or, the Modern Prometheus. Published in 1818, this work of science fiction leaves the reader with the not too subtle warning that one ought to keep one’s hubris in check; for to create anything, much less an artificial man, is, almost certainly, to fail to be willing, or, perhaps, to be unable, to anticipate many of the important untoward consequences of one’s actions, and equally problematic, to over-estimate one’s ability to exercise control over one’s own creation.

ii. Genetic Opportunities for Choice

Since the discovery of the molecular structure of deoxyribonucleic acid (DNA), the molecule that contains the genetic instructions that are necessary for all living organisms to develop and to reproduce, in 1953, and since the completion of the mapping of the human genome, popularly known as the Human Genome Project, that is, the identification of the complete and exact sequencing of the billions of elements that make up the DNA code of the human body, some fifty years later, a vast amount of research has been conducted in the area of disease-causing mutations as causes of many human genetic disorders. This research has also allowed for the creation of literally thousands of genetic tests, the purpose of which is to detect, both in the case of prospective parents and at the fetal stage of the development of human offspring, those genetic mutations that are responsible, in part or in whole, for many non-fatal and fatal conditions and diseases. Furthermore, this research has allowed for the editing of human genes, in an effort to proactively disable some genetic mutations, in the case of adults, children, and newborns as well as in the fetal stage of development. The information derived from genetic testing, more often than not, is anything but definitive; in other words, the results of the vast majority of genetic tests are predictive of the probability that the disease or condition for which the testing was done will actually bear out. Whether such probabilities are low, moderate, or high, many other factors, especially environmental ones, can also be contributing factors. Further, while many genetic tests are available for the detection of conditions and diseases for which there is, at present, a cure, many other genetic tests are able to be conducted for conditions and diseases for which there are no cures. This fact raises the obvious question of whether specific individuals do or do not want to know that there is a probability, to whatever degree, that they will fall victim to a particular condition or disease for which there is no cure.

Each of the advances in genetic knowledge, genetic technologies, and biomedical capabilities concerning genetics brings in its train its own set of moral concerns. Genetic disorders such as amyotrophic lateral sclerosis (ALS, popularly known as Lou Gehrig’s Disease), a motor neuron disease, which is always fatal, can be familial, that is, one who has inherited the gene mutation for ALS has a 50% chance of passing the mutated gene on to any of their offspring. However, one who inherits the mutated gene might or might not fall victim to the ravages of the disease. It is conceivable that an individual, who has begun to exhibit some of the early symptoms of ALS, might choose to be tested for any of the four gene mutations that are thought to be causal. If such testing reveals the presence of one or more such mutations, and if this individual has children, the moral issue of whether any such children should be informed, immediately, and if they are so informed, the moral issue of whether such children should choose, themselves, to be tested, both become of paramount importance, if only because, depending on the outcome of the genetic testing of these children, the fate of any of their children (already in existence or as future possibilities) would be a concern.

Another moral issue that continues to arise in the context of genetic testing is when an adult or a child is tested for one condition or disease and a mutated gene is discovered for another potentially fatal condition or disease. This situation can occur because much genetic testing, at present, is sufficiently broad in its application as to include a variety of different genes. So, it sometimes happens that genetic testing for a toddler, for example, for one, or more, genetic mutations (which are suspected due to the presence of specific relevant symptoms) might reveal one or more other genetic mutations for conditions, diseases, or even specific cancers, or for young adult-onset cardiomyopathy, about which neither the researcher nor the pediatrician was even concerned. In such a case, questions arise as to whether such health risks (again, not anticipated but discovered by the genetic tests) for the toddler should be shared with the toddler’s parents, and if so, when should they be shared, that is, immediately or when the toddler is older (and if when the toddler is older, at what age). If it is not known whether the offensive gene mutations are inherited or are merely spontaneous (which is a common occurrence), does the timing of informing the toddler’s parents become a moral issue, in the event that the toddler’s parents might expect to bring additional children into the world? And, what about the toddler: from the perspective of the pediatrician or the parents, at what age should the toddler be so informed (Wachbroit, 1996)?

The moral issues identified, concerning each of these two hypothetical situations, are reflective of the ethical issues that are most fundamental in health care, namely, cases of conflict involving the ethical principles of respect for the patient’s right to autonomous decision-making as compared to acts of paternalism on the part of health care professionals and as compared to the patient’s right to beneficence in one’s relationship with health care professionals.

In addition to therapeutic reasons for genetics research and its application to health care, there are non-therapeutic reasons for such research and applications, for example, genetic enhancement, that is, the application of genetic knowledge and technologies to improve any of a number of physiological, mental, or emotional human characteristics. Some commentators argue that genetic enhancement, as compared to genetic therapy, is morally objectionable for a number of reasons, not the least of which is that, in a free-market economic system in which genetic enhancement is not provided to each citizen who might choose it by the state, those who could afford to pay for it would have a decided advantage over those who could not (Glannon, 2001). Other commentators do not agree, arguing that any attempt to use gene therapy to cure any type of human dysfunction is, in no way, morally different from any attempt to use gene therapy to enhance human function in cases in which such enhancements serve to protect one’s health or life (Harris, 1993).

Julian Savulescu goes even further by arguing for what he calls “procreative beneficence,” which is that anyone who is making use of genetic testing for non-disease human traits should make selections in favor of a child, from among other available selections in favor of other possible children, who can be expected to have, based on all of the available genetic information, what he calls “the best life,” that is, “the life with the most well-being,” or a life that would be at least as good as the lives that any of the other possible children would be expected to have. For, according to Savulescu, some non-disease-related genes influence the probability of one’s leading the best life; there is good reason to use information, which is at our disposal and which concerns such genes; and one should select embryos or fetuses which, in accordance with the available genetic information (including such information concerning non-disease genes), have the best opportunity for leading to the best life. He does make clear that, consistent with the moral requirement to make selections in favor of the child who can be expected to have the best life, those individuals who are making such selections may be subjected to persuasion but ought not to be subjected to any coercion (Savulescu, 2001).

Stoller contends that Savulescu fails to make his case because the examples that he offers to be, ostensibly, analogous to pre-implantation genetic diagnosis (PGD), a procedure that is used to screen IVF-created embryos for genetic disorders or diseases prior to their implantation, are different in ways that are morally relevant and consequently fail to justify his theory (Stoller, 2008).

Stem cell research, since its inception, has been the subject of much controversy. The pluripotent qualities of embryonic stem cells, that is, their ability to differentiate or to be converted into the cells that make up any of the human body’s parts, render them superior to adult stem cells when it comes to their use in genetic therapeutic research. Hence, many of the same reasons, as above-mentioned, that constitute moral issues whenever embryos are used for research purposes apply to the use of embryonic stem cells. This is despite the fact that they hold out much promise in their application to minimize the negative effects of, if not cure, many previously incurable conditions and diseases, for example, coronary disease, diabetes, Parkinson’s disease, Alzheimer’s disease, spinal cord injuries, and many others.

As genetic research progresses to the point at which gene therapy is able to make use of not only somatic-cell therapy (that is, the modification of genes in the cells of any of a number of human body parts for therapeutic reasons) but also germ-line therapy (that is, the alteration of egg cells, sperm cells, and zygotes for therapeutic reasons), the health care applications are expected to increase in number in an exponential way. However, the most important moral concern that the prospect of being able and willing to eventually engage in germ-line therapy is that this type of gene modification, by its very nature, will affect an unknown number of people in the future as they inherit these genetic changes. By contrast, somatic-cell therapy can only affect the person whose genes are so modified.

e. The Allocation of Health Care Resources

Health care resources have never been unlimited in any society, regardless of the type of health care system that was employed. At least for the foreseeable future, this fact is unlikely to change, but it is this fact that necessitates some form of what is normally referred to as the rationing of health care resources. Health care resources include not only the availability of in-patient hospital (and other medical facility) beds, emergency room beds, surgical units, specialized surgical units, specialized treatment centers, diagnostic technology, and more, but also personnel resources, that is, health care professionals of every description.

Whenever the availability of health care resources is exceeded by the demand for health care resources, the financial costs of such resources will rise; to the extent that, historically, there has been a consistent progression of the demand for such resources exceeding their availability, the financial costs of health care have also, consistently, risen. Because there are many other causal factors for this financial phenomenon, the rise in the financial costs of health care has been consistently exponential, in many countries, since the latter part of the 20th century. By the nature of the case, this occurs to a greater extent, and at a more rapid pace, in any country the politicians and public policy makers for which decide to employ a health care system that does not provide universal coverage.

i. Organ Procurement and Transplantation

The procurement of human organs for transplantation in order to save the lives of those who otherwise would not survive represents what many consider to be a modern medical miracle, which became possible only in the latter half of the 20th century. However, like all such advances in medical knowledge and in medical technologies, human organ transplantation raises some fundamental moral issues. Throughout the brief history of human organ transplantation, a problem that is expected to continue is the fact that there are many, many more people who need organ transplants in order to survive than there are human organs available to be transplanted. Consequently, the available organs, at any point in time, must be rationed, which raises the question of determining the relevant factors to be considered in deciding who receives transplanted organs and who does not.

To harvest human organs that are necessary for human life, for example, hearts, lungs, or livers, and in order to be able to transplant them into the bodies of people who will not survive without such a transplant, is to harvest them from the bodies of people who are only recently deceased. However, a single kidney or bone marrow, for example, are usually harvested from the body of a donor who is alive and, presumably, well. In either case, in most countries, permission is required to be granted, legally and arguably also morally, in order for the harvesting to take place. Organ donor organizations exist to enlist as many citizens as possible, in countries in which organ harvesting has been legalized, to be organ donors so that, once such donors are deceased, health care professionals are authorized to harvest any of a number of viable organs from the deceased donor’s body. As is the case for any invasive medical procedure, permission is necessary for one to donate one’s kidney or bone marrow as well.

One of the most important moral issues concerning the recipients of human organs is the issue of the criteria that are used for the selection of human organ recipients. It should come as no surprise that one of the major factors to determine which prospective organ recipients are given priority on the waiting list is the age of the prospective recipient. With only rare exception, a young adult, as a prospective heart transplant recipient, will rank higher on the heart transplant waiting list than will an elderly adult, if the latter is deemed to even be eligible. Additional criteria that are used to determine both eligibility and ranking for organ transplantation include: 1) the extent to which the need for organ transplantation is urgent in order to save the prospective organ recipient’s life; and 2) the likelihood that, and the extent to which, the candidate for transplantation will benefit from the procedure, that is, its probability for success; but also, 3) the candidate’s history of deleterious health-related habits (for example, whether the candidate for a lung transplant has ever smoked cigarettes or other tobacco products, or currently does so); 4) the candidate’s ability to pay (either outright or through private or federally funded health insurance) for the procedure; and 5) the value of the candidate, by virtue of, for example, one’s occupation, to society (for example, a cancer biomedical researcher as compared to a high school custodian), and more. If the former two criteria do not seem to raise any moral concerns, each of the latter three, almost certainly, do.

While each of the first two of these criteria could be reflective of egalitarian principles of justice, according to which each candidate, as a person, is viewed as having equal value, each of the latter three of these criteria could be seen as beneficial to the best interests of society, that is, as promoting social utility. As such, egalitarian principles of justice do not necessarily promote what is in the best interests of society any more than social utility considerations necessarily promote what is in the best interests of the individual. However, the application of either of these two criteria is far less controversial than is the application of any one of the latter three criteria. It might be reasonable for people to disagree as to whether a candidate for a lung transplant, who smoked a pack of cigarettes each day for twenty years, is less deserving of such a transplant than another such candidate who has never smoked in one’s life. It might be reasonable for people to disagree as to whether a person who is otherwise a good candidate for an organ transplant should be rejected solely because this person cannot afford to pay for the procedure and has no access to health insurance. Finally, it might be reasonable for people to disagree as to whether a candidate for an organ transplant, who happens to be a cancer biomedical researcher, is any more deserving of such a transplant than is another medically qualified candidate, who happens to be a high school custodian.

Adding to the dissatisfaction that some people express concerning the rationing of human organs for transplantation, in America and in other countries, is the deference that is sometimes offered to people of social prominence. Publicly documented in America are cases in which, for example, a prominent former professional sports figure, who had cirrhosis of the liver due to decades of alcohol abuse, was offered a liver transplant despite being, at that time, far down on the waiting list, and a governor of an East Coast state, who was offered and received both a heart and a lung transplant, again despite being, at the time in question, far down on the waiting list due, at least in part, to his age and his health status. In fact, he died less than a year later.

Another moral issue that is endemic to the human organ transplant industry is the buying and selling of human organs for the purpose of transplantation. In some Central American and some South American countries as well as in some Mideast countries, for the past several decades, there has been a thriving illegal market for human organs. More recently, this practice has spread to some European countries and even to America, when financially impoverished people find themselves in need of money for their own sustenance. Typically, such individuals are promised the equivalent of thousands of dollars for a kidney or bone marrow but find themselves at the mercy of the organ dealer for payment after the fact. Worse, too many times, such medical procedures are performed in non-clinical environments and sometimes by non-clinically trained harvesters.

Raising additional moral concerns is the practice of what is sometimes referred to as the “farming” of human organs, that is, to conceive and to bring to fruition a newborn (or, in some cases, the harvesting of human organs or tissue can be done at the fetal stage) or to maintain on life support the body of someone who has been determined to be brain dead in order to be able to harvest an organ or bone marrow for transplantation. In the former case, questions arise concerning the moral propriety of bringing a child into the world for the express purpose of harvesting some of its body parts. Depending on which specific organs might be harvested, the death of this newborn might be inevitable. In the latter case, anyone, from an anencephalic newborn to a child or an adult of any age, who, as a result of either a non-traumatic or a traumatic event, has been declared to be in a state of unresponsive wakefulness (popularly referred to as a “permanent vegetative state”), that is, a patient whose state of consciousness, due to severe damage to the brain, is not indicative of actual awareness but, at best, only partial awareness or arousal, and whose condition has lasted for three to six months, in the case of a non-traumatic cause, or at least twelve months, in the case of a traumatic cause, might be maintained on life support for the express purpose of harvesting any of a variety of human organs. Any such case introduces questions concerning any of the following moral issues: Is it ever morally allowable to keep the body of an otherwise brain dead person alive for the sole purpose of harvesting some of its organs?; Even if brain dead, does such a practice violate any moral rights or interests of the individual in question? Even if the answer to these questions is in the negative, because this individual might be deemed to have the same physiological, and thereby moral, status as one who has died, does proper respect for the body of the dead dictate that this practice is morally improper?

Both the retail sale of human organs and the farming of human organs continue to raise the moral issue of whether, and to what extent, human organs should be treated as commodities to be bought and sold in the marketplace (legally or not) and grown for the express purpose of harvesting for transplantation. Twenty-first century stem cell research holds out the promise, incrementally and over time, to eventually be able to produce, in theory, any human body part from a single cell of one’s own body. To the extent that these prospects become realities, many of the moral issues that are raised by the procurement and the transplantation of human organs will become moot.

ii. The Question of Eligibility in Health Care

The question of who, in a given society, should be eligible to receive health care is one of the most important ethical issues concerning the provision of health care in the 21st century. This is because of the stark contrasts that exist concerning the distribution of health care when comparing America to other nations. America is the only one of the thirty or more wealthiest nations on the planet to continue to prohibit universal health care. Universal health care, by the nature of the case, leaves out of its financing equation private health care insurance providers. By contrast, in America, these private health care insurance providers are the primary drivers of the health care system, determining who is eligible for health care insurance coverage; what particular health care services they choose to finance, and for whom, including not only diagnostic procedures but also surgical and other invasive medical procedures; the lengths of stays in hospitals or other medical facilities, for both surgical and non-surgical patients; the cost of health insurance premiums as well as financial deductibles and co-payments to be paid by their customers; the fees for services for physicians, surgeons, and other health care professionals, and the percentage of such fees that they will pay; the particular prescription medications that they deem eligible for payment by themselves and how much, in co-payments, that their customers have to pay; and many additional factors that affect both the health and the finances of those who maintain such insurance coverage.

In fact, there is a direct relationship, due to the effects of this type of health care system, between the health care and the finances of all members of society (both those with health insurance and those without). Many members of American society with health insurance, by virtue of their own personal financial situations, face the choice, usually on a regular basis, as to whether they can afford to pay the financial deductibles and/or the co-payments for their own health care because their earned weekly wages, all too often, preclude them from making these payments in addition to paying for rent, food, and other necessities for their families and for themselves. Added to these issues is the fact that not all health insurance plans are the same concerning which services and procedures that they cover and which they do not, the practical effect of which is that many families with working parents do not have health insurance coverage for many important and significant health care services and procedures, or even prescription medications. Worse, a large percentage of wage earners, and some salaried employees, cannot, reasonably, afford to pay the costs of health insurance premiums, and so, have no health insurance coverage at all. The practical effect of this is that in addition to not being able to afford, out of pocket, health care services or procedures that serve to maintain one’s reasonably good health status, these individuals cannot afford to seek medical attention when they experience health care symptoms even of a dire nature.

All of these facts concerning the health care system in America as compared to the health care systems in virtually every other reasonably wealthy nation in the world raise the following questions of a moral nature. Does each and every citizen of any society have a moral right to health care? If so, does the government of any society have a moral obligation to provide each and every one of its citizens with health care? These questions, by their very nature, raise the issue of the extent to which the ethical principle of justice can be realized in any given society. At the societal level, the ethical principle of justice is applicable, fundamentally, to the ways in which goods and services as well as rights, liberties, opportunities for social and economic advancement, duties, responsibilities, and many other entities (both tangible and intangible) are distributed to citizens. The application of the ethical principle of justice to these questions concerning health care provides a benchmark for the determination of which types of health care systems are more, or less, just than others.

While any of the methods of moral decision-making, as delineated above, could be applied in fruitful ways to such questions, it might be more instructive to apply two public policy perspectives: libertarianism and egalitarianism. Those politicians and public policy makers who are responsible, over many decades, for the health care system in America, have, for the most part, done so based on libertarian principles of justice, while those politicians and public policy makers who are responsible, again, over many decades, for the health care systems in those countries with universal health care coverage, have, by and large, done so based on egalitarian principles of justice.

According to libertarian principles of justice, citizens might or might not have any kind of right to health care, but even if they do, it should not result in the placing of financial burdens on wealthier citizens to fund, in part or in whole, the health care of their less financially well-off counterparts. Rather, health care, like food, clothing, the cost of shelter, and the costs of all other goods and services available in society, should be distributed by the dictates of a free-market economic system. Those who are wealthier, and who are able to buy more expensive goods and services of superior quality, will also be able to afford to buy not only health care services and procedures themselves, but also a superior quality of such health care commodities. Those who are less wealthy, and who are able to buy less expensive goods and services of comparatively inferior quality, will be able to afford health care services and procedures, but only of a comparatively inferior quality. Finally, those who are financially impoverished will not be able to afford health care services or procedures at all. Under the public policy dictates of this type of health care system, the ethical principle of the autonomy of citizens to make their own choices, as citizens in society, takes precedence over the ethical principle of beneficence.

According to egalitarian principles of justice, each citizen in society has an equal right to health care services and procedures because each citizen in society has equal value as a person. Because the status of one’s health is foundational for one to even be able to enjoy a reasonably good quality of life (and all that that entails), the government is obligated to provide each and every one of its citizens with access to health care services and procedures. Unlike most of the goods and services the distribution of which is dictated by a free-market economic system, health care is essential to the well-being of every citizen. Of course, the politicians and public policy makers, in accordance with this type of health care system, would have to adjudicate the question of whether all health care services and procedures would be available to all of the members of society, in equal measure, or the ways in which, and the degrees to which, such services and procedures would be made available to the members of society. Under the public policy dictates of this type of health care system, the ethical principle of beneficence supersedes, in importance, the ethical principle of the autonomy of its citizens to make their own choices.

In the final analysis, the ways in which, and the degrees to which, particular health care services and procedures are distributed among the citizens of a given society depend on the dictates of the principles of justice not only as they are applied to the society’s economic system but also as they are applied to the society’s governmental system.

f. Health Care Organization Ethics Committees

The Joint Commission is the comprehensive accrediting agency for health care programs and organizations, of all types, throughout America, and has, for some time, mandated the inclusion of ethics committees as an accreditation requirement. The purpose of any health care organization ethics committee is to develop, to engage in an on-going process of the review of, and to ensure the proper application of the medical ethics policies of the health care organization in question. Such policies would normally include such significant issues in health care ethics as informed consent, confidentiality, euthanasia, assisted suicide, the withholding and withdrawing of medical treatment, the harvesting and transplantation of human organs, and many others depending on the specific type of health care organization. While there is a wide latitude concerning the membership composition of health care ethics committees, typically, the following professions are represented: physicians, nurses, social workers, senior administrators, risk managers, chaplains, and ethicists, in addition to lay people from the local community, among others.

Functions of a health care ethics committee include the following: to become informed about, and to maintain a credible level of awareness of, significant issues in health care ethics, generally, and their relationships to the needs of both the patients and the health care professionals who are associated with the health care facility in question; to educate, on an on-going basis, the health care professionals of the facility in question, in addition to the members of the ethics committee, on significant issues in health care ethics as well as the ethics committee’s policies concerning such issues; and to be responsible for the particular cases of the facility’s patients that warrant either a review by, or a consultation with, the ethics committee. The health care ethics committee is, usually, the final authority on ethics policy concerning medical issues, subject to approval by the facility’s Board of Trustees.

5. Conclusion

Health care ethics is a multi-faceted and fundamentally important issue for the citizens of any society because the provision of health care is essential to the well-being of each person, and the ways in which people are treated, concerning their health care, bears importantly on their health status. The many moral issues that arise out of the provision of health care—from those that are inherent in the relationship between the health care professional and the patient to those associated with abortion and euthanasia, from those to be encountered in biomedical or behavioral human subject research to those that have come about as a result of reproductive and genetic knowledge and technologies, and from those concerning the harvesting and transplantation of human organs to those that stem from public policy decisions as determinative of the allocation of health care services and procedures—are perennial issues. To attempt to clarify these moral issues by use of the philosophical analysis of the language and the concepts that underlie them is, at least in theory, to provide a framework in accordance with which to make better quality decisions concerning them.

6. References and Further Reading

  • Anderson, E. S. (1990) “Is Women’s Labor a Commodity,” in Philosophy and Public Affairs, 19: Winter, pp. 71-92.
  • Aristotle (1985) Nicomachean Ethics, trans. by Terence Irwin, Hackett Publishing Co.
  • Beauchamp, T. L. and Childress, J. F. (2009) Principles of Biomedical Ethics, 6th ed., New York: Oxford University Press.
  • Beauchamp, T. L., Walters, L., Kahn, J. P., and Mastroianni, A. C. (2014) Contemporary Issues in Bioethics, 8th ed., Boston: Cengage.
  • Boylan, M. (2004) A Just Society, Lanham, Maryland: Oxford: Rowman and Littlefield.
  • Boylan, M. (2012) “Health as Self-Fulfillment,” in the Philosophy and Medicine Newsletter, 12:4. (Reprinted in Boylan, M. (2014) Medical Ethics, 2nd ed., Malden, Massachusetts: Wiley-Blackwell, pp. 44-57.)
  • Boylan, M. (2014) Medical Ethics, 2nd ed., Malden, Massachusetts: Wiley-Blackwell.
  • Brandt, A. M. (1978) “Racism and Research: The Case of the Tuskegee Syphilis Study,” in the Hastings Center Report, 8:6, pp. 21-29.
  • Brennan, T. (2007) “Markets in Health Care: The Case of Renal Transplantation,” in the Journal of Law, Medicine & Ethics, 35:2, pp. 249-255.
  • Brock, D. W. (1998) “Cloning Human Beings: An Assessment of the Ethical Issues Pro and Con,” in Clones and Clones: Facts and Fantasies About Human Cloning, edited by Nussbaum, M. C. and Sunstein, C. R., W. W. Norton & Co.
  • Callahan, D. (1989) “Killing and Allowing to Die,” in the Hastings Center Report, 19 (Special Supplement), pp. 5-6.
  • Chadwick, R. F. (1989) “The Market for Bodily Parts: Kant and Duties to Oneself,” in the Journal of Applied Philosophy, 6:2, pp. 129-140.
  • Cohen, C. B. (1996) “‘Give Me Children or I Shall Die!’ New Reproductive Technologies and Harm to Children,” in the Hastings Center Report, 26:2, pp. 19-27.
  • Gert, B. and Clouser, K. D. (1990) “A Critique of Principlism,” in The Journal of Medicine and Philosophy, 15:2, pp. 219-236.
  • Glannon, W. (2001) “Genetic Enhancement,” in Genes and Future People: Philosophical Issues in Human Genetics, Glannon, W., Westview Press, pp. 94-101.
  • Harris, J. (1993) “Is Gene Therapy a Form of Eugenics?” in Bioethics, 7:2/3, pp. 178-187.
  • Held, V. (2006) The Ethics of Care, New York: Oxford University Press.
  • Holmes, H. B. and Purdy, L. M. (1992) Feminist Perspectives in Medical Ethics, Bloomington: Indiana University Press.
  • Jonsen, A. R. and Toulmin, S. (1988) The Abuse of Casuistry: A History of Moral Reasoning, Berkeley: University of California Press.
  • Kant, I. (1989) Foundations of the Metaphysics of Morals, edited and translated by Lewis White Beck, Library of Liberal Arts: Pearson.
  • Kevorkian, J. (1991) Prescription—Medicine: The Goodness of Planned Death, Prometheus Books.
  • Kuhse, H. (1997) Caring: Nurses, Women and Ethics, Oxford: Blackwell.
  • Kuhse, H., Schuklenk, U., and Singer, P. (2015) Bioethics: An Anthology, 3rd ed., Malden, Massachusetts: Wiley Blackwell.
  • MacKay, D. and Danis, M. (2016) “Federalism and Responsibility for Health Care,” in Public Affairs Quarterly, 30:1, pp. 1-29.
  • Marquis, D. (1989) “Why Abortion is Immoral,” in the Journal of Philosophy, LXXXVI:4, 183-202.
  • Mill, J. S. (1861) Utilitarianism, in Collected Works of John Stuart Mill. Edited by J. M. Robson, Vol. X, Toronto: University of Toronto Press, 1969.
  • National Academy of Sciences (2002) Committee on Science, Engineering, and Public Policy, Scientific and Medical Aspects of Human Reproductive Cloning, Washington, D. C.: National Academy Press.
  • Nesbitt, W. (1995) “Is Killing No Worse than Letting Die?” in the Journal of Applied Philosophy, 12:1, pp. 101-105.
  • Noonan, J. T. (1968) “Deciding Who Is Human,” in the American Journal of Jurisprudence,  13:1, pp. 134-140.
  • Noonan, J. T. (1970) “An Almost Absolute Value in History,” in The Morality of Abortion: Legal and Historical Perspectives, John T. Noonan, Cambridge: Harvard University Press, pp. 51-59.
  • Purdy, L. M. (1989) “Surrogate Mothering: Exploitation or Empowerment?” in Bioethics, 3:1, pp. 18-34.
  • Rachels, J. (1975) “Active and Passive Euthanasia,” in the New England Journal of Medicine 292, pp. 78-80.
  • Ram-Tiktin, E. (2012) “The Right to Health Care as a Right to Basic Human Functional Capabilities,” in Ethical Theory and Moral Practice, 15:3, pp. 337-351.
  • Robertson, J. (1994) “The Presumptive Primacy of Procreative Liberty,” in Children of Choice: Freedom and the New Reproductive Technologies, Princeton: Princeton University Press, pp. 22-42.
  • Robertson, J. A. (1983) “Procreative Liberty and the Control of Conception, Pregnancy, and Childbirth,” in the University of Virginia Law Review, 69, pp. 405-464.
  • Savulescu, J. (1995) “Rational Non-Interventional Paternalism: Why Doctors Ought to Make Judgments of What Is Best for Their Patients,” in the Journal of Medical Ethics, 21, 327-331. (Reprinted in Medical Ethics, 2nd ed. (2014), ed. by Michael Boylan, Malden, Massachusetts: Wiley-Blackwell, pp. 83-90.)
  • Savulescu, J. (2001) “Procreative Beneficence: Why We Should Select the Best Children,” in Bioethics, 15:5/6, pp. 413-426.
  • Savulescu, J. and Momeyer, R. W. (1997) “Should Informed Consent Be Based on Rational Beliefs?” in the Journal of Medical Ethics, 23, pp. 282-288. (Reprinted in Medical Ethics, 2nd ed. (2014), ed. by Michael Boylan, Malden, Massachusetts: Wiley-Blackwell, 104-115.)
  • Shaw, D. (2009) “Euthanasia and Eudaimonia,” in the Journal of Medical Ethics, 35:9, 530-533.
  • Sherwin, S. (1992) No Longer Patient: Feminist Ethics and Health Care, Philadelphia: Temple University Press.
  • Sherwin, S. (1994) “Women in Clinical Studies: A Feminist View,” in the Cambridge Quarterly of Healthcare Ethics, 3:4, pp. 533-539.
  • Silvers, A. (2012) “Too Old for the Good of Health?” in the Philosophy and Medicine Newsletter, 12:4. (Reprinted in Boylan, M. (2014) Medical Ethics, 2nd ed.,Malden, Massachusetts: Wiley-Blackwell, pp. 30-43.)
  • Skloot, R. (2010) The Immortal Life of Henrietta Lacks, New York: Crown/Random House.
  • Steinbock, B., London, A. J., and Arras, J. (2013) Ethical Issues in Modern Medicine: Contemporary Readings in Bioethics, 8th ed., Columbus, Ohio: McGraw-Hill.
  • Stoller, S. (2008) “Why We Are Not Morally Responsible to Select the Best Children: A Response to Savulescu,” in Bioethics, 22:7, pp. 364-369.
  • Thomson, J. J. (1971) “A Defense of Abortion,” in Philosophy and Public Affairs, 1:1, 47-66.
  • Tong, R. (1997) Feminist Approaches to Bioethics: Theoretical Reflections and Practical Applications, Boulder: Westview Press.
  • Tong, R. (2002) “Love’s Labor in the Health Care System: Working Toward Gender Equity,” Hypatia, 17:3, pp. 200-213.
  • Tong, R. (2012) “Ethics, Infertility, and Public Health: Balancing Public Good and Private Choice,” in the Newsletter on Philosophy and Medicine, 11:2, pp. 12-17. (Reprinted in Boylan, M. (2014) Medical Ethics, second ed., Malden, Massachusetts: Wiley-Blackwell, p.13-30.)
  • Wachbroit, R. (1996) “Disowning Knowledge: Issues in Genetic Testing,” in Report from the Institute for Philosophy and Public Policy, 16:3/4, pp. 14-18.
  • Warren, M. A. (1973) “On the Moral and Legal Status of Abortion,” in The Monist, 57:1, 43-61.
  • Warren, M. A. (1988) “IVF and Women’s Interests: An Analysis of Feminist Concerns,” in Bioethics, 2:1, pp. 37-57.
  • Warren, V. L. (1992) “Feminist Directions in Medical Ethics,” in the HEC Forum, 4:1, pp. 73-    87.
  • World Health Organization, Preamble to the Constitution of the World Health Organization, New York, June 19-July 22, 1946 (New York: Adopted by the International Health Conference, and signed on July 22, 1946.)

 

Author Information

Stephen C. Taylor
Email: staylor@desu.edu
Delaware State University
U. S. A.

Science and Ideology

This article illustrates some of the relationships between science and ideologies. It discusses how science has been enlisted to support particular ideologies and how ideologies have influenced the processes and interpretations of scientific inquiry.

An example from the biological sciences illustrates this. In the early 20th century, evolutionary theory was used to support socialism and laissez-faire capitalism. Those two competing ideologies were justified by appeal to biological claims about the nature of evolution.

Those justifications may seem puzzling. If science claims to generate only a limited set of facts about the world—say, the mechanisms of biological diversification—it is unclear how they could inform anything so far removed as economic theory. Part of the answer is that the process of interpreting and applying scientific theories can generate divergent results. Despite science’s capacities to render some exceedingly clear and well-verified central cases, its broader uses can become intertwined with separate knowledge claims, values, and ideologies. Thus, the apparently clear deliverances of natural sciences have been leveraged to endorse competing views.

Rightly or wrongly, this leveraging has long been part of the aims and practice of scientists. Many of the Early Modern progenitors of natural science hoped that science would apply to large swaths of human life. They believed that science could inform and improve politics, religion, education, the humanities, and more. One fictional version of this ideal, from Francis Bacon in the 17th century, imagined scientists as the political elites, ruling because they are best equipped to shape society. Such hopes live on today.

It is not only in its applications that science can become ideological; ideologies also can be part of the formation of sciences. If natural sciences are not hermetically sealed off from society, but instead are permeable to social values, power relations, or dominant norms of an era, then it is possible for science to reflect the ideologies of its practitioners. This can have a particularly pernicious effect when the ideologies that make their way into the science are then claimed to be results derived from the science. Those ideologies, now “naturalized,” have sometimes been granted added credibility because of their supposedly scientific derivation.

Not all sciences seem equally susceptible to ideological influence or appropriation. Ideologies seem to have closer connections to those sciences investigating topics nearer to human concerns. Sciences that claim to bear upon immigration restrictions, government, or human sexuality find wider audiences and wider disputes than scientific conclusions limited to barnacle morphology or quantum gravity.

The potential for science to become entwined with ideology does not necessarily undermine scientific claims or detract from science’s epistemic and cultural value. It hardly makes science trivial, or just one view among others. Science must be used well and taken seriously in order to solve real-world challenges. Part of taking science seriously involves judicious analysis of how ideologies might influence scientific processes and applications.

The topic is vast, and this article confines itself to some historical cases that exemplify significant interactions between science and ideologies.

Table of Contents

  1. Terminology
  2. Science and Political Economy
  3. Science and Race
  4. Science and Gender
  5. Science and Religion
  6. Science as Ideology: Scientism
  7. Conclusion
  8. References and Further Reading

1. Terminology

First, a brief note about definitions.  What exactly is meant by “science” and by “ideology”?  Much has been written attempting to define these concepts, but we only need the broad outlines of such attempts before moving on.

The word “science” derives from the Latin scientia, or knowledge. It has historically been closely associated with philosophy. At least since the Renaissance, the term has acquired connotations of theoretical, organized, and experiential knowledge.

In the 17th century, a constellation of practices, ideas and institutions among natural philosophers contributed to what most historians recognize as the advent of modern science. Galileo Galilei, Rene Descartes, Francis Bacon, Robert Boyle, and Isaac Newton (all of whom considered themselves philosophers) wrote texts that subsequent practitioners lifted up as exemplary of the “new philosophy.” While there was no universal agreement on exactly what this new philosophy consisted of, some of the most salient elements included the rejection of Aristotelian forms and final causes; the attempt to account for most natural phenomena in terms of efficient causes operating according to laws of nature; the identification and quantification of objective “primary qualities” such as mass and velocity; and the introduction of experimental practices using the controlled operation of idealized or contrived events as evidence for nature’s operation.

Science encompasses two distinctive strands, including both a body of knowledge and a coordinated set of instrumental activities that generate technological or engineering solutions. The former continues the legacy of natural philosophy through its aim to understand, explain, and predict the world. The latter strand has more pragmatic concerns to build tools and solve problems. Perhaps unsurprisingly, philosophers have paid most attention to the first, natural philosophical, strand of science.

In the mid-20th century, philosophers launched a vigorous campaign to correctly characterize science and thus distinguish it from illegitimate forms of knowledge or pseudoscience. If the scientific method could be correctly identified, they supposed, then the right method for knowledge generation could be secured, and there would be a better way to jettison dubious, nonscientific, or merely ideological claims. For example, Karl Popper was famously keen to exclude Marxist historiography and Freudian psychoanalysis from the province of science. Along with Popper, Imre Lakatos and others contributed to a sophisticated body of literature on scientific method, attempting to square the idea of characteristic and rational rules of science with the historical record of dynamic, changing scientific theories and practices. Paul Feyerabend, by contrast, urged abandoning the search for rules of science altogether; he argued that, since science is a creative and evolving enterprise, there is no specific method it ever did, or should, follow.

The campaign to distinguish science from pseudoscience has now largely subsided with no clear resolution. Some philosophers see scientificity as a matter of degree that can be instantiated to a greater or lesser extent according to how systematic the study may be. Nonetheless, a single definition of science remains elusive. The diversity of activities and methods used across the natural sciences makes it difficult to find anything that neatly separates sciences from other human activities not typically considered scientific, like auto mechanical work. As one philosopher put it, “Why should there be the method of science? There is not just one way to build a house, or even to grow tomatoes. We should not expect something as motley as the growth of knowledge to be strapped to one methodology” (Hacking 1983).

Much like science, “ideology” is notoriously difficult to pin down as a single, determinate concept. The term was originally proposed around the year 1800 to be, quite literally, a science of ideas: a way to rigorously study humans’ ideas as part of natural history. The term’s creator, Destutt de Tracy, even imagined this new science as a branch of zoology.

But the word has since changed its meaning and today frequently carries a negative connotation. In informal discourse, “being ideological” is often a pejorative label used to accuse someone of being blinkered to reality by a particular set of beliefs. This pejorative sense of ideology comes largely from classical social theorists, especially Karl Marx. For Marx, to be in the grip of a false ideology was to naively adopt ruling class ideas about art, religion, ethics, or politics, which are actually explained by that society’s economic structure. Those ideologies, Marx believed, generated a false consciousness about one’s own world and diverted one’s attention from true sources of oppression (Marx and Engels 1938). While ideologies claim to describe the way things are, Marx claimed that in reality they function to defend political structures underpinning class hierarchies. Marx diagnosed and critiqued such ideologies, hoping thereby to liberate individuals from self-oppression and to bring about social reforms. In this tradition, ideology was often seen as antithetical to science. This conceptual contrast between science and ideology has largely been passed down to us today, for example, when science is imagined to be quintessentially nonideological.

Following Marx, subsequent theorists extended views of ideology and why it might be harmful. Political philosopher Hannah Arendt criticized ideology for the way it short-circuits substantive political debate. Ideologies posit basic tenets or first principles, such as racial purity, class struggle, or free markets, from which other ideas automatically follow. According to Arendt, ideologies have a pernicious role in replacing genuine ethical debate with their own abstract and internal logic. Promising certainty, ideologies run roughshod over tradition, concrete historical particulars, and the difficult business of moral deliberation (Arendt 1973).

This article does not adhere solely to theoretical frameworks that criticize ideology, and so this article treats “ideology” in its broader and more neutral sense, as a description of the organizing beliefs of a population. This second, broader use is in accord with the practices of empirical anthropology, which might seek to describe the organizing beliefs of a foreign culture. When conceived of in this descriptive sense, ideologies may be understood as necessary or positive for many political purposes. Ideologies in this sense are merely ways of interpreting or “mapping” our political and social environments (Freeden 2003).

Some important features are common to both the pejorative and more neutral senses of ideology. First, ideologies are beliefs that legitimate or stabilize social power structures. Broadly speaking, ideologies relate to politics because they have a social function, and as such they can engender a sense of group identity or motivate the need for action. Second, ideologies are not always transparent to those who hold them. It is often easier to recognize ideology in others than in oneself. Third, ideologies involve beliefs that are closer to the center of one’s web of belief. That is to say, they are not easily acquired and released, because they play a structural role in how we see things, what is construed as evidence, and sometimes even personal identity. Fourth, there is typically a complex admixture of descriptive and prescriptive elements to ideologies: Their defense would appeal to the way things are and how things ought to be (Seliger 1976).

We need not dwell on these attempts to define such complex terms as science and ideology. It is worth noting, however, that particular definitions of the terms would render an analysis of science and ideology much less significant—or even meaningless. If science were just descriptive and ideology just prescriptive, then perhaps they would be two radically different sorts of things, and the two should never meet, since, according to some philosophers working in the tradition of David Hume, an is cannot generate an ought. On this view, they could not overlap without some improper transgression of one into the rightful territory of the other. However, ideologies are not just wishful desires; they are informed by some facts and make claims about the way the world is. Conversely, some philosophers argue that science is not accurately characterized as value-free, purely descriptive facts, but instead that science is laden with values (Douglas 2009).

A second set of definitions that might render the topic of science and ideology less meaningful would be if science were essentially or only ideological in nature, so that the two terms wholly collapse into one another. If science were just politics by other means,

then perhaps “science” would not add anything new to an investigation of “science and ideology.” But this collapse can be resisted. While we can fruitfully analyze the generation and transmission of scientific knowledge in its purely social and anthropological dimensions—that is, without reference to truth or to any unconditioned external reality—this does not make science nothing but ideology. Ignoring the distinctiveness of the world from human cognition risks an untenable relativism.

Accordingly, we may rest content with broad and common notions of science and ideology, recognizing that they label many different things and that their boundaries are not precise. This need not hinder investigation. Prototypically at least, sciences are not just ideologies. There may be overlap in the real-world history of science, but the terms regularly and usefully label distinct notions.

2. Science and Political Economy

Many well-known discussions of ideological influence on science illustrate how ideology can warp science. One notorious episode frequently construed as an ideological distortion of science is from mid-20th century Soviet biology, when the agricultural research of Trofim Lysenko was at the center of a broader effort to shape a uniquely Soviet biology (Roll-Hansen 2005; Graham 2016). Lysenko and others claimed that grain growth and heredity could be significantly influenced by environmental alterations such as treating the seeds with cold and moisture, and that such alterations could lead to improved crop yields and the reformulation of genetics writ large. The claims about temperature effects are true, while the latter claims are contested and more problematic. The ideological forces contributing to the rise of Lysenko’s science were at least twofold: First was a Soviet concern that natural science should address practical problems and contribute to the common good of the people—the connection with agriculture here was obvious in this period of scarcity and famine. Second was the Marxist precept that organisms are shaped primarily by their environments rather than determined by innate biological traits. Some Soviet scientists and politicians of the period understood Western genetics to be corrupted by capitalist notions of competition, innateness, and individualism, while they saw Western science more generally as unduly prioritizing pure theoretical science disconnected from the needs of the masses. While there was some merit in such critiques, Lysenkoist science was a failure on its own terms: Crop yields were not radically improved. Moreover, and perhaps most importantly, Stalin’s explicit approval of Lysenkoism as officially Soviet, and the ensuing eradication of a critical research community—including the imprisonment of dissenting scientists—contributed to the precipitous decline of Soviet genetics in this period. Political power structures that hinder open and critical debate damage science.

Ideological influence is not only exerted upon scientific research, but on the dissemination of that research as well. Popular understanding of science is crucial for public policy formation, and that understanding can be shaped by any number of forces. For example, multiple independent lines of evidence established a link between cigarette smoking and lung cancer in the 1940s and 1950s, yet the tobacco industry, aware of these health effects, lobbied think tanks, academics, and media executives to disseminate a message that this science was inconclusive. The industry’s efforts were immensely successful, as many Americans, including medical doctors, reported believing that science had no conclusive evidence for such a link for decades afterwards (Michaels 2008; Brandt 2012; Proctor 2012). The same tactics of purposefully manufacturing scientific uncertainty have been deployed to spread ignorance about scientific knowledge of acid rain, ozone hole depletion, and greenhouse gas emissions (Oreskes and Conway 2010). Behind this campaign of manufactured doubt has been a political concern that some science could be used to support environmental or public health regulations, thus threatening the unregulated markets that some groups find central to political economics.

While ideologies can distort science and its popular understanding, it is important to point out that many of the classic studies of science and ideology investigated which ideologies provided the best contexts for scientific advance (Bernal 1939, Merton 1942). An important thesis concerned whether Western-style liberal democracies could be the best political arrangements for the production of quality science. One idea here was that good science may require a kind of openness to critique that is essentially a political ideal, and that such openness also underpins liberal democracies. One contrast, during this time period, was the Soviet Union’s communism, which excelled in centralized planning of science. State direction of scientific activities contributed to the Soviet Union’s Cold War successes, such as Sputnik, and such strategies were also sometimes used by the US, for example in its Manhattan Project. Political ideologies shape science through funding, planning, institutionalization, and their political ethos.

Much discussion has also been generated by the question of which political or economic ideologies might be supported by particular scientific theories. To take just one example, the theory of evolution by natural selection has been used to legitimate multiple and incompatible political ideologies, from conservative politics and laissez faire capitalism to socialism.

Biology has often been used to reinforce essentialist, individualist, and conservative doctrines. If people are who they are because of innate traits, and society is the way it is because of those traits too, then it seems as if nature itself underwrites the political order. On this view, class structure has its particular form because the upper classes have the right stuff in their blood. Attempts to change the political order, then, would mean not just fighting a status quo, but fighting nature itself. Such ideas, sometimes called “biological determinism,” minimize the influence of environments, history, and culture in shaping societies or individuals and are typically used to oppose efforts to shape society through education, welfare programs, or other promotions of social mobility.

Biology has also been used to bolster a specifically capitalist ideology that places competition in the center of its worldview. The idea here is that just as organisms’ competition for scarce resources eventually generates evolutionary change by weeding out the unfit, so also individual competition should yield social and economic progress. One source for this view in the 19th century was scientific naturalist Herbert Spencer, the pre-Darwinian popularizer of evolution who coined the term “survival of the fittest.” Spencer’s view of evolution was all-encompassing and ardently progressive, positing competition at the center of a process yielding a more harmonious “social organism.” Spencer imagined a biological process responsible for progress in social, political, economic, and even racial dimensions. While Spencer did not intend to justify corporate or state rapaciousness, his popular evolutionary narrative was adopted by others to justify laissez faire capitalism. Upon studying Spencer, American industrialist Andrew Carnegie testified, “I remember that light came as in a flood and all was clear… I had found the truth of evolution. ‘All is well since all grows better’ became my motto, my true source of comfort” (Carnegie 1920). Such ideas apparently meshed with Carnegie’s objection to government influence in commerce, his repudiation of workers’ unions, and his insistence that the concentration of capital by industrialists like himself was essential for social progress. Capitalists were confident nature was on their side.

Socialists were too. Many socialists seized on the materialist implications of evolution—that biological history could be explained in terms of natural laws—to support their view that social history was likewise governed by laws. Some said that Marx had anticipated Darwin by developing an evolutionary picture of social change. The philosopher Georgi Plekhanov went further, practically equating the two theories: “Marxism is Darwinism in its application to social science” (1956). Friedrich Engels thought that evolutionary theory provided evidence for the dialectical nature of historical change, which he argued was key to understanding social and natural history alike. Others found evolution as evidence for socialism only when purged of its problematic framing as essentially competitive. The Russian scientist and philosopher Peter Kropotkin emphasized the centrality of cooperation in biological evolution; his (1902) study of mutual aid argued that a variety of mutualistic and altruistic behaviors had been largely underrepresented in contemporary biology in favor of the more gladiatorial frameworks deployed by British naturalists.  For Kropotkin, the extent of cooperative behaviors in nature bore lessons for social organization writ large: While the “unsociable species” were “doomed to decay,” the more sociable ones were invariably “more prosperous,” open to “further progress,” “higher intellectual development,” and “further progressive evolution.” In turn, Kropotkin advocated a distinctive version of small-scale communism based on voluntary cooperative living.

Indeed, many have found nature replete with lessons about social order, and nature’s authority has been claimed by reactionaries and revolutionaries alike. Darwinism has been grafted onto political economics by various institutions and individuals to serve distinct ends. These combinations of Darwinism and political economics were then no longer straightforwardly scientific theories, but malleable cultural resources that could serve various interests.

Darwin’s evolutionary theory, postulating common descent and natural selection as a mechanism of change, has been accepted in broad outline by contemporary biologists. Moreover, there is a widespread expectation that evolution should inform and enrich many other areas of science and human life. How to use that theory, and what it means for our understanding of economics or politics, remain topics of continued debate. In particular, there is considerable ambiguity in the scope of evolutionary generalizations. Questions remain as to what phenomena evolution applies to, what it does or does not explain, and whether certain forms of social organization are more natural, and therefore preferable, to others. Such questions are not settled by the biological data that were so influential in the theory’s adoption, and they remain contested today.

3. Science and Race

Racist societies have generated racist sciences. If, as was hinted above, science is sometimes permeable to social values, then it makes sense that racist ideologies could make their way into the questions, methods, and analyses of some scientists. Decades of diverse research programs were devoted to establishing the natural basis of European racial supremacy. In the 20th century, eugenics continued the legacy of racist science in its widespread adoption throughout Europe and North America.

Eighteenth and 19th century anthropologists regularly described non-European peoples and cultures as “savage,” “primitive,” and “uncivilized.” Their subjects were typically described in opposition to the “advanced” cultures that anthropologists imagined themselves part of. Early anthropology was closely linked with the colonial projects of Europe, and the notion that foreign peoples were incompetent to look after themselves fit well with the drive to colonize foreign places to extract their resources, bodies, and labor. This period gave rise to the notion that races are biological categories. While theorists continue to debate whether there are viable biological notions of race—for example, as lineages whose geographical isolation is responsible for superficial phenotypic differences (Kitcher 2007)—many contemporary anthropologists, biologists, and philosophers reject the notion that folk categories of race are real biological divisions (Baker et al. 2017; Gannett 2004; Witherspoon et al. 2007; Yudell et al. 2016; Winther and Kaplan 2013).

But if races were distinct biological populations, as many scientists of the 19th century believed, then one scientific task was to classify these distinct groups. An important question among these biologists was whether races descended from a single source—assumed to be Adam and Eve, according to their Christian beliefs—or from multiple, separate sources, perhaps from different places or different Adams. These hypotheses were labeled monogenism and polygenism. Polygenists found an important spokesperson in Harvard biologist Louis Agassiz. Quantitative evidence for Agassiz’s polygenism came from Samuel George Morton’s renowned biometrical measurements of cranial volumes. In this period, skull sizes were believed to be indicators of mental capacity, and Morton’s studies “found” just the answers he expected to find: Europeans had the largest cranial volumes. Such studies were later discovered to be badly compromised by selection bias, but not before they had a significant impact on social policies that disenfranchised non-Europeans. Agassiz, one of the most influential American biologists of the 19th century, used those studies to argue for polygenism, the innate inferiority of “colored races,” and by extension, for separate educational regimes for different ethnicities (Gould 1996).

Darwin hoped that the monogenism inherent to his own theory—this time evolutionary in character rather than creationist—would have remedial social effects. Because evolution posited common descent, emphasizing humans’ shared history, Darwin hoped it would diminish the scientific arguments for racial hierarchy, and therefore contribute to the demise of the slave trade that he abhorred (Desmond and Moore 2009). However, many scientists found that their racism was compatible with multiple scientific theories, including Darwin’s: If we all evolved from a common ancestor, they reasoned, then some of us are more evolved than others. Because evolutionary theory was widely understood as a kind of progressive force molding better and better organisms, it was sometimes used to separate the putatively advanced from less advanced humans, and such scientific hypotheses aligned with common social hierarchies of the time.

Some of the racist proclivities visible in the biometrical programs of cranial measurement persisted into later strands of psychology, including intelligence measurement. Intelligence tests, originally designed by Alfred Binet for diagnostic and remedial purposes, were later transformed by Henry Goddard, who interpreted the tests as indicators of an innate general intelligence. Goddard and many others in his wake used such tests to articulate the social “menace” posed by those of low intelligence, and also to argue for immigration restrictions. His IQ tests were administered to newly arrived immigrants at Ellis Island, where Goddard claimed they showed that about 80% of Jews, Hungarians, and Italians—groups that were often considered inferior races—were officially “feeble-minded.” Goddard concluded, “[T]he immigration of recent years is of a decidedly different character from the early immigration… We are now getting the poorest of each race” (cited in Gould 1996).

Underpinning many lines of such nativist and racist science was a belief in hereditarianism, the doctrine that heredity, rather than environmental influences, decisively shapes or even determines human character traits, including personality and intelligence. For example, many scientists believed that traits like criminality could be passed on from one generation to the next. This hereditarian doctrine, when combined with the modernist political will for social engineering and optimism that the nascent science of genetics would discover discrete underpinnings of traits like criminality, contributed to the rise of eugenics in the early 20th century.

Darwin’s cousin Francis Galton coined the term eugenics, meaning “good breeding,” in 1883 to describe the application of hereditary science to human improvement. The idea was to improve society through more selective reproduction; it could be manifest in positive eugenics, encouraging reproduction among the “right” kind of people; or negative eugenics, discouraging or prohibiting reproduction among the “wrong” kind of people. It was implemented around the world but especially in Europe and North America; records show that 20,000 people were sterilized against their wills in the state of California alone. While eugenics reinforced multiple social prejudices against the disabled, the poor, and the “feeble-minded,” racism was a central element of its broad agenda.

Eugenics garnered widespread support from many corners of public life, including conservatives, progressives, scientists, and the religious. As just one measure of its broad scientific backing, consider that no fewer than five presidents of the American Association for the Advancement of Science were members of the advisory board for the American Eugenics Society. Eugenics flourished in different forms of governments, including socialist, liberal democratic, and authoritarian (Mottier 2010). Galton hoped that eugenics might one day obtain the mass social appeal of “orthodox religion,” and this hope was not far off: Eugenics enjoyed broad support among Protestants, and there was even a sermon competition for best sermons supporting eugenics in America (Rosen 2004). While there was disagreement about how to implement eugenics, there were few institutional voices questioning whether eugenics should be implemented until the 1930s, when the Catholic Church voiced its opposition. British Catholic and public intellectual G. K. Chesterton (1922) was a noteworthy exception to the broad consensus favoring eugenics.

Madison Grant’s (1916) Passing of the Great Race extended hereditarian thinking with explanations of how climate molded Nordic superiority, leading to an advanced race of humans. Grant combined this notion of Nordic supremacy with the leitmotif of white fragility. Whiteness, in this tradition, was fashioned as dominant and innately superior, but at the same time fragile and threatened with imminent demise. Grant was an American amateur anthropologist, but he found a wide audience, and a personal note of praise was mailed to him from none other an overseas admirer than Adolf Hitler, who called the book “my Bible.”

Hitler’s Third Reich was largely founded upon a biomedical ideology of “racial hygiene” (Proctor 1988). The regime is most infamous for its anti-Semitism, but its targeted killings began with the disabled, Roma people, homosexuals, and others who were thought to threaten the purity of the Nordic ideal advanced by Grant and others. Such ideals were construed as public health policies in Germany, backed by physicians in the name of national health. Those policies were continuous with—and in fact sometimes based on—policies arising from American eugenic programs (Kühl 1994, Whitman 2017). As late as 1934, American physicians in favor of forced sterilization laws lamented that “The Germans are beating us at our own game” (cited in Kevles 1985).

The eventual reaction against eugenics was based partly on collective horror of the atrocities of the Holocaust. In addition to this political change in temperament, there were also scientific repudiations of eugenics, notably from anthropologist Franz Boas and biologist Theodotious Dobzhansky. Dobzhansky argued that natural selection maintains variation in population, and that such variation is biologically beneficial. Accordingly, the reduction of such genetic variation via eugenics would be disastrous (Beatty 1994, Paul 1994). In this way, Dobzhansky became one of the predominant critics of eugenics and defenders of human diversity.

4. Science and Gender

Gender ideologies are often visible in the history of theorizing the natural basis of sex (Tuana 1989, Keller and Longino 1989). Aristotle, a progenitor of biological science, writes that being a woman is essentially a deficiency, being a kind of incomplete male. In a series of psychological, anatomical, and physiological comparisons, he contrasts male and female organisms, typically highlighting females’ inferiority. Women are not only “less perfectly formed” than men, but they are even “mutilated” versions of men. Bewilderingly, given that he was such a careful observer, he even wrote that women have fewer teeth than men. For Aristotle, being female is often defined in terms of the female’s incapacities: to concoct blood, to produce semen, or to convert menses into something better. On the topic of reproductive contributions of males and females, he theorized that men pass on the “active principle” of the human form through their semen, whereas women contribute the passive material causes of the embryos.

Aristotle’s biological work was hugely influential for many centuries, and even later scientists noteworthy for challenging Aristotle’s authority still reaffirmed his traditional Greek view that women are biologically inferior to men (Lloyd 1983, Merchant 1990). The case of reproductive physiology is again illustrative. The Roman physician Galen, for example, attributed formal and material causes to both males and females, but nevertheless insisted on female inferiority because of their “imperfect” semen and because their genitalia were internal. Seventeenth century thinkers continued this line of research bolstering male superiority. William Harvey, most famous for his discovery of blood circulation, assigned efficient causes to both male and female reproductive powers, but still insisted that the male was “the superior and more worthy progenitor” (cited in Merchant 1990). Such work supported a predominant belief in Early Modern Europe that males were progenitors while females were essentially incubators.

These cases also illustrate how being female was interpreted as deviation from norm, best, or perfect. That womanhood was theorized as an alterity reflects an important fact about the homogenous population doing the theorizing for most of the history of science, namely, that they were all men.

According to some 19th century psychologists, paleontologists, and anthropologists, women are more infantile, immature versions of men. Whether based on measurements of cranial volume or psychological development, the view here was that women exist in a childlike stage from which males would outgrow. Moreover, according to this thinking, women are biologically closer to animals and the “savage.” German zoologist and physiologist Carl Vogt wrote, “The female European skull resembles much more the Negro skull than that of the European man…[W]henever we perceive an approach to the animal type, the female is nearer to it than the male” (quoted in Russett 1989). Notice the confluence here with the above section on race, where evolutionary narratives were used to establish European supremacy; similar narratives were used to establish male supremacy (Milam 2010).

The physical sciences were also relevant for investigations into gender. In the wake of successful developments in thermodynamics and energy conservation, proponents of “limited energy theory” sought to explain sex differences in the human developmental process. Harvard physician Edward Clarke theorized that strenuous work in one part of the body limited ability and development of other parts of the body. “The brain cannot take more than its share without injury to other organs. It cannot do more than its share without depriving other organs of that exercise and nourishment which are essential to their health and vigor” (Clarke 1873). Limited energy theory had important ramifications for educational practices, according to Clarke, since women who sought the same educations as men diverted their energies from their bodies to mental work, thus risking “neuralgia, uterine disease, hysteria, and other derangements of the nervous system” (1873). Clarke warned that giving men and women equal educations threatened the very survival of the human species. While such theories might seem humorously arcane today, they were partly responsible for excluding generations of women from higher education.

More recent biological sciences, too, have been liable to rely on cultural gender prejudices when describing reproductive behavior and anatomy. Many have detected common Victorian gender prejudices in Darwin’s work, especially his writing on sexual selection (Roughgarden 2009, Richards 2017). The stereotype of the passive female and the adventurous, competitive male has proved remarkably enduring, apparently making its way into late 20th century cell biology. One consequence was an overemphasis on the passivity of the female egg during fertilization: The most influential cell biology textbook of the era described how “an egg will die within hours unless rescued by the sperm” (cited in Martin 1991). Such stereotypical metaphors, aligning with widespread gender ideologies, could impede science to the extent that they hinder investigations or descriptions at odds with culturally entrenched ideas. Indeed, subsequent discoveries of the egg’s active roles in fertilization were nevertheless slow to change biologists’ descriptions. Alternatively, such metaphors could unwittingly naturalize human cultural norms and make them seem unquestionable: “That these stereotypes are now being written in at the level of the cell constitutes a powerful move to make them seem so natural as to be beyond alteration” (Martin 1991).

One further aspect of gender is sexuality, and psychiatric science has shaped—and been shaped by—sexual norms and ideologies. Late 20th century typologies of disease, notably the official manual of mental health known as the Diagnostic and Statistical Manual of Mental Disorders (DSM), pathologized homosexuality in an era when it was considered deviant. According to that standard, homosexuality was officially a psychiatric illness in the United States from 1952 to 1973, and variant categories of homosexuality persisted in the DSM through 1987. While homosexuality has since been de-pathologized in the medical community, some religious communities continue to advocate “reorientation therapy” to treat what they consider the malady of homosexuality (Waidzunas 2015). The history of many mental health disorders has been closely associated with social trends; perhaps being mentally healthy may often depend on social attitudes about the acceptable range of normalcy and variation.

5. Science and Religion

Religions can form the basis of totalizing belief systems encompassing cosmology, theology, politics, and ethics, and so for some theorists, religion is the quintessential ideology. Marx famously called religion “the opiate of the masses” and thought it was precisely the kind of ideology from which people needed liberation in order to understand power dynamics as they truly are. He thought that religions like Christianity served the interests of the ruling classes by placating adherents, making them less willing to acknowledge and confront manifest injustices by deferring justice to an afterlife rather than establishing a more equitable society on earth.

Accordingly, if religion is a typical ideology, then a familiar narrative contrasts religion with science, supposing they are locked in essential conflict with each other. This notion looms large in the popular imagination, and conflict is especially apparent as it has related to the interpretation of religious scriptures. Galileo’s condemnation by the Catholic Church partly involved the church’s resolution to control the interpretation of scripture, which was especially salient during the Counter-Reformation following the Council of Trent. The book of Joshua records that God stopped the sun (presumably from moving around the Earth), which the Church interpreted as evidence for a geocentric planetary order. Galileo suggested an alternative interpretation of the passage that was compatible with heliocentrism, but religious authorities of the 17th century were reluctant to let an outspoken astronomer dictate the correct meaning of scripture.

While strictly literal interpretations of scripture have not been standard in the Christian tradition, some Christians’ opposition to evolutionary theory today likewise hinges on their literal interpretation of religious texts, which they say describes how the world was created in seven days in the year 4004 BC, according to a traditional 17th century chronology by Bishop James Ussher. Evolutionary theory, positing species transmutation and an enormously extended historical timescale, found mixed reception among Christians in different times and places. In America, Darwinian evolution did not meet much resistance until the 1920s, when some Christian evangelicals and fundamentalists linked evolution with threats to favored theological and moral orders. At that time, there was little debate about the status of organic evolution among professional biologists or among most religious leaders, but its tenability was soon called into question especially as a way to influence secondary school curricula. It was in this connection that evolution became the topic of globally publicized courtroom dramas: first as the 1925 Scopes “Monkey” trial on whether evolution was allowed in a Tennessee classroom, and later as the 1982 United States Supreme Court decision on whether creationism was allowed in an Arkansas classroom (Ruse 1988). When creationism was judged to be a religious rather than scientific theory, and thus ruled out of biology classes, it morphed into intelligent design theory, which focused less on advancing specifically Biblical explanations, and more on challenging the status of evolutionary theory. Since the 1970s, creationists and intelligent design theorists alike sought intellectual support from scientific and philosophical resources including Francis Bacon, Karl Popper, and Thomas Kuhn to argue that their preferred version of science was on equal footing with evolutionary theory (Numbers 2006).

Antievolutionism was not led primarily by churches but by individuals like William Jennings Bryan and George McCready Price. Bryan was the populist politician, a three-time presidential candidate of the Democratic party, who battled evolution at the Scopes trial. For Bryan, evolution was associated with moral decay and a decline of Biblical authority. Bryan thought Darwinism was implicated in the militant German nationalism of World War I and the decrease in religious belief among college-educated Americans. Despite Bryan’s renown as an opponent of evolution, he was primarily concerned with protecting the supernatural origin of humans, and in fact had no qualms with evolution in general or the standard reading of “days” in the book of Genesis as extended periods of time compatible with geological findings. Young-Earth creationism was born as the “flood geology” of Seventh-day Adventist George McReady Price, who posited a literal six-day creation narrative and a young-Earth chronology. While this was not the traditional Christian interpretation of Genesis, Price advocated for this strictest version of creationism because he was following the teachings of Adventist founder Ellen G. White, who claimed divine inspiration for her view that the fossil record was the result of the Noachian flood. While much of the rhetoric among creationists has focused on matters of Biblical interpretation, the fact that such strident literalist antievolutionism took form only in the 1920s, and did not catch on with a broader public until the 1960s, suggests that creationism is at least partly explained by social and political conditions unique to those periods, such as some Christians’ rejections of what they considered modernity’s excesses.

The supposition that there is an essential conflict between science and religion is often founded on the premise that they are pursuing the same goals—say, the true description of the world—and so they are competing for the same territory. One narrative based on that notion of shared goals has it that science is displacing religious explanations of natural phenomena: Where mythological or religious explanations once sufficed, we now have true scientific explanations. However, the premise that science and religion share the same goals has been disputed from various quarters. Biologist Stephen Jay Gould argued that science and religion are “non-overlapping magesteria,” two realms concerned with two separate subject matters: science with facts and religion with values (Gould 1999, see also Brooke 2016). Reformed theologian Karl Barth, arguing from a very different perspective, theorized how science and religion rest on wholly separate foundations: science on empirical reality and religion on revelation. Such arguments are sensitive to the ways that sciences and religions evince distinctive ends and practices; perhaps they do not share the same goals after all.

If science and religion sometimes pursue separate goals with separate methods, then this dimishes the emphasis on conflict. Historically, at least, the emphasis on conflict is an incomplete way to tell the story of science and religion. It was not a common way to think of the relationship of science and religion until recently. The “conflict narrative,” as it is known by historians, dates only from the late 19th century, from influential if methodologically flawed history texts by John William Draper and Andrew Dixon White. No such totalizing conflict was perceived for most of the history of science (Brooke 1991, Numbers 2009, Harrison 2010).

While the sources of modern sciences are diverse, reaching back to ancient Greek and medieval Arabic and European roots, modern sciences were institutionalized in an overwhelmingly Christian Europe in the 17th century (see also Effron 2010). It would have been quite surprising, then, if this new “mechanistic philosophy,” as it was then known, was considered irreligious. It was not. Many of the architects of modern sciences were themselves Christians of one stripe or another, in whose minds there was no conflict between their own scientific and religious practices. To the contrary, for most of these early scientists, doing science was a pious activity especially befitting the religious, insofar as coming to know God’s creation was a way of coming to know the Creator. The tradition of natural theology, which sought to infer the existence or attributes of the Creator through the design apparent in the creation, was a religious framework for doing science for centuries (Re Manning 2013, Topham 2010). Kepler, Galileo, Newton, and many others believed that doing science amounted to deciphering the “book of nature”—a common theological metaphor that placed scientific investigation alongside the study of religious scripture. Robert Boyle, the 17th century chemist and namesake of Boyle’s Law, labored to ensure that the new mechanistic philosophy was not seen as threatening religious belief, but rather as more compatible with Christian theology than the reigning Scholastic approach of his time. In one passage, Boyle even advocated performing experimental science on the Sabbath, as it could be considered a form of worship (Davis 2007).

Accordingly, the conflict narrative does not capture most of the history of science and religion. Science advanced not despite, but often because of its religious significance to early scientists. As one historian writes, “a distinctive feature of the Scientific Revolution is that, unlike other earlier scientific programs and cultures, it is driven, often explicitly, by religious considerations: Christianity set the agenda for natural philosophy in many respects and projected it forward in a way quite different from that of any other scientific culture” (Gaukroger 2006). Impulses arising from within religious movements spurred and shaped the formation of natural sciences (Harrison 1998).

If contemporary historians reject the conflict view relating science and religion, they have adopted a more nuanced position known simply as the complexity thesis, which states that there is no single relation between science and religion. Such complexity should be entirely expected if science and religion are not stable, monolithic entities with timeless essences, but instead are labels for diverse, dynamic traditions of thought and practice. Consider briefly that there is no essential element shared across all religions—not even a general one such as belief in gods. It should not be surprising, then, that all those things called religions might not have a single relationship with science. Such complexity, then, provides a warning sign for all studies of science and religion: Sweeping narratives that so readily lend themselves to ideological or rhetorical purposes often ignore complexity at the cost of historical accuracy.

6. Science as Ideology: Scientism

Finally, it is worth noting a sense in which science itself can form a basis of an ideology. When science is credited as the one and only way we have to describe reality, or to state truth, such restrictive epistemology might graduate into scientism. According to this view, the only rationality is scientific rationality. Poetry, literature, music, fine art, religion, or ethics could not be considered sources of knowledge, according to this view, because they are not generated by scientific methods. Such fealty to the deliverances of science, especially at the expense of other ways of knowing, can become ideological, and scientism is the preferred description of such a view. While enthusiasm for science has been a part of its ethos since the Enlightenment, scientism goes beyond enthusiasm in its insistence that whatever falls outside the scope of science is not knowledge. Alternatively, scientism is sometimes used to refer more specifically to the uses of science to inform policy. If political issues are framed as scientific, so that scientific evidence alone can adjudicate the right policies, it constitutes a strongly technocratic move to replace politics with science, and such replacement can also be a form of scientism.

The use of the label “scientism” typically implies a negative judgment about a problematic fidelity to science, but a few theorists have embraced the label as well. There is no simple relationship between science and scientism. Many scientists reject scientism, while some humanities scholars promote it. When humanists decide they ought to work within a metaphysics they imagine to be scientific, they may feel compelled to adopt a materialist or reductionist framework rejecting traditional categories of humanistic inquiry, such as person, will, freedom, judgment, or agency. Insofar as natural sciences might not recognize those categories, some humanistic scholarship has been transformed—some would say attenuated—by the loss of such concepts (Pfau 2013).

We can identify at least four challenges for scientism. First, an overweening loyalty to science and rejection of nonscience may presuppose that such categories have discrete boundaries. As noted in Section 1, however, the longstanding attempt to characterize science through a definition or definitive methods has been largely unsuccessful. It has proven incredibly difficult to specify exactly what makes an approach to the world scientific, which obviously problematizes the derogation of nonscience. Second, the appeal to science can obscure the question of which parts of science are being drawn upon. If science consists of a variety of distinctive practices, answering many different questions with many different methodological approaches, then appeals to science simpliciter can obfuscate important questions about which science is being included, which omitted, and how it is analyzed. This is important because different scientific studies and methods often do not align to provide straightforward results: Separate analyses even of the very same data can yield remarkably divergent conclusions (Stegenga 2011). Third, proponents of scientism sometimes marshal their own scientific credentials to back their claims. In a society that grants so much cultural authority to scientists, those credentials can easily bestow rhetorical power. Nonetheless, scientific expertise does not automatically entail expertise in other areas, and it has proved all too easy for, say, some biologists to make philosophical and theological pronouncements without training in, or even appreciation for, those other fields of study. A fourth challenge faces scientism as a replacement for politics; the problem is that political debates are typically not exhausted by their scientific dimensions. Issues like climate change or race relations, for example, involve more than scientific results; they also include conceptions of justice, freedom, economics, and even religion, which are each infused with ethical concerns. Politics cannot be reduced to technical scientific problems, and so the attempt to convert essentially ideological debates into straightforward scientific hypotheses can misconstrue what is at stake and overlook important issues under debate (Oakeshott 1962, Bernstein 1976, Seliger 1976).

Insofar as science’s powers are rooted in methods aimed at studying nature independent of any ideologies, this also represents a limit to its application. While scientific inquiry can contribute to nearly any problem we face, science typically cannot determine the solutions to those problems on its own; to think otherwise is to fall prey to scientism. Most real-world problem solving involves more than just applying scientific results; it also involves complex philosophical and ethical judgments, whether or not those are explicitly articulated.

7. Conclusion

Although it is often lamented whenever science is politicized, this article shows how frequently scientific knowledge has been intertwined with broader social and political concerns. History does not entail that such politicization is acceptable or inevitable. History does suggest it is nothing new. So long as we believe that science will matter to the things we care about most deeply, we should expect such contestations to continue in the future. Seen this way, ideological debates over science illustrate just how central science is in the modern world. Ideologically-contested science is not a sign that we fail to value science; to the contrary it shows us just how much all partisans agree that science is central to their advocacy. Of course, this can be problematic if science is misrepresented in order to justify particular interests.

Ideologues have often claimed science to be on their side. That is not surprising, given the cultural status of science, and given that ideologies are usually informed by some factual, putatively scientific claims. This article has shown how science has been used to support various ideologies.

It has also shown how ideologies can make their way into science. In the West, science has often been shaped by dominant ideologies which have privileged the white, the male, and the heterosexual, while demoting or pathologizing non-Europeans, women, and homosexuals. It seems clear that scientists have sometimes drawn on widely shared social beliefs when they are doing science, and that such ideologies can influence their science. Thus, it is problematic, to say the least, when those scientific results are then cited as independent evidence for the ideologies themselves (Lewontin 1992).

On the other hand, science has also been used as a check or bulwark against inhumane ideologies, such as Darwin’s fight against the slave trade or Dobzhansky’s arguments against eugenics. In these ways, ostensibly scientific disputes can also be sites of adjudicating ideological conflict, though such adjudication necessarily draws on more than just scientific data.

If ideologies can be assimilated into science, science has also challenged traditional beliefs and ideologies. As one classicist argues, “Ancient science is from the beginning strongly marked by the interplay between, on the one hand, the assimilation of popular assumptions, and, on the other, their critical analysis, exposure and rejection, and this continues to be a feature of science to the end of antiquity and beyond” (Lloyd 1983). Science and ideologies can adjust to one another, and this process is ongoing.

A close look at the history of science makes any clean-cut division between science and ideology appear artificially imposed. The history of science instead engenders a sense for the complex assortment and rearrangement of ideas that can problematize any straightforward isolation of the scientific from the ideological. Indeed, most contemporary historians and sociologists of science make sense of scientific changes partly by recognizing science’s permeability to cultural pressures. Political and religious frameworks can influence the questions scientists ask, which research they take to be significant, how they assess its importance, and even how long particular problems are worth pursuing.

As one historian put it, “The lines between science, ideology and world view are seldom tightly drawn” (Greene 1982). The point is that science has historically been enmeshed with social trends and beliefs that include ideologies. Historian Bob Young went so far as to claim that ideology is pervasive: “Ideology is an inescapable level of discourse” (Young 1971).

While the historical cases sketched above are well documented, the philosophical conclusions we might draw from them remain contested. For instance, one view is that they are unfortunate instances of science gone bad. Another is that perhaps they are cases where science is corrupted or objectivity is compromised. Optimistically, we might learn from them and try to remain more unbiased or ideologically neutral in the future. Perhaps self-awareness about our own social and political values will help secure more objective science.

However, it is possible that it will remain difficult to fully recognize exactly how broader patterns of thought, including background assumptions that are ideological in nature, influence scientific theorizing. Recent cognitive studies of implicit bias indicate that humans operate with biases they often do not recognize and which are difficult or impossible to eliminate. It remains to be seen how such biases might influence scientific theorizing. As was noted in section 1, ideologies are often difficult to recognize—especially in oneself—but their critical analysis is important not just for politics but for science as well.

Because ideologies are held by everyone, including scientists, they can sometimes explain why some scientific hypotheses are not pursued, while others are pursued or accepted uncritically. In his published writings at least, Darwin seems to have rejected out of hand the hypothesis that women could be cognitively equal to men; such equality would seem extremely implausible given the Victorian gender norms that Darwin generally shared. For other scientists, hypotheses such as the genetic determination of intelligence have been uncritically accepted because they fit a favored ideological narrative (Richardson 1984).

It is possible that ideologies find their way into science more effectively among homogenous groups of scientists. Examples such as the longstanding research program of white men asking why women and minorities were so much less intelligent are at the very least suggestive. Who is doing the science may very well influence what scientific questions are asked, which of course relates to what conclusions are reached. Some philosophers argue that more diverse groups of inquirers can foster objectivity. On this view, the lack of diversity in science is no mere political or moral problem, but an epistemic problem. Insofar as modern sciences are no longer primarily the pursuit of individuals, but a collective enterprise to be analyzed at the community level, then objectivity might best be achieved among groups with different backgrounds or life experiences (Longino 1990). Analyses of the relationship between social position and scientific knowledge were pioneered by feminist philosophers but have since become mainstream (Richardson 2010). Some empirical evidence indeed suggests that ethnic and geographic diversity among researchers can improve scientific results (Adams 2013; Freeman and Huang 2014).

8. References and Further Reading

  • Adams, Jonathan. 2013. “Collaborations: The fourth age of research.” Nature 497: 557-560.
  • Arendt, Hannah. 1973. The Origins of Totalitarianism. New York: Harcourt Brace Jovanovich.
  • Baker, Jennifer L., Charles N. Rotimi, and Daniel Shriner. 2017. “Human ancestry correlates with language and reveals that race is not an objective genomic classifier.” Scientific Reports 7: 1572.
  • Beatty, John. 1994. “Dobzhansky and the Biology of Democracy: The Moral and Political Significance of Genetic Variation.” In The Evolution of Theodosius Dobzhansky, edited by Mark B. Adams. Princeton: Princeton University Press.
  • Bernal, J. D. 1939. The Social Function of Science. New York: The Macmillan Company.
  • Brandt, Allan M. 2012. “Inventing Conflicts of Interest: A History of Tobacco Industry Tactics.” American Journal of Public Health 102 (1): 63–71.
  • Bernstein, Richard J. 1976. The Restructuring of Social and Political Theory. New York: Harcourt Brace Jovanovich.
  • Brooke, John Hedley. 1991. Science and Religion: Some Historical Perspectives. Cambridge: Cambridge University Press.
  • Brooke, John Hedley. 2016. “Order in the Relations Between Religion and Science? Reflections on the NOMA Principle of Stephen J. Gould.” In Rethinking Order, edited by Nancy Cartwright and Keith Ward. London: Bloomsbury Academic.
  • Carnegie, Andrew. 1920. Autobiography of Andrew Carnegie. Boston: Houghton Mifflin.
  • Chesterton, G.K. 1922. Eugenics and Other Evils. London: Cassell and Company, Limited.
  • Clarke, Edward. 1873. Sex in Education. Boston: James R. Osgood and Company.
  • Davis, Edward B. 2007. “Robert Boyle’s Religious Life, Attitudes, and Vocation.” Science & Christian Belief 19 (2): 117-138.
  • Desmond, Adrian and James Moore. 2009. Darwin’s Sacred Cause. Boston: Houghton Mifflin Harcourt.
  • Douglas, Heather. 2009. Science, Policy, and the Value-Free Ideal. Pittsburgh: University of Pittsburgh Press.
  • Effron, Noah. 2010. “The Myth that Christianity Gave Birth to Modern Science.” In Galileo Goes to Jail and Other Myths about Science and Religion, edited by Ronald L. Numbers. Cambridge, MA: Harvard University Press.
  • Freeden, Michael. 2003. Ideology: A Very Short Introduction. Oxford: Oxford University Press.
  • Freeman, Richard B. and Wei Huang. 2014. “Collaboration: Strength in diversity.” Nature 513: 305.
  • Gannett, Lisa. 2004. “The Biological Reification of Race.” The British Journal for the Philosophy of Science 55 (2): 323–345.
  • Gaukroger, Stephen. 2006. The Emergence of a Scientific Culture: Science and the Shaping of Modernity, 1210-1685. New York: Oxford University Press.
  • Gould, Stephen Jay. 1996. The Mismeasure of Man. New York: W.W. Norton & Company.
  • Gould, Stephen Jay. 1999. Rocks of Ages: Science and Religion in the Fullness of Life. New York: Library of Contemporary Thought.
  • Graham, Loren. 2016. Lysenko’s Ghost: Epigenetics and Russia. Cambridge: Harvard University Press.
  • Hacking, Ian. 1983. Representing and Intervening. Cambridge: Cambridge University Press.
  • Harrison, Peter. 1998. The Bible, Protestantism, and the Rise of Natural Science. Cambridge: Cambridge University Press.
  • Harrison, Peter (ed.) 2010. Cambridge Companion to Science and Religion. Cambridge: Cambridge University Press.
  • Keller, Evelyn Fox and Helen E. Longino, eds. 1996. Feminism and Science. Oxford: Oxford University Press.
  • Kitcher, Philip. 2007. “Does ‘Race’ Have a Future?” Philosophy and Public Affairs 35 (4): 293-317.
  • Kevles, Daniel. 1985. In the Name of Eugenics. Cambridge, MA: Harvard University Press.
  • Kühl, Stefan. 1994. The Nazi Connection: Eugenics, American Racism, and German National Socialism. New York: Oxford University Press.
  • Lewontin, R. C. 1992. Biology as Ideology. New York: HarperCollins.
  • Lloyd, G. E. R. 1983. Science, Folklore and Ideology. Cambridge: Cambridge University Press.
  • Longino, Helen. 1990. Science as Social Knowledge. Princeton: Princeton University Press.
  • Martin, Emily. 1991. “The Egg and the Sperm.” Signs 16 (3): 485-501.
  • Marx, Karl and Friedrich Engels. 1938. The German Ideology. London: Lawrence & Wishart.
  • Merchant, Carolyn. 1990. The Death of Nature. New York: Harper Collins.
  • Merton, Robert K. 1942. “A Note on Science and Democracy.” Journal of Legal and Political Sociology 1: 115-126.
  • Milam, Erika Lorraine. 2010. “Beauty and the beast? Conceptualizing sex in evolutionary narratives.” In Biology and Ideology from Descartes to Dawkins, edited by Dennis R. Alexander and Ronald L. Numbers. Chicago: University of Chicago Press.
  • Mottier, Véronique. 2010. “Eugenics and the State: Policy-Making in Comparative Perspective” in Bashford, Alison and Philippa Levine, eds. The Oxford Handbook of the History of Eugenics. Oxford: Oxford University Press.
  • Numbers, Ronald L. 2006. The Creationists. Cambridge: Harvard University Press.
  • Numbers, Ronald L., ed. 2008. Galileo Goes to Jail and Other Myths about Science and Religion. Cambridge: Harvard University Press.
  • Oakeshott, Michael. 1962. Rationalism in Politics and Other Essays. London: Methuen & Co Ltd.
  • Oreskes, Naomi and Erik Conway. 2010. Merchants of Doubt. New York: Bloomsbury Press.
  • Paul, Diane B. 1994. “Dobzhansky in the “Nature-Nurture” Debate.” In The Evolution of Theodosius Dobzhansky, edited by Mark B. Adams. Princeton: Princeton University Press.
  • Pfau, Thomas. 2015. Minding the Modern. Notre Dame: University of Notre Dame Press.
  • Plekhanov, Georgi. 1956. The Development of the Monist View of History. Moscow: Foreign Languages Publishing House.
  • Proctor, Robert N. 1988. Racial Hygiene. Cambridge, MA: Harvard University Press.
  • Proctor, Robert N. 2012. “The history of the discovery of the cigarette-lung cancer link: evidentiary traditions, corporate denial, global toll.” Tobacco Control 21 (2): 87-91.
  • Re Manning, Russell. 2013. The Oxford Handbook of Natural Theology. Oxford: Oxford University Press.
  • Richards, Evelleen. 2017. Darwin and the Making of Sexual Selection. Chicago: University of Chicago Press.
  • Richardson, Robert C. 1984. “Biology and Ideology: The Interpenetration of Science and Values.” Philosophy of Science 51 (3): 396-420.
  • Richardson, Sarah S. 2010. “Feminist philosophy of science: history, contributions,
  • and challenges.” Synthese 177 (3): 337–362.
  • Roll-Hansen, Nils. 2005. The Lysenko Effect: The Politics of Science. New York: Humanity Books.
  • Rosen, Christine. 2004. Preaching Eugenics. New York: Oxford University Press.
  • Roughgarden, Joan. 2009. The Genial Gene: Deconstructing Darwinian Selfishness. Berkeley: University of California Press.
  • Ruse, Michael (ed). 1988. But Is It Science? The Philosophical Question in the Creationism/Evolution Controversy. Buffalo: Prometheus Books.
  • Russett, Cynthia Eagle. 1989. Sexual Science. Cambridge: Harvard University Press.
  • Seliger, Martin. 1976. Ideology and Politics. Lonon: George Allen & Unwin Ltd.
  • Stegenga, Jacob. 2011. “Is meta-analysis the platinum standard of evidence?” Studies in History and Philosophy of Biological and Biomedical Sciences 42 (4): 497–507.
  • Topham, Jonathan R. 2010. “Biology in the Service of natural theology: Paley, Darwin, and the Bridgewater Treatises.” In Biology and Ideology from Descartes to Dawkins, edited by Dennis R. Alexander and Ronald L. Numbers. Chicago: University of Chicago Press.
  • Tuana, Nancy. 1989. Feminism and Science. Bloomington: Indiana University Press.
  • Waidzunas, Tom. 2015. The Straight Line. Minneapolis: University of Minnesota Press.
  • Winther, Rasmus and Jonathan Kaplan. 2013. “Ontologies and Politics of Biogenomic ‘Race.’” Theoria 136 (60), No. 3: 54-80.
  • Witherspoon, D. J., S. Wooding, A. R. Rogers, E.E. Marchani, W. S. Watkins, M. A. Batzer, and L. B. Jorde. 2007. “Genetic Similarities Within and Between Human Populations.” Genetics 176 (1): 351–359.
  • Young, Bob. 1971. “Evolutionary Biology and Ideology: Then and Now.” Science Studies 1: 177-206.
  • Yudell, Michael, Dorothy Roberts, Rob DeSalle and Sarah Tishkoff. 2016. “Taking race out of human genetics” Science 351 (6273): 564-565.

 

Author Information

Eric C. Martin
Email: eric_martin@baylor.edu
Baylor University
U. S. A.

Analytic Perspectives in the Philosophy of Music

musical notesThe philosophy of music attempts to answer questions concerning the nature and value of musical practices. Contemporary analytic philosophy has tackled these issues in its characteristically piecemeal approach, and has revived interest in questions about the ontological nature of musical works, the experience of musical expressiveness, the value of music, and other considerations. Priority is normally granted to the philosophical clarification of pure (or absolute) music, that is, music that is not accompanied by lyrics or a program and is otherwise lacking any reference to extra-musical reality. This is because most of the puzzles in the philosophy of music arise with particular strength in the case of pure music. For instance, although it is easy to explain why we would describe as “sad” a song with lyrics conveying a sad story, it is harder to see why we would call a piece of instrumental music “sad.” Unless otherwise stated, the word “music” in this article refers to pure music, that is, instrumental music.

While it would be hard to point to uncontroversial solutions to any of these problems, this is not to deny that substantial conceptual clarifications have been made. In the case of musical expressiveness, a fundamental distinction has been traced, and is widely accepted, between the expression of emotions as the manifestation of psychological states and expressiveness as the mere presentation of the outward characteristics associated with emotions. Conflating the former with the latter gives rise to the mistaken assumption that emotional descriptions of music must refer to an actual emotional state either in the listener or perhaps in the composer.

The field of musical ontology is largely a reflection of debates in general ontology, although some issues are peculiar to the musical case. For instance, philosophers have debated whether the differences in appreciative focus across musical traditions warrant a different ontological characterisation of works in those traditions. Consider the case of rock music: the main focus is often the record as opposed to the live performance of the piece, which is arguably the critical focus in the Western classical tradition. This may suggest that we ought to construe the work of rock music as ontologically different from the work of classical music, as the former is a track, whereas the latter is a work for performance.

Finally, analytic philosophy of music has attempted to solve the riddle of musical value: how is pure music valuable to our lives if it makes no reference whatsoever to our world? The most original solutions to this problem have tried to show that it is precisely the music’s abstractness that explains its value and appeal.

Table of Contents

  1. Definitions of Music
    1. Definitional Proposals
    2. Related Issues
  2. Musical Expressiveness
    1. Two Basic Distinctions
    2. Accounts of Musical Expressiveness
      1. Arousal Theory
      2. Resemblance Theories
      3. Persona Theory
      4. Other Accounts
    3. Literalism vs. Metaphoricism
    4. Emotions Aroused by Music
      1. The Sceptical View
      2. Emotional Contagion
      3. Negative Emotions
  3. Ontology of Music
    1. Fundamental Ontology
      1. Nominalism
      2. Platonism
      3. Sceptical Views
    2. Comparative Ontology
      1. Rock
      2. Jazz
      3. A Sceptical View
    3. Performance Authenticity
  4. Musical Understanding
    1. Concatenationism
    2. Architectonicism
  5. Musical Value and Profundity
    1. Values of Music
    2. Profundity
  6. References and Further Reading

1. Definitions of Music

a. Definitional Proposals

In comparison to the extensive scrutiny devoted to the general definition of art, the definition of music has received little attention. One may be tempted to dismiss the need for a philosophical definition, as music textbooks routinely present definitions of music that are taken to be relatively uncontroversial. However, while music textbooks may be unanimous in defining music as sound sequences that present elements such as melody, harmony, and rhythm, none of these features is necessary for something to count as a piece of music. Moreover, the occurrence of melodic intervals and rhythmic patterns in natural contexts suggest that these features are also insufficient to make something music: there are melodic intervals in birdsong, pitched sounds produced by the howling of the wind, and rhythmic patterns in heartbeats, but none of these should count as music (at least under the reasonable assumption that music requires human agency).

Examine here are two prominent attempts at a definition of music, a sceptical view of those attempts, and issues broadly related to the definitional problem.

Jerrold Levinson starts from the intuitive notion that music is organized sound (“The Concept of Music” 269). While this may seem correct, it does not yield a definition with the intended scope, as it would include human speech, Morse code, animal calls, and countless other non-musical phenomena. A possibility is to amend the definition by specifying that the organized sounds in question are produced for the purpose of aesthetic appreciation. While this would exclude some of the examples mentioned above, it would also fail to include what are arguably central cases of music the purpose of which is not that of being appreciated aesthetically. This is the case for military music, some music accompanying ritual, at least some film music, and other instances of music in which its main function is not related to its aesthetic appreciation. The amended definition would also problematically include sound arts other than music, such as poetry. Levinson believes that these shortcomings may be resolved if we define the purpose of music as the enrichment or intensification of experience achieved through an active engagement with it, where the active engagement may include activities ranging from attentive listening, to dancing, and to marching to the music. Music for dancing, marching, or praying would thus be included in the definition, as our experience is heightened, intensified, or otherwise enriched by our active engagement with organized sounds. To this qualification we must add another one: in music we engage with sounds primarily as sounds. This further caveat is necessary to exclude cases such as spoken poetry, where our engagement with the sounds primarily aims at the linguistic meaning they convey. From these observations we arrive at a definition of music as “sounds temporally organized by a person for the purpose of enriching or intensifying experience through active engagement (for example, listening, dancing, performing) with the sounds regarded primarily, or in significant measure, as sounds” (“The Concept of Music” 273).

Against Levinson’s proposal, Andrew Kania observes that the above definition is too narrow (“Definition” 8). A musician’s daily practice of scales, or a violin tune played to startle a friend in the middle of the night, ought intuitively to count as music, yet they fail to meet the requirements set out by Levinson’s definition: scale practising is not meant to enrich or intensify experience, nor is one’s playing the violin to play a prank on a sleepy friend. More problematically, the whole category of Muzak is excluded by Levinson’s definition (by Levinson’s own admission), as Muzak is not produced with the purpose of enriching or intensifying experience, but rather with that of inducing a particular mood or attitude. Kania observes that this seems to confuse classificatory and evaluative issues: Muzak may be bad music, but it certainly is music.

These cases may tempt one to include in the definition features such as pitch and rhythm, as these may allow us to include the examples unduly excluded by Levinson. But to make these a necessary feature would make the definition too restrictive, in that it would exclude avant-garde music that lacks pitched sounds or a rhythm, such as Yoko Ono’s Toilet Piece (1971), which is constituted by the sound of a flushing toilet. Kania’s strategy to get out of this impasse is a disjunctive definition (“Definition” 12). His proposal reads as follows: “Music is (1) any event intentionally produced or organized (2) to be heard, and (3) either (a) to have some basic musical features, such as pitch or rhythm, or (b) to be listened to for such features” (Kania, “Definition” 11).

Note that the disjunction allows us to include both a musician’s practice routine, which meets condition 3(a), and cases such as Ono’s Toilet Piece, which lack such elements but presuppose that we would listen for such features, as they are typical of most music.

Against these attempts, Jonathan McKeown-Green has argued that definitions attempting to preserve our pre-theoretical intuitions as to what music is may fall short of providing what we reasonably expect from a definition of something. He suggests that definitions such as Kania’s and Levinson’s are ill-equipped to provide a “future-proof” definition of music, as further developments of current musical practices may change folk intuitions in such a way as to make their current definitions unable to include things that future folk intuitions would consider music. While McKeown-Green leaves open the possibility of future methodological refinements that may address these issues, his view casts a sceptical doubt on the definitional enterprise.

b. Related Issues

In addition to these disputes, which target clearly and specifically the definitional issue, other contributions address the question of what music is in more peripheral ways. For instance, Stephen Davies (“John Cage’s 4’ 33””) and Julian Dodd (“What 4’ 33” Is”) discuss the issue of whether silent pieces, such as John Cage’s famous 4’ 33”, should indeed count as music. While they both hold it should not, and prefer to classify it as a non-musical work for performance, they disagree about the nature of the work. According to Davies, 4’ 33” contains the environmental sounds that occur while it is being performed—he compares this to “an empty picture frame that is presented by an artist who specifies that her artwork is whatever can be seen through it” (459). Against this, Dodd holds that the work is merely about those environmental sounds. For, if the work is a work of performance art—something Davies grants—then it is impossible for it to include, as part of its content, sound events that are not performed by the work’s performers (6–8).

Other philosophers have focused on the distinction between natural and musical sounds, or, more generally, non-musical and musical sounds. Roger Scruton (19) distinguishes the latter two by the way we listen to them: we attend to non-musical sounds causally, as we are interested in the sounds’ sources, whereas musical sounds are listened to acousmatically, that is, independently from their sources.

John Andrew Fisher considers causal listening a possibility both in the case of musical sounds and natural sounds, but he draws the distinction between the two by specifying that they are produced by different objects: whereas natural sounds are produced by ecologically natural objects, musical sounds are produced by artefactual objects, such as musical instruments (“The Value of Natural Sounds”). This distinction grounds the otherness that is typical of natural sounds and the experience of inevitability that is associated with them. Additionally, Fisher characterises natural sounds as being attentionally unframed (a natural soundscape does not prescribe privileged focus on a foreground, whereas this happens regularly in the musical case), temporally unframed (a natural soundscape does not have a beginning, midpoint, or end), and unrepeatable (unlike most musical works) (“What the Hills Are Alive With”).

John Dyck has challenged both Scruton and Fisher’s accounts, on the ground that they leave unexplained the way in which natural and musical sounds coexist in sound art. Consider for instance works such as Jon Hopkins and King Creosote’s album Diamond Mine (2011), in which musical moments unfold over a background of environmental sounds. In mixed contexts such as this, we cannot appeal to incompatible ways of listening (causal vs. acousmatic) or incompatible standards of evaluation (attentionally and temporally unframed vs. framed). In other words, a suitable account of the distinction should not explain just the difference between the two types of sounds, but also their interaction. Dyck proposes the following dual distinction: natural and musical sounds differ causally, in that the former are caused by natural objects, the latter by artefactual objects, and acousmatically, in that the former “tend to have a greater variation of microtones, microrhythms, and microtimbres than human environments” (Dyck 298).

2. Musical Expressiveness

a. Two Basic Distinctions

Discussions of musical expressiveness are likely to begin by distinguishing between expressing an emotion and being expressive of an emotion. The distinction is standard since at least Kivy (The Corded Shell, 1980), although it can be found earlier in Tormey (1971). Expressing an emotion means to outwardly manifest a felt emotional state. For instance, I feel sad and express my sadness by weeping and being downcast. For something to be expressive of an emotion, on the other hand, means merely to display the outward manifestations of such an emotion. For instance, a Saint Bernard’s face is expressive of sadness because its snout presents the drooping features associated with sadness, although the dog may be perfectly happy. Similarly, an actor’s behaviour on the stage over the course of a play is expressive of a number of emotions without the actor necessarily going through these emotions himself. This opposition distinguishes expressive contexts that require an actual emotional state—my behaviour is expressing sadness only if I am actually sad—from expressive contexts that do not require such a state—for the actor and the Saint Bernard to look sad, nobody needs to feel actual sadness. Contemporary analytic philosophers are inclined to take music to be an example of the latter case. While the emotions expressed by the music may often be related to actual emotions—such as when listening to a sad song leads us to feel sad—the music is expressive of emotions independently of anyone’s felt emotional state.

Another important, related distinction is between the emotions in the music and those in the listener. Lay people are inclined to confuse conceptually (if not phenomenologically) the emotions aroused by the music with the emotions expressed by the music. Consider this example: a happy song at a party makes someone feel cheerful. The lonely guy in the corner hears the cheerfulness of the song too, yet his depressed mood isn’t affected by it. Or if it is, the music’s happiness may even be a source of frustration. The contrast is between happiness as a state the music induces in the listener and happiness as a state attributed to the music itself. Section 2.b deals with accounts of the latter phenomenon, whereas section 2.c examines philosophical issues related to the former.

b. Accounts of Musical Expressiveness

i. Arousal Theory

While the previous section distinguishes the music’s emotional expressiveness from emotional arousal, an elegant view describes the former as an instance of the latter. In its crudest form, the idea explains the music’s expressiveness of an emotion in terms of the music’s disposition to arouse such an emotional state in a listener. This is the arousal theory of musical expressiveness.

In this basic form, the theory is doomed to failure. On the one hand, some listeners who perceive the music’s expressive character deny ever being moved to feel such emotions themselves. On the other hand, the emotions a piece of music has a disposition to arouse may differ from those we ascribe to the music itself—think again of the guy in the corner, who was frustrated by the music’s happiness. Additionally, the theory cannot explain the way in which expressiveness contributes to the music’s value: if expressiveness is reduced to emotional arousal, then a suitable emotion-inducing drug could supply whatever value is provided by the music’s expressive character. This goes against the intuition that the value of a musical piece’s expressiveness is intrinsically linked to the music and could not be retrieved otherwise. Finally, the arousal theory fails to explain why we would listen to music that is expressive of fear, anguish, or other negative state: if these expressive properties were to be analysed as the music’s disposition to arouse similar emotional states in us, we would probably refrain from listening to such music altogether (more about this in section 2.d.iii).

Derek Matravers defends a version of the arousal theory that he believes capable of facing these difficulties. He claims that the emotions aroused by music are not full-blown emotions, but rather feelings, as they are deprived of the cognitive component typical of emotions. Moreover, Matravers denies that the feeling aroused by the music is always, and only, the one ascribed to the music. Rather, the listener’s emotional response may vary, as does our emotional response to emotions in human beings. Sad music, for instance, is music which normally arouses emotional responses of the sort that would constitute an appropriate reaction to someone’s expression of sadness. These responses are arguably limited, but certainly are not restricted to sadness only. We may for instance appropriately react to sadness with compassion or pity.

While Matravers’ work remains a classic reading in contemporary analytic philosophy of music, his view is normally deemed incapable of solving at least some of the problems that threaten cruder versions of the arousal theory. Justine Kingsbury observes how in other contexts we hardly ever run together the expression of an emotion (or feeling) and its arousal. One may be saddened by other people’s happiness, or worried by someone’s continuous expressions of anger, or feel some sort of Schadenfreude when confronted with expressions of distress. Given the commonplace nature of the conceptual distinction between emotional expression and arousal, it would be weird to think that these should be analysed as equivalent in the musical case.

Matravers would presumably respond to this objection by saying that the two cases are akin as in both cases the appropriate response to emotional expression is an emotional response. We react with sadness (or pity) to someone’s sadness, and we react to sad music in a similar way. But this reply would need to deal with Kingsbury’s other objection: on what grounds can Matravers disqualify as inappropriate the reaction of the listener who does not feel appropriate emotional reactions to sad music? While there are reasons to describe as appropriate the emotional reaction to the misery of another human being, it is unclear in what sense the expressive character of inanimate objects such as musical works requires or invites an emotional reaction.

ii. Resemblance Theories

Resemblance theories of musical expressiveness hold that the music’s expressive properties are due to their resemblance to human expressive behaviour. This is probably the most widely supported philosophical theory of musical expressiveness, and it was first independently proposed by both Stephen Davies (“The Expression of Emotion in Music”) and Peter Kivy (The Corded Shell).

While the two versions of the theory are often discussed together, it is worth stressing their differences. In order to do so, I consider Kivy’s theory first, and then move on to Davies’. After that, I consider objections raised against both views.

The resemblance theory defended by Kivy is known as the contour theory of musical expressiveness. It owes its name to the intuition that the reason why music is expressive of emotions is to be found in the resemblance between melodic contour and human emotional prosody. In other words, music expressive of sadness sounds like human speech when we are in the grip of sadness, and so it acquires its expressive character. According to Kivy, resemblances between music and human behaviour are not limited to vocal behaviour, but also include resemblances to bodily behaviour. Music that is sad moves downwards and slowly, whereas happy music is sprightly and often proceeds by leaps.

According to Kivy, resemblance is not the only source of musical expressiveness. He claims that it is impossible to make sense of the expressive character of some elements of the Western musical tradition on the grounds of their resemblance to human expressive behaviour. His example is that of major and minor chords, which do not resemble in any salient way the vocal or bodily behaviour of happy and sad people, yet are consistently described as happy and sad respectively. Kivy’s solution is to assume that some musical features acquire their expressive character by convention (The Corded Shell 80). This is not unproblematic: how could we successfully establish the conventional connection between sadness and minor chords if these sounded entirely neutral at first?

Davies’ theory is named by its author appearance emotionalism. It holds that music is expressive because it resembles emotion characteristics in appearance, that is, the outward manifestations of human emotions. Davies is inclined to stress the importance of the music’s resemblance to human bodily expressive behaviour, as opposed to vocal (“Artistic Expression” 182). The theory shares with Kivy’s contour theory the idea that music’s expressive character depends on its resemblance to human expressive behaviour and is independent from any actual emotion in the composer or in the listener. Davies points out three main differences between his view and Kivy’s (Musical Meaning 260–267). First, he denies that music, strictly speaking, expresses emotions, as it merely presents the aural appearance of expressive behaviour, and this does not warrant talk of expression. Second, he concedes that music may express Platonic attitudes, that is, emotional states that require an object, such as admiration, pride, or hope. According to Davies, this may be achieved by suitably long and complex musical passages, which convey the succession of feelings and behavioural components typical of such attitudes. Third, Davies claims that music may be about the emotion it expresses, whereas Kivy holds to the formalist view that music isn’t about anything at all. While emotion characteristics in appearance do not by themselves refer to the emotion they are expressive of, they may do so in the appropriate context. Think of using a picture of a Saint Bernard’s sad-looking face to show how you are feeling. In this case, the emotion characteristic in appearance presented by the picture would be referring to your emotional state. Likewise, music can be about the emotions it presents.

A historical note: the intuition that the music’s expressive power lies in its resemblance to human expressive behaviour is an old one and can be traced back to Plato. The resemblance theories proposed by Kivy and Davies advance this idea while at the same time detaching it from the assumption that the emotions in the music had to be related to actual emotional states either in the listener or in the composer. In this resides their main element of novelty.

Resemblance theories have been criticised on numerous grounds. Various commentators have argued that, while they may correctly characterise resemblance to human expressive behaviour as (part of) the explanation of why we hear music as expressive of emotions, they fail to characterise the experience of musical expressiveness (see Levinson “Musical Expressiveness” 195–199, and Matravers ch. 7).

Additionally, Levinson has argued against Davies that appearance emotionalism is unable to describe what would count as the musical presentation of emotion characteristics: “(w)e can give content to ‘sad human appearance’ by glossing it as ‘the appearance sad humans typically display.’ But we can’t analogously give content to ‘sad musical appearance.’ There is no such thing as the appearance or kind of appearance that sad music typically displays” (“Musical Expressiveness” 197).

iii. Persona Theory

Levinson has defended the view that musical expressiveness is essentially the expression of a fictional musical agent, or “persona.” His assumption is that expressiveness can make sense only if it is reduced to some kind of expression: the puzzle of expressiveness is to understand how it is possible for some objects deprived of a psychological life, such as works of music, to be described as possessing psychological properties like happiness or sadness. The riddle is readily solved if we postulate that whenever we hear expressive music, we are hearing it as the expression of emotions in music of a fictional musical agent.

Critics of Levinson’s view tend to stress how competent listeners seem to be able to detect and appreciate the music’s expressive character without any imaginative engagement with a fictional agent they hear in the music (Davies, “Artistic Expression” 189). Levinson’s reply to this is that these processes may often not be conscious. A second, more radical objection to the persona theory holds that, even granting for the sake of the argument that we do in fact hear music as the expression of fictional individuals, a piece of pure music is typically unable to constrain a plausible and coherent narrative about its development. Is the work the expression of a single persona or multiple ones? Is the dialogue between the strings and winds a fight between two imaginary agents or the internal struggle of a single one? In other words, a work of music underdetermines the coherent narratives in terms of musical personae it may elicit. The problem with this is that it is unlikely that all of these narratives will result in a similar verdict with regard to the piece’s expressive character (Davies, “Artistic Expression” 190).

iv. Other Accounts

Jenefer Robinson, in her Deeper than Reason, is noteworthy for the attention she devotes to empirical research on emotions, as well as for her attempt to develop a notion of expressiveness that could be applied to art forms other than music. According to Robinson, highly expressive works of art allow the appreciator to feel what it is like to be in the emotional state the work is expressive of (see, for instance, Robinson 290).

Unlike Levinson, Robinson does not believe that all expressiveness requires an expressing persona. She contends, however, that some music in the Western canon invites such a listening. Relatedly, it is also noteworthy that Robinson is willing to make concessions to the discredited expression theory of expressiveness, according to which a work of art’s expressive properties are due to its creator’s emotional state. As it is, this theory is untenable: we know that artists have created exuberant and joyful works while being depressed, and it is in any case unlikely that an artist will remain in a single emotional state throughout the creation of a complex work of art such as a symphony. Robinson concedes, however, that some musical works, particularly those in the Romantic tradition, may present an emotional state felt by their authors (325). In these cases, we may be justified in identifying the persona in the music as the work’s author.

Charles Nussbaum has defended a sophisticated version of the arousal theory built around the idea that we form a mental representation of a musical work as a virtual terrain. Just like the ordinary space surrounding us, musical space offers affordances, that is, action possibilities. On this view, the arousal of feelings by music is due to off-line motor states that the music puts us into in virtue of our spatial representation of the musical surface (214). Nussbaum’s theory is ambitious and has probably not yet received the sustained consideration it deserves. Some critics have doubted if it could fend off standard objections to cruder versions of the arousal theory (see for instance Trivedi 47–48).

Saam Trivedi in 2017 defended an imaginationist account of musical expressiveness. According to him, the experience of musical expression centrally involves imagination, although it may do so in different ways. The basic way we use imagination in relation to music is to imagine the music itself as a sentient being expressing its emotional states, but other types of imaginative engagement are available (133–139). For instance, we could imagine that the music is the expression of emotions of an indeterminate persona, or that we are ourselves in the emotional states the music is expressive of (139–143).

c. Literalism vs. Metaphoricism

A debate parallel to that concerning musical expressiveness is the one regarding the status of our descriptions of music in emotional terms. When we describe music as “‘sad,” “happy,” and the like, are we speaking literally or, rather, using metaphors in order to grasp aspects of the music that we cannot quite describe in literal terms? The former option is dubbed literalism, whereas the latter can be called metaphoricism.

An early metaphoricist proposal is the one by Nelson Goodman. He claimed that music metaphorically exemplifies expressive properties (85). Suppose you have a new suit made. The tailor shows you swatches of fabric to let you choose your preferred colour and material. The swatches possess a variety of properties, but exemplify only some of them—for instance, they exemplify colour and thickness, but not size. A way to put this is to say that exemplification is possession plus reference. Goodman builds his account of expressiveness on this basic notion of exemplification, with the relevant difference that expressive properties, unlike properties such as colour or size, are not literally possessed by inanimate objects. In the case of expressiveness, then, exemplification is reference to a property that is metaphorically possessed by an object. For instance, a work of music is expressive of sadness if it refers to the property of sadness that it possesses metaphorically.

Goodman’s view has been frequently criticised, especially for the rather obscure notion of metaphoric possession that is central to it (see for instance Davies, Musical Meaning 145–150).

Roger Scruton holds common descriptions of music in spatial and emotional terms to be irreducibly metaphorical. They are metaphorical because they describe in spatial terms something that is not literally extended in space and in emotional and psychological terms something that has no mental states. These metaphors cannot be paraphrased into literal statements, yet they are indispensable because they describe the way in which we imaginatively engage with music. This claim receives support from Scruton’s broader account of musical understanding (see section 4; see Trivedi 67–72 for criticism of Scruton’s metaphoricism).

Against these theorists, Stephen Davies defends a literalist position (“Music and Metaphor”). His strategy is to appeal to the secondary meaning taken by emotion terms when they are used to describe the outward manifestations of emotions. For instance, we may describe a tragic mask as “sad,” and by this we would mean not that the mask is in some actual state of sadness, but rather that it displays the physiognomy associated with sadness. Emotional descriptions of music work in a similar way. When we call a piece of music “sad,” we are using the term in the secondary sense referring to the outward manifestations of sadness, its behavioural correlates, rather than in the primary sense referring to a psychological state. Davies clarifies his view by stressing that the connection between the two uses of the word “sad” (the psychological one, and the behavioural one) is not one of mere homonymy (as in the use of “bank” to indicate both a financial institution and a riverside), but rather an instance of polysemy, that is, of distinct but related meanings (as in the use of “mole” to refer to both a burrowing animal and an undercover agent).

d. Emotions Aroused by Music

There are two main issues related to the emotions aroused by music in listeners. The first is the question as to whether instrumental music may arouse emotions (at least some emotions) and how it may do so. The second is the question as to whether any of these emotions are relevant to the appreciation of music qua music.

i. The Sceptical View

I start from a sceptical view of emotional arousal defended by Peter Kivy (Music Alone ch. 8). While he does not deny that listening to music regularly arouses garden-variety emotions (happiness, sadness, and so on), Kivy denies that any emotion of this sort is relevant to the appreciation or understanding of music as music. This apparently sweeping claim is best understood in light of Kivy’s preferred theory of emotions, that is, a cognitive theory according to which emotions always come with a feeling-state component, an intentional object, and an appropriate belief. Pure music is deprived of the propositional content or extra-musical references necessary to supply a relevant intentional object and belief. So music alone cannot arouse such emotions in us. However, music often gives rise to all sorts of idiosyncratic associations in the listener’s mind. It is these that, according to Kivy, provide the material necessary to the arousal of happiness, sadness, and the like. It is a short step from here to a sceptical position: if garden-variety emotions are aroused by music because of associated content brought to mind by the listening experience, then it is, properly speaking, that content that does the arousal and not the music. Moreover, if the emotional arousal in question is prompted by content that is contingently related to the piece that calls it to mind, then the emotions aroused are irrelevant to the appreciation of the piece. They may in fact be of a completely different character to two different listeners who associate different contents with the piece in question.

There is only one sort of emotion that, according to Kivy, is connected to our appreciation of music. Unsurprisingly, this emotion fits the cognitive view of emotions in that it has an intentional object and a corresponding belief. More precisely, this nameless emotion is one that takes the music as an object and, correspondingly, the belief that the piece is beautiful, well-crafted, skilful, and so forth. Among the properties of the piece that may give rise to such emotional response are also expressive properties. A sad musical work may be beautifully sad, that is, it may express sadness in particularly poignant and well-suited musical means. But this is not to say that the appreciation of such a characteristic is going to arouse sadness in us. Rather, the emotion aroused in these cases is the very same nameless emotion mentioned earlier, a response that takes the music as an object and is sustained by the belief that the music is skilfully, beautifully, and powerfully expressive.

ii. Emotional Contagion

Against the sceptical view, some philosophers hold that arousal of garden-variety emotions is possible without the aid of extra-musical associations. Particularly, those who hold a more liberal view than Kivy’s are inclined to think that music may arouse in the listener the emotions it expresses as happens in the case of emotional contagion from music to listener. This is the position defended by Stephen Davies, who rejects the cognitive theory of emotions. While some emotions may fall neatly in the template described by the cognitive theory, others do not. For instance, we may experience an objectless anxiety or a phobia that lacks the support of any relevant belief. Emotional contagion from music to listener is another example: we catch the music’s emotional state, but the music is not the intentional object of our emotional response (we are not sad about the music, but merely saddened by it; Davies, “Emotional Contagion” 51–52)

Jenefer Robinson’s view is similar to Davies’ in that she holds music to be capable of arousing emotional responses of a mirroring sort. However, she is critical of Davies’s description of the arousal process. In particular, she claims that Davies is mistaken in holding that emotional contagion is the result of a listener’s experience of musical expressiveness (392). According to Robinson, things are quite the opposite, as music is able to induce the emotional states it expresses both before we may realise it expresses them and independently from our capacity to do so. From this point of view, Davies’ description of the mirroring process (or emotional contagion) is unduly heavy on the cognitive side, as it describes contagion as dependent on the listener’s capacity to recognize the music’s expressive character.

Robinson provides an intriguing and empirically informed account of the contagion process. First, music expressive of e may elicit psychological and physiological changes typical of certain moods. Subsequently, the listener may latch onto environmental cues that may supply an intentional object to her emotion. For instance, I may be listening to a happy piece of music, and this may arouse a cheerful mood in me. The mood will convert into a full-blown emotion of happiness when I see something on my desk that reminds me of a friend who is far away but who I will soon get to see. Robinson calls this process the “Jazzercise” effect (391).

Davies is sceptical regarding both Robinson’s objection and her account of the contagion process. Against the worry that he may give too prominent a role to the listener’s recognition of the music’s expressive character, he replies that he does not rule out what he calls “non-attentional contagion,” that is, the unconscious, emotional attuning to expressive features of the environment. He merely believes this to be less central a case than its attentional counterpart (Davies, “Emotional Contagion” 56).

Davies’ criticism of Robinson’s Jazzercise effect questions whether this is a genuine case of contagion from music to listener. If the music merely occasions physiological changes and the corresponding objectless mood, and if these need to be supplemented by environmental cues in order to result in the arousal of an emotion, then the object of our emotion is whatever feature of the environment aroused it. In the above example, if the happiness is prompted by seeing the picture on my desk, then it would seem that we are in the presence of a standard emotion of the cognitive sort, one that does not take the music as an object. (Davies, “Emotional Contagion” 58–60).

iii. Negative Emotions

Recall that one of the standard objections against the arousal theory questions the willingness of listeners to put themselves in negative emotional states by listening to music expressive of such states. And, as we have seen, various philosophers who reject the arousal theory claim nonetheless that music may in fact arouse in the listener the emotions it expresses. It then remains to be seen how they justify the listener’s toleration of, or even attraction to, deeply sad music, if such music has the disposition to arouse in them the negative emotional states it expresses. I examine two prominent answers to this problem.

Levinson considers the music’s expressive character as capable of arousing in the listener the feeling component of emotional states. This falls short of what is required to have a full-blown emotional state, which would require an intentional object and a relevant belief. It is exactly this that makes the musical arousal of emotions a rewarding experience, as the absence of the usual contextual implications for our lives of negative states allows us to relish and explore the phenomenological aspect of these emotions, that is, the feeling component aroused by the music. As Levinson puts it, “(w)e become cognoscenti of feeling, savoring the qualitative aspect of emotional life for its own sake” (“Music and Negative Emotions” 324).

Levinson further claims that additional benefits may be available to the listeners who imaginatively engage with the feeling component aroused by the music and imagine to be themselves in a full-blown state of despair, sadness, or any other negative emotion (“Music and Negative Emotions” 326–329).

Davies is drawn to a more modest but perhaps more effective solution. He observes how many human activities that are valuable and sought after possess an intrinsically unpleasant or painful element—think of weight training or running. Listening to music expressive of negative emotions is one such activity: one of the ways in which we listen to music with understanding is by reacting emotionally to its expressive character, such as when we are made cheerful by happy music or sad by sad music. Because he describes the negative emotional response to sad music as an integral response of our understanding of such music, Davies avoids characterising negative emotional responses as something we endure in order to pursue some goal. He writes: “The response is not an incidental accompaniment but rather something integral to the understanding achieved. It is not something with which one puts up for the sake of understanding; it is an element of that understanding” (Davies, Musical Meaning 312).

3. Ontology of Music

Philosophical reflection on the ontological status of music has tackled three main problems: the fundamental ontological nature of musical works, the possible differences in ontological status of works belonging to different musical traditions, and the issue of what counts as an authentic instance of a piece. The three following sections examine these issues.

a. Fundamental Ontology

We know that the Mona Lisa is a canvas in a large room in the Louvre; likewise, we know that the Palazzo Vecchio is a building in Piazza della Signoria in Florence. These objects seem relatively easy to locate and classify. Musical works, however, are elusive entities. Where is Bach’s Musical Offering, and what kind of thing is it?

Fundamental ontology is mainly concerned with the question as to what sort of entity musical works are, that is, to what ontological category they belong. Dodd calls this the categorial question (“Musical Works” 1114). Are works of music collections of particulars, or are they types that are instantiated by various performances?

Alongside this basic question, musical ontology addresses what Dodd has named the individuation and the persistence questions (“Musical Works” 1114–1116). The former deals with identity conditions: when are we to consider two works as the same? Is identity of notation sufficient? Should we include historical factors, such as its date of composition? The persistence question, on the other hand, concerns a musical work’s coming into being as well as its possible destruction. Do composers create works, or do works exist prior to their composition, in which case they are merely discovered? And under what circumstances, if any, would a musical piece cease to exist?

Views of musical ontology are normally grouped according to the way in which they answer the categorial question—a practice I follow here. However, it is worth observing that pre-theoretical intuitions regarding a work’s identity and its creation or discovery are often decisive in accounting for a philosopher’s preference for one ontological category over another.

i. Nominalism

An early proposal advanced by Nelson Goodman is that we should consider a musical work to be a collection of particulars and, more specifically, as a set including all of the work’s correct performances. This view is appealing to those who, like Goodman, intend to avoid commitment to entities other than particulars. However, it runs into rather obvious problems. First, nominalism seems to convert contingent facts regarding a work’s performances into facts about the work of music they are performances of. For instance, suppose I write a piece of music for guitar this afternoon. I then perform it three times, but every time my performance contains a wrong note on bar 8, as the passage exceeds my technical abilities. The nominalist view would seem forced to draw the absurd conclusion that the piece itself contains a wrong note, as the composition exists only as the set of my three defective performances. Alternatively, the nominalist might embrace the equally counterintuitive view that the work in question has never been performed, for all of the available performances are defective and so do not really count as performances of the work.

A second, even more serious worry, takes the form of a modal objection. It is arguably contingent for a work of music to have been performed a certain number of times. There is a possible world in which Thelonious Monk’s Straight, No Chaser has been performed two more times than it has in ours, and others in which it has been performed eight times fewer. But if we construe works of music as sets, a problem arises, for sets necessarily have just the members they do. The incapacity of accounting for this modal characteristic of the relation between a work and its performances seems to doom the nominalist project to failure.

ii. Platonism

Kivy has proposed what may be considered the most elegant way to account for the relation between a work and its performances. He suggests that the musical work is an eternal type and is realised in its various performances (Kivy, “Platonism in Music”).

This view has been questioned by Levinson, who stresses its inability to account for two rather central intuitions regarding musical works (“What a Musical Work Is” 65–78). First, we would consider two pieces identical in their sound structure but composed at different times to be two different pieces. This intuition is arguably grounded in the different properties we would ascribe to these two pieces (the earlier piece may be ground-breaking, the later one scholastic). Kivy’s view does not respect this intuition, as it identifies musical works with their sound structure and would therefore consider the two pieces to be identical. Second, we consider composers as the creators of the pieces they compose, whereas Kivy’s view holds that composers merely discover pre-existing sound structures.

Jerrold Levinson has suggested his alternative proposal that it is to better accommodate these intuitions. Consider first a non-musical example: the case of the Tarte Tatin. While this type of cake is certainly instantiated by a variety of tokens, it does not serve our intuitions well to hold that this model has always existed in the Platonic realm of eternal forms alongside mathematical entities and the like. The Tarte Tatin is a repeatable entity that was created at some point in time by someone who specified its ingredients, preparation, and so on. The case of musical works may be ontologically akin to the one just presented. We need to make sense of a musical piece as something that has been specified in its sound structure and performance means by some agent at a certain time. Levinson calls this ontological category an indicated type (“What a Musical Work Is” 79). More precisely, a musical work as a sound/performance means structure-as-indicated-by-X-at-t. This characterisation of a piece’s ontological nature is also capable of accounting for the two intuitions mentioned above: the intuition that we should consider as separate works two pieces with an identical sound structure but composed at different times in the history of music and the intuition that musical works are created rather than discovered.

Julian Dodd in 2007 revived the standard Platonist view, according to which musical works pre-exist their composition (“Works of Music”). I focus here on his rejection of Levinson’s arguments. Dodd’s first point concerns Levinson’s claim that a full-fledged Platonist view would fail to make sense of our intuition that composers, in composing a work, engage in a creative process. Dodd takes this objection to conflate the psychological notion of creativity with the metaphysical claim that something is created by composers. While the view that composers are creative is arguably correct, this view is simply expressing the idea that composers are engaging in a creative process, not that they are bringing something into existence. Dodd considers discoveries in the field of mathematics or logic as a useful parallel to the musical case: we do not deny that Pythagoras was a creative individual, even though we may well hold that the theorem that bears his name is an abstract entity that pre-existed its discovery by the Greek mathematician.

Dodd’s second objection to Levinson’s account questions its capacity to strike the intended compromise between the type/token view and our intuition that musical works are created. Indicated types, Dodd observes, are just as problematic as their non-indicated cousins in that they also pre-exist their discovery. Levinson is making the mistake of considering the impossibility of a type’s instantiation as equivalent to the type’s non-existence. But this is metaphysically suspicious, to say the least. As Dodd exemplifies, the type “child born in 1999” could not have been instantiated in the year 1066, yet we would not consider this as a reason to deny its existence in 1066. Ditto for works of music as indicated types. But if indicated types also pre-exist the act of composition, it would seem that we fall back into the idea that musical works are discovered rather than created.

iii. Sceptical Views

In a famous study, Lydia Goehr claimed that the concept of musical work we are familiar with appeared only in the 19th century, as earlier musical practice had looser criteria regarding a piece of music’s identity as well as a more diluted conception of authorship. For instance, scores were less precise in indicating embellishments and performance dynamics—if they did so at all. According to Goehr, this shows how the search for a fundamental ontological category is mistaken when it comes to a culturally variable and historically mutable practice such as music making.

Goehr’s dismissal of musical ontology isn’t typically welcomed by analytic philosophers of music. For instance, Stephen Davies observes that Goehr’s examples show that pieces of music composed prior to 1800 may have had a higher degree of indeterminacy in that they left more choices to the performer. But this falls short of supporting the view that the composers of such pieces were not creating works of music of a sort ontologically akin to those composed later on (Davies, Musical Works 123).

b. Comparative Ontology

Often referred to as “higher-order ontology,” comparative ontology explores alleged differences in the sort of musical works characteristic at the centre of different musical genres. By way of example, I present here two debates concerning the correct ontological characterisation of works in two musical traditions: rock and jazz.

i. Rock

Theodore Gracyk pioneered philosophical reflection on rock music with his monograph Rhythm and Noise (1996). He argues that records are the primary artistic object produced by rock artists as they represent the focus of appreciation for rock fans and critics. While Gracyk is ready to concede that rock musicians also create songs, he denies to songs the central critical place he accords to records. The view he puts forward construes the rock tradition as fundamentally different from the classical one, as in the latter the object of attention is the work as determined by the score, quite apart from its instantiations. In the rock tradition, on the contrary, the particular manifestation of the song found on the relevant recording is the ultimate object of critical attention.

Stephen Davies agrees with Gracyk that works in the rock tradition rely heavily on studio wizardry, but he is unwilling to give up the idea that rock pieces are works for performance. After all, some rock bands only exist as garage bands, playing small venues and never recording their songs, while other major bands play a song live for quite some time before recording an album version. According to Davies, we can make sense of these practices if we describe rock songs as works for studio performance (as opposed to works for live performance, such as the works in the Western classical tradition). What distinguishes works for studio performance is that they are created with the studio as a privileged performance venue, as the studio allows the manipulation of the musical material that, as shown by Gracyk, is so central to the rock tradition (Davies, Musical Works).

In this way, Davies is able to accommodate the important intuition that there could be performances of rock works—rather than just playbacks of the relevant tracks—while still preserving the strength of Gracyk’s claim that studio production plays a paramount role in the sonic identity of rock pieces.

A fundamental difference between this view and Gracyk’s is that Davies intends to stress the continuity between rock and classical music: both traditions produce works for performance, although with different performance contexts in mind.

Christopher Bartel argues that both Gracyk and Davies are mistaken in considering the record as the primary object of appreciation. Grounding his claim on evidence concerning appreciative practices, he argues that several artists in the rock scene are appreciated for their skills as performers or songwriters. As an illustration of this, consider the contrast between the two hard rock bands Led Zeppelin and Deep Purple. These two iconic bands produced their most important records around the same time and played a relatively similar kind of music. However, while the first is appreciated for the polished, layered, and modern character of the tracks they recorded, the latter is considered by fans to be at its finest as a live band. Yet other rock artists are credited for the songs they have written over and above their value as performers or recording artists—Leonard Cohen being a case in point. Bartel concludes that “there are (at least) three practices central to the rock tradition, and musicians will place varying degree of emphasis on each” (153).

ii. Jazz

Various philosophers have examined the ontological peculiarity of jazz, with particular focus on the nature of jazz standards. Here I focus mainly on a debate between Andrew Kania and Julian Dodd.

Kania claims that jazz standards cannot fit the standard token/type ontology that seems apt to describe the relation between a work and its instantiations typical of the Western classical tradition. Jazz, according to Kania, is a workless musical tradition (“All Play and No Work”). While the claim may appear counterintuitive, Kania holds that no available realist view about jazz works could make sense of jazz performance practice. Kania offers three main reasons in favour of his view.

First, he argues that variation in the performance of a standard is too great to identify a core musical material that is common to every performance of that standard. Jazz standards, it would seem, cannot be located in the way works in the classical tradition can. Second, and relatedly, jazz standards do not constrain performance as classical Western works do. Third, Kania claims that jazz standards are not the focus of critical attention. Rather, it is their performances, and their improvisational elements in particular, that are normally subject to the greatest critical scrutiny. This is in stark contrast with what happens in the Western classical tradition, in which the work is the focus of attention.

Against Kania, Dodd holds jazz standards to be ontologically akin to works in the classical tradition. While they may be ontologically thinner, in that performers have more freedom with regard to the piece’s structure, instrumentation, length, and other features, works of jazz are repeatable works, the identities of which are grounded in instructions for performance determined by the composers. Central to Dodd’s rebuttal of Kania’s view is the idea that performance authenticity plays a more peripheral role in jazz than in classical music: we are interested mainly in what musicians do with a standard rather than in their correct performance of it, but this is not to say that the standard is ontologically different from works in the classical tradition (Dodd, “Upholding Standards”).

iii. A Sceptical View

Consider the pluralist view of rock ontology proposed by Bartel in 2017 and summarised above. It is a short step from the acceptance of this sort of pluralism to a full-fledged scepticism with regard to the enterprise of comparative ontology. For if there is no entity that is accorded pride of place when it comes to the appreciation and evaluation of rock music, then perhaps the whole idea of exploring the nature of the rock work rests on the mistaken assumption that there is one such thing. Lee B. Brown has suggested just that. He argues that, both in rock and jazz, what we have are multiple directions of critical and appreciative interest, and no ontological investigation could possibly identify a single ontological category as critically privileged without abandoning a descriptivist approach. He writes: “The truth is that rock history has not depos­ited any well-entrenched concept of the work of rock music” (174).

c. Performance Authenticity

Agreement concerning the fundamental nature of musical works does not imply agreement with regard to how they are correctly instantiated in performances. This section examines answers to the question as to what counts as a correct performance of a piece. In examining this issue, philosophers of music have mainly taken as a point of reference the tradition, starting in the 20th century, of historically informed music performance. Broadly construed, this tradition holds that pieces of music ought to be performed in a way sensitive to the period in which they were composed. While versions of this thesis are widely accepted in musical practice, philosophers and musicologists have debated the justification of such approaches.

As with other issues in the field, Kivy has been one of the earliest and most influential contributors, with a monograph on the topic (Authenticities). His book appeared in the same year as musicologist Richard Taruskin’s seminal Text and Act (1995), and both works share a degree of scepticism regarding the philological reconstruction of the original sound of past music. Particularly, they share the intuition that the self-proclaimed objective, evidence-based treatment of performance and instrumentation choices, is an attempt to remove a central aspect from music-making, at least in the Western classical tradition, that is, the contribution to a piece of the performer’s interpretation of it. Kivy describes this as an unfortunate trade-off between personal authenticity and other kinds of authenticity, particularly sonic authenticity, which is defined as the attempt at replicating the sound of past performances.

Recall that Levinson considers musical works to be structures comprising both sounds and performance means. As a consequence of this view, Levinson holds an instrumentalist position with regard to the instantiation of a work: a work of music is correctly instantiated only if it is performed with the musical instruments (or, more generally, performance means) prescribed by its score.

Stephen Davies distinguishes between ontologically thick and thin works of music (Musical Works). Thin works are comparatively less specific in prescribing performance means and other properties of a correct performance, whereas thicker works leave relatively less freedom to the performer. As an example, popular songs in the American songbook are thinner works than a Mahler symphony. A way to interpret Davies’ suggestion is to consider a compromise between the sonicist and instrumentalist positions just mentioned: there is no absolute standard when it comes to performance requirement, as works in certain traditions prescribe specific instrumentations, whereas other musical practices are more liberal.

Dodd has argued for a pluralist account of performance authenticity, distinguishing between compliance authenticity and interpretive authenticity (“Performing Works of Music”). Whereas the former is concerned with the accurate performance of the piece as specified by the score, the latter is a way of performing that displays a deep understanding of the piece. These two ways of performing may at times be in tension: there are occasions in which disregarding compliance concerns may help a performer produce a persuasive performance of a piece. In these cases, musical practice shows that concerns for interpretive authenticity may override concerns for compliance, as when a piece’s indicated tempo is disregarded by a performer because she deems a different tempo to be more suited to the piece’s character (Dodd, “Performing Works of Music” 9).

A sceptical view of historically informed performance has been expressed by James O. Young. He considers various formulations of the authenticity ideal animating historically informed performances of music and dismisses them as either unattainable or unattractive. Young concludes that contemporary performers engaging in historically informed performance are valuable for their artistic achievements and for their capacity to present the music they play under a new, stimulating light, and not because of their ability to retrieve the “authentic” version of a piece.

A final observation: although I have started by noting how the debate concerning the basic ontological status of musical works does not settle performance authenticity issues, it is worth stressing how the two problems are presumably connected. Historically loaded characterisations of musical work’s fundamental ontological nature are often paired with authenticity requirements that include the means of production of a musical structure (for example instrumentation, as in Levinson’s case), whereas fundamental ontologies of a platonic sort tend to set the bar low, in that parameters such as timbre or instrumentation are irrelevant to the instantiation of a musical work.

4. Musical Understanding

Music isn’t simply sounds we hear. It is sounds we listen to. Analogously to natural languages, the process of listening to music involves understanding it as music. But how exactly should this understanding be characterised? Contemporary analytic philosophy has produced a debate regarding the way in which we should describe basic musical understanding. The intent is to describe the minimum requirements for the appreciative understanding of a musical piece. The two main opposing views, championed by Levinson and Kivy, are termed concatenationism and architectonicism.

a. Concatenationism

According to Levinson, basic musical understanding is defined by our ability to follow the music’s development from one moment to the next (Music in the Moment).

In order to describe this process, Levinson introduces the concept of quasi-hearing. This refers to the process of attentive listening that encompasses the moments immediately preceding the present one, and that, on this basis, anticipates the music’s short-term development.

Basic musical understanding, as characterised by Levinson, does not include a grasp of large-scale structures, such as the exposition-development-recapitulation characteristic of sonata form. While Levinson does not deny that many educated listeners do pay attention to formal musical features, he denies that awareness of these aspects is required in order to satisfactorily understand a piece of music.

In a 2015 defence of his view, Levinson also appeals to empirical research showing how even accomplished musicians are insensitive to significant changes in large-scale structure, as long as they are able to follow the music’s flow from one moment to the next (“Concatenationism” 42).

b. Architectonicism

Against Levinson, Kivy observes that part of the Western classical music canon is impossible to understand without some degree of awareness of large-scale musical structure (“Music in Memory”). Kivy agrees that momentary listening is basic in the sense of being presupposed by any other kind of musical understanding. If one cannot follow the music’s moment-to-moment progress, one cannot understand music at all. But to concede this is not to say that all music may be understood by listeners who follow only the music’s unfolding in the short span covered by quasi-hearing.

A third party in this dispute, Stephen Davies, offers a criticism of Levinson’s view that downplays the difference between concatenationism and architectonicism (“Musical Understandings”, 95–99). He observes that Levinson seems to present momentary listening and structural listening as distinct psychological processes, the former involving perceptual awareness and the latter some sort of cognitive appraisal. But this need not be the case. For instance, our recognition of a theme as it returns after several minutes from its first appearance is perceptual in that it does not involve explicit knowledge regarding the work’s structure, yet it arches back to a part of the piece that clearly lies outside the scope of our quasi-hearing capacities. Accordingly, Davies claims that “Levinson shows not that grasping a work’s overarching form is irrelevant to musical understanding but that such awareness must arise from the listening experience” (“Musical Understandings” 97).

5. Musical Value and Profundity

Is there a value intrinsic to pure instrumental music? For the purpose of this section, I define as an intrinsic value to a work of art w a value that is unavailable to those who do not experience w. This means a work of art’s intrinsic value is not merely instrumental—as is, for instance, the work’s capacity to generate wealth if sold at an auction. While it may be conjectured that representational art-forms possess a value related to their representational content, this move is impossible in the case of pure music, as this lacks by definition any ties to the real world. Where, then, does the value of music reside?

a. Values of Music

But we may be moving too quickly, for it is not beyond dispute that pure music indeed lacks any extra-musical reference. For one thing, as we have seen, many philosophers believe music to be expressive of emotions. (Though they may not agree on whether music may also be about the emotions it expresses—recall that this is a major difference between Kivy’s and Davies’ accounts.) If pure music indeed does have an emotional character, it may be that part of its value as an art form is related to this feature. As an example of this, recall Robinson’s view that a musical piece’s expressive character articulates and individualises an emotion and may allow a listener to feel what it is like to be in that emotional state. If correct, this account offers an elegant insight into the value of expressive music, for it shows that music has the capacity to make us understand what it is to feel an emotion without our having to undergo the full-blown emotion.

Let us now restrict our focus to those who seek to explain the value of music apart from whatever value may ensue from the music’s capacity to be expressive of emotions. This move is necessary because, regardless of how optimistic one may be with regard to the value of musical expressiveness, one will be forced to admit that much great music is lacking in expressive power. Any value possessed by music of this kind would have to be of a sort different from the value connected to the music’s expressive character.

The most promising strategy in order to explain the value of music, despite its lack of any evident connection to our world, is to bite the bullet of abstraction and claim that pure music is valuable precisely because of its abstract nature. This is the strategy pursued by Alan H. Goldman. He observes that it would not be sufficient, in order to establish the peculiar value we accord to music, to point out the ways in which it expresses emotions. For literature and the visual arts surely do so with greater precision, and the scope of emotional states they are able to represent lies outside the possibilities for pure music. The real value of music, according to Goldman, resides in its capacity to fully engage us in the exploration of an alternative world. Goldman fleshes out this proposal by noting that musical tones are experienced as independent from their material sources and constitute a virtual musical space (39–40). Moreover, the development of music is experienced as purposive: the music goes through struggles and developments and finally finds rest—at least in tonal pieces. But this must solve only part of the enigma, for Goldman has so far suggested only that the experience of music is the experience of an alternative world. He hasn’t yet explained why such an experience is valuable to us. His suggestion is that in addition to the capacity music has to allow us to escape our daily concerns regarding the actual world, there is a particularly welcome feature to the alternative world music opens to us. The musical world is designed, and its dissonances and hesitations are finally resolved as the piece comes to an end. The world of music is “a totally human world in which threats are tamed even when tinged with pathos or other negative emotions throughout” (42).

Malcolm Budd also attempts to explain the value of music through its abstract nature. He likens the appeal of music to that of other natural and artefactual objects featuring abstract patterns (165). The peculiarity of music is that these abstract gestalten are offered as developing in time. Moreover, music presents us with formal structures that reach levels of complexity hardly imaginable in other contexts, in that the formal structures of music are hierarchically organized and related to other structures within the same piece—think of an arpeggio as a sonic gestalt, embedded in a chord progression, which we experience as a larger gestalt (168). While we may be confronted with similar levels of formal complexity in the case of logic or mathematics, this abstract complexity is rarely given perceptually, and the formal structures we deal with in those cases do not arguably have as their primary goal the exploration of aesthetically rewarding structures.

Budd also observes how abstractness does not preclude references to the extra-musical altogether: pure music may exemplify relations that are not, qua relations, exclusive to music. (The concept of exemplification used by Budd is the one introduced by Goodman and presented in section 2.c.) Consider the simple case of imitation in a contrapuntal piece. The relation of imitation instantiated by the piece is one that has application outside of the musical domain, and a work of music may exemplify this relation by prominently showcasing it.

b. Profundity

A related debate concerns the sense in which pure music may be described as profound. We routinely say of novels, poems, movies, and even paintings, that they are profound, and mean by this that they convey some sort of insight or give us food for thought. It has been a matter of debate whether pure music could do the same. If it does, then this would arguably constitute a further way in which music may be valuable.

Kivy is sceptical (Music Alone ch. 10). Pure music lacks the minimum requirements for profundity, namely the capacity of denoting something extra-musical, as well as the capacity to communicate profound propositions about that thing.

Kivy had excluded the possibility of musical profundity before others claimed against him that music may indeed be profound. He later likened his early expression of scepticism to the story of the man who told the children not to stick beans up their nose: they would have never had the idea had he not suggested it to them (“Another Go” 410). Much like the children in the story, philosophers of art have tried to do exactly what Kivy said wasn’t advisable, that is, show that music may be profound. What follows is a presentation of two relevant attempts.

Stephen Davies develops a notion of musical profundity that does not commit him to claims about the music’s possession of propositional content. He suggests an analogy with a game of chess ( “Profundity” 348). A cleverly played game or an unexpected and brilliant move may be described as profound because of their capacity to illustrate the impressive potential of human ingenuity and inventiveness. A game of chess is profound not by communicating profound propositions but rather by showing profound analytical skills, problem-solving abilities, and so on. Similarly, music is sometimes profound because it displays a composer’s cleverness in handling the musical material, from the tonal development to the details of the orchestration.

Kivy remains unconvinced by this attempt, and notes that Davies’ criterion for profundity does not seem to reflect the intuitive claim that not all great works of art are profound. If profundity is a display of astonishing ingenuity, then all music masterpieces should be described as profound. That we would refuse to do so counts against Davies’ view of profundity (Kivy, “Another Go” 407).

According to Dodd, Kivy is right in holding music incapable of communicating propositional content, but he is mistaken in considering this a requirement for profundity. In fact, both requirements he sets out for something to be profound are misleading (Dodd, “The Possibility” 301). First, profundity does not require denotation but mere reference, and reference may be achieved in ways other than through denotation. Dodd’s suggestion is that display may be the relevant relation in the musical case. Among the properties it has, a work of music displays those that the sensitive listeners perceive as crucial to the work’s point. In doing so, it may elicit in such a listener a deeper understanding of its subject matter, that is, the displayed properties. According to Dodd, Kivy is misled from the start by his insistence on a quasi-semantic characterisation of profundity, a tendency he undoubtedly owes to his choice to treat literary profundity as paradigmatic (Dodd, “The Possibility” 302).

6. References and Further Reading

  • Bartel, Christopher. “Rock as a Three‐Value Tradition.” The Journal of Aesthetics and Art Criticism, vol. 75, no. 2, 2017, pp. 143–154.
  • Brown, Lee B. “Do Higher-order Music Ontologies Rest on a Mistake?” The British Journal of Aesthetics, vol. 51, no. 2, 2011, pp. 169–184.
  • Budd, Malcolm. Values of Art: Painting, Poetry, and Music. Penguin, 1995.
    • This work, while not uniquely concerned with music, has insightful discussion on both the musical expression of emotions and the value of music as art.
  • Davies, Stephen. “Artistic Expression and the Hard Case of Pure Music.” Contemporary Debates in Aesthetics and the Philosophy of Art, edited by Matthew Kieran, Blackwell, 2006, pp. 179–191.
  • Davies, Stephen. “Emotional Contagion from Music to Listener.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 47–65.
    • Davies rejects Robinson’s view of emotional contagion and offers an alternative model.
  • Davies, Stephen. “The Expression of Emotion in Music.” Mind, vol. 89, no. 353, 1980, pp. 67–86.
  • Davies, Stephen. “John Cage’s 4′ 33″: Is It Music?” Australasian Journal of Philosophy, vol. 75, no. 4, 1997, pp. 448–462.
  • Davies, Stephen. “Music and Metaphor.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 21–33.
  • Davies, Stephen. Musical Meaning and Expression. Cornell University Press, 1994.
    • A reference work for the debate on musical expressiveness. Like Kivy, Davies defends a resemblance theory of musical expressiveness, although their views do not overlap completely.
  • Davies, Stephen. “Musical Understandings.” In his Musical Understandings & Other Essays on the Philosophy of Music, Oxford University Press, 2011, pp. 88–128.
    • An overview of issues concerning musical understanding.
  • Davies, Stephen. Musical Works and Performances: A Philosophical Exploration. Clarendon Press, 2001.
    • An important work on musical ontology, with a focus on comparative ontology and authenticity.
  • Davies, Stephen. “Profundity in Instrumental Music.” The British Journal of Aesthetics, vol. 42, no. 4, 2002, pp. 343–356.
  • Dodd, Julian. “Musical Works: Ontology and Meta-ontology.” Philosophy Compass, vol. 3, no. 6, 2008, pp. 1113–1134.
  • Dodd, Julian. “Performing Works of Music Authentically.” European Journal of Philosophy, vol. 23, no. 3, 2015, pp. 485–508.
  • Dodd Julian. “The Possibility of Profound Music.” British Journal of Aesthetics, vol. 54, no. 3, 2014, pp. 299–322.
  • Dodd, Julian. “Upholding Standards: A Realist Ontology of Standard Form Jazz.” The Journal of Aesthetics and Art Criticism, vol. 72, no. 3, 2014, pp. 277–290.
  • Dodd, Julian. “What 4’ 33” Is.” Australasian Journal of Philosophy, 2017, doi: 10.1080/00048402.2017.1408664.
  • Dodd, Julian. Works of Music: An Essay in Ontology. Oxford University Press, 2007.
    • Dodd presents a defence of the Platonist view of musical ontology.
  • Dyck, John. “Natural Sounds and Musical Sounds: A Dual Distinction.” The Journal of Aesthetics and Art Criticism, vol. 74, no. 3, 2016, pp. 291–302.
  • Fisher, John Andrew. “The Value of Natural Sounds.” Journal of Aesthetic Education, vol. 33, no. 3, 1999, pp. 26–42.
  • Fisher, John Andrew. “What the Hills Are Alive With: In Defense of the Sounds of Nature.” The Journal of Aesthetics and Art Criticism, vol. 56, no. 2, 1998, pp. 167–179.
  • Goehr, Lydia. The Imaginary Museum of Musical Works: An Essay in the Philosophy of Music. Clarendon Press, 1992.
    • A take on the analytic perspective on musical ontology that is well known even outside philosophical circles.
  • Goldman, Alan. “The Value of Music.” The Journal of Aesthetics and Art Criticism, vol. 50, no. 1, 1992, pp. 35–44.
  • Goodman, Nelson. Languages of Art: An Approach to a Theory of Symbols. Bobbs-Merril, 1968.
    • From an historical perspective, this work is fundamental in setting the stage for future debates concerning musical ontology and expressiveness.
  • Gracyk, Theodore. Rhythm and Noise: An Aesthetics of Rock. Duke University Press, 1996.
    • A seminal work in the discussion of comparative ontology, specifically regarding the ontological status of rock works.
  • Kania, Andrew. “All Play and No Work: An Ontology of Jazz.” The Journal of Aesthetics and Art Criticism, vol. 69, no. 4, 2011, pp. 391–403.
  • Kania, Andrew. “Definition.” The Routledge Companion to Philosophy and Music, edited by Theodore Gracyk and Andrew Kania, Routledge, 2011, pp. 3–13.
    • Regarding the definition of music, this chapter in The Routledge Companion to Philosophy and Music is an excellent starting point. It is also worth noting that the entire Companion offers an excellent and up-to-date overview of most topics in the analytic philosophy of music.
  • Kingsbury, Justine. “Matravers on Musical Expressiveness.” The British Journal of Aesthetics, vol. 42, no. 1, 2002, pp. 13–19.
  • Kivy, Peter. “Another Go at Musical Profundity: Stephen Davies and the Game of Chess.” The British Journal of Aesthetics, vol. 43, no. 4, 2003, pp. 401–411.
  • Kivy, Peter. Authenticities. Philosophical Reflections on Musical Performance. Cornell University Press, 1995.
  • Kivy, Peter. The Corded Shell: Reflections on Musical Expression. Princeton University Press, 1980.
    • A reference work for the debate on musical expressiveness, defending a resemblance theory of musical expressiveness similar to Davies’, although their views do not overlap completely.
  • Kivy, Peter. Music Alone: Philosophical Reflections on the Purely Musical Experience. Cornell University Press, 1991.
    • Kivy discusses a variety of issues concerning musical value and profundity and defends a formalist view of the appreciation of Western classical music.
  • Kivy, Peter. “Music in Memory and Music in the Moment.” In his New Essays on Musical Understanding, Oxford University Press, 2001, pp. 183–217.
  • Kivy, Peter. “Platonism in Music: A Kind of Defense.” Grazer Philosophische Studien, 19, 1983, pp. 109–129.
  • Levinson, Jerrold. “Concatenationism, Architectonicism, and the Appreciation of Music.” In his Musical Concerns: Essays in Philosophy of Music. Oxford University Press, 2015, pp. 32–44.
  • Levinson, Jerrold. “The Concept of Music.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 267–278.
  • Levinson, Jerrold. “Musical Expressiveness as Hearability-as-expression.” Contemporary Debates in Aesthetics and the Philosophy of Art, edited by Matthew Kieran, Blackwell, 2006, pp. 192–204.
    • A clear formulation of the persona theory of musical expressiveness.
  • Levinson, Jerrold. “Music and Negative Emotions.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 306–335.
  • Levinson, Jerrold. Music in the Moment, Cornell University Press, 1997.
    • Levinson offers the first formulation and defence of the concatenationist view.
  • Levinson, Jerrold. “What a Musical Work Is.” In his Music, Art, and Metaphysics: Essays in Philosophical Aesthetics, Cornell University Press, 1990, pp. 63–88.
    • An important work on musical ontology.
  • Matravers, Derek. Art and Emotion. Oxford University Press, 1998.
  • Nussbaum, Charles O. The Musical Representation: Meaning, Ontology, and Emotion. MIT Press, 2007.
  • Robinson, Jenefer. Deeper than Reason: Emotion and its Role in Literature, Music, and Art. Oxford University Press, 2005.
    • An ambitious and empirically informed study on emotional expression in the arts. The section on music defends a hybrid view that combines arousalist elements with Levinson’s persona theory. This work also presents Robinson’s model of emotional contagion from music to listener.
  • Scruton, Roger. The Aesthetics of Music. Oxford University Press, 1997.
    • A highly original and influential take on many of the issues discussed in this article, from definitional concerns to problems of expressiveness and value.
  • Taruskin, Richard. Text and Act: Essays on Music and Performance. Oxford University Press, 1995.
  • Tormey, Alan. The Concept of Expression: A Study in Philosophical Psychology and Aesthetics. Princeton University Press, 1971.
  • Trivedi, Saam. Imagination, Music, and the Emotions: A Philosophical Study. State University of New York Press, 2017.
  • Young, James O. “The Concept of Authentic Performance.” The British Journal of Aesthetics, vol. 28, no. 3, 1988, pp. 228–238.

 

Author Information

Matteo Ravasio
Email: mrav740@aucklanduni.ac.nz
University of Auckland
New Zealand

Bertrand Russell: Logic

For Russell, Aristotelian syllogistic inference does not do justice to the subject of logic. This is surely not surprising. It may well be something of a surprise, however, to learn that in Russell’s view neither Boolean algebra nor modern quantification theory do justice to the subject. For Russell, logic is a synthetic a priori science studying all the kinds of structures there. This thesis about logic makes up the lion’s share of Russell’s philosophy of logic until the late 1920’s, and we shall have little to say of his flirtations with the naturalization of mind thereafter. We shall have much to say about his views on the ontology of structures, for they underwent extensive changes in the time from his writing The Principles of Mathematics (1903) to the three of the four projected volumes of Principia Mathematica (1910, 1912, 1913) coauthored with Alfred North Whitehead. The fourth volume on geometry never appeared. Much of this article’s presentation of Russell’s Logic will concern Russell’s various logical systems as they pertain to his Logicism. In “Mathematics and the Metaphysicians” (1901), Russell’s heralds his logicist thesis, observing that mathematics has enjoyed a conceptual revolution. One of the chief triumphs of modern mathematics, he explains, consists in having discovered that mathematics studies relational structure and is therefore free of commitment to the metaphysicians’ abstract particulars such as numbers and spatial figures. This revolutionary conception of mathematics was made possible by advances in geometry, especially in non-Euclidean geometry, and advances in analysis, where real numbers, limits and continuity were newly defined by thinkers such as Cantor, Dedekind, and Weierstrass. On the new conception or mathematics, it is relational order, not magnitude that is the focus. Meanwhile, Logic was also enjoying a conceptual revolution due to Gottlob Frege, who maintained that with the impredicative comprehension of functions, logic (that is, comprehension principle logic, ‘cp-Logic’ hereafter) is an informative science. Russell took this new science to be a study of relational structure, conducted by studying relations independently of whether they are exemplified. The branches of mathematics, in Russell’s view, are studies of different sorts of relations, which structure their fields. Mathematics, then, is a branch of the cp-Logic of relations.

Table of Contents

  1. Russell’s Logicism
  2. The Simple Type Syntax of PrincipiaL
  3. Developments: Principia Mathematica’s Section 8, Definite Descriptions, Class Expressions
  4. The Ramified-Type Syntax of Church’s PrincipiaC
  5. The Quantification Theory of Propositions in Theory of Implication (c. 1905)
  6. The Substitutional Theory of Propositions (1905)
  7. The Substitutional Theory Without General Propositions (1906)
  8. Church’s PrincipiaC and Russell’s Orders of Propositions (c. 1907)
  9. Appendix A: Quantification Theory in The Principles of Mathematics
  10. Appendix B: The 1925 experiment of Principia MathematicaW
  11. References and Further Reading
    1. Works by Russell
    2. Books and Articles

1. Russell’s Logicism

In The Art of Philosophizing, Bertrand Russell offered the following admonishment:

If you wish to become a logician, there is one piece of advice which I cannot urge too strongly, and that is: Do Not learn the traditional formal logic. In Aristotle’s day it was a creditable effort, but so was the Ptolemaic astronomy. To teach either in the present day is a ridiculous piece of antiquarianism.

Russell’s own logics are tailored to his Logicism. Care must be taken in using the word ‘Logicism,’ however, since its advocates have had quite different agendas and quite different conceptions of what it entails. Carnap’s characterization presents Logicism as wedded to a deductive thesis according to which all the truths of mathematics can be derived as theorems from a consistent axiomatic foundation that captures all and only logical truths. This use of ‘Logicism’ can lead to confusion. This form of Logicism belongs neither to Frege’s nor to Russell’s conception. Though both held that a system of cp-Logic is consistently recursively axiomatizable, neither made it definitive of Logicism. Gödel showed that a consistent axiomatic calculus adequate to represent every recursive natural number theoretic function is negation incomplete. That is, for each such calculus there is a wff (well-formed formula) G such that neither G nor ~G is a theorem. Since either G or ~G is true in the standard model, the consistent axiomatic system must leave out a truth of arithmetic. But this is also irrelevant to Logicism as Frege and Russell understood it.

Let us put forth the following definition that altogether separates the deductive thesis from the Logicist thesis. Russell’s Logicism is expressed by this definition:

RLogicism =df pure mathematics is a branch of cp-Logic.

Russell’s Logicism is the thesis that all branches of mathematics, including geometry, Euclidean or otherwise, are studies of relational structures and therefore are studies that can be subsumed within the cp-Logic of relations. Cp-Logic is not modern quantification theory with identity. It is a quantification theory that enables the binding of predicate variables as well as individual variables and which embraces the impredicative comprehension of relations independently of whether these relations are exemplified. Its impredicativity indicates that no restrictions are to be placed on the quantifiers occurring in the wffs which give the exemplification conditions for comprehension of universals.

Two important revolutions, one due to Cantor and another due to Frege, are behind Russell’s Logicism and it would inconceivable without them. Henri Poincaré, a prominent mathematician, never could embrace the revolutions. Poincaré thought of logic and mathematics in the old ways, with mathematics about metaphysical abstract particulars, numbers and spatial figures, and logic constrained to proper inference in reason—a theory of a deductive consequence relation. Poincaré thought Russell’s Logicism entailed that mathematicians are to change their creative practices and tailor proofs techniques into the p’s and q’s of a canonical logistic. Russell’s Logicism entails no such transformation. It simply maintains that, as a study of relational structures, mathematics is a part of cp-Logic as the synthetic a priori science studying all the kinds of relational structures there are by studying the way relations, exemplified or not, order their fields. This is not a movement coming outside of mathematics. It comes from within. It implies that mathematicians are doing cp-Logic—that is, studying relation structures—when they do mathematics.

In Russell’s view, Cantor’s revolution, together with such figures as Weierstrass, Dedekind and Pieri, was responsible for inaugurating the transformation of all branches of mathematics into studies of kinds of relational structures. Russell’s agenda was to demonstrate that abstract particulars are nowhere needed in any branch of mathematics. Frege’s revolution was no less central to Russell’s unique Logicism. It was responsible for transforming the field of logic into cp-Logic, which, as Frege saw it, embraces the informative impredicative comprehension of functions. It was precisely this imprediative comprehension that enabled his new cp-Logic to be an informative science capable of capturing the notions of the ancestral and cardinal number, and to arrive at a theorem of mathematical induction. Frege had seen this already in his Begriffsschrift (1879). Russell came to appreciate it slowly. Frege never quite embraced what Russell regarded as the Cantorian revolution and certainly did not have the Russellian agenda of eliminating abstract particulars—not from geometry and certainly not from the arithmetic of numbers (cardinal, natural, and so on). Quite to the contrary, Frege was adamant in maintaining that cardinal numbers are objects.

In The Principles of Mathematics (1903) Russell’s aim is to explain his Logicism.

The Principles of Mathematics operates with an ontology of logically necessary abstract particulars that are called ‘propositions’. They are mind and language independent entities some of which have the unanalyzable property of being true while others are false. The work was to have a second volume which worked out in a technically formal symbolic way the doctrines of the first volume. The second volume would also solve paradoxes such as Cantor’s paradox of the greatest cardinal, the Burali-Forti paradox of the greatest ordinal, and Russell’s paradoxes of classes and attributes (The Principles of Mathematics , p. xvi). The second volume was to have been coauthored with Alfred North Whitehead who had been a long-time mentor of Russell in mathematics and whose work on abstract algebra is a natural ally of the logicist agenda. But, the project was abandoned.

Instead, Whitehead and Russell produced Principia Mathematica. The Preface goes so far as to say that the work of Principia Mathematica had begun in 1900, even prior to the publication of The Principles of Mathematics. It explains that instead of a second volume for The Principles of Mathematics couched in an ontology of logical necessary existing propositions, the work offers a fresh start avoiding abstract particulars not only in all the branches of mathematics but avoiding them in the field of cp-Logic itself (Principia Mathematica, p. v). Ultimately, Russell went on to endeavor to eliminate abstract particulars from philosophy altogether. This is the agenda of his book Our Knowledge of the External World as a Field for Scientific Method in Philosophy (1914) which offered a research program that made Principia Mathematica’s cp-Logic the essence of philosophy. The program, Russell thought, held promise for solving all philosophical problems—problems arising from the paucity of imagination among speculative metaphysicians that results in an inadequate logic that produces indispensability arguments for abstract particulars and kinds of non-logical necessity governing them.

Though Russell’s transition from The Principles of Mathematics to Principia Mathematica is quite complicated, the logicist thesis of the former has not changed at all in the latter. Ample evidence can be found in Principia Mathematica in the following:

Section A: The theory of Deduction (p. 90).

Summary of Principia’s Part I (p. 87).

Summary of Part II, Section A: Prolegomenon to Cardinal Arithmetic (p. 329).

Principia Mathematica says, for example, that the subject of cardinal arithmetic is regarded as different only in degree from the subject matter of logic discussed in Part I. Principia Mathematica is surely advocating Logicism just as in The Principles of Mathematics, but some quite striking changes occur between the two works. For example, in Principia Mathematica Whitehead and Russell no longer regard the infinity of natural numbers to be a subject for mathematics to decide. This result so surprised Boolos (1994) that he concluded that work no longer advances Logicism. But quite to the contrary, it stems from the same source as the discovery in non-Euclidean geometry that not all right triangles obey the Pythagorean theorem. The agenda is to reject indispensability arguments for abstract particulars; the results follow from there. Similarly, that the infinity of the natural numbers is not a mathematical issue follows from the rejection of classes or sets as abstract particulars. There are many such surprises in Principia Mathematica. Another is the discovery that Hume’s Principle, which asserts that the cardinals of two classes are the same if and only if the classes are similar, admits of exceptions (see Landini 2016). Though the conception of Logicism has not changed, it is easy to see that quite a lot happened in the interim between The Principles of Mathematics and Principia Mathematica.

For a great many years the interim period has been akin to the dark ages whose role in modern science has only recently come to light. In this period, Russell worked steadfastly to emulate the impredicative comprehension of cp-Logic in an ingenious substitutional logic of propositional structure. The foundations of the idea to find a substitutional theory to emulate a simple type of universals (and thereby classes) is already manifest in Appendix B of The Principles of Mathematics itself. But, it used the substitution of denoting concepts (‘all a’, ‘some a’, ‘the a’, ‘an a’ ‘any a’ and ‘every a’). The theory of denoting concepts of The Principles of Mathematics proved to be a quagmire and without the 1905 theory of definite descriptions, Russell could not execute the plan for a substitutional theory (see, Landini 1998b). The substitutional theory finally became viable in 1905 and it pervaded Russell’s work until 1908, but most of it was almost completely unknown until the 1980’s. Happily, much of Russell’s work during this time has become clear such that we can better understand the evolution of Principia Mathematica and Russell’s apparently sudden abandonment of propositions. Contrary to years of misunderstanding, the evolution of Russell’s mathematical logic toward Principia Mathematica was not driven by a misguided interest in finding a common solution of both logical and semantic paradoxes. What ended Russell’s substitutional theory of propositional structure was not problems of unity, not problems concerning Liar paradoxes of propositions, and certainly not semantic paradoxes of naming or denoting or defining characteristic of the Richard paradox or Berry or the Grelling. What ended the substitutional theory was a paradox, here called Russell’s ‘ /  paradox’. Unlike Liars and semantics paradoxes, it is a Cantorian diagonal paradox grounded in the fact that the emulation of simple types of attributes in the substitutional theory is inconsistent with Cantor’s power-theorem that assures that there can be no function from objects (propositions being themselves objects) onto properties of those objects.

In summary, the whole of Russell philosophical work in mathematical logic may be seen in terms of his trials and tribulations at emulating an impredicative simple-type regimented cp-Logic of universals. Our focus, therefore, is squarely on the evolution of the cp-Logic of Principia Mathematica.  In what follows, we shall outline the major logical systems that led Whitehead and Russell to Principia Mathematica’s syntax and formal theory and the informal semantic interpretation they gave it. Since Russell’s work toward a substitutional theory in The Principles of Mathematicss ended in a quagmire and did not yield a formal system, we shall not pause to discuss it. The basic quantification theory of The Principles of Mathematics was replaced by the 1905 “Theory of Implication” which formed the quantification theory for logic of substitution which was to appear in Whitehead and Russell second volume of The Principles of Mathematics.

When Russell abandoned the propositions of his substitutional theory, he abandoned the idea of a second volume for The Principles of Mathematics. But he did not abandon hope that an emulation of an impredicative simple-type stratified regimentation of the cp-Logic of universals might still be found. In the introduction to the first edition of Principia Mathematica, Whitehead and Russell propose an informal nominalistic semantic interpretation of the object-language bindable predicate variables. But by 1920, Russell had come to realize that such a nominalistic semantics could not validate impredicative comprehension axioms. Only a Realist semantics can validate the comprehension principles Principia Mathematica’s impredicative simple-type regimented cp-Logic. Russell never stopped trying, however. In its 1925 second edition, Russell experimented with Wittgensteinian ideas for emulating impredicative comprehension, imagining an altered grammar to accommodate extensionality. Whitehead was not happy with this experiment being included in the new edition since neither he nor Russell intended to advocate it. Alas, Whitehead was right (see, for example, Lowe 1990, Monk 1996). The ideas of the 1925 second edition are sketched in an appendix and end our discussion of Russell’s logics.

2. The Simple Type Syntax of PrincipiaL

pdf

3. Developments: Principia Mathematica’s Section 8, Definite Descriptions, Class Expressions

pdf

4. The Ramified-Type Syntax of Church’s PrincipiaC

pdf

5. The Quantification Theory of Propositions in Theory of Implication (c. 1905)

pdf

6. The Substitutional Theory of Propositions (1905)

pdf

7. The Substitutional Theory Without General Propositions (1906)

pdf

8. Church’s PrincipiaC and Russell’s Orders of Propositions (c. 1907)

pdf

9. Appendix A: Quantification Theory in The Principles of Mathematics

pdf

10. Appendix B: The 1925 experiment of Principia MathematicaW

pdf

11. References and Further Reading

a. Works by Russell

  • The Collected Papers of Bertrand Russell, Vol. 4, Foundations of Logic: 1903-1905, ed. by Alsdair Urquhard (London: Routledge, 1994).
  • The Collected Papers of Bertrand Russell, Vol. 6, Logic and Philosophy Papers: 1901-1913, ed. John G. Slater (London, Routledge, 1992).
  • The Principles of Mathematics, (PoM) second-edition (New York: W.W. Norton & Co., second edition 1937, 1964). First edition (London: Allen & Unwin, 1903).
  • “On Denoting,” in Essays in Analysis, pp. 103-119.First published in Mind 14 (1905), pp.  479-493.
  • “On Fundamentals,” Collected Papers Vol. 4, pp. 359-413.
  • “On The Logic of Relations,” in Logic and Knowledge Essays, pp. 3-38. First published as “Sur la logique des relations,” Rivista di Mathematica, Vol. vii, (1901), pp. 115-148.
  • “On the Relation of Mathematics to Logic,” in Essays in Analysis, pp. 260-271. First published as “Sur la Relation des Mathématiques B la Logistique,” in Revue de Métaphysique et de Morale 13, (1905) pp. 906-917.
  • “On Some Difficulties in the Theory of Transfinite Numbers and Order Types,” in Essays in Analysis, pp. 135-164. First published in Proceedings of the London Mathematical Society 4 (March 1906), pp. 29-53.
  • “On the Substitutional Theory of Classes and Relations,” in Essays in Analysis, pp. 165-189. Manuscript received by the London Mathematical Society on 24 April 1905.
  • “On ‘Insolubila’ and Their Solution By Symbolic Logic,” in Essays in Analysis, pp. 190-214. First published as “Les Paradoxes de la Logique,” Revue de Métaphysique et de Morale, 14 (1906) pp. 627-50.
  • “Mathematical Logic as Based on the Theory of Types,” in Logic and Knowledge, pp. 59-102. First published in The American Journal of Mathematics 30 (1908), pp. 222-62.
  • Philosophy (New York: W. W. Norton & Co., 1927).
  • Principia Mathematica (coauthored by A. N Whitehead), second edition (Cambridge, 1925, 1962); First edition, Cambridge, Vol. 1 (1910), Vol. 2 (1911), Vol. 3 (1913).
  • Principia Mathematica to *56 (Cambridge, 1964).
  • Introduction to Mathematical Philosophy (London: Allen & Unwin, 1919, 1953).
  • My Philosophical Development (New York: Simon & Schuster, 1959).
  • The Art of Philosophizing (New York: Philosophical Library, 1968).

b. Books and Articles

  • Blackwell, Kenneth. “The Early Wittgenstein and the Middle Russell,” in Irving Block ed., Perspectives on the Philosophy of Wittgenstein (Cambridge, MIT Press, 1981), p. 27, fn. 3.
  • Boolos, George. 1994 “The Advantages of Honest Toil over Theft,” in Alexander George, ed., Mathematics and Mind (Oxford: Oxford University Press), pp. 27-44.
  • Church, Alonzo. (1956).Introduction to Mathematical Logic (New Jersey: Princeton University Press).
  • Church, Alonzo. 1976 “Comparison of Russell’s Resolution of the Semantical Antinomies with that of Tarski,” Journal of Symbolic Logic 41, pp. 747-760.
  • Church, Alonzo. (1984) “Russell’s Theory of the Identity of Propositions,” Philosophia Naturalis 21, pp. 513-22.
  • Cocchiarella, Nino. 1987 “The Development of the Theory of Logical Types and the Notion of a Logical Subject in Russell’s Early Philosophy,” Synthese 45 (1980), pp. 71-115. Reprinted in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp.19-63.
  • Cocchiarella, Nino. 1987 “Logical Atomism and Modal Logic,” in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp. 222-243.
  • Cocchiarella, Nino. 1987 “Logical Atomism, Nominalism and Modal Logic,” Philosophia, Philosophical Quarterly of Israel 4, (1974), pp. 41-44. Reprinted in Logical Studies in Early Analytic Philosophy (Columbus: Ohio State University Press), pp. 244-284.
  • Cocchiarella, Nino. 1987 “Russell’s Theory of Logical Types and the Atomistic Hierarchy of Sentences,” in Nino Cocchiarella, Logical Studies in Early Analytic Philosophy, (Columbus, Ohio State University Press, ), pp. 193-221.
  • Copi, Irving. 1971 The Theory of Logical Types (London: Routledge & Kegan Paul).
  • Frege, Gottlob. 1884 The Foundations of Arithmetic, translated by J. L. Austin (Northwestern University Press, 1980). First published as Die Grundlagen der Arithmetik: eine Logisch-Mathematische Untersuchung hber den Begriff der Zahl, (Breslau, 1884).
  • Frege, Gottlob. 1893 Grundgesetze der Arithmetik, Vol. I (Jena, 1893), Vol. II (Jena 1903) Reprinted by Darmstadt Hildesheim: Georg Olms Verlag, 1962).
  • Frege, Gottlob. 1892 “On Concept and Object,” in eds., Peter Geach and Max, Black, Translations from the Philosophical Writings of Gottlob Frege, (Oxford: Basil Blackwell, 1977), pp. 21-41. First published as Über Begriff und Gegenstand ” in Vierteljarsschrift fhr wissenschaftliche Philosophie, vol. XIV 1892, pp. 192-205.
  • Frege, Gottlob. 1980 Philosophical and Mathematical Correspondence, edited by Gottfried Gabriel, Hans Hermes, Friedrich Kambartel, Christian Thiel, Albert Verrart, and abridged from the German edition by Brian McGuinness and translated by Hans Kaal (Chicago: University Press).
  • Frege, Gottlob. 1964 The Basic Laws of Arithmetic: Exposition of the System, translated with an editor’s introduction by Montgomery Furth (Berkeley: University of California Press).
  • Galaugher, Jolen. 2013 “Substitution’s Unsolved Insolubilia,” Russell 3, pp. 5-30.
  • Geach, P. T. 1956 “Frege’s Way Out” Mind 65, pp. 408-409.
  • Gödel, Kurt. 1944 “Russell’s Mathematical Logic,” in ed., Paul Arthur Schilpp, The Philosophy of Bertrand Russell (Evanston: Northwestern University Press), 125-153.
  • Grattan-Guinness, Ivor. 1977 Dear Russell- Dear Jourdain (London: Duckworth).
  • Grattan-Guinness, Ivor. 2001 In Search for Mathematical Roots 1870-1940: Logic, Set Theories and the Foundations of Mathematics from Cantor Through Russell to Gödel (Princeton University Press).
  • Griffin, Nicholas. 1981 “Russell on the Nature of Logic” Synthese 45, pp. 117-188.
  • Griffin, Nicholas. ed., 2003 The Cambridge Companion to Bertrand Russell (Cambridge University Press).
  • Hatcher, William. 1982 The Logical Foundaions of Mathematics (Oxford: Pergamon Press).
  • Hazen, Allen. 2004 “A ‘Constructive’ Proper Extension of Ramified Type Theory; The Logic of Principia Mathematica, Second Edition, Appendix B,” in ed., Godehard Link, One Hundred Years of Russell’s Paradox (Berlin: Walter de Gruyter), pp. 449-480.
  • Holroyd Michael. 1967 Lytton Strachey (London: Heinemann)
  • Landini, Gregory. 1996 “The Definability of the Set of Natural Numbers in the 1925 Principia Mathematica,” Journal of Philosophical Logic 25, pp. 597-615.
  • Landini, Gregory. 1998a Russell’s Hidden Substitutional Theory (New York: Oxford University Press).
  • Landini, Gregory. 1998b “On Denoting Against Denoting,” Russell 18, pp. 43-80.
  • Landini, Gregory. 2000 “Quantification Theory in *9 of Principia Mathematica,” History and Philosophy of Logic 21, pp. 57-78.
  • Landini, Gregory. 2004a Logicism’s ‘Insolubilia’ and Their Solution By Russell’s Substitutional Theory,” in ed., Godehard Link, One Hundred Years of Russell’s Paradox (New York: De Gruyter), 373-399.
  • Landini, Gregory. 2004b “Russell’s Separation of the Logical and Semantic Paradoxes,” in Philippe de Rouilhan, ed., Russell en héritage, (Revue Internationale Philosophie 3, pp. 257-294.
  • Landini, Gregory. 2005 “Quantification Theory in *8 of Principia Mathematica and the Empty Domain,” History and Philosophy of Logic, 25, pp. 47-59.
  • Landini, Gregory. 2007 Wittgenstein’s Apprenticeship With Russell (Cambridge: Cambridge University Press).
  • Landini, Gregory. (2013a) “Zermelo and Russell’s Paradox: Is there a Universal set?” Philosophica Mathematica, vol. 21, pp. 180-199.
  • Landini, Gregory. (2013b) “Review of Bernard Linsky, The Evolution of Prinicpia Mathematica: Bertrand Russell’s Manuscripts and Notes fo the Second Edition,” History and Philosophy of Logic 34, pp. 79-97.
  • Landini, Gregory. 2016 “Whitehead’s Badly Emended Principia,” History and Philosophy of Logic 37, pp. 1-56.
  • Linsky, Bernard. 1999 Russell’s Metaphysical Logic (Stanford: CSLI Publications).
  • Lowe, Victor. 1990 Alfred North Whitehead: The Man and His Work. Volume II: 190-1947 edited by J. B. Schneewind (Baltimore: Johns Hopkins University Press).
  • Monk, Ray. Bertrand Russell: The Ghost of Madness 1921-1970 (The Free Press, 2001).
  • Myhill, John. 1974 “The Undefinability of the Set of Natural Numbers in the Ramified Principia,” in George Nakhnikian, ed., Bertrand Russell’s Philosophy (New York: Barnes & Noble), pp. 19-27.
  • Quine, W.V.O. 1954 “Quantification and the Empty Domain,” Journal of Symbolic Logic 19, pp. 177-179.
  • Quine, W.V.O. “Frege’s Way Out” Mind 64 (1955), pp. 145-159.
  • Quine, W.V.O. 1980 Set Theory and Its Logic (Cambridge: Harvard University Press).
  • Ramsey, Frank.1925 “The Foundations of Mathematics,” in ed., R. B. Braithwaite ed., The Foundations of Mathematics and Other Essays by Frank Plumpton Ramsey (Harcourt, Brace and Co., 1931), pp. 1-61. First published in the Proceedings of the London Mathematical Society, 25 (1925), pp. 338-84.
  • Rouilhan (de), Philippe. 1996 Russell et le cercle des paradoxes (Paris: Presses Universitaries de France), p. 275.
  • Schmid, Anne-Françoise. 2001 ed., with commentary, Bertrand Russell: Correspondence sur la Philosophie, la Logique et la Politique avec Louis Couturat 1897-1913 ( Paris: édition Kimé, volume I, II).
  • Van Heijenoort, Jean. 1967 “Logic as Calculus and Logic as Language,” Synthese 17, pp. 324-30.
  • Whitehead, A. N. 1911 An Introduction to Mathematics (London: Williams and Norgate).
  • Wittgenstein, Ludwig. 1914 Notebooks 1914-1916, ed. by G. H. Von Wight and G. E. M. Anscombe, (Chicago: University of Chicago Press, 1979).
  • Wolfe, Mays. 1967 “Recollections of Wittgenstein,” in K. T. Fann (ed), Ludwig Wittgenstein: The Man and His Philosophy (New Jersey).

 

Author Information

Gregory Landini
Email: gregory-landini@uiowa.edu
University of Iowa
U. S. A.

Natural Kinds

A large part of our exploration of the world consists in categorizing or classifying the objects and processes we encounter, both in scientific and everyday contexts. There are various, perhaps innumerable, ways to sort objects into different kinds or categories, but it is commonly assumed that, among the countless possible types of classifications, one group is privileged. Philosophy refers to such categories as natural kinds. Standard examples of such kinds include fundamental physical particles, chemical elements, and biological species. The term natural does not imply that natural kinds ought to categorize only naturally occurring stuff or objects. Candidates for natural kinds can include man-made substances, such as synthetic elements, that can be created in a laboratory. The naturalness in question is not the naturalness of the entities being classified, but that of the groupings themselves. Groupings that are artificial or arbitrary are not natural; they are invented or imposed on nature.  Natural kinds, on the other hand, are not invented, and many assume that scientific investigations should discover them.

To say that a kind is natural, rather than artificial or arbitrary, means, minimally, that it reflects some relevant aspects of the world and not only the interests of, or facts about, the classifiers. The expression “footwear under $100,” for instance, describes an artificial kind reflecting some categorizer’s interest—their budget—and not some relevant feature of the classified objects.

Another feature of natural kinds is that they allow many important inferences about the entities grouped within them. Take gold: All entities classified as gold share a property—their atomic structure—that uniquely identifies a chemical element. This property also accounts for gold’s other observed properties, such as its color, malleability, and so forth. Identifying something as gold warrants many inferences and generalizations, such as that it dissolves in mercury at room temperature and is unaffected by most acids, that will apply to all samples of gold.

More problematic, but still debated as possible instances of natural kinds, are categories in higher-level sciences: psychological categories, such as emotion; psychiatric conditions, such as depression; and social categories, such as money. We might not be able to identify anything like the atomic structure of a chemical element for depression. However, one might still wonder whether people suffering from it share properties that account for their behaviors and help us explain the condition’s causes and how it might be treated. Few people, perhaps, will consider most higher-level categories, such as psychiatric conditions, to be candidates for natural kinds. Nonetheless, what makes depression a legitimate scientific category, unlike hysteria, remains to be examined.

This article describes the three most prominent accounts of natural kinds: essentialism, cluster kinds, and promiscuous realism. It spells out some of the features standardly associated with natural kinds and then examines the three views on natural kinds via specific examples of candidates for natural kinds in chemistry, biology and psychiatry. The final section discusses the metaphysics of natural kinds and offers a systematization of the possible views.

Table of Contents

  1. What Makes a Kind Natural?
    1. Natural Kind Monism vs. Pluralism
    2. How to Identify Natural Kinds: Their Role in Inductive Generalizations, Scientific Laws, and Explanations
    3. Natural Kinds and Functional Kinds
    4. The Increase of Interest in Natural Kinds in the Twentieth Century
  2. Three Views on Natural Kinds: Essentialism, Cluster Kinds, and Promiscuous Realism
    1. Essentialism: The Case of Chemical Elements
    2. Natural Kinds as Property Clusters: The Case of Biological Species
    3. Promiscuous Realism: The Case of Psychiatric Categories
  3. Metaphysics of Natural Kinds
    1. What Does It Mean that a Kind is Real?
    2. The Relationship Between Scientific Realism and Natural Kinds Realism
    3. Natural Kinds Realism
    4. Natural Kinds Antirealism
  4. Conclusion
  5. References and Further Reading

1. What Makes a Kind Natural?

The philosophical tradition has long demanded that we ought to search for natural classifications in our investigation of the world. The nature of this demand can be difficult to spell out. This idea is often illustrated with Plato’s famous metaphor about “carving nature at its joints.” In Phaedrus, he says that we should “divide into forms, following the objective articulation; we are not to attempt to hack off parts like a clumsy butcher” (Plato 1952, 265e).  The underlying intuition here is that the natural world is divisible into objective categories and that we should strive to discover such divisions. That is, our exploration of the world should model itself on the practice of a competent butcher who, when cutting the meat, follows its natural divisions and does not clumsily hack parts off.

Questions arise as to how we identify suitable candidates for such “natural openings” and where we should draw divisions between objects in the world. One good place to look for them would be in the discipline of particle physics because it appears that, if there are some objective divisions in nature, they will surely be found at the level of fundamental entities that comprise all existing things: protons, neutrons, electrons, or even smaller particles like quarks. That kind of reasoning was already present in ancient Greece, where attempts were made at discovering the true nature of all things, whether it was elements that everything else is composed of, like water or fire, or whether it required finding the smallest indivisible building blocks of matter, like atoms. In this respect, contemporary scientific research might be seen as a continuation of the same project.

Alternatively, one might argue that the approach of finding the most basic constituents of matter is too restrictive and that there are many other objective categories to be discovered. In geology, for instance, different rocks can be divided according to their qualities—mineral and chemical composition, permeability, texture of the constituent particles, particle size—and these can be taken as objective parameters for classification. Moreover, some authors make a case that there are natural kinds in the higher-level or special sciences such as biology, psychology, or linguistics (Fodor 1974). It could be argued, for example, that certain basic emotions, such as fear and anger, are identified and recognized across different cultures, which makes them suitable candidates for natural kinds. Similar reasoning might be applied to nonnatural or artificial entities, including cultural artifacts, such as language. The fact that certain linguistic patterns occur systematically across all natural languages may indicate that groupings of such patterns represent objective linguistic categories.

a. Natural Kind Monism vs. Pluralism

Cross-cultural convergence in classification, as in the example above of common linguistic patterns, can be interpreted in two ways. One is to say that it indicates the existence of objective categories that rational investigators will eventually discover. The other is the notion that we group things in such a way because our cognitive makeup makes those groupings especially salient to us. In this case, the grouping would not only reflect the objective structure of the world, but also our cognitive dispositions. This issue is examined in the section entitled “Metaphysics of Natural Kinds.”

In many cases, however, the classification systems are not shared, but rather vary cross-culturally, or across different scientific disciplines. Facing such situations, one might wonder whether there is one correct system or whether different ones can be equally valid. Going back to Plato’s metaphor, if different butchering traditions produced meat that is carved up differently, so that there are no T-bone steaks in England and no roasts in the US, would it mean that one of those traditions is doing it wrong, or that there are different ways to carve the meat at its joints? On this issue, we can distinguish between the position of the monists and that of the pluralists.

Natural kind monists hold that there is only one correct way of dividing the world into natural kinds, of carving nature at its joints. In such a view, no crosscutting classifications should be considered natural kinds. In case there is any overlap between different kinds, one must be a sub-kind of the other. This claim is known as the hierarchy thesis regarding natural kinds (Khalidi 2013). The isotopes of hydrogen, for instance—protium, deuterium, and tritium—can be said to constitute a sub-kind of the kind hydrogen. That is, they have the same atomic number, but different numbers of neutrons in the nucleus. Accordingly, a monist either claims that there is one natural categorization of entities in the world, and it must apply only to the lowest possible level of classification, or, if there are higher-level natural kinds, they should form a hierarchy that bottoms out at the lowest level. From this we can see that monists do not necessarily need to endorse the hierarchy thesis.

Natural kind pluralists, on the other hand, countenance different ways of classifying entities into natural kinds. In their view, entities can be cross-classified in different ways, depending on the purposes that these classifications serve. We can classify biological organisms, for instance, into species if we are interested in their ancestry or breeding patterns. But we can also classify them into ecological groupings, for instance, that of detrivores, which refers to organisms that consume decomposing organic matter and encompasses a wide array of organisms, from fungi and worms to some bacteria. In the pluralist view, we cannot claim that one of these classifications is superior and ought to be endorsed at the expense of the other. Rather, both can be useful and equally valid depending on the purposes and contexts of scientific investigation. Pluralists are not typically associated with endorsing the abovementioned hierarchy thesis, since they have no problem allowing crosscutting classifications. But a pluralist can hold the view that different classification systems, responding to diverse scientific interests, still have to be hierarchically ordered. Even if the hierarchy thesis is normally associated with a monistic approach, therefore, the monism-versus-pluralism question and the idea of a strict hierarchy of natural kinds are conceptually distinct.

b. How to Identify Natural Kinds: Their Role in Inductive Generalizations, Scientific Laws, and Explanations

So far, we have been dealing with very general questions concerning whether the world can be divided into certain privileged groupings. If indeed there are such groupings, that is, natural kinds, then it is worthwhile to establish the criteria for something to be a natural kind. Different accounts of natural kinds ascribe different features to them, but all of them, at a minimum, presuppose the following: The entities classified into a kind should share a set of common properties by which they are grouped together. This grouping of common properties ought not to be accidental. To illustrate this, we can think about cases in which we group entities together based on observable properties and then establish that there is a common cause that accounts for those properties. We note, for instance, that sunflowers (Helianthus annuus) share common observable properties—a large, usually yellow, flower head, a tall, erect stem, broad and rough leaves, and so on—and conclude that there must be an underlying explanation for such a clustering of properties. This explanation draws on the fact that all sunflower plants belong to the same species, which points to a common cause for the common properties. Regarding species in general, this common element might stem from shared ancestry or an ecological niche, exchanging genetic material through interbreeding with other species members, and so on.

The properties shared by members of a natural kind need not be directly observable. In many accounts, chemical elements are considered to be standard examples of natural kinds for which important properties shared by members of the kind are not directly observable. Take carbon, for instance. It is well known that different structures of carbon atoms constitute materials of extremely different properties, such as diamonds and graphite. Nevertheless, both diamonds and graphite are taken to be composed of the same element because they share a deep property, namely, the microstructure.

These features of natural kinds can help us see why it is useful to classify the world into such categories and indicate why natural kinds are commonly taken to play an important role in inductive inferences, scientific laws, and explanations. Let us briefly examine how the debate on natural kinds is entangled with these key issues in the philosophy of science. Classifying things into kinds according to their shared properties is theoretically and practically significant because it normally countenances inductive inferences about the members of kinds. Our previous encounters with sunflowers, for instance, allow us to infer some properties and behaviors related to this species, such as that they grow best when exposed to plenty of sun, in fertile, moist and well-drained soil; that they can be used to extract some toxic ingredients from the soil, such as arsenic or lead, and so on. Establishing the existence of stable, clustered properties associated with sunflowers thus underpins the inductive inference that future observed instances of this kind will also share some or all of those properties. This enables us to formulate relatively precise instructions for plant cultivation.

Natural kinds also play an important role in laws of nature or scientific laws. How this role is characterized and explained depends on the exact account of scientific laws one endorses (see the article on Laws of Nature). Consider copper as a candidate for a natural kind. All instances of copper share some common properties: They are soft, malleable, and ductile, with a reddish-orange color. These observable features can be accounted for by the atomic structure of copper, namely that it has a nucleus containing 29 protons and 34 to 36 neutrons and it is surrounded by 29 electrons localized in 4 shells. Like other metals, it consists of a lattice of atoms and has a single electron in the outer shell that does not remain connected to particular atoms but forms an electron cloud spreading through the lattice. This cloud, containing many dissociable electrons, makes the conduction of electric currents possible. These facts about the atomic structure of copper allow us not only to infer that a subsequently observed instance of copper will conduct electricity, but also to establish it as a scientific law of the following form: “All pure copper conducts electricity.”

The plausibility of this assumption about natural kinds depends on how stringently we construe natural laws. For instance, it is often taken that laws are necessary, exceptionless, and universal. Specifically, natural kind essentialists, as further explained in section 4.a, hold that there ought to be some common properties, that is, essences, shared by all and only members of a kind. The existence of these unique properties would, in turn, ground the idea that laws of nature necessarily hold with respect to members of natural kinds. In this view, it also follows that natural kinds ought to be categorically distinct; that is, there can be no continuum or smooth transition between different kinds. Rather, there should be some natural boundaries between them. Many authors argue, however, that essentialism is not the best account of natural kinds because it excludes many scientific categories such as those in biology, psychology, and other special sciences that do not fulfill its demanding criteria (Dupré 1981, Khalidi 2013).

The assumption that natural kinds play an important role in inductive inferences and scientific laws explains the widespread belief that natural kinds are important for scientific explanations. We saw how the atomic structure of copper explains its observable properties, such as electric conductivity. Establishing a common cause or mechanism that accounts for the grouping of properties in nature also provides an explanation for the behavior of entities thus classified. It must be noted, however, that the role natural kinds play in scientific explanations also depends on the notion of scientific explanation that one endorses (see the article on Theories of Explanation).

Thus far, the assumption has been that natural kinds are characterized by shared common properties, which in turn account for their role in inductive generalizations, scientific laws, and explanations. However, if we start with the assumption that natural kinds are those categories that play important roles in scientific inferences and theories, we ought to address the question of whether functional kinds, which are important in many scientific disciplines, are natural kinds. This issue is addressed in the next subsection.

c. Natural Kinds and Functional Kinds

Functional kinds are defined as groups of entities united by a common function—that is, by their activities and causal roles. Common examples include biological kinds, such as predator and prey; psychological kinds, such as pain; and artifact kinds, such as knives. What connects all these examples is that the entities in question are grouped together because of something they do, and not because they share similar underlying properties. Very different species of animals can belong to the predator category, such as jaguar, human, rattlesnake, or stork. Similarly, very different kinds of things can be used as a knife, from a piece of a sharp stone or glass to steel blades specifically manufactured for cutting food. This phenomenon is referred to as multiple realizability of functional types or kinds (see the article Mind and Multiple Realizability), and it has been a widely discussed topic in the philosophy of mind with regard to mental kinds, like pain.

On the one hand, one might argue that mental kinds, such as pain, cannot be taken to be natural kinds because they cannot be reduced to paradigmatic physical kinds. It is plausible that very different types of creatures can feel pain. For instance, it is plausible that humans, squids, and snakes can experience pain, although they have very different types of neurophysiological architectures. If pain can be realized by different physical states, however, then it seems that pain could only be a “widely disjunctive” and disunified kind, in the sense that in humans it is realized by one set of neuropsychological states, in squids by another, in snakes by still another set, and so on and so forth for different species. Some authors have concluded on these grounds that it is impossible to unify or reduce categories of special sciences to the more basic categories that we find in the physical sciences, which provide paradigmatic examples of natural kinds (Fodor 1974).

On the other hand, functional kinds, such as pain, play important roles in scientific explanations in various disciplines of special sciences, psychology being the most prominent example. Thus, if they play such an important role in the special sciences, it is worthwhile to examine them as candidates for natural kinds in such disciplines. Some see the fact that functional kinds play such an important role in scientific explanations as a reason to assume that they are not really multiply realizable and thus widely disjunctive, and that the properties important for realizing a function need to be shared by the entities grouped together. Alternatively, other authors argue that natural kinds can be multiply realizable and that functional kinds can be considered instances of natural kinds (Ereshefsky and Reydon 2015).

d. The Increase of Interest in Natural Kinds in the Twentieth Century

The topic of natural kinds gained momentum in the second half of the twentieth century in relation to two philosophical debates: the debate on paradoxes of confirmation and inductive inferences in the philosophy of science and the debate on theories of reference in the philosophy of language. Let us start with the first issue, since it relates to the aforementioned role that natural kinds play in inductive inferences.

Views of natural kinds that emphasize their role in inductive inferences face Goodman’s new riddle of induction (see the article on Confirmation and Induction). Nelson Goodman (1983) argued that there are innumerable ways to draw inductive inferences from a given data set. For instance, from the same data set consisting of green emeralds, we can infer either that all emeralds are green or that all emeralds are “grue,” a word Goodman invented for the purpose of this argument. “Grue” is a predicate that is defined relative to some fixed time: Something is grue if it was observed prior to the year 2050, and is green, or it is observed after the year 2050 and is blue. Drawing the inductive inference that all observed instances of grue emeralds allow us to conclude that all emeralds are grue, which leads to a paradoxical situation in which observing instances of green emeralds in the past can serve as an inductive basis for inferring that in the future, blue emeralds will be observed. We consider the induction based on the concept “green” to be acceptable and reject the induction based on “grue.” This indicates that the choice of kind concepts matters for preferring certain inductive inferences. The question arises as to how, or on what basis, we can draw the line between concepts that are suitable for inductive generalizations and those that are not.

Willard van Orman Quine (1969) introduced natural kinds as a solution to Goodman’s grue paradox and argued that what makes concepts projectable and suitable for inductive generalizations is the fact that they refer to kinds. Natural kinds are sets whose members share similar properties. This does not entirely solve Goodman’s problem, however, since, according to Quine, natural kinds rest on an even more problematic notion of similarity. That is, to know how to classify objects into kinds, we already need to have an account of what makes properties similar in relevant aspects. In his view, our standards for judging similarity are preset, that is, they are a part of our cognitive setup, and are needed for any learning to occur. The main question is why we should assume that our similarity standards track some real groupings in nature. Quine’s answer is that we are successful in making inductions because our similarity spaces have evolved through natural selection, by a process of trial and error. Goodman’s solution to the problem he articulated is, simply, that certain concepts, for example, “green,” are better entrenched in our usage and language than others, such as “grue.” This means that we have used them more and have been successful in doing so. Thus, groupings that have proven to be inductively successful have become entrenched in our language.

A natural kind essentialist answers this problem by claiming that concepts suitable for inductive generalizations are those that correspond to the real, mind-independent groupings in nature and are characterized by shared essences. Non-essentialists, however, cannot endorse this answer because they contend that we do not have access to mind-independent divisions, even if they exist. They do not think that we can identify certain properties that all and only members of a kind share and in virtue of which they belong to natural kinds.

A different route to the topic of natural kinds was the debates on theories of reference, specifically, Saul Kripke’s (1972) and Hilary Putnam’s (1975) essentialist views on natural kinds. These views were inspired by the problems of the semantics of natural kind terms. Both Kripke and Putnam argue against descriptivist theories of meaning of natural kind terms (see the article on Gottlob Frege) that identify the meaning of a term with the description of properties associated with that term. In the case of the term “water,” for instance, the description that it is a clear, odorless, colorless, and drinkable liquid fixes its meaning. Kripke and Putnam argue, instead, that even if all the descriptions we associate with a natural kind term are false, we can still refer to that kind.

In Putnam’s Twin Earth thought experiment (see the article on Internalism and Externalism in the Philosophy of Mind and Language), he asks us to imagine a situation in which there is Twin Earth, a planet that is exactly like Earth except for one difference: Instead of water, there is superficially the same liquid, but with a different chemical composition. That is, instead of H2O, it consists of XYZ. People on Twin Earth also refer to this liquid as “water.” But if we ask whether people on Earth and people on Twin Earth refer to the same stuff when they say “water,” the answer seems to be no. This means that there is more to reference than the description associated with a kind term. Putnam shows that something external to the user, namely the objective causal relations with the referent, are relevant for fixing the meaning of natural kind terms. What characterizes all instances of a kind is the fact that they bear some relation to other members of a kind; in the case of water, this is the relation of being the same liquid, that is, having the same chemical microstructure with other samples of water.

Kripke and Putnam advanced and popularized an essentialist view of natural kinds that many considered to be acceptable because it did not construe kind essences as elusive properties, but as something discoverable by scientific inquiry. This view, however, provoked reactions from philosophers dealing with special sciences, such as biology, psychology, psychiatry, and so forth, where scientific classifications do not fulfill the essentialist criteria (for an exhaustive overview and criticism of Kripke-Putnam’s version of essentialism, see LaPorte 2003). This has led to accounts of natural kinds that aim to loosen the criteria that determine which categories can constitute them, the most popular being the clustering accounts of kinds that take natural kinds to pick out clusters of properties, where members of a kind do not need to share unique essences, but rather, a certain amount of common properties where these properties are shared for nonarbitrary reasons. Section 3.b further discusses clustering accounts.

Other, more metaphysically minded philosophers, inspired by the work of Kripke and Putnam, started to develop an approach that has been termed scientific essentialism (Bird 2007, Ellis 2001). This view claims that the fundamental laws of nature hold because of essential properties of natural kinds. Thus, given that natural laws are grounded in the natural kind structure of the world, it is their essences that explain why the laws of nature are, in fact, metaphysically necessary. Roughly put, entities in the world must behave the way they do because of their natures. Scientific essentialists are usually concerned with fundamental kinds such as electrons, whose essential properties, like electric charge and mass, cause all their lawful behaviors.

The abovementioned contrasting reactions to essentialist views on natural kinds reflect a more general juxtaposition on how to approach the natural kinds debate. On the one side, there are authors, such as the scientific essentialists, who are more interested in the metaphysical problems and conceive of natural kinds as the most fundamental groupings of entities in the world. They tend to endorse very rigorous views on what it takes for a kind to be natural. Certain interpretations of essentialism are compatible with such an approach. On the other side, there are authors who are mainly oriented toward actual scientific practice and tend to assume that successful scientific classifications can be used as paradigmatic cases of natural kinds, and that the job of the philosophical accounts of natural kinds is to track the main features of such classifications and offer an account of natural kinds that will be able to encompass the scientific practice (Kendig 2015).

The next section provides an overview of the three most prominent accounts of natural kinds, starting with essentialism. The overview follows this general tendency to start with a strict philosophical account of natural kinds, and then to offer more relaxed criteria that take into consideration the data coming from the practice of scientific classification. Even essentialism, as the most demanding view, has been interpreted in different ways with the aim of capturing existing scientific categories. After essentialism, two more encompassing views are presented: cluster kinds, a view that emphasizes the clustering of properties specific to members of a kind without requiring the possession of unique kind essences, and, finally, the category of promiscuous kinds, which is the most liberal, allowing for members of a kind to have a small number of shared properties if they serve certain explanatory purposes.

2. Three Views on Natural Kinds: Essentialism, Cluster Kinds, and Promiscuous Realism

The three main views on natural kinds—essentialism, cluster kinds, and promiscuous kinds—are illustrated using specific examples from different scientific disciplines. The chemical elements are used to exemplify essentialism, since they are the most commonly used example of essentialist categories. The cluster kind view has been advanced as a reaction to the inadequacy of essentialism to capture many scientific classifications; biological species, being the most prominent among them, will be used to illustrate this account. Lastly, promiscuous realism, the most relaxed account of natural kinds, will be illustrated by invoking the example of psychiatric categories, which many consider to be highly disputable candidates for natural kinds. Since promiscuous realism allows even folk categories to count as natural kinds and allows for a vast range of interests to play an important role in establishing what constitutes a natural kind, psychiatric categories represent an interesting case study in which both scientific and practical concerns may be taken for establishing which classifications ought to be taken as relevant. It needs to be emphasized that the decision to illustrate the main accounts of natural kinds with these specific examples does not imply that these accounts are suitable only for those categories or those disciplines. Often, even though not necessarily so, authors proposing an account of natural kinds assume that it can be applied to all instances of natural kinds, regardless of the scientific discipline in question.

a. Essentialism: The Case of Chemical Elements

According to essentialism, natural kinds are groupings of entities that share a common essence—intrinsic properties or structure(s) uniquely possessed by all and only members of a kind. An intrinsic property is a property that an entity has independently of any other things, while an extrinsic property is the one that a thing has in virtue of some relations or interactions with other entities. The basic idea is that the essence causes and explains all other observable shared properties of the members of a kind and allows us to draw inductive inferences and formulate scientific laws about them. Chemical elements are used as standard examples of paradigmatic candidates for essentialist natural kinds. Their intrinsic properties—that is, the structures of their atoms—determine their observable properties. Take the case of hydrogen: Its atoms consist of a single proton in the nucleus and a single electron in the atomic shell. The structure of hydrogen atoms determines the bonds it can form with other entities and compounds, such as the molecular structure of the chemical compound H2. These molecular forms then determine other properties of hydrogen, such as its colorlessness, odorlessness, tastelessness, and high combustibility at normal temperatures. They also account for its prevalence in molecular forms, such as water and organic compounds, because it has a disposition to form covalent bonds with nonmetallic elements. This makes the atomic structure of hydrogen its essence, a property that is shared by all hydrogen atoms and not shared by atoms of any other element.

Many essentialists think of the periodic table of elements as a perfect illustration of how things in the world are divided into natural kinds. In our exploration of nature, we can find different substances with a range of properties, but a further examination shows that they all belong to some basic categories clearly distinct from one another. Upon further examination, we find that this distinctiveness is a consequence of their intrinsic properties. The fact that chemical elements form natural kinds in virtue of their shared essences accounts for the fact that they ground scientific laws and inductive generalizations. For instance, knowing that something is a hydrogen gas allows us to infer that it will spontaneously react with chlorine and fluorine at room temperatures, thus forming potentially hazardous acids.

Essentialism requires natural kinds to be discrete or categorically distinct. Alternatively, if there were smooth or continuous transitions from one kind to another, this would mean that we should decide, perhaps arbitrarily so, where to draw the line of demarcation between them. The essentialist holds that essences are supposed to provide us with an objective criterion for where to draw such lines. Exactly the requirement that members of each kind ought to share a unique essence excludes vague or unclear cases when we cannot clearly determine to which kind an entity belongs. Brian Ellis (1999), for instance, takes the discreteness of chemical categories as scientific evidence that the world is structured into essentialist kinds. He contends that if there were a smooth transition between different kinds, then the demarcation between them would not be drawn by nature; rather, we would have to decide where to draw the line.

Essentialists are typically, but not necessarily, monists. They typically hold that there is a single correct way of dividing the world into natural kinds. In this view, it might seem that categories are natural only if they constitute a unique way of organizing phenomena under investigation. In that case, there could be no crosscutting categories in the domain under investigation. Humans and dogs, for example, are classified into the category mammals, but dogs and crocodiles (which are not mammals) can be classified into the category quadruped. In such cases, a monist ought to claim that one of these categories is not a natural kind, that is, for instance, that dogs and crocodiles are natural kinds, while quadrupeds are not. It appears though, that monists can accept overlapping classifications if they are hierarchically ordered, which means that in cases in which there is overlap between two different kinds, one must be a sub-kind of the other. Linnaean taxonomy is an example of a hierarchically ordered classification with seven different ranks of classification, starting with species at the lowest level, and ending up with kingdom at the top as the widest category, encompassing all the others. Humans are thus classified into the species Homo sapiens, but also into the class mammals and the kingdom animals, but species is a subcategory of a class, and a class is a subcategory of kingdom.

Nonetheless, the idea of crosscutting categories is not necessarily incompatible with essentialism. In nuclear physics, if we focus on patterns of radioactive decay and the stability of elements undergoing decay, then chemical elements can be classified in a way that crosscuts standard classification as captured by the periodic table. Radionuclides, for instance, are unstable atoms with excess nuclear energy that undergo radioactive decay. They can occur naturally or artificially. Examples include tritium, a radionuclide and an isotope of hydrogen, and carbon-14, a radioactive isotope of carbon. If we were to build a classification system that is based on the stability of radioactive atoms, it would be different from the standard chemical classification into elements, but one can still argue that it would track certain essences or essential properties.

In the philosophy of chemistry, microstructuralism is the essentialist view according to which chemical kinds ought to be individuated solely according to their microstructural properties (Hendry 2006), like the nuclear structure represented by the atomic number for chemical elements. While higher-level, observable properties can be used to identify what kind some entity belongs to, the microstructure has explanatory priority, and is the real arbiter of whether something belongs to a kind, because it is responsible for all the other properties and relations into which the entity can enter. The problem that microstructuralists face, however, is whether they can demonstrate that microstructural properties really have this potential and specify what the relevant microstructural similarities are. In addition, they need to explain why, in general, we should privilege groupings based on microstructure, as opposed to some other way of classifying things.

Let us grant, for example, that the essence of water is its H2O molecular structure. If we take an individual molecule of water, it will not have the observable properties we commonly associate with water. Moreover, water is more accurately described as containing H2O, OH, H3O+ and some other less common ions. The problem is not that we do not know what the microstructure of water is; the problem is that there is no one microstructure responsible for the observed properties. In fact, the observed properties are a result of very complex and ever-changing interactions. It is correct to say that the average ratio of atoms of H and O is 2:1 but the observable properties of water do not depend upon this ratio. Rather, they depend upon the interactions between the dissociated ions.

It is far from straightforward to specify what exactly structural similarity amounts to, since this appears to be a matter of degree. It is unclear how much microstructural similarity is enough to individuate a natural kind. If we are focusing on the nuclear properties of atoms, we can target nuclear charge, where the atomic number—that is, the number of protons in the nucleus—is relevant for establishing a kind, in this case the kind chemical element. Alternatively, we can target nuclear mass, and reach a classification into isotopes. Isotopes have the same nuclear charge and undergo the same reactions at different rates, but the differences between them can be significant. Take the example of isotopes of uranium: uranium-235 and uranium-238. These isotopes differ not only in the number of neutrons, but also in other important properties, for instance, in how radioactive they are. Furthermore, take the example of chiral molecules, which have a similar structure but different dispositions due to their components being differently geometrically configured, one being a mirror image of the other. The question can be posed as to whether enantiomers—molecules that are mirror images of chiral molecules—form a separate kind according to microstructuralism.

When we go to the classification of macromolecules such as proteins, we reach a problem of justifying classification based on microstructural properties, since they are standardly individuated by their functions. This has led some authors to propose a pluralism about macromolecular classification (Slater 2009). Microstructuralists can accept a form of pluralism as long as the kind essences are microstructural properties. Introducing functional properties of the macromolecules, however, goes beyond the scope of microstructuralism. Essentialists, more generally, as was already noted, can accept such pluralistic positions and allow, for instance, that the classification of chemical elements based on their atomic number stems from an interest in explaining particular material transformations and that chemical classifications might have been very different if we had started out with different interests (Hendry 2010). Thus, if we are interested in the behavior of biological macromolecules, we can classify them according to function rather than structural properties.

Such pluralist forms of essentialism can encompass a much wider range of scientifically interesting categories, but at a cost of reducing the importance of the role played by essences in causing and explaining all other properties typically associated with kind members. If we allow that a diverse range of interests tracks different essences, and we group the same entities into many crosscutting classifications, then essences would not play as important a role as has been assumed. The basic essentialist idea is that when we know which natural kind an entity belongs to, we can infer many important properties of that kind, exactly because the essence is responsible for all those shared properties. If, on the other hand, there are many different essences that we can track, and that, accordingly, enable the grouping of the same entities into different, crosscutting categories, then knowing the essence and which category an entity belongs to would not give us full information about the entity we are investigating. Rather, it would give us only partial information, depending on specific interests that lead us to investigate some group of entities.

Perhaps the most powerful objection leveled against essentialism is that it is inapplicable to kinds in many non-fundamental sciences. Biological species, for instance, which were taken as standard examples of natural kinds, do not fulfill essentialist requirements. Moreover, essentialism seems to be incompatible with the Darwinian theory of evolution. There are no properties of species that all and only members of a species share. But even if we were to find some, we would expect that they could easily be changed by evolutionary mechanisms, such as mutation, recombination, and random drift. These considerations have led many authors to conclude that essentialism is not a satisfactory view of natural kinds (see, for instance, Sober 1994, Wilson, Barker and Brigandt 2007) and to declare “the death of essentialism” (Ereshefsky 2016).

The essentialists respond by restricting natural kinds to the more fundamental sciences, such as physics and possibly chemistry (see, for instance, Ellis 2008). The idea is that natural kinds refer to the groupings discovered by those sciences and the scientific classifications of higher-level sciences do not refer to natural kinds. Other philosophers reconceptualized essentialism to countenance essences as extrinsic or relational properties and not only as intrinsic ones. The property of being a descendant of a certain ancestor, for example, might be essential for belonging to a species. Acceptance of such a view, however, represents a significant departure from standard essentialism. Even though being a descendant of a Canis lupus familiaris might be a necessary and sufficient condition for belonging to the kind dog, this relation does not seem to play the role standardly associated with kind essences. The main motivation for essentialism is that possessing an essence accounts for the similarity of members of a kind. If we have a case where we can point to some common essence, but this essence does not guarantee that members of a kind will share important properties, then the essence does not play the role it was supposed to play. The fact that members of some species share a certain ancestor can cause many similarities between them, but there might also be many significant differences between them. Different breeds of dog, for instance, share a common ancestor, which makes them similar in certain respects, but they are also dissimilar in salient respects. For example, Siberian Huskies, with their double layered coats, are adapted for cold environments, while Border Collies are well equipped to withstand heat. Thus, sharing a common ancestor does not in any way guarantee that members of a kind will share a certain set of properties and thereby does not play the role that essence is supposed to play.

The problems that arise when we try to apply an essentialist account to many categories in the special or higher-level sciences, most notably, to the category of species in biology, have prompted relaxing the constraints proposed by essentialist views. The most established of such reactions is the proposal that natural kinds should be identified with clustered properties and not essential ones. In the next section, the cluster approaches to natural kinds are presented through the example of biological species, one of the main examples used to illustrate the adequacy of cluster approaches, especially in opposition to essentialism.

b. Natural Kinds as Property Clusters: The Case of Biological Species

Cluster kind approaches offer a less strict view of natural kinds. In accordance with these views, to belong to a kind, its members need not to share a set of necessary and sufficient properties; it is enough that they share some subset of properties that tend to cluster together due to some underlying common causes. The main idea is that nature is structured in such a way that properties are not randomly distributed across space-time; rather, they are systematically “sociable” (Chakravartty 2007), in the sense that families of properties form stable clusters. Natural kinds are categories that pick out such clusters of properties. This is a much more encompassing view than essentialism because none of the properties are necessary for kind membership, it is sufficient that some of them are shared, and there is no requirement for a clear-cut division between members of a kind and nonmembers.

Many philosophers of biology recognized the inadequacy of essentialism to account for species (Hull 1965, Sober 1994). It is hard to find traits that are uniquely shared by all and only members of a single species. According to evolutionary theory, any common trait can easily be changed through mutation, drift, or recombination. Since selection acts upon differences between traits, variation, rather than similarity, is the rule in the biological world and the fuel of evolution. Thus, practicing biologists do not classify organisms by identifying something like an essence that species members share; they do it by tracing phylogenetic relations (that is, ancestor-descendent relations or the evolutionary history of species members), interbreeding patterns, ecological niches, and so forth.

These considerations prompted a view that species are individuals, rather than biological kinds (Ghiselin 1974, Hull 1978). Similarly to functioning organisms, individuals are ontologically characterized by having spatiotemporally restricted and causally interconnected parts. In this view, to belong to a species does not mean that its members share some common properties but, rather, that they belong to an evolving lineage whose parts causally interact.

Richard Boyd (1991, 1999) introduced the Homeostatic Property Cluster (HPC) theory as an alternative view that can accommodate the idea that species are natural kinds. HPC characterizes natural kinds as clusters of co-occurring properties underpinned by homeostatic mechanisms that cause and sustain the property clusters. According to Boyd, biological kinds are good candidates for a natural kind cluster; species members share many, but not necessarily all, properties that are caused by various mechanisms, such as sharing a common ancestor, sharing an ecological niche, gene exchange, or common developmental mechanisms. This view allows for the possibility that there are many variations and differences between members of a species, while acknowledging that traits and properties of members of the same biological species are clustered due to the aforementioned mechanisms.

A problem for the HPC view is that, in some cases, properties of members of a kind need not be products of underlying homeostatic mechanisms. Since members of species vary in traits, for example, they might also vary in the underlying mechanisms that cause them. Consequently, we can have different underlying mechanisms distributed across a species that cause different traits in species members, such as human blood types, which are caused by different underlying genetic mechanisms. Unless we have a criterion for which of the traits and their underlying mechanisms are somehow more important or essential, it is up to us whether we will focus on the shared mechanisms that cause similarities between species members, or on the ones that are heterogeneous and case differences between species members.

In addition, it has been claimed that the focus on underlying mechanisms is too restrictive and diverts attention from what really needs to be explained, that is, is the stability and cohesiveness of properties that occur together. Thus, even less restrictive accounts have been offered, such as the Stable Property Cluster (SPC) view (Slater 2014).  In this view, a grouping is considered a natural kind if it consists of clusters of stable properties and this stability is due to the instantiation of some of the properties that warrant a probabilistically reliable inference that other properties are instantiated as well. For this inference, it is not necessary to trace the underlying causes of such stability.

The main advantage of HPC and related clustering views—that they are much more permissive than essentialism—can also prompt worries. Unless one is very strict on how to individuate clusters of properties and/or their underlying causes, these accounts have the problem of determining how many clustered properties are enough to consider something a natural kind. A potential worry is that these accounts are overly liberal and that any clustering of properties might comprise a natural kind. This would go against the commonsense intuition that natural kinds pick out groupings that are in some sense privileged.

There are authors, nonetheless, who do not see this as a problem and who defend a view that natural kinds should not be considered some privileged subset of categories (Dupré 1981). According to such views, there are many sameness relations in the world that we pick out depending on our interests, and they all can qualify as natural kinds. The next section reviews such account, promiscuous realism. As its name suggests, this account allows that many diverse interests can play a role in determining which grouping should count as a natural kind, thereby substantially expanding the set of categories considered natural kinds. Promiscuous realism is illustrated through psychiatric classifications. Many consider psychiatric categories to be problematic because there is often much heterogeneity among category members and there are many possible interests and relevant criteria for picking out psychiatric groupings. However, one cannot deny that those categories are scientifically useful and play an important role, both in scientific research and in practical contexts, which makes it interesting to examine whether there is a suitable philosophical account that might capture such categories, and promiscuous realism appears to be a fitting candidate.

c. Promiscuous Realism: The Case of Psychiatric Categories

According to promiscuous realism, depending on our interests and aims, there are many ways of classifying entities into kinds. This position was introduced by John Dupré (1981) and a similar view was also proposed, under the name of pluralistic realism, by Philip Kitcher (1984). Dupré holds that there are many sameness relations that can be used to distinguish different natural kinds and that none of those relations are privileged. That is, different entities can share some similarities with members of one group and some with members of another group, and which group we pick out as relevant will depend on our interests. This view is realist because it involves the criterion that something counts as a natural kind if its members share at least some similarities, even if minimal. Those similarities need to be some objective features of the world and not facts about us. For example, the fact that we group some things together would not count as a common property that can serve as a basis for classification.  This view therefore excludes as nonnatural classifications of entities which do not share any common properties. Different aims and interests will tend to produce different classifications, and those classifications can be taken as natural kinds if the members share at least some common properties that cause those entities to be categorized together in the first place.

While the cluster kind approaches have the problem of specifying where, exactly, to draw the line between clusters of properties that correspond to natural kinds and those that do not, promiscuous realism sidesteps this issue. If divisions between kinds come on a continuum and there are no clear cutoff points, promiscuous realism allows us to regard as natural kinds all classifications that group together entities that have at least some objective properties in common. This does not mean that all classifications are on equal footing. We can still consider some to better serve our purposes than others or to be better used in different contexts, but all of them can be considered natural in this minimal sense. This is a much more sweeping account of natural kinds because it countenances a wider range of categories as natural.

Dupré introduced this view by offering the example of different crosscutting categorizations into species, depending on which species concept is used in various biological subdisciplines, and classification practices outside biology. One of the hallmarks of promiscuous realism is that it does not prioritize scientific classifications over folk categories. Dupré (1981) provides examples of cases in which folk classifications do not correspond to biological classifications. For instance, our classification into butterflies and moths cross-classifies with the biological one. In fact, in many cases our classifications will be coarser-grained or finer-grained depending on our interests. What we call lilies, for instance, belong to the numerous genera of the lily family (Liliaceae), but our folk naming practice does not include the entire family, since we exclude onions and garlics that also belong to the same family. Dupré’s argument is that we should not try to change our folk categorizations to correspond to scientific ones because they often serve different purposes. We sort some plants of the lily family together because of their aesthetic properties, while we exclude garlics and onions because they serve culinary or other purposes. All these classifications can be considered natural and we can use one or the other depending on our interests and aims.

The promiscuous kind account has also been recognized as suitable for psychiatric classifications. It might seem that a psychiatric classification normally picks out a homogenous group of symptoms whose underlying cause(s) can be discovered and consequently treated, as described by the cluster kind accounts. This, however, is not what we often find in the actual practice of psychiatric classification. In this context, it seems even harder than in biology to find a stable cluster of common properties like symptoms or behaviors that are underpinned by a joint causal mechanism (Cooper 2012). Derek Bolton (2012) argues that the standard approach of classifying psychiatric conditions, starting with surface characteristics and then looking for their etiology to ensure reliability, is not as fruitful as it was initially assumed and that we should stop hoping that the etiology of psychiatric conditions will deliver one optimal classification scheme. Depending on our interests and pragmatic considerations, different types or subtypes of psychiatric categories will be taken as relevant. Thus, we might start using different sets of criteria to identify schizophrenia depending on our research interests. Those who are interested in treatment might parse out the symptoms and other criteria differently than those searching for the genetic causes of the condition. For this reason, it seems that promiscuous kinds can better account for classifications in psychiatry.

The fruitfulness of this approach can be illustrated by the introduction of biomarkers for marking biological correlates of different psychiatric conditions. The idea is to identify a biological causal chain or its correlates—for example, specific brain activation patterns—that underlie psychological and social characteristics associated with a psychiatric condition. The paradigmatic success story of this approach is the case of neurosyphilis, a disease that is characterized by psychiatric symptoms that are caused by the bacterium spirochete Treponema pallidum. Another example is a large project aimed at collecting genetic, biochemical and imaging data for a population that has a high risk for Alzheimer’s disease. This has led to proposals for new classification schemes, based on biological features that measure the presence of a disease.

The problem with this type of approach is that it is not justified to expect that for every familiar psychiatric condition, normally identified by behavioral and psychological symptoms, we will find a common pathway that underpins those psychiatric symptoms, as in the case of neurosyphilis. In fact, not many have been found (Buckholtz and Meyer-Lindenberg 2012). More often, we find a diversity of symptoms with diverse etiologies constituting one psychiatric condition. While for some this might constitute a reason to discard psychiatric conditions, or most of them, as candidates for psychiatric natural kinds (for a discussion, see Murphy 2017), promiscuous realists would allow that even a minimum of shared properties is enough to consider something a natural kind, if the classification serves some purpose.

While promiscuous realism has the advantage of encompassing many classifications that would not be considered natural on the cluster accounts, it can be objected that it is too liberal in doing so. All categorizations that are at least minimally grounded in the causal structure of the world can be considered natural kinds.  This might seem odd; since one of the starting points of the debate was the intuition that the objective structure of the world allows us to pick out some privileged groupings, ordinarily it is taken that those are the ones discovered through scientific inquiry, and that such groupings are superior to our everyday folk categories. In the promiscuous realist view, we can still privilege certain groupings, like the scientific categories, as being more explanatory or predictive, but it is not so that these groupings are natural and that the folk ones, for example, are not. All those categories can be considered natural kinds, and to prioritize some over others will have to be justified by invoking our interests.

This does not necessarily present a problem since, in some contexts, we do not need scientific classifications. When cooking, for example, we might have more use for “The Scoville Heat Scale,” a measure of the hotness of chili peppers according to the concentration of capsaicin, a chemical compound that produces heat sensation, than for the botanical classification of chili pepper plants. In the context of scientific research, however, we could use some further guidance for favoring classifications that are better grounded. Thus, it seems desirable to have a further requirement that goes beyond some minimum of shared properties that can serve some or other purpose or interest. Promiscuous realists can respond by stating that the relevant properties and interests ought to be related in systematic ways. Additionally, we can refine the demands on scientific classifications by adding constraints on the purposes that classifications serve. While the classification of people into right-handed and left-handed, for example, is based on a property that members of these groups share, and it can serve some minimal purposes like informing us what kind of scissors to produce, it is not a very useful category because it is minimally informative. Thus, we can add that we should favor those scientific categories that are information rich and that can accommodate many of our interests.

To go back to psychiatric conditions, while, for some purposes, it might be useful to group together shy people or anxious people, these groups are commonly considered to be too heterogeneous to be considered natural kinds. The introduction of constraints on classifications that are focused on our interests and aims, and not only on the amount or importance of shared properties, brings us to the question of whether we ought to consider natural kinds to be groupings that exist independently of mind and can be discovered by us, or whether which kinds we consider natural is always related to us, the investigators. The first thesis is associated with a realist understanding of natural kinds, and the second one with an antirealist understanding, but one should be very careful in formulating what exactly it means to be a realist or antirealist about natural kinds. The next section further examines this issue and provides a taxonomy of the various realist and antirealist positions. The section also problematizes the reason the three main accounts of natural kinds presented in this section are commonly taken as realist views and discusses how to differentiate different antirealist views according to the way they demarcate which interests are taken as relevant for establishing what constitutes a natural kind.

3. Metaphysics of Natural Kinds

a. What Does It Mean that a Kind is Real?

Realism about some entity or domain P states that P exists, and that it exists independently of us, the cognizers, that is, independently of our classificatory practices, conceptual schemas, beliefs, values, and so on. One can be a realist about everyday objects, for instance, such as chairs, rocks, buildings, and trees—but also about intangible entities like numbers or moral value. Those who are antirealists about everyday objects, numbers or moral value generally do not claim that such things do not exist tout court. Rather, they hold that such entities depend on us and would not exist were there no creatures who can respond to them. Accordingly, natural kinds realists are committed to the view that natural kinds exist independently of mind. When we talk about entities belonging to a kind, it seems straightforward to establish what it would mean that they exist independently of mind. For instance, on one hand, mental states are necessarily mind dependent. On the other hand, most people will agree that rocks and mountains exist independently of mind, that is, they would exist even if there were no one to perceive them or think about them. When we talk about groupings of such entities into kinds, however, there are at least two possible interpretations of the claim that groupings themselves are mind independent.

In one interpretation, to say that natural kinds exist independently of mind means that they exist as separate entities. Usually, this claim is taken to imply that natural kinds are universals, a special type of repeatable entity that can be instantiated with many particular objects (see the article on Universals). Realism about natural kinds as universals has been called strong realism (Bird and Tobin 2015). There are alternative views, however. For instance, kinds might exist as particulars or as some special sui generis entities (Hawley and Bird 2011). This debate relates to the more general question regarding the metaphysics of properties and is not discussed further in this article. Here, the focus is on debates on natural kinds in the philosophy of science.

In the second interpretation, natural kinds exist independently of mind in the sense that there are divisions in nature that obtain independently of our classificatory practices. The assumption is that the world is structured in such a way that certain ways of classifying it, or carving it up, are correct solely in virtue of that structure. This view has been called weak realism about natural kinds, or naturalism (Bird and Tobin 2015). Weak realism or naturalism seems to be consistent with natural kinds nominalism. Even though weak realists hold that there is a metaphysical difference between natural and nonnatural classifications, it does not automatically follow that this difference needs to be spelled out in terms of a special ontological category of natural kinds. In what follows, the term natural kinds realism refers to weak realism or naturalism.

This approach to natural kinds has recently been called a zooming-in model (Reydon 2016) because it assumes that a careful examination of nature—“zooming in” to it—will lead to the discovery of mind-independent groupings. In this view, natural kinds are found in nature and not created by us. The next section starts by examining the difference between scientific realism and natural kinds realism. It then looks at how the three most prominent accounts of natural kinds discussed in the previous section can be interpreted as realist views of kinds. The analysis starts with essentialism as the strongest and most typical realist view. It then reviews cluster kinds and promiscuous kinds. These are commonly considered realist views, but, as this discussion shows, can potentially be interpreted in the antirealist vein. After that, the section offers a taxonomy of antirealist views, starting with strong versions and continuing with more moderate ones, where the difference between realism and antirealism is much subtler.

b. The Relationship Between Scientific Realism and Natural Kinds Realism

To say that entities that are being classified by our scientific theories exist independently of us and our classificatory practices is one formulation of the thesis of scientific realism (see the article on Scientific Realism and Antirealism).  An interesting question in the debate on natural kinds realism is how to formulate this idea and what its relation to scientific realism is. Some authors presuppose that scientific realism and natural kinds realism amount to the same thesis. Stathis Psillos, for example, states that the metaphysical thesis of scientific realism is committed to the claim that the “world has a definite and mind-independent natural-kind structure” (Psillos 1999, xvii). Bird and Tobin similarly claim that “it is a corollary of scientific realism that when all goes well the classifications and taxonomies employed by science correspond to the real kinds in nature” (Bird and Tobin 2008, introduction). Conceptually, however, it appears that scientific realism can be kept distinct from realism about natural kinds. The claims of the existence of certain entities, which are members of natural kinds—say, electrons—can be interpreted both as saying that there are mind-independent entities with certain properties, as described by the scientific theory, and as a stronger claim that there is an objective, mind-independent criterion for how to categorize those entities into the kind electron.

Scientific realism refers, at a minimum, to the idea that science investigates facts about entities, their properties, and the relations in which they stand that are objective or mind independent. Natural kinds realism can then be read as a further thesis, according to which, in addition to the existence of mind-independent entities and processes, certain structure(s) of kinds of entities and the criteria by which we group and individuate them are equally mind independent (Chakravartty 2011). That is, there are correct ways of categorizing the world that reflect this mind-independent natural kind structure.

c. Natural Kinds Realism

Essentialism is a paradigmatically realist view because it holds that the sources of similarities between members of a kind are intrinsic and independent of circumstances or our cognitive practices or interests (Ellis 1999). Even if the essentialism in question is pluralistic and allows for many crosscutting categorizations, it is nonetheless the fact that entities grouped together share an essence that makes kinds natural. On the other hand, antirealist or conventionalist views hold that we do not have access to the supposed real divisions in nature, or real essences of kinds, and, hence, we decide where to draw the boundaries between different kinds according to our interests and aims. Invoking our interests and aims as relevant for establishing a category as a natural kind is thus more akin to antirealist views. Both cluster approaches and promiscuous realism are commonly considered to be realist views, however, though they do invoke our aims and interests as relevant.

There are at least two strategies for accommodating the idea that a theory that invokes our aims and interests as relevant for determining which kinds are natural can still be considered realist. They both rely on arguments that aim to show that classifications that serve our interests and aims are exactly those that capture preexisting mind-independent divisions in nature. A cluster kind realist can invoke a version of the no-miracles argument (see the article on Scientific Realism and Antirealism) and argue that the fact that certain categories are successful in inductive inferences, predictions, and explanations gives us reason to conclude that they reflect some objective divisions in nature. The argument is that it would be a miracle that our inductive practices work if they do not latch onto some categories—natural kinds—that are objective. An objection to this no-miracle argument could be that it does not prove the mind-independence of the categories that we use in inductive inferences because the success of those inferences is similarly measured relative to how well they satisfy our interests and aims.

To this objection, a cluster kind realist can reply that some clusters of properties will be identified no matter what interest and aims one starts with, and that such clusters represent natural kinds. Matthew Slater, for instance, indicates that “[p]erhaps there are some clusters of properties such that no matter how a discipline adjusted its norms and aims… the category that cluster described would be fit to play a robust epistemic role in the discipline” (Slater 2014, 406). In this strategy, the kinds we take to be natural do, in a sense, depend on our aims and interests because, were it not for those aims and interests, we would not reach those classifications. What justifies taking such classifications to be natural kinds, however, is the fact that other people, starting with different aims and interests, would also reach similar classifications. The main problem with this kind of defense of natural kinds realism is that it might end up with a very small set of categorizations that qualify as natural kinds, since there seem to be many categorizations that would not be recognized by people starting with very different interests and aims. Our interest in accounting for certain material transformations brought us, for example, to classification by chemical elements according to atomic structure. But if we are interested in patterns of radioactive decay, we will arrive at different classifications, and if, hypothetically, we were interested only in the behavior of materials in centrifuges, we would arrive at classifications based on density (Franklin-Hall 2015).

Another, less demanding strategy is to treat natural kinds as domain dependent. P.D. Magnus (2012) has explicitly defended this view, but Boyd (1991) and Khalidi (2013) also seem to endorse it. The idea here is not that any rational inquirer with any type of interest will, or ought to, reach the same classifications, but rather that the realism in question amounts to the claim that classifications are natural relative to the domain of inquiry. That is, what is required is that inquirers with the same interests and aims arrive at the same classifications. This, from the viewpoint of many natural kind realists, ensures that our categorizations track the causal structure of the world. There is no disinterested point of view that will discover the real natural kinds. Rather, there are many different points of view, and what makes a grouping natural is that, when we fix what we are interested in, we also fix the correct ways to classify a domain of investigation according to those interests. Thus, even though our interests play an important role in identifying natural kinds, once we have fixed them, there are still correct and incorrect ways to classify the domain in question, and what determines this are the features of the entities being classified.

Promiscuous realism, as the name suggests, is a realist view because it takes as natural those classifications whose members share at least some (or one) common property, which is an objective and mind-independent fact. This view, even though it is extremely permissive and allows a vast range of classifications to be considered natural kinds, excludes at least some classifications. The fact that the excluded ones are not natural kinds is true by virtue of mind-independent facts; that is, in virtue of the fact that they do not share any common properties. This is a very weak version of realism because it merely captures the fact that we cannot group together entirely arbitrary collections of objects. It does not, however, offer any further realist criteria for privileging certain classifications over others. Further refinements of the promiscuous realist view seem to rely on invoking antirealist criteria, for example, by putting constraints on relevant interests and aims in scientific classifications.

d. Natural Kinds Antirealism

Antirealism, or conventionalism, as it has been called, encompasses a wide range of views. All of them have in common the claim that what determines which kinds are natural are not only mind-independent facts about the world, but also facts about “us,” the cognizers or researchers. It is important to emphasize that antirealism, on this reading, is not committed to the further thesis that the identity of natural kinds, or the criterion for what makes a kind natural, is fully mind dependent. That view, according to which natural kinds are fully mind dependent and the world does not constrain our classifications, would then represent the most extreme variety of antirealism, which has been called strong conventionalism (Bird and Tobin 2015).

According to strong conventionalism, not only are there no mind-independent facts about which groupings are natural, but, also, all the differences and similarities between different entities are entirely dependent on us. Thus, any common properties among members of a kind that we identify as the basis for grouping them together are products of our classificatory practices and do not exist independently of mind. This view might go hand in hand with the more general antirealist view regarding the existence of a mind-independent world. This view sees natural kinds as exclusively those categorizations that we use in our classificatory practice. Since there is nothing about the world that would sanction some groupings as opposed to others, natural kinds would depend on our explicit beliefs about what kinds exist (see Franklin-Hall 2015). In this view, therefore, kinds are entirely subjective. But this view has counterintuitive consequences: It would make categories such as witches or hysteria equally as legitimate as scientific categories that include electrons or species, depending on the circumstances in which they are used.

Another antirealist view claims that our ignorance or lack of access to the natural principles of classification, if they exist, leads us to conclude that our grouping of objects into a natural kind will depend, at least partly, on our interests, aims, and cognitive capacities. This view can take at least two possible forms. One is that we do not have access to the real essences of kinds—there are natural principles of classification, but they are inaccessible to us. Another is the argument that there are no clear divisions in nature, or no discoverable natural principles of classification. Rather, there are only continuous gradations between different kinds of things, so it is partly up to us where to draw the line. This implies that our epistemic aims, cognitive capacities, and practical interests might play a role in deciding where to draw such lines and what classifications to endorse. This type of view has been called weak conventionalism (Bird and Tobin 2015). It is characterized by the claim that both the causal structure of the world—mind-independent facts—and facts about us jointly determine which categorizations we will consider to be natural.

The main thesis of weak conventionalism is nicely illustrated by Reydon’s (2016) co-creation model of natural kinds. In this model, kinds are taken to be co-determined by both states of affairs in nature and the background assumptions and decisions of investigators in specific scientific contexts. It can encompass a broad set of views. Depending on how exactly one thinks of cognitive capacities, epistemic or practical aims, and what kinds of interests are taken as legitimate, different antirealist views can be developed.

A simple pragmatist approach to natural kinds, as defined by Laura Franklin-Hall (2015), holds that natural kinds correspond to categories that fulfill some of our epistemic and/or practical aims. This is a very broad understanding of natural kinds that allows a wide range of categories to be considered natural kinds. It does, however, exclude entirely arbitrary categories. It excludes them because they cannot serve any useful purpose. On the other hand, it allows categories such as proteins, gluten-free food, or introverts to be natural kinds, since they fulfill some of our interests. While in the practical sense this account would deem as natural most, if not all, of the same groupings as promiscuous realism, it is important to notice that the reason these accounts hold that certain groupings are natural is different. While the realist stresses that the reason the grouping is useful is in the fact that certain objective properties are shared, the antirealist does not care about that. The antirealist focuses instead on whether the grouping is useful and serves some purpose, regardless of whether it is based on some objective property. A simple pragmatist view countenances as natural all the groupings that are in some way relevant to us.

One problem with this view arises when we start thinking about the possibility of our interests being somewhat different than they are. This commits the view to a potentially awkward consequence, in which any change in our interests entails the existence of different natural kinds. To resist this consequence, a pragmatist can offer a way to refine which interests can be taken as relevant for judging whether a kind is natural. For instance, one can restrict the possible range of interests by considering what interests some idealized and fully informed agent or inquirer would have or would endorse.

Another potentially problematic consequence of the simple pragmatist view is that practical issues can outweigh factual ones when it comes to deciding which classifications to adopt. The psychiatric classification antisocial personality disorder, for example, most likely groups together a very heterogeneous class of people whose only common feature is that they engage in some sort of criminal behavior (Brazil, et al. 2016). From the point of view of scientific research, we should strive to find classifications that are better grounded in commonalities that their members share.  From a practical point of view, however, it might be enough to know only that people belonging to this group have committed crimes and that it is likely that they will do so again in the future (Brzović et al. 2018; Malatesti and McMillan 2014).

A more common variation on the weak conventionalist view focuses on our epistemic interests and has been called the simple epistemic view (Franklin-Hall 2015).  In this approach, natural kinds correspond to categories that fulfill some of our epistemic aims. It differs from the simple pragmatist view by excluding practical interests as relevant for circumscribing natural kinds. Cluster kinds, for instance, can have realist and antirealist readings, depending on what one focuses on. The realist would say that what makes such kinds natural is the fact that they track real clusters of properties in the world, while the simple epistemic antirealist would argue that what makes them natural is their success in fulfilling our epistemic aims, such as being predictive and explanatory. We therefore do not start by looking for clustered properties, but by looking for categories that successfully fulfill our epistemic aims. In many cases, categories based on clustered properties will accomplish this aim. In this reading, the aim of our scientific endeavors is to develop the most accurate descriptions of the world we live in, and the categories that best serve this purpose ought to be considered natural kinds. Such views have been characterized as epistemology-oriented approaches to natural kinds (Reydon 2009).

The main difficulty with these approaches is to explain how to circumscribe the set of epistemic aims that we take to be relevant in establishing which groupings correspond to natural kinds. If we take our present aims as relevant, this has the welcome consequence that our present successful scientific categories come out as natural kinds. However, we might exclude some classifications that we might reach if our interests were to change or if our knowledge expanded or got revised. To solve this type of problem, Franklin-Hall (2015) offers a more elaborate antirealist approach, the categorical bottleneck view. She identifies natural kinds with categories that fulfill the interests that we and a wider range of epistemic agents with different interests and cognitive capacities have in common. Here, however, the relevant interests are limited to what we and our “neighboring agents” would recognize as scientifically relevant classifications. That is, we do not consider any possible epistemic agents apart from those that are relatively like us. Neighboring agents are those that only somewhat differ from actual agents in their epistemic aims and interests. This restriction of possible interests is meant to ensure more objectivity for natural kind categories by eliminating those that might be contingent on some of our cognitive capacities or limitations in knowledge.

Thinking about the synchronic and diachronic ways of considering naturalness illustrates the problem that antirealist views face—that classifications based on our interests can seem to lack objectivity (Chang 2016). The synchronic aspect looks at a specific moment of scientific development, usually the one that is of immediate interest, and examines whether the scientific categories that are in use can be considered natural kinds. Examples include whether they play an important role in scientific explanations, whether they are predictive, whether they figure in scientific laws or lawlike generalizations, or even whether they fulfill certain practical purposes. If we only focus on the present moment, we might be tempted to conclude that natural kinds are those categories that fulfill our present epistemic interests and aims. If we concentrate on the diachronic aspect of naturalness and investigate what it means that a category is a natural kind throughout different periods of scientific development, then it might turn out not to be beneficial to focus on our present interests and aims, since there is always the possibility that they might change as new information comes in and new scientific theories are accepted.

Thinking about the question of what makes a kind natural across different stages of scientific development can be used to illustrate the way different antirealist positions can reach different conclusions. One reading of the simple epistemic view claims that, throughout the development of science, different categories have served our interests, and, for this reason, we can consider them to be natural kinds in the contexts in which they were used. The consequence of this view is that natural kinds are relative to the context of scientific investigation. In this view, categorizations such as phlogiston or hysteria, for example, were natural kinds in one historical period but not in others.

A different reading of the simple epistemic view argues that only our present categories correspond to natural kinds, while the ones that served our interests in previous stages of scientific development were not natural kinds if they differed from the present ones. This has the problematic consequence, however, that in the future we might develop different epistemic aims and interests, and that other categories would therefore come to be considered scientifically grounded. We thus could not consider them natural kinds, since the notion of natural kinds is tied to our present interests. One option for the simple epistemic view is to argue that our interests and aims do not change to a large degree, but rather it is our factual knowledge regarding how to fulfill our aims that changes. The claim that natural kinds are categories that best serve our epistemic aims and interests thus presupposes that they are best served when we have all the required information regarding matters of fact. Thus, while it might appear that our aims change substantively with the development of science, what actually changes is our access to information on how to fulfil them. Another option for the antirealist is to abandon the simple epistemic view and embrace something more akin to the categorical bottleneck view, which ensures more objectivity for natural kinds by introducing a wider range of epistemic agents with different interests and cognitive capacities.

Domain-dependent realism, which is closest to antirealist views in the sense that it makes natural kinds relative to different domains of inquiry, solves this problem by construing natural kinds as categories that we would adopt if starting with the same interests. The idea here is not merely that any category that is useful in certain contexts is a natural kind, but rather that, once we start with certain fixed interests, there is a correct way to classify the domain of inquiry. Thus, it is not enough that certain classifications fulfill our interests, because even the categorizations we now take to be flawed still fulfill our interests to a certain extent. Rather, we should aim at finding the correct ones within our domain of interest. Consequently, there is a possibility that our current classifications are not natural kinds, because it might turn out that there are better, or more refined ones, which will more perfectly fulfill our interests. That is, we can always assume that further scientific developments and new data will lead us to reconsider our current classifications. This is a feature of realist views in general, that we cannot be certain that our current knowledge reflects the real states of affairs or, in this case, that our classifications reflect natural kinds.

These problems nicely illustrate the main benefits and drawbacks of realist and antirealist views. We start out with the intuition that natural kinds pick out some objective features of the world and that what kinds are natural is not supposed to change across different contexts. Realism easily accounts for this objectivity by arguing that natural kinds represent the way the world is structured independently of us. Thus, kinds are out there to be discovered, and they cannot change across different scientific contexts or vary with different researchers’ interests. The problem for the realist, then, is to demonstrate how it is possible to access such natural groupings, to locate nature’s joints. The realist has to offer something like essences or very clearly delineated clusters of properties and to try to convince skeptics that these really are the natural ways to divide the entities in the world. Realist positions are characterized by their openness to the possibility that our current categories do not actually capture natural kinds. Even domain-dependent realists can always question whether—if there is a convergence on a certain classification by everyone sharing the same interest, that is, working in the same discipline—we might, with future scientific developments, discover new facts that will lead us to reexamine those classifications.

Antirealism, on the other hand, gives weight to the researchers’ contribution to scientific classification, but at the cost of sacrificing the objectivity of kinds. In this account, natural kinds can be seen as relative to the specific contexts of investigation. This has the consequence that the kinds that are deemed natural will change as the scientific research advances. Thus, while hysteria, to take an example cited previously, was at one point a natural kind, that is no longer the case. To avoid this consequence, the antirealist can offer a way to sanction possible interests and aims to arrive at a more objective view of natural kinds. Such sanctioning, however, naturally lead us to postulate objective features of the world that our classifications ought to identify. These realist intuitions again lead us away from the starting ambition to encompass actual scientific classifications.

4. Conclusion

The growth of interest in natural kinds among philosophers of science stems from two sources. One relates to debates on scientific confirmation and inductive reasoning; the other has emerged from debates regarding the reference of scientific terms. With the further development of the philosophies of specific scientific disciplines such as biology, chemistry, psychology, psychiatry, and so on, theorizing about natural kinds moved more in the direction of examining successful scientific classifications and offering philosophical accounts that should capture those classifications. In this regard, we can identify two main approaches to the natural kinds debate and the corresponding roles they are supposed to play: on the one hand, a traditional, more prescriptive one; on the other, a descriptive one that aims to stay close to scientific practice.

This move is transparent in the three major approaches to natural kinds presented in this article. On one hand, essentialism, with its strict search for clearly demarcated kinds, has been criticized as being too restrictive, because it leaves out many important scientific categorizations. On the other, the cluster kind and promiscuous realism approaches have been worked out with the aim of providing a framework that will capture classifications in actual scientific practice. This tendency is effective insofar as it brings philosophical accounts closer to science. However, it risks minimizing the prescriptive role that natural kinds should play in scientific research, because philosophers using this approach tend to equate current scientific classifications with natural kinds. In the debate on the metaphysics of natural kinds, the dichotomy between these two approaches is reflected in a tension between attempts to ensure the objectivity of natural kinds and attempts to stay close to scientific practice by emphasizing that natural kinds ought to fulfill our current interests.

5. References and Further Reading

  • Bird, Alexander. Nature’s Metaphysics: Laws and Properties. Oxford University Press, 2007.
  • Bird, Alexander, and Emma Tobin. “Natural Kinds.The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Stanford University, Spring 2015.
  • Bolton, Derek. “Classification and Causal Mechanisms: A Deflationary Approach to the Classification Problem.” In Philosophical Issues in Psychiatry II: Nosology, edited by K. S. Kendler and J. Parnas, Oxford University Press, 2012, pp. 6–12.
  • Boyd, Richard. “Homeostasis, Species, and Higher Taxa.” In Species: New Interdisciplinary Essays, edited by R. A. Wilson, MIT Press, 1999, pp. 141–185.
  • Boyd, Richard. “Realism, Anti-Foundationalism and the Enthusiasm for Natural Kinds.” Philosophical Studies, vol. 61, no. 1, 1999, pp. 127–148.
  • Brazil, Inti A., J. D. M. van Dongen, J. H. R. Maes, R. B. Mars, and Arielle R. Baskin-Sommers. “Classification and Treatment of Antisocial Individuals: From Behavior to Biocognition.” Neuroscience & Biobehavioral Reviews, 2016.
  • Brzović, Zdenka, Jurjako, Marko, and Predrag Šustar. “The Kindness of Psychopaths.” International Studies in the Philosophy of Science, vol. 31, no. 2, 2017, pp. 189-211.
  • Buckholtz, Joshua W., and A. Meyer-Lindenberg. “Psychopathology and the Human Connectome: Toward a Transdiagnostic Model of Risk for Mental Illness.” Neuron, vol. 74, no. 6, 2012, pp. 990–1004.
  • Chakravartty, Anjan. A Metaphysics for Scientific Realism: Knowing the Unobservable. Cambridge University Press, 2007.
  • Chakravartty, Anjan. “Scientific Realism and Ontological Relativity.” The Monist, vol. 94, no. 2, 2011, pp. 157–180.
  • Chang, H. “The Rising of Chemical Natural Kinds through Epistemic Iteration.” Natural Kinds and Classification in Scientific Practice, edited by C. Kendig, Routledge, 2016, pp. 33–47.
  • Cooper, Rachel. “Is Psychiatric Classification a Good Thing?” Philosophical Issues in Psychiatry II: Nosology, edited by K.S. Kendler and J. Parnas, Oxford University Press, 2012, pp. 61–70.
  • Dupré, John A. “Natural Kinds and Biological Taxa.” The Philosophical Review, vol. 90, no. 1, 1981, pp. 66–90.
  • Ellis, Brian. “Essentialism and Natural Kinds.” The Routledge Companion to Philosophy of Science, edited by M. Curd and S. Psillos, Routledge, 1999, pp. 139–149.
  • Ellis, Brian. Scientific Essentialism. Cambridge University Press, 2001.
  • Ereshefsky, Marc. “Species.” The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Stanford University, 2016. https://plato.stanford.edu/archives/sum2016/entries/species.
  • Ereshefsky, Marc, and Thomas A. C. Reydon. “Scientific kinds.” Philosophical Studies, vol. 172, no. 4, 2015, pp. 969–986.
  • Fodor, J. “Special Sciences (Or: The Disunity of Science as a Working Hypothesis).” Synthese, vol. 28, no. 2, 1974, pp. 97–115.
  • Franklin-Hall, Laura. “Natural Kinds as Categorical Bottlenecks.” Philosophical Studies, vol. 172, no. 4, 2015, pp. 925–948.
  • Ghiselin, M. “A Radical Solution to the Species Problem.” Systematic Zoology, vol. 23, no. 4, 1974, pp. 536–544.
  • Goodman, Nelson. Fact, Fiction and Forecast. Harvard University Press, 1983.
  • Hawley, K, and A. Bird. “What are Natural Kinds?” Philosophical Perspectives, vol. 25 no. 1, 2011, pp. 205–221.
  • Hendry, Robin F. “Elements, Compounds and Other Chemical Kinds.” Philosophy of Science, vol. 73, no. 5, 2006, pp. 864–875.
  • Hendry, Robin F. 2010. “The elements and conceptual change.” In The Semantics and Metaphysics of Natural, edited by H. Beebee and N. Sabbarton-Leary, Routledge, pp. 137–158.
  • Hull, David L. “A Matter of Individuality.” Philosophy of Science, vol. 45, no. 3, 1978, pp. 335–360.
  • Hull, David L. “The Effect of Essentialism on Taxonomy: Two Thousand Years of Stasis.” British Journal for the Philosophy of Science, vol. 15, no. 60, 1965, pp. 314–326.
  • Kendig, Catherine, editor. Natural Kinds and Classification in Scientific Practice. 1st ed., Routledge, 2015.
  • Khalidi, Muhammad Ali. Natural Categories and Human Kinds: Classification in the Natural and Social Sciences. Cambridge University Press, 2013.
  • Kitcher, Philip. “Species.” Philosophy of Science, vol. 51, no. 2, 1984, pp. 308–333.
  • Kripke, Saul. “Naming and Necessity.” Semantics of Natural Language, edited by G. Harman and D. Davidson, Reidel, 1972, pp. 253–355.
  • LaPorte, Joseph. Natural Kinds and Conceptual Change. Cambridge University Press, 2003.
  • Magnus, P. D. Scientific Enquiry and Natural Kinds: From Planets to Mallards. Palgrave Macmillan, 2012.
  • Malatesti, Luca, and John McMillan. “Defending Psychopathy: An Argument from Values and Moral Responsibility.” Theoretical Medicine and Bioethics, vol. 35, no. 1, 2014, pp. 7–16.
  • Murphy, Dominic. “Can Psychiatry Refurnish the Mind?” Philosophical Explorations, vol. 20, no. 2, 2017, pp. 160–174.
  • Plato. Phaedrus. Cambridge University Press, 1952.
  • Psillos, Stathis. Scientific Realism: How Science Tracks Truth. Routledge, 1999.
  • Putnam, Hilary. “The Meaning of ‘Meaning.’” Minnesota Studies in the Philosophy of Science, vol. 7, 1975, pp. 215–271.
  • Quine, Willard van Orman. “Natural Kinds.” Ontological Relativity and Other Essays, edited by W. V. Quine, Columbia University Press, 2012, pp. 114–138.
  • Reydon, Thomas. “From a Zooming-In Model to a Co-creation Model: Towards a more Dynamic Account of Classification and Kinds.” Natural Kinds and Classification in Scientific Practice, edited by C. E. Kendig, Routledge, 2016, pp. 59–73.
  • Reydon, Thomas. 2009. “How to Fix Kind Membership: A Problem for HPC Theory and a Solution.” Philosophy of Science, vol. 76, no. 5, 2009, pp. 724–736.
  • Slater, Matthew H. 2009. “Macromolecular Pluralism.” Philosophy of Science, vol. 76, no. 5, 2009, pp. 851–63.
  • Slater, Matthew. “Natural Kindness.” British Journal for the Philosophy of Science, vol. 66, no. 2, 2015, pp. 375–411.
  • Sober, Elliott. “Evolution, Population Thinking and Essentialism.” Conceptual Issues in Evolutionary Biology, edited by Elliott Sober, MIT Press, 1994, pp. 161–189.
  • Wilson, R. A., M. J. Barker, and I. Brigandt. “When Traditional Essentialism Fails: Biological     Natural Kinds.” Philosophical Topics, vol. 35, no. 1/2, 2007, pp. 189–215.

 

Author Information

Zdenka Brzović
Email: zbrzovic@gmail.com
University of Rijeka
Croatia

Niccolò Machiavelli (1469—1527)

MachiavelliMachiavelli was a 16th century Florentine philosopher known primarily for his political ideas. His two most famous philosophical books, The Prince and the Discourses on Livy, were published after his death. His philosophical legacy remains enigmatic, but that result should not be surprising for a thinker who understood the necessity to work sometimes from the shadows. There is still no settled scholarly opinion with respect to almost any facet of Machiavelli’s philosophy. Philosophers disagree concerning his overall intention, the status of his sincerity, the status of his piety, the unity of his works, and the content of his teaching.

His influence has been enormous. Arguably no philosopher since antiquity, with the possible exception of Kant, has affected his successors so deeply. Indeed, the very list of these successors reads almost as if it were the history of modern political philosophy itself. Bacon, Descartes, Spinoza, Bayle, Hobbes, Locke, Rousseau, Hume, Smith, Montesquieu, Fichte, Hegel, Marx, and Nietzsche number among those whose ideas ring with the echo of Machiavelli’s thought. Even those who apparently rejected the foundations of his philosophy, such as Montaigne, typically regarded Machiavelli as a formidable opponent and deemed it necessary to engage with the implications of that philosophy.

Table of Contents

  1. Life
    1. The Youth (1469-1498)
    2. The Official (1498-1512)
    3. The Philosopher (1513-1527)
  2. Philosophical Themes
    1. Virtue
    2. Fortune
    3. Nature
    4. History and Necessity
    5. Truth
    6. Politics: The Humors
    7. Politics: Republicanism
    8. Glory
    9. Religion
    10. Ethics
  3. Machiavelli’s Corpus
    1. The Prince
    2. Discourses on Livy
    3. Art of War
    4. Florentine Histories
    5. Other Works
  4. Possible Philosophical Influences on Machiavelli
    1. Renaissance Humanism
    2. Renaissance Platonism
    3. Renaissance Aristotelianism
    4. Xenophon
    5. Lucretius
    6. Savonarola
    7. The Bible and Its Traditions
  5. Contemporary Interpretations
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

It is customary to divide Machiavelli’s life into three periods: his youth; his work for the Florentine republic; and his later years, during which he composed his most important philosophical writings.

Most of Machiavelli’s diplomatic and philosophical career was bookended by two important political events: the French invasion of Italy in 1494 by Charles VIII; and the sack of Rome in 1527 by the army of Emperor Charles V.

In what follows, citations to The Prince refer to chapter number (e.g., “P 17”). Citations to the Discourses and to the Florentine Histories refer to book and chapter number (e.g., “D 3.1” and “FH 4.26”). Citations to the Art of War refer to book and sentence number in the Italian edition of Marchand, Farchard, and Masi and in the corresponding translation of Lynch (e.g., “AW 1.64”).

a. The Youth (1469-1498)

Machiavelli was born on May 3, 1469, to a somewhat distinguished family. He grew up in the Santo Spirito district of Florence. He had three siblings: Primavera, Margherita, and Totto. His mother was Bartolomea di Stefano Nelli. His father was Bernardo, a doctor of law who spent a considerable part of his meager income on books and who seems to have been especially enamored of Cicero. So, at a young age, Machiavelli was exposed to many classical authors who influenced him profoundly; as he says in the Discourses, the things that shape a boy of “tender years” will ever afterward regulate his conduct (D 3.46). We do not know whether Machiavelli read Greek, but he certainly read Greek authors in translation, such as Thucydides, Plato, Xenophon, Aristotle, Polybius, Plutarch, and Ptolemy. He was studying Latin already by age seven and translating vernacular works into Latin by age twelve. Among the Latin authors that he read were Plautus, Terence, Caesar, Cicero, Sallust, Virgil, Lucretius, Tibullus, Ovid, Seneca, Tacitus, Priscian, Macrobius, and Livy. Among Machiavelli’s favorite Italian authors were Dante and Petrarch.

When he was twelve, Machiavelli began to study under the priest Paolo da Ronciglione, a famous teacher who instructed many prominent humanists. Machiavelli may have studied later under Marcello di Virgilio Adriani, a professor at the University of Florence.

The diaries of Machiavelli’s father end in 1487. For the next ten years, there is no record of Machiavelli’s activities. In 1497, he returns to the historical record by writing two letters in a dispute with the Pazzi family.

During this period, there were many important dates during this period. The Pazzi conspiracy against the Medici occurred in 1478. Savonarola began to preach in Florence in 1482. In 1492, Lorenzo the Magnificent died and Rodrigo Borgia ascended to the papacy as Alexander VI. In 1490, after preaching elsewhere for several years, Savonarola returned to Florence and was assigned to San Marco. In 1494, he gained authority in Florence when the Medici were expelled in the aftermath of the invasion of Charles VIII. Machiavelli’s mother passed away in 1496, the same year that Savonarola would urge the creation of the Great Council. On May 12, 1497, Savonarola was excommunicated by Alexander VI. On May 23, 1498, almost exactly a year later, he was hung and then burned at the stake with two other friars in the Piazza della Signoria.

b. The Official (1498-1512)

Not long after Savonarola was put to death, Machiavelli was appointed to serve under Adriani as head of the Second Chancery. Machiavelli was 29 and had no prior political experience. A month after he was appointed to the Chancery, he was also appointed to serve as Secretary to the Ten, the committee on war.

In November 1498 he undertook his first diplomatic assignment, which involved a brief trip to the city of Piombino. In March 1499, he was sent to Pontedera to negotiate a pay dispute involving the mercenary captain, Jacopo d’Appiano. In July of the same year, he would visit Countess Caterina Sforza at Forli (P 3, 6, and 20; D 3.6; FH 7.22 and 8.34; AW 7.27 and 7.31).

His first major mission was to the French court, from July 1500 to January 1501. There he would meet Georges d’Amboise, the cardinal of Rouen and Louis XII’s finance minister (P 3). In 1501, he would take three trips to the city of Pistoia, which was being torn to pieces by factional disputes (P 17). Over the next decade, he would undertake many other missions, some of which kept him away from home for months (e.g., his 1507 mission to Germany).

In August 1501 he was married to Marietta di Ludovico Corsini. Machiavelli and Marietta would eventually have several children, including Bernardo, Primerana (who died young), an unnamed daughter (who also died young), Baccina, Ludovico, Piero, Guido, and Totto. Machiavelli was also romantically linked to other women, such as the courtesan La Riccia and the singer Barbera Salutati.

In 1502, Machiavelli met Cesare Borgia for the first time (e.g., P 3, 7, 8, and 17; D 2.24). In the same year, Florence underwent a major constitutional reform, which would place Piero Soderini as gonfaloniere for life (previously the term limit had been two months). Soderini (e.g., D 1.7, 1.52, 1.56, 3.3, 3.9, and 3.30) allowed Machiavelli to create a Florentine militia in 1505-1506. The militia was an idea that Machiavelli had promoted so that Florence would not have to rely upon foreign or mercenary troops (see P 12 and 13). In 1507, Machiavelli would be appointed to serve as chancellor to the newly created Nine, a committee concerning the militia.

Between 1502 and 1507, Machiavelli would collaborate with Leonardo da Vinci on various projects. The most notable was an attempt to connect the Arno River to the sea; to irrigate the Arno valley; and to cut off the water supply to Pisa.

In the summer of 1512, Machiavelli’s militia was crushed at the city of Prato. Soderini was exiled, and by September 1 Giuliano de’ Medici would march into Florence to reestablish Medici control of the city. Machiavelli’s tenure for the Florentine government would last from June 19, 1498 to November 7, 1512. He was one of the few officials from the republic to be dismissed upon the return of the Medici.

During this period, Cesare Borgia became the Duke of Valentinois in the late summer of 1498. Machiavelli’s father, Bernardo, died in 1500. Alexander VI died in August 1503 and was replaced by Pius III (who lasted less than a month). Julius II would ascend to the papacy later in November 1503.

c. The Philosopher (1513-1527)

In late 1512, Machiavelli was accused of participating in an anti-Medici conspiracy. In early 1513, he was imprisoned for twenty-two days and tortured with the strappado, a method that painfully dislocated the shoulders. He was released in March and retired to a family house (which still stands) in Sant’Andrea in Percussina.

It was a profound fall from grace, and Machiavelli felt it keenly; he complains of his “malignity of fortune” in the Dedicatory Letter to The Prince. He seems to have commenced writing almost immediately. By 10 December 1513, he wrote to his friend, Francesco Vettori, that he was hard at work on what we now know as his most famous philosophical book, The Prince. He also began to write the Discourses on Livy during this period.

During the following years, Machiavelli attended literary and philosophical discussions in the gardens of the Rucellai family, the Orti Oricellari. He wrote poetry and plays during this period, and in 1518 he likely wrote his most famous play, Mandragola.

Friends such as Francesco Guicciardini and patrons such as Lorenzo di Filippo Strozzi attempted, with varying degrees of success, to restore Machiavelli’s reputation with the Medici. Something must have worked. In 1520, Machiavelli was sent on a minor diplomatic mission to Lucca, where he would write the Life of Castruccio Castracani. Impressed, Giuliano de’ Medici offered Machiavelli a position in the University of Florence as the city’s official historiographer. Giuliano would also commission the Florentine Histories (which Machiavelli would finish by 1525).

In 1520, Machiavelli published the Art of War, the only major prose work he would publish during his lifetime. It was well received in both Florence and Rome. He directed the first production of Clizia in January 1525.

Machiavelli died on June 21, 1527. His body is buried in the Florentine basilica of Santa Croce.

During this period, Giovanni de’ Medici became Pope Leo X upon the death of Julius II, in 1513. He was the first Florentine ever to become pope. In October 1517, Martin Luther sent his 95 Theses to Albert of Mainz. In 1521, Luther was excommunicated by Leo X. In 1522, Piero Soderini died in Rome. In 1523, Giuliano de’ Medici became Pope Clement VII. In 1527, Clement refused Henry VIII’s request for an annulment. Five years later, on May 6, 1527, Rome was sacked by Emperor Charles V.

2. Philosophical Themes

If to be a philosopher means to inquire without any fear of boundaries, Machiavelli is the epitome of a philosopher. Although it is unclear exactly what “reason” means for Machiavelli, he says that it is “good to reason about everything” (bene ragionare d’ogni cosa; D 1.18). And he says: “I do not judge nor shall I ever judge it to be a defect to defend any opinion with reasons, without wishing to use either authority or force for it” (D 1.58). He claims that he will not reason about certain topics but then does so, anyway (e.g., P 2, 6, 11, and 12; compare D 1.16 and 1.58). And he suggests that a prince should be a “broad questioner” (largo domandatore) and a “patient listener to the truth” (paziente auditore del vero; P 23).

But what more precisely might Machiavelli mean by “philosophy”? It is worth noting that the word “philosophy” (filosofia) never appears in The Prince or the Discourses (but see FH 7.6). The word “philosopher(s)” (filosofo / filosofi) appears once in The Prince (P 19) and three times in the Discourses (D 1.56, 2.5, and 3.12; see also D 1.4-5 and 2.12, as well as FH 5.1 and 8.29). Machiavelli occasionally refers to other philosophical predecessors (e.g., D 3.6 and 3.26; FH 5.1; and AW 1.25).

For the sake of presentation, this article presumes that The Prince and the Discourses comprise a unified Machiavellian philosophy. Readers should note that other interpreters would not make this presumption. Regardless, what follows is a series of representative themes or vignettes that could support any number of interpretations.

a. Virtue

The most fundamental of all of Machiavelli’s ideas is virtù. This word has several valences but is reliably translated in English as “virtue” (sometimes as “skill” or “excellence”). Although difficult to characterize concisely, Machiavellian virtue concerns the capacity to shape things and is a combination of self-reliance, self-assertion, self-discipline, and self-knowledge.

With respect to self-reliance, a helpful way to think of virtue is in terms of what Machiavelli calls “one’s own arms” (arme proprie; P 1 and 13; D 1.21), a notion that he links to virtue. This phrase at times refers literally to one’s soldiers or troops. But it can also refer to a general sense of what is one’s own, that is, what does not belong to or depend upon something else. Minimally, then, virtue may mean to rely upon one’s self or one’s possessions. Maximally, it may mean to disavow reliance in every sense—such as the reliance upon nature, fortune, tradition, and so on. To be virtuous might mean, then, not only to be self-reliant but also to be independent. In this way, Machiavelli is perhaps the forerunner of various modern accounts of substance (e.g., that of Descartes) that characterize the reality of a thing in terms of its independence rather than its goodness.

With respect to self-assertion, those with virtue are dynamic and restless, even relentless. Machiavellian virtue thus seems more closely related to the Greek conception of active power (dynamis) than to the Greek conception of virtue (arete). Consequently, the idiom of idleness or leisure (ozio) is foreign to most, if not all, of the successful characters in Machiavelli’s writings, who instead constantly work toward the achievement of their aims. The Romans, ostensibly one of the model republics, always look for danger from afar; fight wars immediately if it is necessary; and do not hesitate to employ fraud (P 3; D 2.13). Cesare Borgia, ostensibly one of the model princes, labors ceaselessly to lay the proper foundations for his future (P 7). Machiavelli urges his readers to think of war always, especially in times of peace (P 14); never to fail to see the oncoming storm in the midst of calm (P 24); and to beware of Fortune, who is like “one of those raging rivers” that destroys everything in its path (P 25). He laments the idleness of modern times (D 1.pr; see also FH 5.1) and encourages potential founders to ponder the wisdom of choosing a site that would force its inhabitants to work hard in order to survive (D 1.1). Machiavelli says that a wise prince should never be idle in peaceful times but should instead use his industry (industria) to resist adversity when fortune changes (P 14).

With respect to self-discipline, virtue involves a recognition of one’s limits coupled with the discipline to work within those limits. The Prince, for instance, is occasionally seen as a manual for autocrats or tyrants. But in fact it is replete with recommendations of moderation and self-discipline. Machiavelli insists, for example, that a prince should use cruelty sparingly and appropriately (P 8); that he should not seek to oppress the people (P 9); that he should not spend his subjects’ money (P 16) or take their property or women (P 17); that he should appear to merciful, faithful, honest, humane, and, above all, religious (P 18); that he should be reliable, not only as a “true friend” but as a “true enemy” (P 21); and so forth. And although Machiavelli rarely discusses justice in The Prince, he does say that “victories are never so clear that the winner does not have to have some respect [qualche respetto], especially for justice” (giustizia; P 21; see also 19 and 26). For Machiavelli, virtue includes a recognition of the restraints or limitations within which one must work: not only one’s own limits, but social ones, including conventional understandings of right and wrong.

Finally, with respect to self-knowledge, virtue involves knowing one’s capabilities and possessing the paradoxical ability to be firmly flexible. It is not enough to be constantly moving; additionally, one must always be ready and willing to move in another direction. Nor is it enough simply to recognize one’s limits; additionally, one must always be ready and willing to find ways to turn a disadvantage into an advantage. Success is never a permanent achievement. Time sweeps everything before it and brings the good as well as the bad (P 3); fortune varies and can ruin those who are obstinate (P 25). Virtue requires that we know how to be impetuous (impetuoso); that we know how to recognize fortune’s impetus (impeto); that we know how to move quickly in order to seize an opportunity before it evaporates. Virtue involves flexibility—but this is both a disciplined and an optimistic flexibility. Furthermore, it is a flexibility that exists within prudently ascertained parameters and for which we are responsible. What it means to be virtuous involves understanding ourselves and our place in the cosmos. In this way, Machiavelli’s conception of virtue is linked not only with his conception of fortune but also with necessity and nature. Furthermore, it raises the question of what it means to be wise (savio), an important term in Machiavelli’s thought.

It should be emphasized that Machiavellian virtue is not necessarily moral. At first glance and perhaps upon closer inspection, Machiavellian virtue is something like knowing when to choose virtue (as traditionally understood) and when to choose vice. As he puts it, we must learn how not to be good (P 15 and 19) or even how to enter into evil (P 18; compare D 1.52), since it is not possible to be altogether good (D 1.26). Machiavelli is sensitive to the role that moral judgment plays in political life; there would be no need to dissimulate if the opinions of others did not matter. But his point seems to be that we do not have to think of our own actions as being excellent or poor simply in terms of whether they are linked to conventional moral notions of right and wrong. Praise and blame are levied by observers, but not all observers see from the perspective of conventional morality.

Some scholars point to Machiavelli’s use of mitigating rhetorical techniques and to his reading of classical authors in order to argue that his notion of virtue is in fact much closer to the traditional account than it first appears. Crucial for this issue are the central chapters of The Prince (P 15-19). Some scholars highlight similarities between Machiavelli’s treatment of liberality and mercy in particular and the treatments of Cicero (De officiis) and Seneca (De beneficiis and De clementia). They argue that Machiavelli’s understanding of these virtues is not in principle different from the classical understanding and that Machiavelli’s concern is more with the manner in which these virtues are perceived or “held” (tenuto). Other scholars argue that these chapters of The Prince completely overturn the classical and Christian understanding of these virtues and that Machiavelli intends a new account that is actually “useful” in the world (utile; P 15). The scholarly disagreement over the status of the virtues in the central chapters of The Prince, in other words, reflects the broader disagreement concerning Machiavelli’s understanding of virtue as such.

Lastly, it is worth noting that virtù comes from the Latin virtus, which itself comes from vir or “man.” It is no accident that those without virtue are often called weak, pusillanimous, and even effeminate (effeminato)—such as the Medes, who are characterized as effeminate as the result of a long peace (P 6). Neither is it an accident that fortune, with which virtue is regularly paired and contrasted, is female (e.g., P 20 and 25).

b. Fortune

Fortuna stands alongside virtù as a core Machiavellian concept. It is reliably translated as “fortune” but it can also mean “storms at sea” in both Latin and Italian.

Machiavelli often situates virtue and fortune in tension, if not opposition. At times, he suggests that virtue can resist or even control fortune (e.g., P 25). But he also suggests that fortune cannot be opposed (e.g., D 2.30) and that it can hold down the greatest of men with its “malignity” (malignità; P Ded.Let and 7, as well as D 2.pr). Fortune accompanies good with evil and evil with good (FH 2.30). Thus, one of the most important questions to ask of Machiavelli concerns this relationship between virtue and fortune.

One way of engaging this question is to think of fortune in terms of what Machiavelli calls the “arms of others” (arme d’altri; P 1 and 12-13; D 1.43). This phrase at times refers literally to soldiers who are owned by someone else (auxiliaries) and soldiers who change masters for pay (mercenaries). But it can also refer to a general sense of what is not one’s own, that is, what belongs or depends upon something else. Minimally, then, fortune means to rely upon outside influences—such as chance or God—rather than one’s self. Maximally, it may mean to rely completely upon outside influences and, in the end, to jettison completely the idea of personal responsibility. Few scholars would argue that Machiavelli upholds the maximal position, but it remains unclear how and to what extent Machiavelli believes that we should rely upon fortune in the minimal sense.

A second way of engaging this question is to examine the ways in which Machiavelli portrays fortune. In one passage, he likens fortune to “one of those violent rivers” (uno di questi fiumi rovinosi) which, when enraged, will flood plains and uproot everything in its path (P 25). This image uses language similar to the description of successful princes in the very same chapter (as well as elsewhere, such as P 19 and 20). Three times in the Prince 25 river image, fortune is said to have “impetus” (impeto); at least eight times throughout Prince 25, successful princes are said to need “impetuosity” (impeto) or to need to be impetuous (impetuoso). This linguistic proximity might mean various things: that virtue and fortune are not as opposed as they first appear; that a virtuous prince might share (or imitate) some of fortune’s qualities; or that a virtuous prince, in controlling fortune, takes over its role.

Even more famous than the likeness to a river is Machiavelli’s identification of fortune with femininity. This characterization has important Renaissance precedents—for instance, in the work of Leon Battista Alberti, Giovanni Pontano, and Enea Silvio Piccolomini. But Machiavelli’s own version is nuanced and has long resisted easy interpretation. In The Prince, fortune is identified as female (P 20) and is later said to be a woman or perhaps a lady (una donna; P 25). This image is echoed in one of Machiavelli’s poetic works, Dell’Occasione. There he is more specific: fortune is a woman who moves quickly with her foot on a wheel and who is largely bald-headed, except for a shock of hair that covers her face and prevents her from being recognized. Finally, in his tercets on fortune in I Capitoli, Machiavelli characterizes her as a two-faced goddess who is harsh, violent, cruel, and fickle.

It is worth looking more closely at The Prince’s image of una donna, which is the most famous of the feminine images. Machiavelli makes at least two provocative claims. Firstly, he says that it is necessary to beat and strike fortune down if one wants to hold her down. This hypothetical claim is often read as if it is a misogynistic imperative or at least a recommendation. But it is worth noting that Machiavelli does not claim that it is possible to hold fortune down at all; he instead simply remarks upon what would be necessary if one had the desire to do so. Secondly, Machiavelli says that fortune allows herself to be won more by the impetuous than by those who proceed in a cold or cautious manner. Thus, she is a friend of the young, “like a woman” (come donna; now a likeness rather than an identification). Here, too, it is worth noting that the emphasis concerns the agency of fortune. She is not conquered. Rather, she relents; she allows herself to be won. It is far from clear that the young men who come to her manage to subdue her in any meaningful way, with the implication being that it is not possible to do so without her consent.

On this point, it is also worth noting that recent work has increasingly explored Machiavelli’s portrayal of women. Although Machiavelli in at least one place discusses how a state is “ruined” because of women (D 3.26), he also seems to allow for the possibility of a female prince. The most notable ancient example is Dido, the founder and first queen of Carthage (P 20 and D 2.8). The most notable modern example is Caterina Sforza, who is called “Countess” six times (P 20; D 3.6; FH 8.34 [2x, but compare FH 7.22]; and AW 7.27 and 7.31) and “Madonna” twice (P 3 and D 3.6). Other possibilities include women who operate more indirectly, such Epicharis and Marcia—the respective mistresses of Nero and Commodus (D 3.6). In other words, Machiavelli seems to allow for the possibility of women who act virtuously, that is, who adopt manly characteristics. It may be that a problem with certain male, would-be princes is that they do not know how to adopt feminine characteristics, such as the fickleness or impetuosity of Fortune (e.g., P 25).

A third way of engaging the question of fortune’s role in Machiavelli’s philosophy is to look at what fortune does. One of fortune’s most important roles is supplying opportunity (e.g., P 6 and 20, as well as D 1.10 and D 2.pr). Even the most excellent and virtuous men appear to require the opportunity to display themselves. Figures as great as Moses, Romulus, Cyrus, and Theseus are no exception (P 6), nor is the quasi-mythical redeemer whom Machiavelli summons in order to save Italy (P 26). They all require the situation to be amenable: for a people to be weak or dispersed; for a province to be disunited; and so forth. However, some scholars have sought to deflate the role of fortune here by pointing to the meager basis of many opportunities (e.g., that of Romulus) and by emphasizing Machiavelli’s suggestion that one can create one’s own opportunities (P 20 and 26).

It is worth noting that Machiavelli writes on ingratitude, fortune, ambition, and opportunity in I Capitoli; notably, he omits a treatment of virtue. This pregnant silence may suggest that Machiavelli eventually came to see fortune, and not virtue, as the preeminent force in human affairs. In The Prince, he says: “I judge that it might be true” (iudico potere essere vero) that fortune governs half our actions and leaves the other half, or “close to it,” for us to govern (P 25; compare FH 7.21 and 8.36). But surely here Machiavelli is encouraging, even imploring us to ask whether it might not be true.

c. Nature

What Machiavelli means by “nature” is unclear. At times, it seems related to instability, as when he says that the nature of peoples is variable (P 6); that it is possible to change one’s nature with the times (P 25; D 1.40, 1.41, 1.58, 2.3, and 3.39); that worldly things by nature are variable and always in motion (P 10 and FH 5.1; compare P 25); that human things are always in motion (D 1.6 and 2.pr); and that all things are of finite duration (D 3.1). Elsewhere, it seems related to stability, as when he says that human nature is the same over time (e.g., D 1.pr, 1.11, and 3.43). At least once Machiavelli speaks of “natural things” (cose della natura; P 7); at least twice he associates nature with God (via spokesmen; see FH 3.13 and 4.16). In the only chapter in either The Prince or the Discourses which has the word “nature” (natura; D 3.43) in the title, the word surprisingly seems to mean something like “custom” or “education.” And the “natural prince” (principe naturale; P 2) seems to be a hereditary prince rather than someone who has a princely nature.

The question of nature is particularly important for an understanding of Machiavelli’s political philosophy, as he says that all human actions imitate nature (D 2.3 and 3.9). The following remarks about human nature will thus be serviceable signposts. For if human actions imitate nature, then it is reasonable to believe that Machiavelli’s account of human nature would gesture toward his account of the cosmos.

One of the key features of Machiavelli’s understanding of human beings is that they are fundamentally acquisitive and appetitive. The root human desire is the “very natural and ordinary” desire to acquire (P 3), which, like all desires, can never be fully satisfied (D 1.37 and 2.pr; FH 4.14 and 7.14). Human beings enjoy novelty; they especially desire new things (D 3.21) or things that they do not have (D 1.5). It is worth noting that, while these formulations are in principle compatible with the acquisition of intellectual or spiritual things, most of Machiavelli’s examples suggest that human beings are typically preoccupied with material things. For example, he says that human beings forget a father’s death more easily than the loss of patrimony (P 17). In other words, they love property more than honor.

Human beings are generally susceptible to deception. They are generally ungrateful and fickle liars (P 17) who judge by what they see (P 18). They tend to believe in appearances (P 18) and also tend to be deceived by generalities (D 1.47, 3.10, and 3.34). It is easy to persuade them of something but difficult to keep them in that persuasion (P 6).

This susceptibility extends to self-deception. Human beings deceive themselves in pleasure (P 23). They are taken more by present things than by past ones (P 24), since they do not correctly judge either the present or the past (D 2.pr). They have little prudence (D 2.11) but great ambition (D 2.20). They always hope (D 2.30; FH 4.18) but do not place limits on their hope (D 2.28), such that they will willingly change lords in the mistaken belief that things will improve (P 3). They share a common defect of overlooking the storm during the calm (P 24), for they are “blind” in judging good and bad counsel (D 3.35). They often act like “lesser birds of prey,” driven by nature to pursue their prey while a larger predator fatally circles above them (D 1.40).

Machiavelli’s remarks upon human nature extend into the moral realm. He says that human beings are envious (D 1.pr) and often controllable through fear (P 17). Consequently, they hate things due to their envy and their fear (D 2.pr). They do not know how to be either altogether bad or altogether good (D 1.30); are more prone to evil than to good (D 1.9); and will always turn out to be bad unless made good by necessity (P 23). In something of a secularized echo of Augustinian original sin, Machiavelli even goes so far at times as to say that human beings are wicked (P 17 and 18) and that they furthermore corrupt others by wicked means (D 3.8). Unlike Augustine, however, he rarely (if ever) upbraids such behavior, and he furthermore does not seem to believe that any redemption of wickedness occurs in the next world.

For Machiavelli, human beings are generally imitative. In other words, they almost always walk on previously beaten paths (P 6). Especially in The Prince, imitation plays an important role. Machiavelli regularly encourages (or at least appears to encourage) his readers to imitate figures such as Cesare Borgia (P 7 and P 13) or Caesar (P 14), as well as certain models (e.g., D 3.33) and the virtue of the past in general (D 2.pr). However, it should be noted that recent work has called into question whether these recommendations are sincere. Machiavelli for instance decries the imitation of bad models in “these corrupt centuries of ours” (D 2.19); and some scholars believe that his recommendations regarding Cesare Borgia and Caesar in particular are attenuated and even completely subverted in the final analysis.

Finally, it is worth noting that some scholars believe that Machiavelli goes so far as to subvert the classical account of a hierarchy or chain of being—either by blurring the boundaries between traditional distinctions (such as principality / republics; good / evil; and even man / woman) or, more radically, by demolishing the account as such. On such a reading, Machiavelli might believe that substances are not determined by their natures or even that there are no natures (and thus no substances).

d. History and Necessity

History (istoria / storia) and necessity (necessità) are two important terms for Machiavelli that remain particularly obscure.

Machiavelli is among the handful of great philosophers who is also a great historian. Although he was interested in the study of nature, his primary interest seemed to be the study of human affairs. He urges the study of history many times in his writings (e.g., P 14, as well as D 1.pr and 2.pr), especially with judicious attention (sensatamente; D 1.23; compare D 3.30). He implies that the Bible is a history (D 2.5) and praises Xenophon’s “life of Cyrus” as a history (P 14; D 2.13, 3.20, 3.22, and 3.39). The Discourses is presented as a philosophical commentary on Livy’s History. And Machiavelli wrote several historical works himself, including the verse Florentine history, I Decannali; the fictionalized biography of Castruccio Castracani; and the Medici-commissioned Florentine Histories. There is no question that he was keenly interested in the historian’s craft, especially the recovery of lost knowledge (e.g., D 1.pr and 2.5).

But what exactly does the historian study? What is history? It is not clear in Machiavelli’s writings whether he believes that time is linear or cyclical. Both accounts are compatible with his suggestions that human nature does not change (e.g., D 1.pr, 1.11, and 3.43) and that imitating the ancients is possible (e.g., D 1.pr). In some places in his writings, he gestures toward a progressive, even eschatological sense of time. His call for a legendary redeemer to unite Italy is a notable example (P 26). In other places, he gestures toward the cyclical account, such as his approximation of the Polybian cycle of regimes (D 1.2) or his suggestion that human events repeat themselves (FH 5.1; compare D 2.5). Scholars thus remain divided on this question. History for Machiavelli might be a process that has its own purposes and to which we must submit. Alternatively, it might be a process that we can master and turn toward our own ends.

In his major works, Machiavelli affords modern historians scant attention. He suggests in the first preface to the Discourses that the readers of his time lack a “true knowledge of histories” (D 1.pr). In the preface to the Florentine Histories, he calls Leonardo Bruni and Poggio Bracciolini “two very excellent historians” but goes on to point out their deficiencies (FH Pref). Machiavelli was friends with the historian Francesco Guicciardini, who commented upon the Discourses. Their philosophical engagement occurred primarily through correspondence, however, and in the major works Machiavelli does not substantively take up Guicciardini’s thought.

Machiavelli speaks more amply with respect to ancient historians. Recent work has pointed to provocative connections between Machiavelli’s thoughts and that of Greek historians, such as Herodotus (quoted at D 3.67), Thucydides (D 3.16 and AW 3.214), Polybius (D 3.40), Diodorus Siculus (D 2.5), Plutarch (D 1.21, 2.1, 2.24 [quoted], 3.12, 3.35, and 3.40), and Xenophon (P 14; D 2.2, 2.13, 3.20, 3.22 [2x], and 3.39 [2x]). Among the Latin historians that Machiavelli studied were Herodian (D 3.6), Justin (quoted at D 1.26 and 3.6), Procopius (quoted at D 2.8), Pliny (FH 2.2), Sallust (D 1.46, 2.8, and 3.6), Tacitus (D 1.29, 2.26, 3.6, and 3.19 [2x]; FH 2.2), and of course Livy.

In 1476, when Machiavelli was eight years old, his father obtained a complete copy of Livy and prepared an index of towns and places for the printer Donnus Nicolaus Germanus. It is therefore fitting that one of Machiavelli’s two most widely known books is ostensibly a commentary on Livy’s History. Machiavelli mentions and quotes Livy many times in his major works. With only a few exceptions (AW 2.13 and 2.24), his treatment of Livy takes place in Discourses. However, Machiavelli regularly alters or omits Livy’s words (e.g., D 1.12) and on occasion disagrees with Livy outright (e.g., D 1.58). There is even a suggestion that working with Livy’s account is akin to working with marble that has been badly blocked out (D 1.11). Only three chapters begin with epigraphic quotations from Livy’s text (D 2.3, 2.23, and 3.10), and in all three cases Livy’s words are modified in some manner. It remains an open question to what extent Machiavelli’s thought is a modification of Livy’s.

As with “history,” the word “necessity” has no univocal meaning in Machiavelli’s writings. Recent work has attempted to explore Machiavelli’s use of this term, with respect not only to his metaphysics but also to his thoughts on moral responsibility. Machiavelli frequently returns to the way that necessity binds, or at least frames, human action. Sometimes, Machiavelli seems to mean that an action is unavoidable, such as the “natural and ordinary necessity” (necessità naturale e ordinaria; P 3) of a new prince offending his newly obtained subjects. He suggests that there are certain rules of counsel that “never fail” (e.g., P 22). He speaks of the necessity that constrains writers (FH 7.6; compare D Ded. Let and D 1.10). And at least twice he mentions an “ultimate necessity” (ultima necessità; D 2.8 and FH 5.11). Sometimes, however, Machiavelli seems to mean that an action is a matter of prudence—meaning a matter of choosing the lesser evil (P 21)—such as using cruelty only “out of the necessity” (per la necessità; P 8) to secure one’s self and to maintain one’s acquisitions. And he suggests that there are rules which “never, or rarely, fail” (e.g., P 3)—that is, rules which admit the possibility of failure and which are thus not strictly necessary.

Machiavelli speaks of the necessities to be alone (D 1.9), to deceive (D 2.13), and to kill others (D 3.30). A Lucchese citizen in the Florentine Histories argues that “things done out of necessity neither should nor can merit praise or blame” (FH 5.11). And in one of the most famous passages concerning necessity, Machiavelli uses the word two different times and, according to some scholars, with two different meanings: “Hence it is necessary [necessario] to a prince, if he wants to maintain himself, to learn to be able not to be good, and to use this and not use it according to necessity” (la necessità; P 25).

Necessity might be a condition to which we must submit ourselves. Alternatively, it might be a condition that we can alter, implying that we can alter the meaning of necessity itself. If what is necessary today might not be necessary tomorrow, then necessity becomes a weaker notion. At the very least, necessity would not be directly opposed to contingency; instead, as some scholars maintain, necessity itself would be contingent in some way and therefore shapeable by human agency.

The beginning of Prince 25 merits close attention on this point. There Machiavelli reports a view that he says is widely held in his day: the belief that our lives are fated or determined to such an extent that it does not matter what we choose to do. Though he admits that he has sometimes been inclined to this position, he ponders a different possibility “so that our free will not be eliminated” (perché il nostro libero arbitrio non sia spento). On this question, some scholars highlight Renaissance versions of the Stoic notion of fate, which contemporaries such as Pietro Pomponazzi seem to have held. Other scholars highlight Machiavelli’s concerns, especially in his correspondence, with astrological determinism (a version of which his friend, Vettori, seems to have held). Two years before he wrote his famous 13-21 September 1506 letter to Giovan Battista Soderini—the so-called Ghiribizzi al Soderini (Musings to Soderini)—Machiavelli wrote a now lost letter to Batolomeo Vespucci, a Florentine teacher of astrology at the University of Padua. In his response to Machiavelli, Vespucci suggests that a wise man can affect the influence of the stars not by altering the stars (which is impossible) but by altering himself.

Still other scholars propose a connection with the so-called Master Argument (kurieon logos) of the ancient Megarian philosopher, Diodorus Cronus. Diodorus denies the possibility of future contingencies, that is, the possibility that future events do not already have a determined truth value. Aristotle famously argues against this view in De Interpretatione; Cicero and Boethius also discuss the issue in their respective treatments of divine providence. Some scholars have suggested that the beginning of Prince 25 not only problematizes Machiavelli’s notion of necessity but also engages with this ancient controversy.

e. Truth

Machiavelli makes a remark concerning military matters that he says is “truer than any other truth” (D 1.21). However, he is most famous for his claim in chapter 15 of The Prince that he is offering the reader what he calls the “effectual truth” (verità effettuale), a phrase he uses there for the only time in all of his writings. Although the effectual truth may pertain to military matters (e.g., P 14 and P 17), it is comprehensive in that it treats all the things of the world and not just military things (P 18). Surprisingly, there is still relatively little work on this fundamental Machiavellian concept. What exactly is the effectual truth?

One way to address this question is to begin with Chapter 15 of The Prince, where Machiavelli introduces the term. Given his stated intention there to “write something useful for whoever understands it,” Machiavelli claims that it is more conveniente to go after the effectual truth than the imagination of things that have never been seen or known “to be in truth” (vero essere; compare FH 8.29). Conveniente is variously rendered by translators as “fitting,” “convenient,” “suitable,” “appropriate,” “proper,” and the like (compare Romulus’ opportunity in P 6). Two things seem to characterize the effectual truth in Chapter 15. Firstly, it is distinguished from what is imagined, particularly imagined republics and principalities (incidentally, this passage is the last explicit mention of a “republic” in the book). Though Machiavelli often appeals to the reader’s imagination with images (e.g., fortune as a woman), the effectual truth seems to appeal to the reader in some other manner or through some other faculty. Whatever it is, the effectual truth does not seem to begin with images of things. Secondly, the effectual truth is more fitting for Machiavelli’s intention of writing something useful for the comprehending reader. The implication seems to be that other (more utopian?) intentions might find the imagination of things a more appropriate rhetorical strategy.

Another way to address this question is to begin with the Dedicatory Letter to The Prince. Machiavelli suggests that those who want to “know well” the natures of princes and peoples are like those who “sketch” (disegnano) landscapes. These sketchers place themselves at high and low vantage points or perspectives in order to see as princes and peoples do, respectively. Scholars have highlighted at least two implications of Machiavelli’s use of this image: that observers see the world from different perspectives; and that it is difficult, if not impossible, to see oneself from one’s own perspective. Machiavelli’s politics, meaning the wider world of human affairs, is always the realm of the partial perspective because politics is always about what is seen. “Everyone sees how you appear,” he says, meaning that even grandmasters of duplicity—such as Pope Alexander VI and the Roman emperor Septimius Severus—must still reveal themselves in some sense to the public eye. The truth begins in ordinary apprehension (e.g., D 1.3, 1.8, 1.12, 2.2, 2.21, 2.27, and 3.34). No one can engage in politics without submitting themselves to what Machiavelli calls “this aspect of the world” (P 18), which to say that no one can act in the world at all without displaying themselves in the very action (if not the result). But precisely because perspective is partial, it is subject to error and indeed manipulation (e.g., D 1.56, 2.pr, and 2.19).

Another way to put this point is to say that the “effect” (effetto) of the effectual truth is always the effect on some observer. Milan is not a wholly new principality as such but instead is new only to Francesco Sforza (P 1). Hannibal’s inhuman cruelty generates respect in the “sight” of his soldiers; by contrast, it generates condemnation in the sight of writers and historians (P 17). Unlike Machiavelli himself, those who damn the tumults of Rome do not see that these disorders actually lead to Roman liberty (D 1.4). It is worth noting that perspectives do not always differ. Sometimes multiple perspectives align, as when Severus is seen as “admirable” both by his soldiers and by the people (P 19; compare AW 1.257). Although the cause in each case differs—the people are “astonished” and “stupefied” (presumably through fear), whereas the soldiers are “reverent” and “satisfied” (presumably through love)—the same effect occurs. Or does it? Some scholars believe that differing causes cannot help but modify effects; in this case, admiration itself would be stained and colored by either love or fear and would be experienced differently as a result.

Machiavelli’s concern with appearance not only pertains to the interpretation of historical events but extends to practical advice, as well. Machiavelli says that a prince should desire to be held merciful and not cruel (though he immediately insists that a prince should take care not to “use this mercy badly”; P 17). And Machiavelli says that what makes a prince contemptible is to be held variable, light, effeminate, pusillanimous, or irresolute (P 19). What matters in politics is how we appear to others—how we are held (tenuto) by others. But how we appear depends upon what we do and where we place ourselves in order to do it. A wise prince for Machiavelli is not someone who is content to investigate causes—including superior causes (P 11), first causes (P 14 and D 1.4), hidden causes (D 1.3), and heavenly causes (D 2.5). Rather, it is someone who produces effects. And there are no effects considered abstractly. Some commentators believe that effects are only effects if they are seen or displayed. They thus see the effectual truth as proto-phenomenological. Others take a stronger line of interpretation and believe that effects are only effects if they produce actual changes in the world of human affairs. Touching rather than seeing might then be the better metaphor for the effectual truth (see P 18).

f. Politics: The Humors

Machiavelli is most famous as a political philosopher. Although he studied classical texts deeply, Machiavelli appears to depart somewhat from the tradition of political philosophy, a departure that in many ways captures the essence of his political position. At least at first glance, it appears that Machiavelli does not believe that the polity is caused by an imposition of form onto matter.

Given that Machiavelli talks of both form and matter (e.g., P 6 and D 1.18), this point deserves unpacking. Aristotle’s position is a useful contrast. For Aristotle, politics is similar to metaphysics in that form makes the city what it is. The difference between a monarchy and a republic is a difference in form. This is not simply a question of institutional arrangement; it is also a question of self-interpretation. Aristotelian political form is something like a lens through which the people understand themselves.

Firstly, it matters whether monarchs or republicans rule, as the citizens of such polities will almost certainly understand themselves differently in light of who rules them. A monarchical “soul” is different from a republican “soul.” Secondly, the factions of the city believe they deserve to rule on the basis of a (partial) claim of justice. Justice is thus the underlying basis of all claims to rule, meaning that, at least in principle, differing views can be brought into proximity to each other. Concord, or at least the potential for it, is both the basis and the aim of the city.

With respect to the first implication, Machiavelli occasionally refers to the six Aristotelian political forms (e.g., D 1.2). He even raises the possibility of a mixed regime (P 3; D 2.6 and 3.1; FH 5.8). But usually he speaks only of two forms, the principality and the republic (P 1). The lines between these two forms are heavily blurred; the Roman republic is a model for wise princes (P 3), and the people can be considered a prince (D 1.58). Machiavelli even at times refers to a prince of a republic (D 2.2). Finally, he says that virtuous princes can introduce any form that they like, with the implication being that form does not constitute the fundamental reality of the polity (P 6).

One explanation is that the reality that underlies all form is what Machiavelli nebulously calls “the state” (lo stato). On this account, political form for Machiavelli is not fundamentally causal; it is at best epiphenomenal and perhaps even nominal. Instead, Machiavelli assigns causality to the elements of the state called “humors” (umori) or “appetites” (appetiti). Some scholars focus on possible origins of this idea (e.g., medieval medicine or cosmology), whereas others focus on the fact that the humors are rooted in desire. Still others focus on the fact that the humors arise only in cities and thus do not seem to exist simply by nature.

Machiavelli says that the city or state is always minimally composed of the humors of the people and the great (P 9 and 19; D 1.4; FH 2.12 and 3.1, but contrast FH 8.19); in some polities, for reasons not entirely clear, the soldiers count as a humor (P 19). The polity is constituted, then, not by a top-down imposition of form but by a bottom-up clash of the humors. And as the humors clash, they generate various political effects (P 9)—these are sometimes good (e.g., “liberty”; D 1.4) and sometimes bad (e.g., “license”; P 17 and D 1.7, 1.37, 3.4 and 3.27; FH 4.1). It is worth noting that a third possibility is “principality,” which according to some scholars looks suspiciously like the imposition of form onto matter (e.g., P 6 and 26; see also FH Pref. and 3.1; compare the “wicked form” of D 3.8). Furthermore, Machiavelli does attribute certain qualities to those who live in republics—greater hatred, greater desire for revenge, and restlessness born from the memory of their previous liberty—which might be absent in those who live in principalities (P 4-5; D 1.16-19 and 2.2; FH 4.1). Such passages appear to bring him in closer proximity to the Aristotelian account than first glance might indicate.

The humors are also related to the second implication mentioned above. Machiavelli distinguishes the humors not by wealth or population size but rather by desire. These desires are inimical to each other in that they cannot be simultaneously satisfied: the great desire to oppress the people, and the people desire not to be oppressed (compare P 9, D 1.16, and FH 3.1). Discord, rather than concord, is thus the basis for the state. Consequently, Machiavelli says that a prince must choose to found himself on one or the other of these humors. Most interpreters have taken him to prefer the humor of the people for any number of reasons, not the least of which may be Machiavelli’s work for the Florentine republic. It is worth noting, though, that Machiavelli’s preference may be pragmatic rather than moral. Government means controlling one’s subjects (D 2.23), and “good government” might mean nothing more than a scorched-earth, Tacitean wasteland which one simply calls peace (P 7).

Although many aspects of Machiavelli’s account of the humors are well understood, some remain mysterious. Firstly, it is unclear what desire characterizes the humor of the soldiers, a third humor that occurs, if not always, at least in certain circumstances. Secondly, in the preface to the Florentine Histories Machiavelli suggests that Florence’s disintegration into multiple “divisions” (divisioni) is unique in the history of republics, but it is unclear how or why the typical humors of the people drove this great subdivide further in Florence (though FH 2 and 3 may offer important clues). Thirdly, it is unclear whether a “faction” (fazione; e.g., D 1.54) and a “sect” (setta; e.g., D 2.5)—each of which plays an important role in Machiavelli’s politics—ultimately reduce to one of the fundamental humors or whether they are instead oriented around something other than desire. Finally, it should be noted that recent work has questioned whether the humors are as distinct as previously believed; whether an individual or group can move between them; and whether they exist on something like a spectrum or continuum. For example, it may be the case that a materially secure people would cease to worry about being oppressed (and might even begin to desire to oppress others in the manner of the great); or that an armed people would effectively act as soldiers (such that a prince would have to worry about their contempt rather than their hatred).

g. Politics: Republicanism

Some scholars claim that Machiavelli is the last ancient political philosopher because he understands the merciless exposure of political life. By contrast, others claim that Machiavelli is the first modern political philosopher because he understands the need to found one’s self on the people. Either position is compatible with a republican reading of Machiavelli. The status of Machiavelli’s republicanism has been the focus of much recent work.

Many scholars focus on Machiavelli’s teaching as it is set forth in the Discourses (though many of the same lessons are found in The Prince). As in The Prince, Machiavelli attributes qualities to republican peoples that might be absent in peoples accustomed to living under a prince (P 4-5; D 1.16-19 and 2.2; FH 4.1). He also distinguishes between the humors of the great and the people (D 1.4-5; P 9). However, in the Discourses he explores more carefully the possibility that the clash between them can be favorable (e.g., D 1.4). He associates both war and expansion with republics and with republican unity; conversely, he associates peace and idleness with republican disunity (D 2.25). He notes the flexibility of republics (D 3.9), especially when they are ordered well (D 1.2) and regularly drawn back to their beginnings (D 3.1; compare D 1.6). He ponders the political utility of public executions and—as recent work has emphasized—courts or public trials (D 3.1; compare the parlements of P 3 and P 19 and Cesare’s court of P 7). He even considers the possibility of a perpetual republic (compare D 3.17 with D 1.20, 1.34, 2.30, 3.1, and 3.22). Like many other authors in the republican tradition, he frequently ponders the problem of corruption (e.g., D 1.17, 1. 18, 1.55, 2.Pr, 2.19, 2.22, 3.1, 3.16, and 3.33).

However, it remains unclear exactly what Machiavelli means by terms such as “corruption,” “freedom,” “law,” and even “republic.” It is therefore not surprising that the content of his republicanism remains unclear, as well. In order to provide a point of entry into this problem, it would be helpful to offer a brief examination of three rival and contemporary positions concerning Machiavelli’s republicanism. Although what follows are stylized and compressed glosses of complicated interpretations, they may serve as profitable beginning points for a reader interested in pursuing the issue further.

One interpretation might be summed up by the Machiavellian phrase “good laws” (e.g., P 12). It holds that Machiavelli is something of a neo-Roman republican. What matters the most, politically speaking, are robust institutions and deliberative participation in public life (e.g., D 1.55). Freedom is the effect of good institutions. Corruption is a moral failing and more specifically a failing of reason. This interpretation focuses upon the stability of public life. A strength of this interpretation is the emphasis that it places upon the rule of law as well as Machiavelli’s understanding of virtue. A possible weakness of this view is that it seems to overlook Machiavelli’s insistence that freedom is a cause of good institutions, not an effect of them (e.g., D 1.4); and that it seems to conflate the Machiavellian humor of the people with a more generic and traditional understanding of “people,” that is, all those who are under the law.

A second interpretation might be summed up by the Machiavellian term “tumults” (e.g., D 1.4). It holds that Machiavelli is something of a radical or revolutionary democrat whose ideas, if comparable to anything classical, are more akin to Greek thought than to Roman. What matters the most, politically speaking, is non-domination. Freedom is a cause of good institutions; freedom is not obedience to any rule but rather the continuous practice of resistance to oppression that undergirds all rules. Corruption is associated with the desire to dominate others. This interpretation focuses upon the instability—and even the deliberate destabilization—of political life. A strength of this interpretation is the emphasis that it places upon tumults, motion, and the more “decent end of the people (P 9; see also D 1.58). A possible weakness is that it seems to understand law in a denuded sense, that is, as merely a device to prevent the great from harming the people; and that it seems to overlook the chaos that might result from factional strife (e.g., P 17) or mob justice (e.g., FH 2.37 and 3.16-17).

A third interpretation, which is something of a middle position between the previous two, might be summed up by the Machiavellian phrase “wise prince” (e.g., P 3). It holds that Machiavelli advocates for something like a constitutional monarchy. What matters the most, politically speaking, is stability of public life and especially acquisitions, coupled with the recognition that such a life is always under assault from those who are dissatisfied. Freedom is both a cause and effect of good institutions. Corruption is associated with a decline (though not a moral decline) in previously civilized human beings. This interpretation focuses both on the stability and instability of political life (e.g., D 1.16). A strength of this interpretation is its emphasis upon understated features—such as courts, public trials, and even elections—in Machiavelli’s thought, and upon Machiavelli’s remarks concerning the infirmity of bodies which lack a “head” (e.g., P 26; D 1.44 and 1.57). A possible weakness is that it seems to downplay Machiavelli’s remarks on nature and consequently places outsized importance upon processes such as training (esercitato), education (educazione), and art (arte).

h. Glory

Glory is one of the key motivations for the various actors in Machiavelli’s corpus. Some scholars go so far as to claim that it is the highest good for Machiavelli. Others deflate its importance and believe that Machiavelli’s ultimate aim is to wean his readers from their desire for glory.

Machiavelli’s understanding of glory (gloria) is substantially beholden to that of the Romans, who were “great lovers of glory” (D 1.37; see also D 1.58 and 2.9). Ancient Romans attained prominence through the acquisition of dignitas, which can be translated as “dignity” but which also included the notion of honors or trophies awarded as recognition of one’s accomplishments. Possessions, titles, family achievements, and land could all contribute to dignitas. But what was most important was gloria, one’s glory and reputation (or lack thereof) for greatness. Plebeians, who did not possess as much wealth or family heritage as patricians, could still attain prominence in the Roman Republic by acquiring glory in speeches (e.g., Cicero) or through deeds, especially in wartime (e.g., Gaius Marius). Typically, this quest for glory occurred “within the system.” A Roman would begin his political career with a lower office (quaestor or aedile) and would attempt to rise to higher positions (tribune, praetor, or consul) by pitting his ambition and excellence in ferocious competition against his fellow citizens.

The destabilization of the Roman Republic was in part due to individuals who short-circuited this system, that is, who achieved glory outside the conventional political pathway. A notable example is Scipio Africanus. At the beginning of his ascendancy, Scipio had never held any political positions and was not even eligible for them. However, by his mid-twenties he had conducted major military reforms. By his mid-thirties, he had defeated no less a general than Hannibal, the most dangerous enemy the Romans ever faced and the “master [or teacher] of war” (maestro di guerra; D 3.10). This unprecedented achievement gained Scipio much glory—at least in the Senate, as Machiavelli notes (though not with Fabius Maximus; P 17 and D 3.19-21). Indeed, Scipio gained so much glory that he catapulted past his peers in terms of renown, regardless of his lack of political accomplishments. Consequently, his imitation was incentivized, which partly led to the rise of the warlords—such as Pompey and Julius Caesar—and the eventual end of the Republic.

Machiavelli’s understanding of glory is beholden to this Roman understanding in at least three ways: the dependence of glory upon public opinion; the possibility of an exceptional individual rising to prominence through nontraditional means; and the proximity of glory to military operations. One useful example of the concatenation of all three characteristics is Agathocles the Sicilian. Agathocles became king of Syracuse after rising from “a mean and abject fortune” (P 8). If one considers the “virtue of Agathocles,” Machiavelli says, one does not see why he should be judged inferior to “any most excellent captain.” Agathocles rose to supremacy with “virtue of body and spirit” and had no aid but that of the military. Indeed, there is little, if anything, that can be attributed to fortune in his ascent. It seems clear for all of these reasons that Agathocles is virtuous on the Machiavellian account. But Machiavelli goes on to say that “one cannot call it virtue” to do what Agathocles did. One cannot call it virtue to keep to a life of crime constantly; to slaughter the senators and the rich; to betray one’s friends; to be without faith, without mercy, without religion. Although such acts are compatible with Machiavellian virtue (and might even comprise it), they cannot be called virtuous according to the standards of conventional morality. Agathocles’ savage cruelty, inhumanity, and infinite crimes do not “permit him to be celebrated” among the most excellent human beings (compare P 6). In general, force and strength easily acquire reputation rather than the other way around (D 1.34). But Machiavelli concludes that Agathocles paid so little heed to public opinion that his virtue was not enough. In the end, Agathocles’ modes enabled him to acquire “empire but not glory” (P 8).

Glory for Machiavelli thus depends upon how you are seen and upon what people say about you. Many of the successful and presumably imitable figures in both The Prince and the Discourses share the quality of being cruel, for example. But even “cruelties well-used” (P 8) are insufficient to maintain your reputation in the long run. This is at least partly why explorations of deceit and dissimulation take on increasing prominence as both works progress (e.g., P 6, 19, and especially 26; D 3.6). One must learn to imitate not only the force of the lion but also the fraud of the fox (P 7, 18, and 19; D 2.13 and 3.40). Doing so might allow one to avoid a “double shame” and instead achieve a “double glory”: beginning a new regime and adorning it with good laws, arms, and examples (P 24).

Whether veneration (venerazione) and reverence (riverenzia) are ultimately higher concepts than glory remains an important question, and recent work has taken it up. Those interested in this question may find it helpful to begin with the following passages: P 6, 7, 11, 17, 19, 23, and 26; D 1.10-12, 1.36, 1.53-54, 2.20, 3.6 and 3.22; FH 1.9, 3.8, 3.10, 5.13, 7.5, and 7.34; and AW 6.163, 7.215, 7.216, and 7.223.

i. Religion

The place of religion in Machiavelli’s thought remains one of the most contentious questions in the scholarship. His brother Totto was a priest. His father appeared to be a devout believer and belonged to a flagellant confraternity called the Company of Piety. When Machiavelli was eleven, he joined the youth branch of this company, and he moved into the adult branch in 1493. From 1500 to 1513, Machiavelli and Totto paid money to the friars of Santa Croce in order to commemorate the death of their father and to fulfill a bequest from their great-uncle. Machiavelli’s actual beliefs, however, remain mysterious. He did write an Exhortation to Penitence (though scholars disagree as to his sincerity; compare P 26). And he did accept the last rites upon his deathbed in the company of his wife and some friends. But evidence in his correspondence—for instance, in letters from close friends such as Francesco Vettori and Francesco Guicciardini—suggests that Machiavelli did not take pains to appear publicly religious.

As with many other philosophers of the modern period, interpretations of Machiavelli’s religious beliefs can gravitate to the extremes: some scholars claim that Machiavelli was a pious Christian, while others claim that he was a militant and unapologetic atheist. Still others claim that he was religious but not in the Christian sense. It remains unclear what faith (fide) and piety (or mercy, pietà) mean for Machiavelli. Perhaps the easiest point on entry is to examine how Machiavelli uses the word “religion” (religione) in his writings.

Machiavelli variously speaks of “the present religion” (la presente religione; e.g., D 1.pr), “this religion” (questa religione; e.g., D 1.55), “the Christian religion” (la cristiana religione; e.g., FH 1.5), and “our religion” (nostra religione; e.g., D 2.2). Machiavelli says that “our religion [has shown] the truth and the true way” (D 2.22; cf. D 3.1 and 1.12), though he is careful not to say that it is the true way. “Our religion” is also contrasted to the curiously singular “ancient religion” (religione antica; D 2.2). Recent work has suggested that Machiavelli’s notion of the ancient religion may be analogous to, or even associated with, the prisca theologia / philosophia perennis which was investigated by Ficino, Pico, and others.

Machiavelli speaks of religious “sects” (sette; e.g., D 2.5), a type of group that seems to have a lifespan between 1,666 and 3,000 years. Species of sects tend to be distinguished by their adversarial character, such as Catholic versus heretical (FH 1.5); Christian versus Gentile (D 2.2); and Guelf versus Ghibelline (P 20). They also generally, if not exclusively, seem to concern matters of theological controversy. It is not clear whether and to what extent a religion differs from a sect for Machiavelli.

Machiavelli suggests that reliance upon certain interpretations—“false interpretations” (false interpretazioni)—of the Christian God has led in large part to Italy’s servitude. Such interpretations implore human beings to think more of enduring their beatings than of avenging them (D 2.2 and 3.27). He seems to allow for the possibility that not all interpretations are false; for example, he says that Francis and Dominic rescue Christianity from elimination, presumably because they return it to an interpretation that focuses upon poverty and the life of Christ (D 3.1). And one of the things that Machiavelli may have admired in Savonarola is how to interpret Christianity in a way that is muscular and manly rather than weak and effeminate (compare P 6 and 12; D 1.pr, 2.2 and 3.27; FH 1.5 and 1.9; and AW 2.305-7).

Some scholars have emphasized the various places where Machiavelli associates Christianity with the use of dissimulation (e.g., P 18) and fear (e.g., D 3.1) as a form of social control. Other scholars believe that Machiavelli adheres to an Averroeist (which is to say Farabian) understanding of the public utility of religion. On such an understanding, religion is necessary and salutary for public morality. The philosopher should therefore take care not to disclose his own lack of belief or at least should attack only impoverished interpretations of religion rather than religion as such.

Finally, recent work has emphasized the extent to which Machiavelli’s concerns appear eminently terrestrial; he never refers in either The Prince or the Discourses to the next world or to another world.

j. Ethics

Machiavelli’s very name has become a byword for treachery and relentless self-interest. His ethical viewpoint is usually described as something like “the end justifies the means” (see for instance D 1.9). Is this a fair characterization?

The easiest point of entry into Machiavelli’s notion of ethics is the concept of cruelty. At least since Montaigne (and more recently with philosophers such as Judith Skhlar and Richard Rorty), this vice has held a special philosophical status. Indeed, contemporary moral issues such as animal ethics, bullying, shaming, and so forth are such contentious issues largely because liberal societies have come to condemn cruelty so severely. It is all the more striking to readers today, then, when they confront Machiavelli’s seeming recommendations of cruelty. Such recommendations are common throughout his works. In the Discourses, Machiavelli appears to recommend a cruel way which is an enemy to every “Christian,” and indeed “human,” way of life (D 1.26); furthermore, he appears to indirectly attribute this way of life to God (via David). In The Prince, he speaks of “cruelties well-used” (P 8) and explicitly identifies almost every imitable character as cruel (e.g., P 7, 8, 19, and 21). He even speaks of “mercy badly used” (P 17).

The fact that seeming vices can be used well and that seeming virtues can be used poorly suggests that there is an instrumentality to Machiavellian ethics that goes beyond the traditional account of the virtues. One could find many places in his writings that support this point (e.g., D 1.pr and 2.6), although the most notable is when he says that he offers something “useful” to whoever understands it (P 15). But what exactly is this instrumentality?

Partly, it seems to come from human nature. We have a “natural and ordinary desire” to acquire (P 3) which can never in principle be satisfied (D 1.37 and 2.pr; FH 4.14 and 7.14). Human life is thus restless motion (D 1.6 and 2.pr), resulting in clashes in the struggle to satisfy one’s desires. It is thus useful as a regulative ideal, and is perhaps even true, that we should see others as bad (D 1.3 and 1.9) and even wicked beings (P 17 and 18) who corrupt others by wicked means (D 3.8). In order to survive in such a world, goodness is not enough (D 3.30). Instead, we must learn how not to be good (P 15 and 19) or even how to enter into evil (P 18; compare D 1.52), since it is not possible to be altogether good (D 1.26). Even “the good” itself is variable (P 25). Thus, virtues and vices serve something outside themselves; they are not purely good or bad. Recognizing this limitation of both virtue and vice is eminently useful.

Another way to put this point is in terms of imitation. While we should often imitate those greater than us (P 6), we should also learn how to imitate those lesser than us. For example, we should imitate animals in order to fight as they do, since human modes of combat, such as law, are often not enough—especially when dealing with those who do not respect laws (P 18). More specifically, we should imitate the lion and the fox. The lion symbolizes force, perhaps to the point of cruelty; the fox symbolizes fraud, perhaps to the point of lying about the deepest things, such as religion (P 18). Everything, even one’s faith (D 1.15) and one’s offspring (P 11), can be used instrumentally.

The mention of the fox brings us to a second profitable point of entry into Machiavellian ethics, namely deception. Machiavelli’s moral exemplars are often cruel, but they are also often dissimulators. One of the clearest examples is Pope Alexander VI, a particularly adroit liar (P 18). Throughout his writings, Machiavelli regularly advocates lying (e.g., D 1.59 and 3.42; FH 6.17), especially for those who attempt to rise from humble beginnings (e.g., D 2.13). He even at one point suggests that it is useful to simulate craziness (D 3.2).

Because cruelty and deception play such important roles in his ethics, it is not unusual for related issues—such as murder and betrayal—to rear their heads with regularity. If Machiavelli possessed a sense of moral squeamishness, it is not something that one easily detects in his works. However, it should be noted that recent work has suggested that many, if not all, of Machiavelli’s shocking moral claims are ironic. If this hypothesis is true, then his moral position would be much more complicated than it appears to be. Does Machiavelli ultimately ask us to rise above considerations of utility? Does he, of all people, ask us to rise above what we have come to see as Machiavellianism?

3. Machiavelli’s Corpus

In what follows, Machiavelli’s four major works are discussed and then his other writings are briefly characterized.

a. The Prince

The Prince is Machiavelli’s most famous philosophical book. It was begun in 1513 and probably completed by 1515. We possess no surviving manuscript copy of it in Machiavelli’s own handwriting. We first hear of it in Machiavelli’s 10 December 1513 letter to his friend, Francesco Vettori, wherein Machiavelli divulges that he has been composing “a little work” entitled De Principatibus. Machiavelli also says that Filippo Casavecchia, a longtime friend, has already seen a rough draft of the text.

Evidence suggests that other manuscript copies were circulating among Machiavelli’s friends, and perhaps beyond, by 1516-17. These manuscripts, some of which we do possess, do not bear the title of The Prince. That title did not appear until roughly five years after Machiavelli’s death, when the first edition of the book was published with papal privilege in 1532.

Which title did Machiavelli intend: the Latin title of De Principatibus (“Of Principalities”); or the Italian title of Il Principe (“The Prince”)? That the book has two purported titles—and that they do not translate exactly into one another—remains an enduring and intriguing puzzle. The structure of The Prince does not settle the issue, as the book begins with chapters that explicitly treat principalities, but eventually proceeds to chapters that explicitly treat princes. Nor does the content settle the issue; the chapter titles are in Latin but the body of each chapter is in Italian, and the words “prince” and “principality” occur frequently throughout the entire book. Lastly, the Discourses offer no easy resolution; Machiavelli there refers to The Prince both as “our treatise of principalities” (nostro trattato de’ principati; D 2.1) and “our treatise of the Prince” (nostro trattato de Principe; D 3.42).

The Prince is composed of twenty-six chapters which are preceded by a Dedicatory Letter to Lorenzo de’ Medici (1492-1519), the grandson of Lorenzo the Magnificent (1449-92). As we learn from the aforementioned letter to Vettori, Machiavelli had originally intended to dedicate The Prince to Lorenzo the Magnificent’s son, Giuliano. At some point, for reasons not entirely clear, Machiavelli changed his mind and dedicated the volume to Lorenzo. We do not know whether Giuliano or Lorenzo ever read the work. There is an old story, perhaps apocryphal, that Lorenzo preferred a pack of hunting dogs to the gift of The Prince and that Machiavelli consequently swore revenge against the Medici. At any rate, the question of the precise audience of The Prince remains a key one. Some interpreters have even suggested that Machiavelli writes to more than one audience simultaneously.

The question of authorial voice is also important. Machiavelli himself appears as a character in The Prince twice (P 3 and 7) and sometimes speaks in the first person (e.g., P 2 and P 13). However, it is not obvious how to interpret these instances, with some recent scholars going so far as to say that Machiavelli operates with the least sincerity precisely when speaking in his own voice. This issue is exacerbated by the Dedicatory Letter, in which Machiavelli sets forth perhaps the foundational image of the book. He compares “those who sketch [disegnano]” landscapes from high and low vantage points to princes and peoples, respectively. And he suggests that “to know well” the nature of peoples one needs to be a prince, and vice versa. The suggestion seems to be that Machiavelli throughout the text variously speaks to one or the other of these vantage points and perhaps even variously speaks from one or the other of these vantage points. At the very least, the image implies that we should be wary of taking his claims in a straightforward manner. The sketcher image becomes even more complicated later in the text, when Machiavelli introduces the perspectives of two additional “humors” of the city, that is, the great (i grandi; P 9) and the soldiers (i soldati; P 19).

An additional interpretative difficulty concerns the book’s structure. In the first chapter, Machiavelli appears to give an outline of the subject matter of The Prince. But this subject matter appears to be exhausted as early as Chapter 7. What, then, to make of the rest of the book? One possibility is that The Prince is not a polished work; some scholars have suggested that it was composed in haste and that consequently it might not be completely coherent. An alternative hypothesis is that Machiavelli has some literary or philosophical reason to break from the structure of the outline, keeping with his general trajectory of departing from what is customary. A third hypothesis is that the rest of the book is somehow captured by the initial outline and that what Machiavelli calls “threads” (orditi; P2) or “orders” (ordini; P 10) flow outward, if only implicitly, from the first chapter.

Whatever interpretation one holds to, the subject matter of the book seems to be arranged into roughly four parts: Chapters 1-11 treat principalities (with the possible exception of Chapter 5); Chapters 12-14 treat the art of war; Chapters 15-19 treat princes; and Chapters 20-26 treat what we may call the art of princes. The first three sections, at least, are suggested by Machiavelli’s own comments in the text. In Chapter 12, Machiavelli says that he has previously treated the acquisition and maintenance of principalities and says that the remaining task is to discourse generally on offensive and defensive matters. Similarly, in Chapter 15, Machiavelli says that what remains is to see how a prince should act with respect to subjects and friends, implying minimally that what has come previously is a treatment of enemies.

Almost from its composition, The Prince has been notorious for its seeming recommendations of cruelty; its seeming prioritization of autocracy (or at least centralized power) over more republican or democratic forms; its seeming lionization of figures such as Cesare Borgia and Septimius Severus; its seeming endorsements of deception and faith-breaking; and so forth. Indeed, it remains perhaps the most notorious work in the history of political philosophy. One should be wary, however, of resting with what seems to be the case in The Prince, especially given Machiavelli’s repeated insistence that appearances can be manipulated. But the meaning of these manipulations, and indeed of these appearances, remains a scholarly question. Interpreters of the caliber of Rousseau and Spinoza have believed The Prince to bear a republican teaching at its core. Some scholars have gone so far as to see it as an utterly satirical or ironic work. Others have insisted that the book is even more dangerous than it first appears. At any rate, how The Prince fits together with the Discourses (if at all) remains one of the enduring puzzles of Machiavelli’s legacy.

b. Discourses on Livy

There is reason to suspect that Machiavelli had begun writing the Discourses as early as 1513; for instance, there seems to be a reference in The Prince to another, lengthier work on republics (P 2). And since the Discourses references events from as late as 1517, it seems to have still been a work in progress by that point and perhaps even later.

Evidence suggests that manuscript copies were circulating by 1530 and perhaps earlier. We do not possess any of these manuscripts; in fact, we possess no manuscript of the Discourses in Machiavelli’s handwriting except for what is now known as the preface to the first book. It bears no heading and begins with a paragraph that our other manuscripts do not have. There is still debate over whether this paragraph should be excised (since it is not found in the other manuscripts) or whether it should be retained (since it is found in the only polished writing we have of the Discourses in Machiavelli’s hand). It is typically retained in English translations.

Roughly four years after Machiavelli’s death, the first edition of the Discourses was published with papal privilege in 1531. As with The Prince, there is a bit of mystery surrounding the title of the Discourses. The book appeared first in Rome and then a few weeks later in Florence, with the two publishers (Blado and Giunta, respectively) seemingly working with independent manuscripts. Both the Blado and Giunta texts give the title of Discorsi sopra la prima deca di Tito Livio. The reference is to Livy’s History of Rome (Ab Urbe Condita) and more specifically to its first ten books. Machiavelli refers simply to Discorsi in the Dedicatory Letter to the work, however, and it is not clear whether he intended the title to specifically pick out the first ten books by name. Additionally, some of Machiavelli’s contemporaries, such as Guicciardini, do not name the book by the full printed title. Today, the title is usually given as the Discourses on Livy (or the Discourses for short).

The number of chapters in the Discourses is 142, which is the same number of books in Livy’s History. This is a curious coincidence and one that is presumably intentional. But what is the intent? Scholars are divided on this issue. A second, related curiosity is that the manuscript as we now have it divides the chapters into three parts or books. However, the third part does not have a preface as the first two do.

As with the dedicatory letter to The Prince, there is also a bit of mystery surrounding the dedicatory letter to the Discourses. The work is dedicated to Zanobi Buondelmonti and Cosimo Rucellai, two of Machiavelli’s friends, of whom Machiavelli says in the letter that they deserve to be princes even though they are not. It is noteworthy that the Discourses is the only one of the major prose works dedicated to friends; by contrast, The Prince, the Art of War, and the Florentine Histories are all dedicated to potential or actual patrons.

Machiavelli makes his presence known from the very beginning of the Discourses; the first word of the work is the first person pronoun, “Io.” And indeed the impression that one gets from the book overall is that Machiavelli takes fewer pains to recede into the background here than in The Prince. The Discourses is, by Machiavelli’s admission, ostensibly a commentary on Livy’s history. In the preface to the first book, Machiavelli laments the fact that there is no longer a “true knowledge of histories” (vera cognizione delle storie) and judges it necessary to write upon the books of Livy that have not been intercepted by “the malignity of the times” (la malignità de’ tempi). He claims that those who read his writings can “more easily draw from them that utility [utilità] for which one should seek knowledge of histories” (D I.pr). However, it is a strange kind of commentary: one in which Machiavelli regularly alters or omits Livy’s words (e.g., D 1.12) and in which he disagrees with Livy outright (e.g., D 1.58).

Clues as to the structure of the Discourses may be gleaned from Machiavelli’s remarks in the text. At the end of the first chapter (D 1.1), Machiavelli distinguishes between things done inside and outside the city of Rome. He further distinguishes between things done by private and public counsel. Finally, he claims that the first part or book will treat things done inside the city by public counsel. The first part, then, primarily treats domestic political affairs. Machiavelli says that the second book concerns how Rome became an empire, that is, it concerns foreign political affairs (D 2.pr). If Machiavelli did in fact intend there to be a third part, the suggestion seems to be that it concerns affairs conducted by private counsel in some manner. It is noteworthy that fraud and conspiracy (D 2.13, 2.41, and 3.6), among other things, become increasingly important topics as the book progresses. At first glance, it is not clear whether the teaching of the Discourses complements that of The Prince or whether it militates against it. Scholars remain divided on this issue. Some insist upon the coherence of the books, either in terms of a more nefarious teaching typically associated with The Prince; or in terms of a more consent-based, republican teaching typically associated with the Discourses. Others see the Discourses as a later, more mature work and take its teaching to be truer to Machiavelli’s ultimate position, especially given his own work for the Florentine republic. At any rate, how the books fit together remains perhaps the preeminent puzzle concerning Machiavelli’s philosophy. The Discourses nevertheless remains one of the most important works in modern republican theory. It had an enormous effect on republican thinkers such as Rousseau, Montesquieu, Hume, and the American Founders. (See “Politics: Republicanism” above.)

c. Art of War

The Art of War is the only significant prose work published by Machiavelli during his lifetime and his only attempt at writing a dialogue in the humanist tradition. It was probably written in 1519. The first edition was published in 1521 in Florence under the title Libro della arte della Guerra di Niccolò Machiavegli cittadino et segretario fiorentino. It takes the literary form of a dialogue divided into seven books and preceded by a preface. Like The Prince, the work is dedicated to a Lorenzo—in this case, Lorenzo di Filippo Strozzi, “Florentine Patrician.” Strozzi was either a friend (as has been customarily held) or a patron (as recent work suggests). It is worth noting in passing that we possess autograph copies of two of Strozzi’s works in Machiavelli’s hand (Commedia and Pistola).

The action of the Art of War takes place after dinner and in the deepest and most secret shade (AW 1.13) of the Orti Oricellari, the gardens of the Rucellai family. These gardens were cultivated by Bernardo Rucellai, a wealthy Florentine who was a disciple of Ficino and who was also the uncle of two Medici popes, Leo X and Clement VII (via his marriage to Nannina, the eldest sister of Lorenzo the Magnificent). Bernardo filled the gardens with plants mentioned in classical texts (AW 1.13-15) and intended the place to be a center of humanist discussion. Ancient philosophy, literature, and history were regularly discussed there, in addition to contemporary works on occasion (for example, some of Machiavelli’s Discourses on Livy). Visitors included Machiavelli, Guicciardini, and members of Ficino’s so-called Platonic Academy. Notably, the gardens were the site of at least two conspiracies: an aristocratic one while Florence was a republic under the rule of Soderini (1498-1512); and a republican one, headed up by Cosimo Rucellai, after the Medici regained control in 1512. Conspiracy is one of the most extensively examined themes in Machiavelli’s corpus: it is the subject of both the longest chapter of The Prince (P 19) and the longest chapter of the Discourses (D 3.6; see also FH 2.32, 7.33, and 8.1).

One of the interlocutors of the Art of War is Bernardo’s grandson, Cosimo Rucellai, who is also one of the dedicatee of the Discourses. The other dedicatee of the Discourses, Zanobi Buondelmonti, is also one of the interlocutors of the Art of War. Two of the other young men present are Luigi Alammani (to whom Machiavelli dedicated the Life of Castruccio Castracani along with Zanobi) and Battista della Palla. But perhaps the most important and striking speaker is Fabrizio Colonna. Colonna was a mercenary captain—notable enough, given Machiavelli’s insistent warnings against mercenary arms (e.g., P 12-13 and D 1.43). However, Colonna was also the leader of the Spanish forces that compelled the capitulation of Soderini and that enabled the Medici to regain control of Florence.

In the preface to the work, Machiavelli notes the vital importance of the military: he compares it to a palace’s roof, which protects the contents (compare FH 6.34). And he laments the corruption of modern military orders as well as the modern separation of military and civilian life (AW Pref., 3-4). Roughly speaking, books 1 and 2 concern issues regarding the treatment of soldiers, such as payment and discipline. Books 3 and 4 concern issues regarding battle, such as tactics and formation. Book 5 concerns issues regarding logistics, such as supply lines and the use of intelligence. Book 6 concerns issues regarding the camp, including a comparison to the way that the Romans organized their camps. Book 7 concerns issues regarding armament, such as fortifications and artillery. Like The Prince, the Art of War ends with an indictment of Italian princes with respect to Italy’s weak and fragmented situation.

Many Machiavellian themes from The Prince and the Discourses recur in the Art of War. Some examples are: the importance of one’s own arms (AW 1.180; P 6-9 and 12-14; D 2.20); modern misinterpretations of the past (AW 1.17; D 1.pr and 2.pr); the way that good soldiers arise from training rather than from nature (AW 1.125 and 2.167; D 1.21 and 3.30-9); the need to divide an army into three sections (AW 3.12ff; D 2.16); the willingness to adapt to enemy orders (AW 4.9ff; P 14; D 3.39); the importance of inspiring one’s troops (AW 4.115-40; D 3.33); the importance of generating obstinacy and resilience in one’s troops (AW 4.134-48 and 5.83; D 1.15); and the relationship between good arms and good laws (AW 1.98 and 7.225; P 12).

Strong statements throughout his corpus hint at the immensely important role of war in Machiavelli’s philosophy. In The Prince, Machiavelli says that a prince should focus all of his attention upon becoming a “professional” in the art of war (professo; compare the “professions” of AW Pref. and P 15), for “that is the only art which is of concern to one who commands” (P 14). In the Discourses, he says that it is “truer than any other truth” that it is always a prince’s defect (rather than a defect of a site or nature) when human beings cannot be made into soldiers (D 1.21). And his only discussion of science in The Prince or the Discourses comes in the context of hunting as an image of war (D 3.39). Such statements, along with Machiavelli’s dream of a Florentine militia, point to the key role of the Art of War in Machiavelli’s corpus. But the technical nature of its content, if nothing else, has proved to be a resilient obstacle for scholars who attempt to master it, and the book remains the least studied of his major works.

d. Florentine Histories

This is the last of Machiavelli’s major works. It was not his first attempt at penning a history; Machiavelli had already written a two-part verse history of Italy, I Decennali, which covers the years 1492-1509. But the Florentine Histories is a greater effort. It is written in prose and covers the period of time from the decline of the Roman Empire until the death of Lorenzo the Magnificent in 1434.

The Florentine Histories was commissioned in 1520 by Pope Leo X, on behalf of the Officers of Study of Florence. The intervention of Cardinal Giulio de’ Medici was key; the Histories would be dedicated to him and presented to him in 1525, by which time he had ascended to the papacy as Clement VII. Machiavelli presented eight books to Clement and did not write any additional ones. They were not published until 1532.

Although Giulio had made Machiavelli the official historiographer of Florence, it is far from clear that the Florentine Histories are a straightforward historiographical account. Machiavelli says in the Dedicatory Letter that he is writing of “those times which, through the death of the Magnificent Lorenzo de’ Medici, brought a change of form [forma] in Italy.” He says that he has striven to “satisfy everyone” while “not staining the truth.” In the Preface, Machiavelli says that his intent is to write down “the things done inside and outside [the city] by the Florentine people” (le cose fatte dentro e fuora dal popolo fiorentino) and that he changed his original intention in order that “this history may be better understood in all times.”

Though Book 1 is ostensibly a narrative concerning the time from the decline of the Roman Empire, in Book 2 he calls Book 1 “our universal treatise” (FH 2.2), thus implying that it is more than a simple narrative. Books 2, 3, and 4 concern the history of Florence itself from its origins to 1434. Books 5, 6, 7, and 8 concern Florence’s history against the background of Italian history.

In Book 1, Machiavelli explores how Italy has become disunited, in no small part due to causes such as Christianity (FH 1.5) and barbarian invasions (FH 1.9). The rise of Charlemagne is also a crucial factor (FH 1.11). Machiavelli notes that Christian towns have been left to the protection of lesser princes (FH 1.39) and even no prince at all in many cases (FH 1.30), such that they “wither at the first wind” (FH 1.23).

In Book 2, Machiavelli famously calls Florence “[t]ruly a great and wretched city” (Grande veramente e misera città; FH 2.25). Scholars have long focused upon how Machiavelli thought Florence was wretched, especially when compared to ancient Rome. But recent work has begun to examine the ways in which Machiavelli thought that Florence was great, as well; and on the overlap between the Histories and the Discourse on Florentine Affairs (which was also commissioned by the Medici around 1520). Book 2 also examines the ways in which the nobility disintegrates into battles between families (e.g., FH 2.9) and into various splinter factions of Guelfs (supporters of the Pope) and Ghibellines (supporters of the Emperor). The rise of Castruccio Castracani, alluded to in Book 1 (e.g., FH 1.26), is further explored (FH 2.26-31), as well as various political reforms (FH 2.28 and 2.39).

Books 3 and 4 are especially notable for Machiavelli’s analysis of the class conflicts that exist in every polity (e.g., FH 3.1), and some scholars believe that his treatment here is more developed and nuanced than his accounts in either The Prince or the Discourses. Machiavelli also narrates the rise of several prominent statesmen: Salvestro de’ Medici (FH 3.9); Michele di Lando (FH 3.16-22; compare FH 3.13); Niccolò da Uzzano (FH 4.2-3); and Giovanni di Bicci de’ Medici (FH 4.3 and 4.10-16), whose family is in the ascendancy at the end of Book 4.

Books 5 and 6 ostensibly concern the rise of the Medici, and indeed one might view Cosimo’s ascent as something of the central event of the Histories (see for instance FH 5.4 and 5.14). Yet in fact Machiavelli devotes the majority of Books 5 and 6 not to the Medici but rather to the rise of mercenary armies in Italy (compare P 12 and D 2.20). Among the topics that Machiavelli discusses are the famous battle of Anghiari (FH 5.33-34); the fearlessness of mercenary captains to break their word (FH 6.17); the exploits of Francesco Sforza (e.g., FH 6.2-18; compare P 1, 7, 12, 14, and 20 as well as D 2.24); and the propensity of mercenaries to generate wars so that they can profit (FH 6.33; see also AW 1.51-62).

Books 7 and 8 principally concern the rise of the Medici—in particular Cosimo; his son, Piero the Gouty; and his son in turn, Lorenzo the Magnificent. Cosimo (though “unarmed”) dies with great glory and is famous largely for his liberality (FH 7.5) and his attention to city politics: he prudently and persistently married his sons into wealthy Florentine families rather than foreign ones (FH 7.6). Cosimo also loved classical learning to such an extent that he brought John Argyropoulos and Marsilio Ficino to Florence. Additionally, Cosimo left a strong foundation for his descendants (FH 7.6). Piero is highlighted mainly for lacking the foresight and prudence of his father; for fomenting popular resentment; and for being unable to resist the ambition of the great. Nonetheless, Machiavelli notes Piero’s “virtue and goodness” (FH 7.23). Lorenzo is noted for his youth (F 7.23); his military prowess (FH 7.12); his desire for renown (FH 8.3); his eventual bodyguard of armed men due to the Pazzi assassination attempt (FH 8.10); and his many amorous endeavors (FH 8.36). The Histories end with the death of Lorenzo.

The Histories has received renewed attention in recent years, and scholars have increasingly seen it as not merely historical but also philosophical—in other words, as complementary to The Prince and the Discourses.

e. Other Works

Machiavelli’s other writings are briefly described here. Every single work is not listed; instead, emphasis has been placed upon those that seem to have philosophical resonance.

Some of Machiavelli’s writings treat historical or political topics. In the early 1500s, he wrote several reports and speeches. They are notable for their topics and for the way in which they contain precursors to important claims in later works, such as The Prince. Among other things, Machiavelli wrote on how Duke Valentino killed Vitellozzo Vitelli (compare P 7); on how Florence tried to suppress the factions in Pistoia (compare P 17); and how to deal with the rebels of Valdichiana.

In 1520, Machiavelli wrote a fictionalized biography, The Life of Castruccio Castracani. Many important details of Castruccio’s life are changed and stylized by Machiavelli, perhaps in the manner of Xenophon’s treatment of Cyrus. The most obvious changes are found in the final part, where Machiavelli attributes to Castruccio many sayings that are in fact almost exclusively drawn from the Lives of Diogenes Laertius. Some scholars believe that Machiavelli’s account is also beholden to the various Renaissance lives of Tamerlane—for instance, those by Poggio Bracciolini and especially Enea Silvio Piccolomini, who would become Pope Pius II and whose account became something of a genre model.

Also around 1520, Machiavelli wrote the Discourse on Florentine Affairs. Recent work has suggested the proximity in content between this work and the Florentine Histories. Also of interest is On the Natures of Florentine Men, which is an autograph manuscript which Machiavelli may have intended as a ninth book of the Florentine Histories.

Toward the end of his tenure in the Florentine government, Machiavelli wrote two poems in terza rima called I Decennali. The first seems to date from 1504-1508 and concerns the history of Italy from 1492 to 1503. It is the only work that Machiavelli published while in office. The second seems to date from around 1512 and concerns the history of Italy from 1504 to 1509. Among other things, they are precursors to concerns found in the Florentine Histories.

In general, between 1515 and 1527, Machiavelli turned more consciously toward art. He wrote a play called Le Maschere (The Masks) which was inspired by Aristophanes’ Clouds but which has not survived. Three of Machiavelli’s comedies have survived, however. L’Andria (The Girl from Andros) is a translation of Terence and was probably written between 1517 and 1520. Mandragola was probably written between 1512 and 1520; was first published in 1524; and was first performed in 1526. While original, it hearkens to the ancient world especially in how its characters are named (e.g., Lucrezia, Nicomaco). It is by far the most famous of the three and indeed is one of the most famous plays of the Renaissance. It contains many typical Machiavellian themes, the most notable of which are conspiracy and the use of religion as a mask for immoral purposes. The last of Machiavelli’s plays, Clizia, is an adaptation of Plautus. It was probably written in the early 1520s. In recent years, scholars have increasingly treated all three of these plays with seriousness and indeed as philosophical works in their own right.

In addition to I Decannali, Machiavelli wrote other poems. I Capitoli contains tercets which are dedicated to friends and which treat the topics of ingratitude, fortune, ambition, and opportunity (with virtue being notably absent). The Ideal Ruler is in the form of a pastoral. L’Asino (The Golden Ass) is unfinished and in terza rima; it has been called an “anti-comedy” and was probably penned around 1517. Between 1510 and 1515, Machiavelli wrote several sonnets and at least one serenade.

There are some other miscellaneous writings with philosophical import, most of which survive in autograph copies and which have undetermined dates of composition. Machiavelli wrote a Dialogue on Language in which he discourses with Dante on various linguistic concerns, including style and philology. Articles for a Pleasure Company is a satire on high society and especially religious confraternities. Belfagor is a short story that portrays, among other things, Satan as a wise and just prince. An Exhortation to Penitence unsurprisingly concerns the topic of penitence; the sincerity of this exhortation, however, remains a scholarly question.

Lastly, Machiavelli’s correspondence is worth noting. Some of his letters are diplomatic dispatches (the so-called “Legations”); others are personal. The Legations date from the period that Machiavelli worked for the Florentine government (1498-1512). The personal letters date from 1497 to 1527. Machiavelli’s nephew, Giuliano de’ Ricci, is responsible for assembling the copies of letters that Machiavelli had made. Particularly notable among the personal letters are the 13-21 September 1506 letter to Giovanbattista Soderini, the so-called Ghiribizzi al Soderini (Musings to Soderini); and the 10 December 1513 letter to Francesco Vettori, wherein Machiavelli first mentions The Prince.

4. Possible Philosophical Influences on Machiavelli

Machiavelli insists upon the novelty of his enterprise in several places (e.g., P 15 and D 1.pr). It is true that Machiavelli is particularly innovative and that he often appears to operate “without any respect” (sanza alcuno rispetto), as he puts it, toward his predecessors. As a result, some interpreters have gone so far as to call him the inaugurator of modern philosophy. But all philosophers are to some degree in conversation with their predecessors, even (or perhaps especially) those who seek to disagree fundamentally with what has been thought before. Thus, even with a figure as purportedly novel as Machiavelli, it is worth pondering historical and philosophical influences.

a. Renaissance Humanism

Although Machiavelli studied ancient humanists, he does not often cite them as authorities. In his own day, the most widely cited discussion of the classical virtues was Book 1 of Cicero’s De officiis. But Cicero is never named in The Prince (although Machiavelli does allude to him via the images of the fox and the lion in P 18-19) and is named only three times in the Discourses (D 1.4, 1.33, and 1.52; see also D 1.28, 1.56, and 1.59). Other classical thinkers in the humanist tradition receive similar treatment. Juvenal is quoted three times (D 2.19, 2.24, and 3.6). Virgil is quoted once in The Prince (P 17) and three times in the Discourses (D 1.23, 1.54, and 2.24). This trend tends to hold true for later thinkers, as well. Petrarch, whom Machiavelli particularly admired, is never mentioned in the Discourses, although Machiavelli does end The Prince with four lines from Petrarch’s Italia mia (93-96). One may see this relative paucity of references as suggestive that Machiavelli did not have humanist concerns. But it is possible to understand his thought as having a generally humanist tenor.

It is worth remembering that the humanists of Machiavelli’s day were almost exclusively professional rhetoricians. Though they did treat problems in philosophy, they were primarily concerned with eloquence. The revival of Greek learning in the Italian Renaissance did not change this concern and in fact even amplified it. New translations were made of ancient works, including Greek poetry and oratory, and rigorous (and in some ways newfound) philological concerns were infused with a sense of grace and nuance not always to be found in translations conducted upon the model of medieval calques. A notable example is Coluccio Salutati, who otherwise bore a resemblance to medieval rhetoricians such as Petrus de Vineis but who believed, unlike the medievals, that the best way to achieve eloquence was to imitate ancient style as concertedly as possible.

Machiavelli’s writings bear the imprint of his age in this regard. But what exactly is this imprint? What exactly is Machiavellian eloquence? Fellow philosophers have differed in their opinions. Adam Smith considered Machiavelli’s tone to be markedly cool and detached, even in discussions of the egregious exploits of Cesare Borgia. By contrast, Nietzsche understood Machiavelli’s Italian to be vibrant, almost galloping; and he thought that The Prince in particular imaginatively transported the reader to Machiavelli’s Florence and conveyed dangerous philosophical ideas in a boisterous “allegrissimo.” It is not unusual for interpreters to take one or the other of these stances today: to see Machiavelli’s works as dry and technical; or to see them as energetic and vivacious.

Recent work has examined not only Machiavelli’s eloquence but also his images, metaphors, and turns of phrase. “At a stroke” (ad un tratto) and “without any respect” (sanza alcuno rispetto) are two characteristic examples that Machiavelli frequently deploys. There has also been recent work on the many binaries to be found in Machiavelli’s works—such as virtue / fortune; ordinary / extraordinary; high / low; manly / effeminate; principality / republic; and secure / ruin. Machiavelli’s wit and his use of humor more generally have also been the subjects of recent work. Finally, increasing attention has been paid to other rhetorical devices, such as when Machiavelli speaks in his own voice; when he uses paradox, irony, and hyperbole; when he modifies historical examples for his own purposes; when he appears as a character in his narrative; and so forth. And some scholars have gone so far as to say that The Prince is not a treatise (compare D 2.1) but rather an oration, which follows the rules of classical rhetoric from beginning to end (and not just in Chapter 26). In short, it is increasingly a scholarly trend to claim that one must pay attention not only to what Machiavelli says but how he says it.

b. Renaissance Platonism

There is still a remarkable gap in the scholarship concerning Machiavelli’s possible indebtedness to Plato. One reason for this lacuna might be that Plato is never mentioned in The Prince and is mentioned only once in the Discourses (D 3.6). But there was certainly a widespread and effervescent revival of Platonism in Florence before and during Machiavelli’s lifetime.

What exactly is meant here, however? “Platonism” itself is a decidedly amorphous term in the history of philosophy. There are few, if any, doctrines that all Platonists have held, as Plato himself did not insist upon the dogmatic character of either his writings or his oral teaching. To which specific variety of Platonism was Machiavelli exposed? The two most instrumental figures with respect to transmitting Platonic ideas to Machiavelli’s Florence were George Gemistos Plethon and Marsilio Ficino.

Plethon visited Florence in 1438 and 1439 due to the Council of Florence, the seventeenth ecumenical council of the Catholic Church (Plethon himself opposed the unification of the Greek and Latin Churches). Cosimo de’ Medici was also enormously inspired by Plethon (as was John Argyropoulos; see FH 7.6); Ficino says in a preface to ten dialogues of Plato, written for Cosimo, that Plato’s spirit had flown from Byzantium to Florence. And he says in a preface to his version of Plotinus that Cosimo had been so deeply impressed with Plethon that the meeting between them had led directly to the foundation of Ficino’s so-called Platonic Academy.

The son of Cosimo de’ Medici’s physician, Ficino was a physician himself who also tutored Lorenzo the Magnificent. Ficino became a priest in 1473, and Lorenzo later made him canon of the Duomo so that he would be free to focus upon his true love: philosophy. Like Plethon, Ficino believed that Plato was part of an ancient tradition of wisdom and interpreted Plato through Neoplatonic successors, especially Proclus, Dionysius the Areopagite, and St. Augustine. Ficino died in 1499 after translating into Latin an enormous amount of ancient philosophy, including commentaries; and after writing his own great work, the Platonic Theology, a work of great renown that probably played no small role in the 1513 Fifth Lateran Council’s promulgation of the dogma of the immortality of the soul.

In the proem to the Platonic Theology, Ficino calls Plato “the father of philosophers” (pater philosophorum). In the Florentine Histories and in the only instance of the word “philosophy” (filosofia) in the major works, Machiavelli calls Ficino himself the “second father of Platonic philosophy” (secondo padre della platonica filosofia [FH 7.6]; compare FH 6.29, where Stefano Porcari of Rome hoped to be called its “new founder and second father” [nuovo fondatore e secondo padre]). And Machiavelli calls the syncretic Platonist Pico della Mirandola “a man almost divine [uomo quasi che divino]” (FH 8.36). Some scholars believe that Machiavelli critiques both Plato and Renaissance Platonism in such passages. Others, especially those who have problematized the sincerity of Machiavelli’s shocking moral claims, believe that this passage suggests a proximity between Machiavellian and Platonic themes.

Finally, Machiavelli’s father, Bernardo, is the principal interlocutor in Bartolomeo Scala’s Dialogue on the Laws and appears there as an ardent admirer of Plato.

c. Renaissance Aristotelianism

Aristotle is never mentioned in The Prince and is mentioned only once in the Discourses in the context of a discussion of tyranny (D 3.26). This has led some scholars to claim that Machiavelli makes a clean and deliberate break with Aristotelian philosophy. Other scholars, particularly those who see Machiavelli as a civic humanist, believe that Aristotle’s notions of republicanism and citizenship inform Machiavelli’s own republican idiom.

As with the question concerning Plato, the question of whether Aristotle influenced Machiavelli would seem to depend at least in part on the Aristotelianism to which he was exposed. Scholars once viewed the Renaissance as the rise of humanism and the rediscovery of Platonism, on the one hand; and the decline of the prevailing Aristotelianism of the medieval period, on the other. But, if anything, the reputation of Aristotle was only strengthened in Machiavelli’s time.

Italian scholastic philosophy was its own animal. Italy was exposed to more Byzantine influences than any other Western country. Furthermore, unlike a country such as France, Italy also had its own tradition of culture and inquiry that reached back to classical Rome. It is simply not the case that Italian Aristotelianism was displaced by humanism or Platonism. Indeed, perhaps from the late 13th century, and certainly by the late 14th, there was a healthy tradition of Italian Aristotelianism that stretched far into the 17th century. The main difference between the Aristotelian scholastics and their humanist rivals was one of subject matter. Whereas the humanists were rhetoricians who focused primarily on grammar, rhetoric, and poetry, the scholastics were philosophers who focused upon logic and natural philosophy. In Machiavelli’s day, university chairs in logic and natural philosophy were regularly held by Aristotelian philosophers, and lecturers in moral philosophy regularly based their material on Aristotle’s Nicomachean Ethics and Politics. And the Eudemian Ethics was translated for the first time.

Assessing to what extent Machiavelli was influenced by Aristotle, then, is not as easy as simply seeing whether he accepts or rejects Aristotelian ideas, because some ideas—or at least the interpretations of those ideas—are much more compatible with Machiavelli’s philosophy than others. It seems likely that Machiavelli did not agree fully with the Aristotelian position on political philosophy. But Alexander of Aphrodisias’ interpretation that the soul was mortal might be much more in line with Machiavelli’s position, and this view was widely known in Machiavelli’s day. Another candidate might be Pietro Pomponazzi’s prioritization of the active, temporal life over the contemplative life. A third candidate might be any of the various and so-called Averroist ideas, many of which underwent a revival in Machiavelli’s day (especially in places like Padua). Recent work has explored this final candidate in particular.

d. Xenophon

Xenophon is mentioned only once in The Prince (P 14). However, he is mentioned seven times in the Discourses (D 2.2, 2.13, 3.20, 3.22 [2x], and 3.39 [2x]), which is more than any other historian except for Livy. Machiavelli refers the reader explicitly to two works of Xenophon: the Cyropaedia, which he calls “the life of Cyrus” (la vita di Ciro; P 14; see also D 2.13); and the Hiero, which he calls by the alternate title, Of Tyranny (De tyrannide; D 2.2; see also the end of P 21).

In The Prince, Machiavelli lists Cyrus (along with Moses, Romulus, and Theseus) as one of the four “most excellent men” (P 6). He also names Cyrus—or least Xenophon’s version of Cyrus (D 3.22)—as the exemplar that Scipio Africanus imitates (P 14). Machiavelli says that whoever reads “the life of Cyrus” will see in the “life of Scipio” how much glory Scipio obtained as a result of imitating Cyrus. And he says that Scipio’s imitation consisted in the chastity, affability, humanity, and liberality outlined by Xenophon.

This kind and gentle vision of Cyrus was not shared universally by Renaissance Italians. Dante, Petrarch, and Boccaccio all characterize Cyrus as a monstrous ruler who was defeated and killed by Queen Tomyris (one of the stories of Cyrus’ demise which is related by Herodotus). Although Machiavelli at times offers information about Cyrus that is compatible with Herodotus’ account (P 6 and 26; AW 6.218), he appears to have a notable preference for Xenophon’s fictionalized version (as in P 14 above).

Machiavelli’s preference is presumably because of Xenophon’s teaching on appearances. Xenophon’s Cyrus is chaste, affable, humane, and liberal (P 14). At least two of these virtues are mentioned in later chapters of The Prince. Liberality is characterized as a virtue that consumes itself and thus cannot be maintained—unless one spends what belongs to others, as did “Cyrus, Caesar, and Alexander” (P 17). Similarly, humanity (umanità) is named as a trait that one may have to disavow in times of necessity (P 18). For example, Agathocles is characterized by inhumanity (inumanità; P8), and Hannibal was “inhumanely cruel” (inumana crudeltà; P 17; see also D 3.21-22). Nonetheless, humanity is also one of the five qualities that Machiavelli explicitly highlights as a useful thing to appear to have (P 18; see also FH 2.36). Machiavelli makes it clear that Xenophon’s Cyrus understood the need to deceive (D 2.13). Thus, Machiavelli may have learned from Xenophon that it is important for rulers (and especially founders) to appear to be something that they are not. This might hold true whether they are actual rulers (e.g., “a certain prince of present times” who says one thing and does another; P 18) or whether they are historical examples (e.g., Machiavelli’s altered story of David; P 13).

But it is worth wondering whether Machiavelli does in fact ultimately uphold Xenophon’s account. Immediately after praising Xenophon’s account of Cyrus at the end of Prince 14, Machiavelli in Prince 15 lambasts those who have presented imaginary objects of imitation. He says that he will leave out what is imagined and will instead discuss what is true. Could it be that Machiavelli puts Xenophon’s Cyrus forward as an example that is not to be followed? It is worth noting that Scipio, who imitates Cyrus, is criticized for excessive mercy (or piety; P 17). This example is especially remarkable since Machiavelli highlights Scipio as someone who was very rare (rarissimo) not only for his own times but “in the entire memory of things known” (in tutta la memoria delle cose che si fanno; P 17; compare FH 8.29). It also raises the question as to whether Machiavelli writes in a manner similar to Xenophon (D 3.22).

Lastly, it is worth noting that Xenophon was a likely influence on Machiavelli’s own fictionalized and stylized biography, The Life of Castruccio Castracani.

e. Lucretius

Ninth century manuscripts of De rerum natura, Lucretius’ poetic account of Epicurean philosophy, are extant. However, the text was not widely read in the Middle Ages and did not obtain prominence until centuries later, when it was rediscovered in 1417 by Poggio Bracciolini. It seems to have entered broader circulation in the 1430s or 1440s, and it was first printed in 1473. De rerum natura was one of the two texts which led to a revival of Epicurean philosophy in Machiavelli’s day, the other being the life of Epicurus from Book 10 of Diogenes Laertius’ Lives (translated into Latin in 1433). These two works, along with other snippets of Epicurean philosophy already known from Seneca and Cicero, inspired many thinkers—such as Ficino and Alberti—to ponder the return of these ideas.

With respect to Machiavelli, Lucretius was an important influence on Bartolomeo Scala, a lawyer who was a friend of Machiavelli’s father. Additionally, Lucretius was an important influence on Marcello di Virgilio Adriani, who was a professor at the University of Florence; Scala’s successor in the chancery; and the man under whom Machiavelli was appointed to work in 1498. Adriani deployed Lucretius in his Florentine lectures on poetry and rhetoric between 1494 and 1515. Machiavelli may have received a substantial part of his classical education from Adriani and was likely familiar with Adriani’s lectures, at least.

Lucretius also seems to have been a direct influence on Machiavelli himself. Although Machiavelli never mentions Lucretius by name, he did hand-copy the entirety of De rerum natura (drawing largely from the 1495 print edition). Machiavelli’s transcription was likely completed around 1497 and certainly before 1512. He omits the descriptive capitula—not original to Lucretius but common in many manuscripts—that subdivide the six books of the text into smaller sections. He also adds approximately twenty marginal annotations of his own, almost all of which are concentrated in Book 2. Machiavelli’s annotations focus on the passages in De rerum natura which concern Epicurean physics—that is, the way that the cosmos would function in terms of atomic motion, atomic swerve, free will, and a lack of providential intervention. Recent work has noted that it is precisely this section of the text that received the least attention from other Renaissance annotators, many of whom focused instead upon Epicurean views on love, virtue, and vice.

Recent work has also highlighted stylistic resonances between Machiavelli’s works and De rerum natura, either directly or indirectly. To give only one example, Machiavelli says in the Discourses that he desires to “take a path as yet untrodden by anyone” (non essendo suta ancora da alcuno trita) in order to find “new modes and orders” (modi ed ordini nuovi; D 1.pr). Lucretius says that he will walk paths not yet trodden (trita) by any foot in order to gather “new flowers” (novos flores; 4.1-5). Among other possible connections are P 25 and 26; and D 1.2, 2.pr, and 3.2.

Machiavelli does not seem to have agreed with the classical Epicurean position that one should withdraw from public life (e.g., D 1.26 and 3.2). But what might Machiavelli have learned from Lucretius? One possible answer concerns the soul. Machiavelli never treats the topic of the soul substantively, and he never uses the word at all in either The Prince or the Discourses (he apparently even went so far as to delete anima from a draft of the first preface to the Discourses). For Lucretius, the soul is material, perishable, and made up of two parts: animus, which is located in the chest, and anima, which is spread throughout the body. But each part, like all things in the cosmos, is composed only of atoms, invisibly small particles of matter that are constantly in motion. From time to time, these atoms conglomerate into macroscopic masses. Human beings are such entities. But when they perish, there is no longer any power to hold the atoms of the soul together, so those atoms disperse like all others eventually do.

A second possible aspect of Lucretian influence concerns the eternity of the cosmos, on the one hand, and the constant motion of the world, on the other. Lucretius seems to have believed that the cosmos was eternal but that the world was not, whereas some thinkers in Machiavelli’s day believed that both the cosmos and the world were eternal. Machiavelli ponders the question of the eternity of the world (D 2.5). He at times claims that the world has always remained the same (D 1.pr and 2.pr; see also 1.59). He also at times claims that worldly things are in motion (P 10 and FH 5.1; compare P 25) and that human things in particular are “always in motion” (D 1.6 and 2.pr).

As recent work has shown, reading Lucretius in the Renaissance was a dangerous game. By Machiavelli’s time, Petrarch had already described Epicurus as a philosopher who was held in popular disrepute; and Dante had already suggested that those who deny the afterlife belong with “Epicurus and all his followers” (Inferno 10.13-15). In 1513, the Fifth Lateran Council condemned those who believed that the soul was mortal; those who believed in the unity of the intellect; and those who believed in the eternity of the world. It also made belief in the afterlife mandatory. Lucretius was last printed in the Italian Renaissance in 1515 and was prohibited from being read in schools by the Florentine synod in late 1516 / early 1517.

f. Savonarola

There is no comprehensive monograph on Machiavelli and Savonarola. While there has been some interesting recent work, particularly with respect to Florentine institutions, the connection between the two thinkers remains a profitable area of research.

Girolamo Savonarola was a Dominican friar who came to Florence in 1491 and who effectively ruled the city from 1494 to 1498 from the pulpits of San Marco and Santa Reparata. He was renowned for his oratorical ability, his endorsement of austerity, and his concomitant condemnation of excess and luxury. The effectiveness of his message can be seen in the stark difference between Botticelli’s Primavera and his later, post-Savonarolan Calumny of Apelles; or in the fact that Michelangelo felt compelled to toss his own easel paintings onto the so-called bonfires of the vanities. Savonarola’s influence in Florentine politics grew to immensity, and Pope Alexander VI would eventually excommunicate Savonarola after a lengthy dispute. As a result, Florence would hang and then burn Savonarola (with two others) at the stake, going so far as to toss his ashes in the Arno afterward so that no relics of him could be kept.

Machiavelli attended several of Savonarola’s sermons, which may be significant since he did not seem inclined otherwise to attend services regularly. There are interesting possible points of contact in terms of the content of these sermons, such as Savonarola’s understanding of Moses; Savonarola’s prediction of Charles VIII as a new Cyrus; and Savonarola’s use of the Biblical story of the flood.

In The Prince, Machiavelli discusses Savonarola by name only a single time, saying that he is an “unarmed prophet” who has been ruined because he does not have a way either to make believers remain firm or to make unbelievers believe (P 6). Machiavelli later acknowledges that Savonarola spoke the truth when he claimed that “our sins” were the cause of Charles VIII’s invasion of Italy, although he does not name him and in fact disagrees with Savonarola as to which sins are relevant (P 12; compare D 2.18). In the Discourses, Machiavelli is more expansive and explicit in his treatment of the friar. Savonarola convinces the Florentines, no naïve people, that he talks with God (D 1.11); helps to reorder Florence but loses reputation after he fails to uphold a law that he fiercely supported (D 1.45); foretells the coming of Charles VIII into Florence (D 1.56); and understands what Moses understands, which is that one must kill envious men who oppose one’s plans (D 3.30). Machiavelli conspicuously omits any explicit mention of Savonarola in the Florentine Histories.

It is also worth noting two other important references in Machiavelli’s corpus. The lengthiest discussion of Savonarola is Machiavelli’s 9 March 1498 letter to Ricciardo Becchi. Many commentators have read this letter as a straightforward condemnation of Savonarola’s hypocrisy, but some recent work has stressed the letter’s rhetorical nuances. To give only one example, Machiavelli discusses how Savonarola colors his “lies” (bugie). While it is true that Machiavelli does use bugie only in a negative context in the Discourses (D 1.14 and 3.6), it is difficult to maintain that Machiavelli is opposed to lying in any principled way.

Secondly, in his 17 May 1521 letter to Francesco Guicciardini, Machiavelli has been interpreted as inveighing against Savonarola’s hypocrisy. But, again, nuances and context may be important. Machiavelli does indeed implicate two other friars: Ponzo for insanity and Alberto for hypocrisy. But he simply calls Savonarola versuto, which means something like “crafty” or “versatile” and which is a quality that he never denounces elsewhere in his corpus.

g. The Bible and Its Traditions

To what extent the Bible influenced Machiavelli remains an important question. He laments that histories are no longer properly read or understood (D 1.pr); speaks of reading histories with judicious attention (sensatamente; D 1.23); and implies that the Bible is a history (D 2.5). Furthermore, he explicitly speaks of reading the Bible in this careful manner (again sensatamente; D 3.30)—the only time in The Prince or the Discourses that he mentions “the Bible” (la Bibbia). Recent work has explored what it might have meant for Machiavelli to read the Bible in this way. Additionally, recent work has explored the extent to which Machiavelli engaged with the Jewish, Christian, and Islamic traditions.

Machiavelli quotes from the Bible only once in his major works, referring to someone “. . . who filled the hungry with good things and sent the rich away empty” (D 1.26; Luke 1:53; compare I Samuel 2:5-7). The passage is from Mary’s Magnificat and refers to God. Machiavelli, however, uses the passage to refer to David.

David is one of two major Biblical figures in Machiavelli’s works. Elsewhere in the Discourses, Machiavelli attributes virtue to David and says that he was undoubtedly a man very excellent in arms, learning, and judgment (D 1.19). In a digression in The Prince, Machiavelli refers to David as “a figure of the Old Testament” (una figura del Testamento vecchio; P 13). Machiavelli offers a gloss of the story of David and Goliath which differs in numerous and substantive ways from the Biblical account (see I Samuel 17:32-40, 50-51).

Moses is the other major Biblical figure in Machiavelli’s works. He is mentioned at least five times in The Prince (P 6 [4x] and 26) and at least five times in the Discourses (D 1.1, 1.9, 2.8 [2x], and 3.30). Moses is the only one of the four most excellent men of Chapter 6 who is said to have a “teacher” (precettore; compare Achilles in P 18). In the Discourses, Moses is a lawgiver who is compelled to kill “infinite men” due to their envy and in order to push his laws and orders forward (D 3.30; see also Exodus 32:25-28).

Machiavelli sparsely treats the “ecclesiastical principality” (P 11) and the “Christian pontificate” (P 11 and 19). He calls Ferdinand of Aragon “the first king among the Christians” (P 21) and says that Cosimo Medici’s death is mourned by “all citizens and all the Christian princes” (FH 7.6).

Chapter 6 of The Prince is famous for its distinction between armed and unarmed prophets. In Chapter 26, Machiavelli refers to extraordinary occurrences “without example” (sanza essemplo): the opening of the sea, the escort by the cloud, the water from the stone, and the manna from heaven. It has long been noted that Machiavelli’s ordering of these events does not follow the order given in Exodus (14:21, 13:21, 17:6, and 16:4, respectively). However, recent work has noted that it does in fact follow exactly the order of Psalms 78:13-24.

Lastly, scholars have recently begun to examine Machiavelli’s connections to Islam. For example, some scholars believe that Machiavelli’s notion of a sect (setta) is imported from the Averroeist vocabulary. Machiavelli speaks at least twice of the prophet Mohammed (FH 1.9 and 1.19), though conspicuously not when he discusses armed prophets (P 6). He discusses various Muslim princes—most importantly Saladin (FH 1.17), who is said to have virtue. Machiavelli compares the Pope with the Ottoman “Turk” and the Egyptian “Sultan” (P 19; compare P 11). He also compares “the Christian pontificate” with the Janissary and Mameluk regimes predominant under Sunni Islam (P 19; see also P 11). On occasion he refers to the Turks as “infidels” (infideli; e.g., P 13 and FH 1.17).

5. Contemporary Interpretations

The main aim of this article is to help readers find a foothold in the primary literature. A second, related aim is to help readers do so in the secondary literature.

In the spirit of bringing “common benefit to everyone” (D 1.pr), what follows is a rough outline of the scholarly landscape. It has followed the practice of many recent Machiavelli scholars—for whom it is not uncommon, especially in English, to say that the views on Machiavelli can be divided into a handful of camps. Many of the differences between these camps appear to reduce to the question of how to fit The Prince and the Discourses together. Five are outlined below, although some scholars would of course put that number either higher or lower. Readers who are interested in understanding the warp and woof of the scholarship in greater detail are encouraged to consult the recent and more fine-grained accounts of Catherine Zuckert (2017), John T. Scott (2016), and Erica Benner (2013).

The first camp takes The Prince to be a satirical or ironic work. The 16th century Italian jurist Alberico Gentili was one of the first interpreters to take up the position that The Prince is a satire on ruling. Rousseau and Spinoza in their own respective ways also seemed to hold this interpretation. Members of this camp typically argue that Machiavelli is a republican of various sorts and place special emphasis upon his rhetoric. The most notable recent member of this camp is Erica Benner (2017a, 2017b, 2013, and 2009), who argues that The Prince is thoroughly ironic and that Machiavelli presents a shocking moral teaching in order to subvert it.

The second camp also places emphasis upon Machiavelli’s republicanism and thus sits in proximity to the first camp. However, members of this camp do not typically argue that The Prince is satirical or ironic. They do typically argue that The Prince presents a different teaching than does the Discourses; and that, as an earlier work, The Prince is not as comprehensive or mature of a writing as the Discourses. This camp also places special emphasis upon Machiavelli’s historical context. The most notable member of this camp is Quentin Skinner (2017, 2010, and 1978). J. G. A. Pocock (2010 and 1975), Hans Baron (1988 and 1966), and David Wootton (2016) could be reasonably placed in this camp. Maurizio Viroli (2016, 2014, 2010, 2000, and 1998) could also be reasonably placed here, though he puts additional emphasis on The Prince.

The third camp argues for the unity of Machiavelli’s teaching and furthermore argues that The Prince and the Discourses approach the truth from different directions. In other words, members of this camp typically claim that Machiavelli presents the same teaching or vision in each book but from different starting points. The most notable members of this camp are Isaiah Berlin (1981 [1958]), Sheldon Wolin (1960), and Benedetto Croce (1925).

The fourth camp also argues for the unity of Machiavelli’s teaching and thus sits in proximity to the third camp. However, members of this camp do not typically argue that The Prince and Discourses begin from different starting points. And while they typically argue for the overall coherence of Machiavelli’s corpus, they do not appear to hold a consensus regarding the status of Machiavelli’s republicanism. The most notable member of this camp is Leo Strauss (1958). Harvey C. Mansfield (2017, 2016, 1998, and 1979), Catherine Zuckert (2017 and 2016), John T. Scott (2016, 2011, and 1994), Vickie Sullivan (2006, 1996, and 1994), Nathan Tarcov (2015, 2014, 2013a, 2013b, 2007, 2006, 2003, 2000, and 1982), and Clifford Orwin (2016 and 1978) could be reasonably placed here.

The fifth camp is hermeneutically beholden to Hegel, which seems at first glance to be an anachronistic approach. But Hegel’s notion of dialectic was itself substantially beholden to Proclus’ commentary on the Parmenides—a work which was readily available to Machiavelli through Ficino’s translation and which was enormously influential on Renaissance Platonism in general. The most notable member of this camp is Claude Lefort (2012 [1972]). Miguel Vatter (2017, 2013, and 2000) could be reasonably placed here and additionally deserves mention for his familiarity with the secondary literature in Spanish (an unusual achievement for Machiavelli scholars who write in English). Additionally, interpreters who are indirectly beholden to Hegel’s dialectic, via Marx, could also be reasonably placed here. Miguel Abensour (2011 [2004]), Louis Althusser (1995), and Antonio Gramsci (1949) are examples.

6. References and Further Reading

Below are listed some of the more well-known works in the scholarship, as well as some that the author has found profitable but which are perhaps not as well-known. They are arranged as much as possible in accordance with the outline of this article. Given the article’s aim, the focus is almost exclusively upon works that are available in English. It goes without saying that there are many important books that are not mentioned.

Regarding Machiavelli’s life, there are many interesting and recent biographies. Some examples include Benner (2017a), Celenza (2015), Black (2013 and 2010), Atkinson (2010), Skinner (2010), Viroli (2010, 2000, and 1998), de Grazia (1989), and Ridolfi (1964). Vivanti (2013) offers an intellectual biography. Pesman (2010) captures Machiavelli’s work for the Florentine republic. Butters (2010), Cesati (1999), and Najemy (1982) discuss Machiavelli’s relationship with the Medici. Landon (2013) examines Machiavelli’s relationship with Lorenzo di Filippo Strozzi. Masters (1999 and 1998) examines Machiavelli’s relationship with Leonardo da Vinci.

For an understanding of Machiavelli’s overall position, Zuckert (2017) is the most recent and comprehensive account of Machiavelli’s corpus, especially with respect to his politics. Other good places to begin are Nederman (2009), Viroli (1998), Mansfield (2017, 2016, and 1998), Skinner (2017 and 1978), Prezzolini (1967), Voegelin (1951), and Foster (1941). Johnston, Urbinati, and Vergara (2017) and Fuller (2016) are recent, excellent collections. Lefort (2012) and Strauss (1958) are daunting and difficult but also well worth the attempt.

Skinner (2017), Benner (2009), and Mansfield (1998) discuss virtue. Spackman (2010) and Pitkin (1984) discuss fortune, particularly with respect to the image of fortune as a woman. Saxonhouse (2016), Tolman Clarke (2005), and Falco (2004) discuss Machiavelli’s understanding of women. Benner (2017b and 2009) and Cox (2010) treat Machiavelli’s ethics.

On religion, see Parsons (2016), Tarcov (2014), Palmer (2010a and 2010b), Lynch (2010), and Lukes (1984). Biasiori and Marcocci (2018) is a recent collection concerning Machiavelli and Islam. Nederman (1999) examines free will. Blanchard (1996) discusses sight and touch.

Rahe (2017) and Parel (1992) discuss Machiavelli’s understanding of humors. Regarding various other political themes, including republicanism, see McCormick (2011), Slade (2010), Barthas (2010), Rahe (2017, 2008, and 2005), Patapan (2006), Sullivan (2006 and 1996), Forde (1995 and 1992), Bock (1990), Hulliung (1983), Skinner (1978), and Pocock (1975).

Recent works concerning The Prince include Benner (2017b and 2013), Scott (2016), Parsons (2016), Viroli (2014), Vatter (2013), Rebhorn (2010 and 1998), M. Palmer (2001), and de Alvarez (1999). Tarcov’s essays (2015, 2014, 2013a, 2013b, 2007, 2006, 2003, 2000, and 1982) are especially fine-grained analyses. Connell (2013) discusses The Prince’s composition. On deception, see Dietz (1984) and Langton and Dietz (1987). On Cesare Borgia, see Orwin (2016) and Scott and Sullivan (1994).

Recent works concerning the Discourses include Duff (2011), Najemy (2010), Pocock (2010), Hörnqvist (2004), Vatter (2000), Coby (1999), and Sullivan (1996). Mansfield (1979) and Walker (1950) are the two notable commentaries.

Regarding the Art of War, see Hörnqvist (2010), Lynch (2010 and 2003), Lukes (2004), and Colish (1998).

Regarding the Florentine Histories, see McCormick (2017), Jurdjevic (2014), Lynch (2012), Cabrini (2010), and Mansfield (1998).

Regarding Machiavelli’s poetry and plays, see Ascoli and Capodivacca (2010), Martinez (2010), Kahn (2010 and 1994), Atkinson and Sices (2007 [1985]), Patapan (2003), Sullivan (2000), and Ascoli and Kahn (1993).

Anyone who wants to learn more about the intellectual context of the Italian Renaissance should begin with the many writings of Kristeller (e.g., 1979, 1961, and 1965), whose work is a model of scholarship. See also Hankins (2000), Cassirer (2010 [1963]), and Burke (1998).

Regarding humanist educational treatises, see Kallendorf (2008). Regarding Ficino, see the I Tatti series edited by James Hankins (especially 2015, 2012, 2008, and 2001). Hankins’ examination of the “myth” of the Platonic Academy in Florence is also worth mentioning (1991). Regarding Xenophon, see Nadon (2001) and Newell (1988). Regarding Lucretius, see A. Palmer (2014), Brown (2010a and 2010b), and Rahe (2008). Norbrook, Harrison, and Hardie (2016) is a recent collection concerning Lucretius’ influence upon early modernity. The most comprehensive recent treatment of Savonarola can be found in Jurdjevic (2014).

Much of Machiavelli’s important personal correspondence has been collected in Atkinson and Sices (1996). Najemy has examined Machiavelli’s correspondence with Vettori (1993).

Those interested in the Italian scholarship should begin with the seminal work of Sasso (1993, 1987, and 1967). Careful studies of Machiavelli’s word choice can be found in Chiappelli (1974, 1969, and 1952).

Lastly, Ruffo-Fiore (1990) has compiled an annotated bibliography of Machiavelli scholarship from 1935 to 1988.

a. Primary Sources

  • Machiavelli, Niccolò. The Art of War, ed. and trans. Christopher Lynch. Chicago: University of Chicago Press, 2003.
  • Machiavelli, Niccolò. L’Arte della guerra; scritti politici minori, ed. Jean-Jacques Marchand, Denis Fachard, and Giorgio Masi. Rome: Salerno Editrice, 2001.
  • Machiavelli, Niccolò. The Chief Works and Others. Three volumes, trans. Allan Gilbert. Durham: Duke University Press, 1999 [1958].
  • Machiavelli, Niccolò. Clizia, trans. Daniel T. Gallagher. Long Grove: Waveland Press, 1996.
  • Machiavelli, Niccolò. The Comedies of Machiavelli, ed. and trans. David Sices and James B. Atkinson. Indianapolis: Hackett, 2007 [1985].
  • Machiavelli, Niccolò. Discourses on Livy, trans. Harvey C. Mansfield and Nathan Tarcov. Chicago: University of Chicago Press, 1998 [1996].
  • Machiavelli, Niccolò. Discorsi sopra la prima deca di Tito Livio, ed. Giorgio Inglese. Milano: Bur Rizzoli, 1984. Digitized 2011.
  • Machiavelli, Niccolò. Florentine Histories, trans. Laura F. Banfield and Harvey C. Mansfield. Princeton: Princeton University Press, 1988.
  • Machiavelli, Niccolò. Machiavelli and Friends: Their Personal Correspondence, ed. and trans. James B. Atkinson and David Sices. DeKalb: Northern Illinois University Press, 1996.
  • Machiavelli, Niccolò. Mandragola, trans. Mera J. Flaumenhaft. Long Grove: Waveland Press, 1981.
  • Machiavelli, Niccolò. The Prince with Related Documents, trans. and ed. William J. Connell. Boston: Bedford / St. Martin’s Press, 2005.
  • Machiavelli, Niccolò. The Prince, second edition, trans. Harvey C. Mansfield. Chicago: University of Chicago Press, 1998.
  • Machiavelli, Niccolò. Il Principe, ed. Giorgio Inglese. Torino: Giulio Einaudi, 2013.Machiavelli, Niccolò. Tutte le opere. Florence: Sansoni, 1971.

b. Secondary Sources

  • Abensour, Miguel. Democracy Against the State: Marx and the Machiavellian Moment. Cambridge: Polity Press, 2011 [2004]).
  • Alberti, Leon Battista. On Painting. New Haven: Yale University Press, 1966 [1956].
  • Althusser, Louis. “Machiavel et nous.” In crits philosophiques et politiques, 42-168. Paris: Stock / IMEC, 1995.
  • Arendt, Hannah. The Human Condition, second edition. Chicago: University of Chicago Press, 1998 [1958].
  • Ascoli, Albert Russell, and Angela Matilde Capodivacca. “Machiavelli and Poetry.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 190-205. Cambridge: Cambridge University Press, 2010.
  • Ascoli, Albert Russell, and Victoria Kahn, eds. Machiavelli and the Discourse of Literature. Ithaca: Cornell University Press, 1993.
  • Atkinson, James B. “Niccolò Machiavelli: A Portrait.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 14-30. Cambridge: Cambridge University Press, 2010.
  • Baron, Hans. In Search of Florentine Civic Humanism. Princeton: Princeton University Press, 1988.
  • Baron, Hans. The Crisis of the Early Italian Renaissance. Princeton: Princeton University Press, 1966.
  • Barthas, Jérémie. “Machiavelli in political thought from the age of revolutions to the present.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 256-273. Cambridge: Cambridge University Press, 2010.
  • Benner, Erica. Be Like the Fox: Machiavelli’s Lifelong Quest for Freedom. New York: W.W. Norton & Company, 2017a.
  • Benner, Erica. “The Necessity to Be Not-Good: Machiavelli’s Two Realisms.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 164-185. Chicago: University of Chicago Press, 2017b.
  • Benner, Erica. Machiavelli’s Prince: A New Reading. Oxford: Oxford University Press, 2013.
  • Benner, Erica. Machiavelli’s Ethics. Princeton: Princeton University Press, 2009.
  • Berlin, Isaiah. “The Originality of Machiavelli.” In Against the Current: Essays in the History of Ideas, 25-79. Oxford: Oxford University Press, 1981 [1958].
  • Biasiori, Lucio, and Giuseppe Marcocci, eds. Machiavelli, Islam and the East: Reorienting the Foundations of Modern Political Thought. London: Palgrave Macmillan, 2018.
  • Blanchard, Kenneth C. “Being, Seeing, and Touching: Machiavelli’s Modification of Platonic Epistemology.” The Review of Metaphysics 49, no. 3 (1996): 577-607.
  • Black, Robert. Machiavelli. London: Routledge, 2013.
  • Black, Robert. “Machiavelli in the Chancery.” In The Cambridge Companion to Machiavelli, 31-47. Edited by John M. Najemy. Cambridge: Cambridge University Press, 2010.
  • Bock, Gisela, Quentin Skinner, and Maurizio Viroli, eds. Machiavelli and Republicanism. Cambridge: Cambridge University Press, 1990.
  • Brown, Alison. “Philosophy and Religion in Machiavelli.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 157-172. Cambridge: Cambridge University Press, 2010a.
  • Brown, Alison. The Return of Lucretius to Renaissance Florence. Cambridge: Harvard University Press, 2010b.
  • Burke, Peter. The European Renaissance: Centres and Peripheries. Oxford: Blackwell, 1998.
  • Butters, Humfrey. “Machiavelli and the Medici.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 64-79. Cambridge: Cambridge University Press, 2010.
  • Cabrini, Anna Maria. “Machiavelli’s Florentine Histories.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 128-143. Cambridge: Cambridge University Press, 2010.
  • Cassirer, Ernst. The Individual and the Cosmos in Renaissance Philosophy. Chicago: University of Chicago Press, 2010 [1963].
  • Celenza, Christopher S. Machiavelli: A Portrait. Cambridge: Harvard University Press, 2015.
  • Cesati, Franco. The Medici. Florence: Mandragora, 1999.
  • Chabod, Federico. Machiavelli and the Renaissance, trans. David Moore. London: Bowes and Bowes, 1960.
  • Chiappelli, Fredi. Machiavelli e La ‘Lingua Fiorentina.’ Bologna: Massimiliano Boni, 1974.
  • Chiappelli, Fredi. Nuovi Studi sul Linguaggio del Machiavelli. Florence: Le Monnier, 1969.
  • Chiappelli, Fredi. Studi sul Linguaggio del Machiavelli. Florence: Le Monnier, 1952.
  • Clarke, Michelle Tolman. “On the Woman Question in Machiavelli.” The Review of Politics 67, no. 2 (2005): 229-256.
  • Coby, Patrick. Machiavelli’s Romans. Lanham: Lexington Books, 1999.
  • Colish, Marcia L. “Machiavelli’s Art of War: A Reconsideration.” Renaissance Quarterly 51, no. 4 (1998): 1151-1168.
  • Connell, William J. “Dating The Prince: Beginnings and Endings.” The Review of Politics 75, no. 4 (2013): 497-514.
  • Cox, Virginia. “Rhetoric and Ethics in Machiavelli.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 173-189. Cambridge: Cambridge University Press, 2010.
  • Croce, Benedetto. Elementi di Politica. Bari: Laterza, 1925.
  • De Alvarez, Leo Paul. The Machiavellian Enterprise: A Commentary on The Prince. DeKalb: Northern Illinois University Press, 2008 [1999].
  • De Grazia, Sebastian. Machiavelli in Hell. Princeton: Princeton University Press, 1989.
  • Dietz, Mary. “Trapping the Prince: Machiavelli and the Politics of Deception.” The American Political Science Review 80, no. 3 (1986): 777-799.
  • Duff, Alexander S. “Republicanism and the Problem of Ambition: The Critique of Cicero in Machiavelli’s Discourses.” The Journal of Politics 73, No. 4 (2011): 980-992.
  • Falco, Maria J., ed. Feminist Interpretations of Machiavelli. University Park: Penn State University Press, 2004.
  • Ficino, Marsilio. On Dionysius the Areopagite Volume 1, ed. and trans. Michael J.B. Allen. Cambridge: Harvard University Press, 2015.
  • Ficino, Marsilio. Commentaries on Plato, Volume 2, Part 1, ed. and trans. Maude Vanhaelen. Cambridge: Harvard University Press, 2012.
  • Ficino, Marsilio. Commentaries on Plato, Volume 1, ed. and trans. Michael J. B. Allen. Cambridge: Harvard University Press, 2008.
  • Ficino, Marsilio. Platonic Theology, Volume 1, ed. James Hankins and William Bowen and trans. Michael J. B. Allen. Cambridge: Harvard University Press, 2001.
  • Forde, Steven. “International Realism and the Science of Politics: Thucydides, Machiavelli, and Neorealism.” International Studies Quarterly 39, no. 2 (1995): 141-160.
  • Forde, Steven. “Varieties of Realism: Thucydides and Machiavelli.” The Journal of Politics 54, no. 2 (1992): 372-393.
  • Foster, Michael. Masters of Political Thought, Volume 1: Plato to Machiavelli. Boston: Houghton Mifflin Company, 1941.
  • Fuller, Timothy, ed. Machiavelli’s Legacy: The Prince After Five Hundred Years. Philadelphia: University of Pennsylvania Press, 2015.
  • Gilbert, Allan H. Machiavelli’s Prince and Its Forerunners. Durham: Duke University Press, 1938.
  • Gilbert, Felix. Machiavelli and Guicciardini: Politics and History in Sixteenth-Century Florence. New York: W.W. Norton & Company, 1984.
  • Gilbert, Felix. History, Choice, and Commitment. Cambridge: The Belknap Press, 1977.
  • Gramsci, Antonio. Note sul Machiavelli, sulla politica e sullo stato moderno. Torino: Einaudi, 1949.
  • Hankins, James, ed. Renaissance Civic Humanism: Reappraisals and Reflections. Cambridge: Cambridge University Press, 2000.
  • Hankins, James. “The Myth of the Platonic Academy of Florence.” Renaissance Quarterly 44, no. 3 (1991): 429-475.
  • Hörnqvist, Mikael. “Machiavelli’s Military Project and the Art of War.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 112-127. Cambridge: Cambridge University Press, 2010.
  • Hörnqvist, Mikael. Machiavelli and Empire. Cambridge: Cambridge University Press, 2004.
  • Hulliung, Mark. Citizen Machiavelli. Princeton: Princeton University Press, 1983.
  • Jurdjevic, Mark. A Great and Wretched City: Promise and Failure in Machiavelli’s Florentine Political Thought. Cambridge: Harvard University Press, 2014.
  • Kahn, Victoria. “Machiavelli’s Afterlife and Reputation to the Eighteenth Century.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 239-255. Cambridge: Cambridge University Press, 2010.
  • Kahn, Victoria. Machiavellian Rhetoric: From the Counter-Reformation to Milton. Princeton: Princeton University Press, 1994.
  • Kallendorf, Craig W., ed. and trans. Humanist Educational Treatises. Cambridge: Harvard University Press, 2008 [2002].
  • Kristeller, Paul Oskar. Renaissance Thought and Its Sources, ed. Michael Mooney. New York: Columbia University Press, 1979.
  • Kristeller, Paul Oskar. Renaissance Thought II: Papers on Humanism and the Arts. New York: Harper and Row, 1965.
  • Kristeller, Paul Oskar. Renaissance Thought: The Classic, Scholastic, and Humanist Strains. New York: Harper and Row, 1961.
  • Landon, William J. Lorenzo de Filippo Strozzi and Niccoló Machiavelli. Toronto: University of Toronto Press, 2013.
  • Langton, John, and Mary Dietz. “Machiavelli’s Paradox: Trapping or Teaching the Prince.” The American Political Science Review 81, no. 4 (1987): 1277-1288.
  • Lukes, Timothy J. “Martialing Machiavelli: Reassessing the Military Reflections.” Journal of Politics 66, no. 4 (2004): 1089-1108.
  • Lukes, Timothy J. “Lionizing Machiavelli.” The American Political Science Review 95, no. 3 (2001): 561-75.
  • Lukes, Timothy J. “To Bamboozle With Goodness: The Political Advantages of Christianity in the Thought of Machiavelli.” Renaissance and Reformation 8, no. 4 (1984): 266-77.
  • Lynch, Christopher. “War and Foreign Affairs in Machiavelli’s Florentine Histories.” The Review of Politics 74, no. 1 (2012): 1-26.
  • Lynch, Christopher. “The Ordine Nuovo of Machiavelli’s Arte della Guerra: Reforming Ancient Matter.” History of Political Thought 31, no. 3 (2010): 407-425.
  • Lynch, Christopher. “Machiavelli on Reading the Bible Judiciously.” Hebraic Political Studies 1, no. 2 (2006): 162-185.
  • Lefort, Claude. Machiavelli in the Making, trans. Michael B. Smith. Evanston: Northwestern University Press, 2012.
  • Major, Rafael. “A New Argument for Morality: Machiavelli and the Ancients.” Political Research Quarterly 60, no. 2 (2007): 171-179.
  • Mansfield, Harvey C. “Machiavelli on Necessity.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 39-57. Chicago: University of Chicago Press, 2017.
  • Mansfield, Harvey C. “Machiavelli’s Enterprise.” In Machiavelli’s Legacy, ed. Timothy Fuller, 11-33. Philadelphia: University of Pennsylvania Press, 2016.
  • Mansfield, Harvey C. Machiavelli’s Virtue. Chicago: University of Chicago Press, 1998 [1996].
  • Mansfield, Harvey C. Machiavelli’s New Modes and Orders: A Study of the Discourses on Livy. Chicago: University of Chicago Press, 1979.
  • Martinez, Ronald L. “Comedian, Tragedian: Machiavelli and Traditions of Renaissance Theater.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 206-222. Cambridge: Cambridge University Press, 2010.
  • Masters, Roger D. Fortune is a River: Leonardo da Vinci and Niccoló Machiavelli’s Magnificent Dream to Change the Course of Florentine History. New York: Free Press, 1999.
  • Masters, Roger D. Machiavelli, Leonardo, and the Science of Power. Notre Dame: University of Notre Dame Press, 1998.
  • McCormick, John P. “On the Myth of a Conservative Turn in Machiavelli’s Florentine Histories.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 330-351. Chicago: University of Chicago Press, 2017.
  • McCormick, John P. Machiavellian Democracy. Cambridge: Cambridge University Press, 2011.
  • Nadon, Christopher. Xenophon’s Prince: Republic and Empire in the Cyropaedia. Berkeley: University of California Press, 2001.
  • Najemy, John A. “Society, Class, and State in Machiavelli’s Discourses on Livy.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 96-111. Cambridge: Cambridge University Press, 2010.
  • Najemy, John M. Between Friends: Discourses of Power and Desire in the Machiavelli-Vettori Letters of 1513-1515. Princeton: Princeton University Press, 1993.
  • Nederman, Cary J. Machiavelli: A Beginner’s Guide. London: Oneworld, 2009.
  • Nederman, Cary J. “Amazing Grace: Fortune, God, and Free Will in Machiavelli’s Thought.” Journal of the History of Ideas 60, no. 4 (1999): 617-638.
  • Newell, Waller R. Tyranny: A New Interpretation. Cambridge: Cambridge University Press, 2013.
  • Newell, Waller R. “Machiavelli and Xenophon on Princely Rule: A Double-Edged Encounter.” The Journal of Politics 50, no. 1 (1988): 108-130.
  • Norbrook, David, Stephen Harrison, and Philip Hardie, eds. Lucretius and the Early Modern. Oxford: Oxford University Press, 2016.
  • Orwin, Clifford. “The Riddle of Cesare Borgia and the Legacy of Machiavelli’s Prince.” In Machiavelli’s Legacy, ed. Timothy Fuller, 156-170. Philadelphia: University of Pennsylvania Press, 2016.
  • Orwin, Clifford. “Machiavelli’s Unchristian Charity.” The American Political Science Review 72, no. 4 (1978): 1217-1228.
  • Palmer, Ada. Reading Lucretius in the Renaissance. Cambridge: Harvard University Press, 2014.
  • Palmer, Michael. Masters and Slaves: Revisioned Essays in Political Philosophy. Lanham: Lexington Books, 2001.
  • Parel, Anthony J. The Machiavellian Cosmos. New Haven: Yale University Press, 1992.
  • Parsons, William B. Machiavelli’s Gospel: The Critique of Christianity in The Prince. Rochester: University of Rochester Press, 2016.
  • Patapan, Haig. Machiavelli in Love: The Modern Politics of Love and Fear. Lanham: Lexington Books, 2007.
  • Patapan, Haig. “I Capitoli: Machiavelli’s New Theogony.” The Review of Politics 65, no. 2 (2003): 185-207.
  • Pesman, Roslyn. “Machiavelli, Piero Soderini, and the Republic of 1494-1512.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 48-63. Cambridge: Cambridge University Press, 2010.
  • Pitkin, Hanna Fenichel. Fortune is a Woman: Gender and Politics in the Thought of Niccolò Machiavelli. Berkeley: University of California Press, 1984.
  • Pocock, J. G. A. “Machiavelli and Rome: The Republic as Ideal and as History.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 144-156. Cambridge: Cambridge University Press, 2010.
  • Pocock, J. G. A. The Machiavellian Moment: Florentine Political Thought and the Atlantic Republican Tradition. Princeton: Princeton University Press, 1975.
  • Prezzolini, Giuseppe. Machiavelli. New York: Farrar, Straus and Giroux, 1967.
  • Ruffo-Fiore, Silvia. Niccolò Machiavelli: An Annotated Bibliography of Modern Criticism and Scholarship. New York: Greenwood Press, 1990.
  • Rahe, Paul A. “Machiavelli and the Modern Tyrant.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 207-233. Chicago: University of Chicago Press, 2017.
  • Rahe, Paul A. Against Throne and Altar: Machiavelli and Political Theory under the English Republic. Cambridge: Cambridge University Press, 2008.
  • Rahe, Paul A., ed. Machiavelli’s Liberal Republican Legacy. Cambridge: Cambridge University Press, 2005.
  • Rebhorn, Wayne A. “Machiavelli’s Prince in the Epic Tradition.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 80-95. Cambridge: Cambridge University Press, 2010.
  • Rebhorn, Wayne A. Foxes and Lions: Machiavelli’s Confidence Men. Ithaca: Cornell University Press, 1988.
  • Ridolfi, Roberto. The Life of Niccolò Machiavelli, trans. Cecil Grayson. Chicago: University of Chicago Press, 1964.
  • Sasso, Gennaro. Niccolò Machiavelli. Bologna: Il Mulino, 1993.
  • Sasso, Gennaro. Machiavelli e gli antichi e altri saggi. Milan: Ricciardi, 1987.
  • Sasso, Gennaro. Studi su Machiavelli. Naples: Morano, 1967.
  • Savonarola, Girolamo. Apologetic Writings, ed. and trans. M. Michèle Mulchahey. Cambridge: Harvard University Press, 2015.
  • Savonarola, Girolamo. Trattato sul Governo di Firenze. Florence: Franco Cesati Editore, 2006.
  • Savonarola, Girolamo. Selected Writings of Girolamo Savonarola: Religion and Politics, 1490-1498, ed. and trans. Anne Borelli and Maria Pastore Passoro. New Haven: Yale University Press, 2006.
  • Savonarola, Girolamo. Prison Meditations on Psalms 51 and 31, ed. and trans. John Patrick Donnelly. Milwaukee, Marquette Press, 2011 [1994].
  • Savonarola, Girolamo. The Triumph of the Cross. London: Sands and Co., 1901.
  • Saxonhouse, Arlene W. “Machiavelli’s Women.” In Machiavelli’s Legacy, ed. Timothy Fuller, 70-86. Philadelphia: University of Pennsylvania Press, 2016.
  • Scott, John T. The Routledge Guidebook to Machiavelli’s The Prince. London: Routledge, 2016.
  • Scott, John T., and Vickie B. Sullivan. “Patricide and the Plot of The Prince: Cesare Borgia and Machiavelli’s Italy.” The American Political Science Review 88, no. 4 (1994): 887-900.
  • Skinner, Quentin. “Machiavelli and the Misunderstanding of Princely Virtù.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 139-163. Chicago: University of Chicago Press, 2017.
  • Skinner, Quentin. Machiavelli. New York: Sterling Publishing, 2010 [1981].
  • Skinner, Quentin. The Renaissance, vol. 1 of The Foundations of Modern Political Thought. Cambridge: Cambridge University Press, 1978.
  • Slade, Francis. “Two Versions of Political Philosophy: Teleology and the Conceptual Genesis of the Modern State.” In Natural Moral Law in Contemporary Society, ed. Holger Zaborowski, 235-263. Washington, D.C.: The Catholic University of America Press, 2010.
  • Spackman, Barbara. “Machiavelli and Gender.” In The Cambridge Companion to Machiavelli, ed. John M. Najemy, 223-238. Cambridge: Cambridge University Press, 2010.
  • Strauss, Leo. Thoughts on Machiavelli. Chicago: University of Chicago Press, 1978 [1958].
  • Sullivan, Vickie B. Machiavelli, Hobbes, and the Formation of a Liberal Republicanism in England. Cambridge: Cambridge University Press, 2006.
  • Sullivan, Vickie B., ed. The Comedy and Tragedy of Machiavelli. New Haven: Yale University Press, 2000.
  • Sullivan, Vickie B. Machiavelli’s Three Romes. DeKalb: Northern Illinois University Press, 1996.
  • Tarcov, Nathan. “Machiavelli’s Humanity.” In In Search of Humanity: Essays in Honor of Clifford Orwin, ed. Andrea Radasanu, 177-186. Lanham: Lexington Books, 2015.
  • Tarcov, Nathan. “Machiavelli’s Critique of Religion.” Social Research 81, no. 1 (2014): 193-216.
  • Tarcov, Nathan. “Machiavelli in The Prince: His Way of Life in Question.” In Political Philosophy Cross-Examined: Perennial Challenges to the Philosophic Life. Essays in Honor of Heinrich Meier. ed. Thomas L. Pangle and J. Harvey Lomax, 101-118. New York: Palgrave Macmillan, 2013a.
  • Tarcov, Nathan. “Belief and Opinion in Machiavelli’s Prince.” The Review of Politics 75, no. 4 (2013b): 573-586.
  • Tarcov, Nathan. “Freedom, Republics, and Peoples in Machiavelli’s Prince.” In Freedom and the Human Person, ed. Richard Velkley, 122-142. Washington, D.C.: Catholic University of America Press, 2007.
  • Tarcov, Nathan. “Law and Innovation in Machiavelli’s Prince.” In Enlightening Revolutions: Essays in Honor of Ralph Lerner, ed. Svetozar Minkov, 77-90. Lanham: Lexington Books, 2006.
  • Tarcov, Nathan. “Arms and Politics in Machiavelli’s Prince.” In Entre Kant et Kosovo: Études offertes … Pierre Hassner, ed. Anne-Marie Le Gloannec et Aleksander Smolar, 109-121. Paris: Presses de la Fondation Nationale des Sciences Politiques, 2003.
  • Tarcov, Nathan. “Machiavelli and the Foundations of Modernity: A Reading of Chapter 3 of The Prince.” In Educating the Prince: Essays in Honor of Harvey Mansfield, ed. Mark Blitz and William Kristol, 30-44. Lanham: Rowman and Littlefield, 2000.
  • Tarcov, Nathan. “Quentin Skinner’s Method and Machiavelli’s Prince.” Ethics 92, no. 4 (1982): 692-709.
  • Vatter, Miguel. “Machiavelli, Ancient Theology, and the Problem of Civil Religion.” In Machiavelli on Liberty & Conflict, ed. David Johnston, Nadia Urbinati, and Camila Vergara, 113-137. Chicago: University of Chicago Press, 2017.
  • Vatter, Miguel. Machiavelli’s The Prince. London: Bloomsbury, 2013.
  • Vatter, Miguel. Between Form and Event: Machiavelli’s Theory of Political Freedom. New York: Fordham University Press, 2014 [2000].
  • Viroli, Maurizio. “The Redeeming Prince.” In Machiavelli’s Legacy, ed. Timothy Fuller, 34-53. Philadelphia: University of Pennsylvania Press, 2016.
  • Viroli, Maurizio. Redeeming The Prince: The Meaning of Machiavelli’s Masterpiece. Princeton: Princeton University Press, 2014.
  • Viroli, Maurizio. Machiavelli’s God. Princeton: Princeton University Press, 2010.
  • Viroli, Maurizio. Niccolò’s Smile: A Biography of Machiavelli. New York: Farrar, Straus and Giroux, 2000.
  • Viroli, Maurizio. Machiavelli. New York: Oxford University Press, 1998.
  • Vivanti, Corrado. Machiavelli: An Intellectual Biography, trans. Simon MacMichael. Princeton: Princeton University Press, 2013.
  • Voegelin, Eric. “Machiavelli’s Prince: Background and Formation.” The Review of Politics 13, no. 2 (1951): 142-168.
  • Walker, Leslie J. The Discourses ofNiccolò Machiavelli, two volumes. London, 1975 [1950].
  • Warner, John M., and John T. Scott. “Sin City: Augustine and Machiavelli’s Reordering of Rome.” The Journal of Politics 73, no. 3 (August 2011): 857-871.
  • Wolin, Sheldon. Politics and Vision. Princeton: Princeton University Press, 2004 [1960].
  • Wootton, David. “Machiavelli and the Business of Politics.” In Machiavelli’s Legacy, ed. Timothy Fuller, 87-104. Philadelphia: University of Pennsylvania Press, 2016.
  • Zuckert, Catherine. Machiavelli’s Politics. Chicago: University of Chicago Press, 2017.
  • Zuckert, Catherine. “Machiavelli’s Revolution in Thought.” In Machiavelli’s Legacy, ed. Timothy Fuller, 54-69. Philadelphia: University of Pennsylvania Press, 2016.

 

Author Information

Kevin Honeycutt
Email: honeycutt_ks@mercer.edu
Mercer University
U. S. A.

Catharine Trotter Cockburn (1679?—1749)

CockburnCatharine Trotter Cockburn was an active contributor to early modern philosophical discourse in England, especially regarding morality. Her philosophical production was primarily in defense of John Locke and Samuel Clarke. Nevertheless, her thinking was original and independent in many respects.

Cockburn’s moral philosophy combines elements of Locke’s epistemology with Clarke’s fitness theory, and its central axiom is that the true ground of morality consists in human nature. She argued that since all human beings are naturally provided with reason, moral obligation rests on the conformity of God’s command to our own reason. According to her anti-voluntarist moral view, the will of God does not lay the foundations of morality, but it gives morality the force of a law. Furthermore, Cockburn maintained that Man is naturally inclined towards sociability and is consequently morally obliged to contribute to the good and preservation of society. This is one of the most distinctive of Cockburn’s ideas, which departs from a strictly Lockean moral view.

Cockburn entertained a universal and anti-dogmatic idea of the Christian religion founded on the essentials of human nature being reason and sociability. In her view, since there is not an absolutely perfect communion, everyone can choose the one she or he judges as the best. Churches should not waste time presuming to be infallible; rather, they should aim at satisfying their adherents by teaching those truths necessary for salvation. Thus, she converted to the Church of England from Catholicism.

Although mainly focused on morality, Cockburn also dealt with some metaphysical issues that often connect to it, particularly the nature of the soul and the reality of space. Regarding the former, she inquired whether the soul is material or spiritual, concluding that although it is probably immaterial, there is no evidence against either its immateriality or the possibility of its being thinking matter. Moreover, while she defended Locke’s position that only consciousness makes personal identity, Cockburn also gave an original mode-based interpretation of Locke’s view on personhood. As regards the reality of space, she rejected Edmund Law’s position against Clarke that space is only an abstract idea. On the contrary, she argued that space is a real being that can fill up the abyss between body and spirit since it partakes of the nature of both.

Table of Contents

  1. Life
  2. Moral Philosophy
    1. The True Grounds of Morality
    2. Moral Obligation
  3. Religion
  4. Metaphysical Issues
    1. The Nature of the Soul
    2. The Reality of Space
  5. Originality
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Catharine Trotter was born in London probably on August 16, 1679. This is the date provided by Thomas Birch (1705-1766), her official biographer and the editor of the collection of her posthumous Works (1751). However, her birthdate has been recently questioned by Anne Kelley, who found an entry of baptism for “Katherine Trotters, daughter of David Trotters, gentleman, and his wife Sarah” for August 29, 1674, in the Register of St Andrew, Holborn (Kelley 2002, 1). Catharine was the younger daughter of Captain David Trotter, commodore in the Royal Navy, and Mrs. Sarah Ballenden. According to the inscription on Catharine’s gravestone in the cemetery of Longhorsley, she died on May 11, 1749, “in the 70th year of her age.” This seems to confirm the date proposed by Birch as her most probable birthdate.

After her father’s death in 1683, King Charles II granted a pension to her family that was barely sufficient for survival. Little is known of Catharine Trotter’s life until 1701. Birch tells that as a child, she taught herself French and received help in learning Latin, grammar, and logic. At the age of sixteen, she started her career as a playwright. From 1695 to 1706, she wrote and published five plays: Agnes de Castro (1695), The Fatal Friendship (1698), Love at a Loss (1700), The Unhappy Penitent (1701), and The Revolution of Sweden (1706), which were well received and repeatedly performed.

Between 1701 and 1703, her family moved to Salisbury, where Catharine found the favor of Elizabeth Burnet, the wife of the bishop Gilbert Burnet. There, she devoted herself to studying John Locke’s philosophy. In 1702 she anonymously published her first philosophical work, the Defence of Mr. Locke’s Essay of Human Understanding, written in response to the three anonymous Remarks upon an Essay Concerning Humane Understanding, published in London between 1697 and 1699, which had been challenging John Locke’s epistemology and moral philosophy. The worth of her Defence was recognized by prominent philosophers of the time, including John Toland, Gottfried W. Leibniz, and John Locke himself. Locke was so impressed by “the strength and clearness” of Cockburn’s reasoning that once he came to know the authorship of the Defence, he sent her a sincere letter of appreciation and a number of books as gifts.

In 1707 she converted to Anglicanism from the Church of Rome, and, on that occasion she published A Discourse Concerning a Guide in Controversies, in Two Letters: Written to One of the Church of Rome, by a Person Lately Converted from that Communion, explaining the main reasons for her choice. One year later Catharine married Patrick Cockburn, a clergyman, and they had three daughters: Mary, Catherine, and Grissel; and a son, John. From 1714 to 1726, their family experienced serious financial difficulties because of Patrick’s refusal to take the oath of abjuration against the pretender James Stuart. In this period Catharine devoted herself to her family and was totally diverted from her studies in philosophy. After Patrick eventually took the oath in 1726, he was appointed to the episcopal congregation of Aberdeen and their family’s condition rapidly improved. She then had the opportunity to pursue her intellectual interests, and later that same year, she wrote A Vindication of Mr. Locke’s Christian Principles, from the Injurious Imputations of Dr. Holdsworth. However, this essay remained unpublished until its inclusion in the 1751 edition of her Works.

In 1737, the Cockburns moved to Longhorsley, the last destination of Patrick’s career, and here they spent the final part of their lives. This was the most intense and prolific period of Catharine Cockburn’s life as a philosopher.

In 1739 she wrote her Remarks upon Some Writers in the Controversy Concerning the Foundation of Moral Duty and Moral Obligation, which was published in The History of the Works of the Learned in 1743. In 1747 she also published The Principles and Reasonings of Dr. Rutherforth’s Essay on the Nature and Obligation of Virtue, criticizing Thomas Rutherforth’s moral philosophy. Both works were written in defence of Samuel Clarke.

Catharine Cockburn also discussed her philosophical and religious positions in her correspondence with several people, especially Thomas Burnet of Kemnay; her son, John; her niece, Anne Arbuthnot; Thomas Sharp; and Edmund Law.

She was aware of the bias against women’s intellectual skills, and she lucidly resolved to publish all her philosophical writings anonymously for the sake of truth only. As she explained to Thomas Burnet of Kemnay, “a woman’s name would give a prejudice against a work of this nature; and truth and reason have less force, when the person, who defends them, is prejudged against” (Cockburn 1751, II: 155). Interestingly, towards the end of her life, Cockburn entrusted Thomas Birch with publishing a two-volume collection of her works. By that time she probably felt ready to stop hiding. She died in May 1749, only a few weeks after her husband’s death, and her Works was published posthumously in 1751.

2. Moral Philosophy

a. The True Grounds of Morality

Due to the style and structure of her philosophical writings, Catharine Cockburn’s thought is not presented systematically. In fact, since all her works were written in defence of someone else (either John Locke or Samuel Clarke), she was compelled to follow the reasoning of her adversaries. While she addressed a number of philosophical issues, such as the nature and the immortality of the soul, thinking matter, the nature of substances, and the origin of evil, giving original and sophisticated contributions on such subjects, moral philosophy was still her primary concern.

Cockburn’s views on morality, which take form throughout her works, is a combination of Locke’s principles of knowledge and Clarke’s moral fitness theory, and also includes elements from Cambridge Platonism and moral sense theory. She entertains an anthropocentric view on morality, defending the idea that human beings are naturally rational and social creatures. Accordingly, she argues that the true ground of morality is to be found in neither eternal moral truths nor in God’s command but consists in human nature itself.

In the first of her philosophical works, the Defence of Mr. Locke’s Essay (1702), Cockburn replies to support John Locke against Remarks upon an Essay Concerning Humane Understanding, probably written by Thomas Burnet (1635-1715) between 1697 and 1699—although Burnet’s authorship has been recently questioned (Walmsley, Craig, and Burrows 2016). She summarizes the Remarker’s objections in three main points: the doctrine of natural conscience, which he opposes to Locke’s anti-innatism; his accusation of voluntarism against Locke; and his worries about the possibility of a material soul and thinking matter.

Adhering to Locke’s epistemology, Cockburn argues that we cannot have any idea not derived from sensation and reflection, and as a consequence we can find the true grounds of morality by following Locke’s principles of knowledge. As such, she believes that good and evil are not absolute principles imprinted in our minds by God from the beginning; instead, they are ideas formed in us by pleasure and pain. Contrary to this idea, the Remarker denies that Locke’s epistemology could provide “a sure foundation for morality” (Burnet (?) 1697a, 4) and instead holds that human beings are endowed with a “natural conscience.” This is a “natural sagacity” or an “instinct” which operates within us as “a principle of action” and directs our behaviour prior to reason (Burnet (?) 1699, 7-8). Rebuffing the Remarker’s objections, Catharine Cockburn maintains that no morality is possible independently of ratiocination since moral virtues such as justice, fidelity, and gratitude would be empty notions if taken with no relation to human beings. She points out that although Locke refused metaphysical or moral truths originally imprinted in mind, he never denied the existence of a power of perceiving in the soul and distinguishing between good and evil. Simply, she argues that even if this power is so immediate that it seems to prevent any ratiocination, it is actually an effect of ratiocination itself. This power in the soul is what Cockburn calls “conscience,” which does not consist in an inward moral sense as argued by the author of the Remarks, but instead comes from sensation and reflection and is set to work through man’s first persuasions and confirmed by his habits. Conscience can be very useful in morality when one is rightly educated, but it can also be misleading when corrupted by vicious customs. Thus, Cockburn concludes that conscience, far from proving innate moral principles, must neither be taken for a moral law nor for the true foundation of morality.

Furthermore, in Cockburn’s moral philosophy, the grounds of morality do not rest in the original and absolute moral principles in God’s mind. More precisely, she does not deny the reality of such principles but instead argues that the perfect being and its moral attributes of goodness and justice are infinitely beyond our narrow capacities. Human beings can have an idea of God and his attributes only by reflecting upon themselves: “for whatever is the original standard of good and evil, it is plain, we have no notion of them but by their conformity, or repugnancy to our reason, and with relation to our nature” (I: 57-58). Interestingly, according to Cockburn we first have a notion of good, and then we know that God himself is good. Therefore, the nature of God neither provides sure foundation for morality, nor can be the rule of good and evil.

Instead, Cockburn adopts an anthropocentric view according to which the nature of man and the good of society are to us the reason and rule of moral good and evil. Rejecting the Remarker’s accusation of moral relativism against Locke’s anti-innatism, she particularly emphasizes that since reason and sociability are essential to human nature, they are the true and immutable grounds of morality. In fact, God has fitted everything to its proper end—which is happiness for Mankind—and accordingly he requires those things of us to which he has suited our nature. On this point, Cockburn explicitly refers to Grotius’ view that the law of nature is the product of human nature itself and hence she draws the conclusion that “it must subsist as long as human nature” (I: 58). In other words, as long as human beings are human beings, they can infallibly know the difference between good and evil by the light of reason and accordingly they can act suitably to their sociable nature. It is worth noting that Cockburn deliberately refrains from engaging a metaphysical controversy with the Remarker on morality—and indeed, this is not the main concern of her Defence. She rather aims at finding the epistemological and ontological foundation of morality from a human perspective. As we will see below, her later works show a stronger commitment to metaphysical and theological aspects of morality.

The anonymous Remarker also charges Locke with voluntarism. Obviously he does not use this term, which was coined only in the nineteenth century to define a moral theory according to which will takes priority over intellect, and as applied to divine action, holds that morality originates from the will of God. Historically, this view was attributed to Augustine of Hippo, Duns Scotus, William of Ockham, and in the early modern age, to Thomas Hobbes and Robert Boyle. The opposite approach is usually called “intellectualism,” which states that intellect takes precedence over will, and moral standards eternally exist in God’s intellect, determining his will and command. This view is usually ascribed to Thomas Aquinas in the Middle Ages and Cambridge Platonists in the seventeenth century.

The Remarker accuses Locke of grounding morality in the arbitrary will of God, enforcing it by a system of rewards and punishments. From the Remarker’s point of view, this supposition has dangerous consequences for morality: he points out that if the will of God were the original rule of good and evil, without any rule determining his will, there would not be any rule of sin to God either, and God himself would be “the author of sin” (Burnet (?) 1697b, 22).

Rejecting this accusation, Cockburn explains that Locke’s notions of “will of God” and “punishments and rewards” could give morality the force of law but were not meant to be its true foundation. This is a central point in her moral philosophy, which clarifies the role of God’s will and command and at the same time reaffirms the importance of human reason in morality. She maintains that, as with the case of good and evil, “we can only know the will of God by its conformity to our nature” (I: 62), and therefore his command would not have any effectiveness if it were not “knowable to us by the light of nature” (I: 61). However, in her first philosophical work, Cockburn does not provide further details concerning moral obligation, limiting herself to the claim that God’s command is not the source of obligation and that human beings are obliged to do what he commands by their own reason. She further develops her notion of moral obligation in her mature works, especially in the Remarks upon Some Writers (1743) and in the Remarks upon the Principles and Reasonings of Dr. Rutherforth’s Essay (1747).

b. Moral Obligation

Like her Defence of Locke, Cockburn’s later philosophical works pursue an apologetic purpose: to defend Samuel Clarke from the attacks of a number of critics. Nevertheless, these writings show her evident intellectual autonomy. She was particularly inspired by Clarke’s doctrine of moral fitness, according to which an agreement or disagreement of some things with others necessarily arises from different relations among different things. Clarke argued that there is an eternal universal fitness of things that precedes and determines both the will of God and the will of his creatures. In fact, since God is self-existent, absolutely independent, and all-powerful, he always does what he knows to be fittest to be done, and he therefore acts always according to the strictest rules of infinite goodness, justice, truth, and all other moral perfections. Thus, according to Clarke, virtue consists in the conformity of actions to the fitness of things.

Although Cockburn advocated Clarke’s view on fitness, there are strong clues that she was not directly influenced by it. In fact, before Samuel Clarke introduced his fitness doctrine in the Boyle Lectures he gave in 1705, Cockburn had developed her own view by the time she wrote her Defence (1701/02): despite the difference in terminology, what is “suitable to human nature” for Cockburn seems to correspond to what is “fit” for Clarke (Bolton 1993, 575-586; Sheridan 2007, 147-148).

Cockburn’s Remarks upon Some Writers was mainly influenced by Edmund Law’s 1731 English translation of De Origine Mali by William King (1702). In commenting on Leibniz’s theory of the best of all possible worlds, Cockburn explains that God is perfectly free to choose which world to bring into actual existence, but although the creation of a particular system proceeds solely from a determination of the will of God, the relations and fitness of things in it are necessary, and his will must itself conform to that fitness. She entertains a partially intellectualist moral view, according to which God’s intellect seems to have a priority over his will, insofar as he perceives the eternity and inalterability of the necessary relations of all possible things. This is the reason why God could never want pain as suitable and pleasure as unsuitable for sensible beings, for it would be contrary to the system of relations of this world in which every living being aims at happiness.

In response to an accusation of inconsistency between this doctrine and the Lockean epistemological foundation of morality Cockburn presented in her Defence, a lengthy footnote was added to the 1751 edition of her Defence. The critic is unidentified, and it is still unclear whether the responding note was written by Cockburn herself or by Birch, especially because it refers to the author in the third person. However, it has been convincingly shown that it is quite faithful to Cockburn’s view (Bolton 1993, 570). In this footnote she explains that although the grounds of moral obligation have not been discussed in the Defence, she nonetheless explicitly rejects “the notion of founding morality on arbitrary will” and implicitly supposes “the nature of God, or the divine understanding, and the nature of man […] to be the true grounds of it” (I: 61). Interestingly, Cockburn here distinguishes between “real laws,” which “imply authority and sanctions,” and “the law of nature,” which “obliges us, not as dependent, but as reasonable beings” (I: 61). God himself, the Supreme Rational Being, “who is subject to no laws, and accountable to none,” is obliged to do always what is right and fit (I:, 61-62). She reaffirms that God’s command and will, and rewards and punishment, are necessary to morality as they “only give it the force of a law,” but they are not the source of obligation (I: 61-62).

Cockburn’s view on obligation has been recently seen as a mark of her independence and originality. In fact, it seems to be something different from Locke’s view that moral obligation is grounded in a superior decree (Sheridan 2007, 145-46). However, it is worth noting that Locke mainly expressed this position in his Essays on the Laws of Nature, which remained unpublished until 1954, and it is unlikely that Cockburn read it. Nevertheless, her position undoubtedly differs from Locke’s.

In her Remarks upon Some Writers, Cockburn claims that all human beings have a moral sense, which operates in them before any sort of revelation. However, she explains that this moral sense, contrary to the thought of Scottish Enlightenment thinker Frances Hutcheson (1694-1746), is not an innate, blind instinct, but “a consciousness consequent upon the perceptions of the rational mind” (I: 407), and it can be cultivated and improved by the right use of our abilities. Although she allows that the faculty that distinguishes between right and wrong is probably innate “since it operates in some measure on all mankind” (I: 407), its exercise depends on custom and education. Such a moral sense also acknowledges that virtue consists in the law of human nature, and it accordingly approves virtuous actions and disapproves the contrary. Thus, the obligation that human moral sense perceives as a duty arises from the eternal fitness of things and does not depend on the will of God and the sanctions of his laws, but can only be enforced by them. In fact, Cockburn argues that since mankind is a system of creatures that continually need one another’s assistance, it is necessary that everyone contributes to the good and preservation of society according to her/his capacity. To this purpose, human beings are so far pushed towards virtue by their moral sense that all of them naturally feel the moral “obligation of living suitably to a rational and social nature” (I: 413). For Cockburn, it is plain that as a rational being should act suitably to reason and the nature of things, so a social being should promote the good of others: these ends are suitable to the nature of rational and social beings, and the contrary would be as absurd as preferring pain to pleasure.

Cockburn further explains this point in her Remarks upon the Principles and Reasonings of Dr. Rutherforth: as human beings, we are naturally inclined towards happiness by our self-love, which is “increased by our practice of moral good” and in turn “naturally incline us to continue in that practice” (II: 20). Thomas Rutherforth (1712-1771) notes that if the desire of our happiness (that is, our own interest) expands in proportion to our practice of virtue, it follows that the more we are virtuous, the more we grow selfish, and paradoxically, the practice of virtue will be “fatal to itself, by strengthening that self-love” (Rutherforth 1744, 65). Cockburn objects to Rutherforth that although a vicious misapplication of self-love is actually dangerous to virtue—for instance if it is applied solely to private interest and self alone—true self-love is not the same as selfishness. Instead, it is a disinterested benevolence which involves the happiness of others. Thus our virtue, by strengthening our self-love, is in return strengthened by it. An “undeniable instance” of such a disinterested benevolence is provided by the “natural affection of parents for their children” (II: 20).

However, human beings are imperfect creatures, and when exposed to irregular passions, they can deviate from the rule of their duty. Thus God, who foresees everything, decided to link their natural duty to his own will by declaring that he would eternally reward obedience or punish disobedience. But in doing so, he gave Men only a new motive to the performance of their duty but no new foundation for it. To summarize, although Cockburn allows that eternal moral truths, Revelation, God’s command, and his will all play an important role in morality, she argues that they do not provide a sure and true foundation, since it is only in their conformity to rational and sociable human nature that they are moral motives for human beings. Cockburn adopts a strongly anthropocentric view of morality, which combine Locke’s principles of knowledge and Clarke’s metaphysical instances.

3. Religion

According to Thomas Birch, her official biographer, Catharine Cockburn was born in a Protestant family and she was therefore educated in the Anglican religion. Nevertheless, while she was very young, her intimacy with several unidentified Catholic families pushed her toward the Church of Rome, and she embraced that communion until 1707 when she converted back to the Church of England. Probably, her conversion was inspired by her long acquaintance with Gilbert and Elizabeth Burnet during her stay in Salisbury. Nonetheless, it seems to be a coherent consequence of her intellectual and philosophical trajectory.

Cockburn’s view on religion was neither rigid nor enthusiastic: she was not a fierce follower of her communion, and at the same time, she was allergic to any blind faith in dogmas. On the contrary, she believed that the best religion was “the knowledge and practice of our duty” in agreement to God’s revelation (II: 157). She explains that since happiness is for human beings the primary and necessary motive of all their actions, and it consists in living suitably to their rational and sociable nature, it follows that “our duty” consists exactly in living suitably to our nature. Now, a true religion must necessarily aim at guiding men in the correct practice of their duty, and it must therefore be both reasonable and committed to politics. As regards reasonableness, Cockburn rejects the Remarker’s position that religion would be “better established on the nature of God” (I: 59), arguing that the nature and will of God can be seen as a strong foundation of religion only insofar as they conform to human reason. As it concerns politics, Cockburn maintains that since men have a natural inclination toward other human beings and their happiness, a true religion must take care of the good of government and society. As a matter of fact, she concludes that if a religion should be unpolitic and destructive to society, it would necessarily be false, since “nothing can be a law to nature, which of direct consequence would destroy nature” (I: 59).

It is worth noting that in Cockburn’s antidogmatic view on religion, reasonableness and sociability are assumed as indispensable criteria of truth, and consequently there is not only one possible true religion, but any communion that satisfies those criteria would be true. She believed that Christianity should be grounded on a single necessary article of faith, namely the divine nature of Jesus Christ. Thus, all distinctions among churches did not concern necessary elements to salvation but depended only on formal aspects of the worship: simply, she explains that in reading dark passages in the Holy Scriptures, men give different interpretations of unessential articles of faith. However, these interpretations have too often been defended with excessive zeal, which had made “the terms of communion straighter than God has made the terms of salvation” (I: 14), therefore causing massacres and persecutions among Christians. Cockburn ironically notes that “those who are most bigoted to a sect, or most rigid and precise in their forms and outward discipline, are most negligent of the moral duties, which certainly are the main end of religion” (II: 177). On the contrary, she believed that during the Reformation, there was “rather a separation of than from the church” (II: 135), and none of the resulting communions had the absolute authority to direct our faith. Otherwise, the Scripture would have given us incontestable directions to find the true faith. Accordingly, she argues that since there is no church in the world that is infallible and absolutely perfect in all points, everyone should follow that church that she/he is satisfied with, even if some of the unessential points in it seem unconvincing, unless they are proven to be dangerous for salvation. Moreover, she believed that such a choice was not irreversible, for all human beings have the right to read directly the Holy Scriptures, and they should also “have the liberty of judging for themselves” (I: 24) whether or not their church acts in agreement with the words of God. Similarly, she found unacceptable the pretence of infallibility of the Church of Rome, because it was not confirmed by textual evidence. Thus, she eventually decided to go back to the Church of England.

4. Metaphysical Issues

In her philosophical writings, Catharine Cockburn also deals with a variety of metaphysical issues, some of which are closely connected with her account of morality. Among others she was particularly concerned with the following two: (a) the nature of the soul and the related themes of its immateriality, immortality, and the possibility of thinking matter; and b() the ontological reality of space.

a. The Nature of the Soul

In his Essay Concerning Human Understanding, John Locke argues against Descartes that the cogitative activity of the soul is not continuous, and “that the soul always thinks” is not a self-evident proposition and needs proof and the support of experience. However, experience itself clearly shows that the soul is sometimes absolutely without thought, for example, in a deep and dreamless sleep. According to Locke, thought is to the soul what motion is to the body, that is, one of its operations—maybe the most peculiar—but not its essence. Thus, as the body does not always move, so the soul does not necessarily always think. Locke also emphasizes that “our faculties cannot arrive at demonstrative certainty” of the immateriality of the soul, although it is highly probable. However, the highest degree of probability does not exclude the possibility of thinking matter, for “God may, if he pleases, give, or have given to some systems of matter a power to conceive and think” (Locke 1824, IV.3: §6).

The author of the Remarks upon an Essay Concerning Humane Understanding expresses serious worries over Locke’s view, insinuating that it could endanger the immortality of the soul, fostering materialism and atheism. Catharine Cockburn carefully examines the Remarker’s objections and replies point by point.

Firstly, she claims that since the supposition that “the soul always thinks” does not prove that it is immortal, the contrary supposition does not take away any proof of its immortality. Her reasoning proceeds as follows:

(1)   “The soul always thinks.” (a) is not a necessary truth, as Locke had shown;

(2)   If (1), then the contrary proposition, that is, “the soul does not always think” (b), is at least possible;

(3)   Even if thinking were necessary for a soul to exist now—this is far from being demonstrated—this would neither prove that that soul has always existed, nor that it will always exist;

(4)   From (3), it follows that (a) does not provide sufficient evidence for the immortality of the soul;

(5)   From (2) and (4), it follows that (b) cannot be necessarily taken as an argument against the immortality of the soul.

Therefore, she concludes—rebuffing the objections by the Remarker—that Locke’s hypothesis that men do not think in sound sleep does not weaken the Christian doctrine of immortality.

Secondly, the Remarker is particularly afraid that if all our thoughts be extinct in sound sleep, the soul itself would be extinct as well, and we would have a new soul every morning, or in other words, we would be new men every day. According to the Remarker, this is extremely dangerous for the doctrine of Resurrection; for how could we be the same persons on Judgement Day if we are different men every day?

Cockburn insightfully tackles her adversary’s difficulty by stressing that as a body continues with its existence when any motion ceases, and it is always the same body when a new motion is produced, so the soul exists even during an unthinking sleep, and it is the same soul when it wakes up. She imputes to the Remarker a loose use of language, especially when he takes soul, man, and person to signify the same thing, ignoring that for Locke these terms have different meanings: man is understood as the union of soul and body, and person as self-consciousness. According to Locke, consciousness only makes the same person: in fact, despite all changes that a man’s body can suffer throughout his life, he continues to recognize himself as himself, inasmuch as he has consciousness of his past actions and thoughts. As his consciousness extends backwards, so his personal identity reaches. Cockburn was sure that this was sufficient to prove that Locke’s view on identity was consistent with Christian Revelation and that it did not imply any sort of Deism as the Remarker had insinuated.

Interestingly, in defending Locke’s position, she points out that personal identity consists “in the same consciousness, and not in the same substance: in fact, whatever substance there is, without consciousness there is no person,” and “wherever there are two distinct incommunicable consciousnesses, there are two distinct persons, though in the same substance” (I:, 73). It is not clear whether Cockburn’s interpretation was entirely faithful to Locke’s view on identity, and in fact, commentators still disagree as to whether Locke entertained a substance-based or a mode-based theory of person. This is a long debate which can be traced back to Edmund Law (1703-1787), who was the first proponent of a mode reading of Locke’s view in his Defence of Mr. Locke’s Opinion Concerning Personal Identity (1769). However, although Cockburn could not enter into such a controversy since it was not yet in place when she died, it is evident that she gave a mode interpretation of Locke’s theory of personal identity over sixty years before Law (Gordon-Roth 2015, 71-72).

Thirdly, regarding the Remarker’s concerns with thinking matter, Cockburn notes that we have only an idea of the nature of the soul formed upon its operations, but we ignore whether the soul has essential properties distinct from matter, whereby it alone has the power of thinking. By echoing Locke’s agnosticism regarding substantial dualism, Cockburn emphasizes that we do not know whether there is an ontological and substantial difference between thinking and unthinking beings, and consequently, we ignore whether the substratum, which supports thought, is material or immaterial. Furthermore, she shows that the Remarker’s strategy of considering the immateriality of the soul as the main proof of its immortality has dangerous consequences for morality. In fact, she observes that human beings generally lack either leisure or capacity for metaphysical speculations, and if they believe that soul is immortal, it is irrelevant whether they consider it immaterial or not. However, if we rest the proof of the immortality of the soul on its immateriality, it would be sufficient to weaken the proofs of soul’s immateriality in order to debunk our belief in its immortality (Gordon-Roth 2015, 67-69).

Despite her trenchant criticism against the Remarker, it is evident that Cockburn did not necessarily disagree with him on the immateriality of the soul—and indeed, she never argued that the soul is corporeal. She adopted an astute three-point strategy: first, she emphasized that any claim about the nature of the soul must be demonstrated, since it is beyond the limit of our understanding; second, she showed that the hypothesis of the soul’s immateriality can have dangerous implications for morality; and third, she pushed the burden of proof to her adversary, who had to prove why his view was preferable to Locke’s (Thomas 2015, 258; Gordon-Roth 2015, 70).

b. The Reality of Space

In her Remarks upon Some Writers (1743), Catharine Cockburn examines, among other things, some of Edmund Law’s objections against Samuel Clarke concerning the nature of space.

Clarke argued that space necessarily exists because it is an entity that contains things and matter. Although it is not sensible, it cannot be nothing, since space has properties, while “nothing” does not: space has quantity and dimension; it is infinite, immutable, continuous, uncreated, and eternal. Nevertheless, Clarke had also defined space and time as divine properties or modes, which are not independent beings but depend on the only self-subsisting being, namely God.

Law objected that space is only an abstract idea in our mind that is formed by perceiving extended substances and abstracting from them the idea of space. He also rejected Clarke’s position that space is a real being because it has the property of containing bodies, since this makes no more sense than saying that darkness has qualities because it has the property of receiving light. According to Law, space is nothing or just an absence of extended bodies, and for this reason it does not have any properties.

Against Law’s view, Cockburn affirms the real existence of space. She argues that while “extension” is an abstract idea that can be predicated of both space and matter, “space” is not. Actually, space, matter, and extension are strictly connected notions, but space consists neither in matter nor in extension. Rather, she believed that the idea of space is “early obtruded upon the mind by senses, and unavoidable perceived by it” (I: 389), and accordingly, it precedes the idea of extension and does not depend upon our capability of abstracting. Moreover, we could neither conceive the real existence of bodies nor their motion without the idea of space, where they exist and move. Thus, we should admit or reject them altogether.

In the same writing, Cockburn also considers Isaac Watts’ view on space, according to which we do not know what class of beings space should be placed into. Watts had argued that space cannot be a mode of being (because its idea subsists independently of the existence of other beings), but it is not a substance neither (because it is neither material nor spiritual), and it cannot be God (because while space is measurable, God is not at all). Thus, Watts concludes that space “must be nothing” (Watts 1733, 19). Cockburn objects that Cartesian substantial dualism—manifestly adopted by Watts—does not provide a necessarily adequate division of being, and she conversely observes “that there may be other substances than either spirits or bodies” (I: 390). Explicitly embracing the doctrine of the “great chain of beings” (I: 391), Cockburn holds that there is a gradual progression in the ontological structure of nature, by which the most imperfect beings are connected to those that are close to perfection. Since this hierarchical organization of beings must be full and continuous, “there should be in nature some being to fill up the vast chasm betwixt body and spirit, otherwise the gradation would fail, and the chain would seem to be broken. […] And why may not space be such a being” (I: 391)? Thus, she concludes that we cannot define space as nothing just because we do not know what it is. Otherwise we should come to the same conclusion about unextended substances whose reality we have no idea of.

Finally, assuming space to be a real being, Cockburn was inclined not to ascribe infinity to space. She considers two kinds of infinity: a positive infinity and a negative one. Positive infinity, as described by Clarke, is understood as a metaphysical infinity, namely an absolute perfection, to which nothing can be added. In this sense, infinite space can be identified with an attribute of God. Negative infinity—as Locke had explained—is something to which more can be endlessly added. Cockburn notes that such a notion of infinity can only be applied to general abstract ideas such as number, duration, and extension, but “it should not be ascribed to space by those who allow space to be a real particular being” (I: 401).

Cockburn’s substantival account of space has been recently seen as a further proof of her intellectual autonomy and philosophical originality: it offers a credible third way between Descartes’ view that space is a substance but with no divine properties, and Newton’s position that space has many properties, including all those usually attributed to God, but it is not a substance (Thomas 2013).

5. Originality

In the 21st century, Catharine Cockburn’s acumen has been recognized by commentators, and an increasing quantity of literature shows that she had original philosophical positions, although some scholars do not consider it fully new (Nuovo 2011, 248-249).

As noted above, all her philosophical works were written in defense of either Locke or Clarke, and she was consequently forced in turn to follow the line of reasoning of their critics. However, her writings and private correspondence show that her intention was not merely to vindicate those eminent philosophers but rather to enter into the most lively controversies of that time and contribute to them.

In Cockburn’s philosophy, we have found at least four marks of originality and intellectual autonomy:

First, we have seen that she was selective about which ideas of Locke’s and Clarke’s to defend. Particularly, although her idea of fitness aligns with Clarke’s, there are strong reasons to believe that she had developed her own view independently (Bolton 1993).

Second, we have considered that her view of moral obligation was grounded in human reason and sociability, showing that it clearly differs from Locke’s view that obligation is constituted in superior decree (Sheridan 2007).

Third, Cockburn anticipated a strong debate concerning the interpretation of Locke’s theory of personal identity, proposing a mode reading over sixty years before Edmund Law, who has been generally seen as the first proponent of this interpretation (Gordon-Roth 2015).

Fourth, we have examined her hypothesis of the ontological reality of space, according to which space is a substance and has divine properties. In this doctrine, some commentators have seen an original alternative both to Descartes’ dualistic view and Newton’s non-substantial theory of space (Thomas 2013).

6. References and Further Reading

a. Primary Sources

  • Burnet, Thomas (?). 1697a. Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Burnet, Thomas (?). 1697b. Second Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Burnet, Thomas (?). 1699. Third Remarks upon an Essay Concerning Humane Understanding: In a Letter Addres’d to the Author. London: Wotton.
  • Clarke, Samuel. 1998. A Demonstration of the Being and Attributes of God. Edited by Ezio Vailati. Cambridge: Cambridge University Press.
  • Cockburn, Catharine (née Trotter). 1751. The Works of Mrs. Cockburn, Theological, Moral, Dramatic, and Poetical, Several of Them Now First Printed, Revised and Published with an Account of the Life of the Author by Thomas Birch. 2 vols., London: J. and P. Knapton.
  • King, William. 1731. An Essay on the Origin of Evil. Edited and Translated by Edmund Law. London: Thurlbourn.
  • Locke, John. 1824. “An Essay Concerning Human Understanding.” 4th edition. In The Works of John Locke in Nine Volumes, edited by John Locke. London: Rivington.
  • Rutherforth, Thomas. 1744. An Essay on the Nature and Obligations of Virtue. London: Thurlbourn.
  • Watts, Isaac. 1733 Philosophical Essays on Various Subjects, 2nd edition. London: R. Ford.

b. Secondary Sources

  • Bolton, Martha Brandt. 1993. “Some Aspects of the Philosophy of Catharine Trotter.” Journal of the History of Philosophy 31, no. 4: 565-88.
    • A classic in Cockburn scholarship, this is one of the first papers that consider Catharine Trotter Cockburn as an original and independent philosopher.
  • Broad, Jacqueline. 2002. Women Philosophers of the Seventeenth Century. Cambridge: Cambridge University Press.
    • A detailed analysis of the contribution of women to philosophy in early modern England. Broad explores the philosophical writings of five figures, including Margaret Cavendish, Anne Conway, Mary Astell, and Catharine Trotter Cockburn.
  • Connor, Margaret. 1995. “Catharine Trotter: An Unknown Child.” American Notes and Queries. Quarterly Journal of Short Articles 8, no. 4: 11-14.
    • This brief paper unconvincingly questions Cockburn’s birthdate.
  • De Tommaso, Emilio M. 2017a. “Il razionalismo etico di Catharine Trotter Cockburn.” Intersezioni XXXVII-1: 19-38.
    • A study of Cockburn’s moral philosophy presented as a sort of ethical rationalism.
  • De Tommaso, Emilio M. 2017b. “‘Some Reflections upon the True Grounds of Morality’—Catharine Trotter in Defence of John Locke.” Philosophy Study 7, no. 6: 326-339.
    • An analysis of Cockburn’s main arguments in favor of the compatibility between morality and Locke’s epistemology.
  • Duran, Jane. 2013. “Early English Empiricism and the Work of Catharine Trotter Cockburn.” Metaphilosophy 44: 485-94.
    • An examination of the empiricist legacy of Cockburn’s philosophy.
  • Gordon-Roth, Jessica. 2015. “Catharine Trotter Cockburn’s Defence of Locke.” The Monist 98: 64-76.
    • An excellent examination of some metaphysical issues in Cockburn’s Defence of Locke, including the immateriality and immortality of the soul.
  • Hutton, Sarah. 1998. “Cockburn, Catharine (1679-1749).” In Routledge Encyclopedia of Philosophy, edited by Edward Craig. Routledge: London. doi: 10.4324/9780415249126-DA017-1
  • Kelley, Anne. 2001. “‘In Search of Truths Sublime’: Reason and the Body in the Writings of Catharine Trotter.” Women’s Writing 8, no. 2: 235-50.
    • Argues that Cockburn’s project was to challenge the convention of women’s intellectual and moral inferiority, demanding their right to a public voice.
  • Kelley, Anne. 2002. Catharine Trotter an Early Modern Writer in the Vanguard of Feminism. Aldershot: Ashgate.
    • A milestone in Cockburn scholarship covering both her literary and philosophical works.
  • Linker, Laura. 2010. “Catharine Trotter and the Humane Libertine.” Studies in English Literature 1500-1900 50, no. 3: 583-99.
    • An insightful examination of some libertine resonances in Cockburn’s comedy Love at Loss. The paper also focuses on women’s lack of power, especially after marriage.
  • Myers, Joanne E. 2012. “Catharine Trotter and the Claims of Conscience.” Tulsa Studies in Women’s Literature 31, no. 1/2: 53-75.
    • A comprehensive analysis of the role of religious themes in Cockburn’s writing.
  • Nuovo, Victor. 2011. Christianity, Antiquity, and Enlightenment: Interpretations of Locke. New York: Springer.
    • In this detailed collection of essays on the Christian philosophy of John Locke, Victor Nuovo devotes a chapter to Catharine Cockburn’s enlightenment.
  • O’Neill, Eileen. 2005. “Early Modern Women Philosophers and the History of Philosophy.” Hypatia 20, no. 3: 185-97.
    • This is an in-depth analysis of the reasons why early modern women philosophers disappeared from the history of philosophy by the twentieth century.
  • Ready, Kathryn J. 2002. “Damaris Cudworth Masham, Catharine Trotter Cockburn, and the Feminist Legacy of Locke’s Theory of Personal Identity.” Eighteenth-Century Studies 35, no. 4: 563-76.
    • This paper emphasizes the feminist implications of both Masham’s and Cockburn’s interpretations of Locke’s view on personhood.
  • Sheridan, Patricia. 2007. “Reflection, Nature, and Moral Law: The Extent of Catharine Cockburn’s Lockeanism in her Defence of Mr. Locke’s .Hypatia 22, no. 3: 133-51.
    • A thorough examination of Cockburn’s Defence. The author provides convincing proof of Cockburn’s originality and independence from Locke.
  • Sheridan, Patricia. 2011. “Catharine Trotter Cockburn.” In the Stanford Encyclopedia of Philosophy. http://plato.stanford.edu.
  • Sund, Elizabeth. 2013. “The Right to Resist: Women’s Citizenship in Catharine Trotter Cockburn’s The Revolution of Sweden.” In Political Ideas of Enlightenment Women: Virtue and Citizenship, edited by L. Curtis-Wendlandt, P. Gibbard, K. Green, 141-156., New York: Ashgate.
    • This essay focuses on Cockburn’s view on citizenship, exploring political and feminist concerns in her final play, The Revolution of Sweden.
  • Thomas, Emily. 2013. “Catharine Cockburn on Substantival Space.” History of Philosophy Quarterly 30, no. 3: 195-214.
    • Provides evidence that Cockburn’s account of substantival space was new and original.
  • Thomas, Emily. 2015. “Catharine Cockburn on Unthinking Immaterial Substance: Souls, Space, and Related Matters.” Philosophy Compass 10, no. 4: 255-63.
    • A careful examination of metaphysical themes in Cockburn’s philosophical works.
  • Waithe, Mary Ellen. 1991. “Catharine Trotter Cockburn.” In A History of Women Philosophers. Vol. III: Modern Women Philosophers, 1600-1900, edited by M.E. Waithe, 101-125. Dordrecht: Kluwer Academic Publishers.
    • One of the first works that includes Cockburn in the history of philosophy. It explores her moral philosophy and some metaphysical issues as presented in her Defence of Mr. Locke’s Essay.
  • Walmsley, J.C., Hugh Craig, and John Burrows. 2016. “The Authorship of the Remarks upon an Essay Concerning Humane Understanding.” Eighteenth-Century Thought 6: 205-43.
    • Argues that attribution to Richard Willis is more probable than the traditional attribution to Thomas Burnet.
  • Williams, Jane. 1861. “Catharine Cockburn.” In The Literary Women of England, edited by J. Williams, 170-188. London: Saunders, Oatley and Co.
    • In her broad study of the literary women of England, Jane Williams devotes some pages to Catharine Cockburn.

 

Author Information

Emilio Maria De Tommaso
Email: emdetommaso@unical.it
University of Calabria
Italy

Properties

A stone, a bag of sugar and a guinea pig all weigh one kilogram. A lily, a cloud and a sample of copper sulphate are white. A statue, a dance and a mathematical equation are beautiful. The fact that distinct particular things can be the same as each other and yet different has been the source of a great deal of philosophical discussion, and in contemporary philosophy we would usually say that what makes distinct particulars qualitatively the same as each other is that they have properties in common. The stone, the sugar and the guinea pig all instantiate the property of weighing one kilogram, while the lily, the cloud and the copper sulphate all instantiate the property of being white. The distribution of properties determines qualitative sameness and difference.

At this point, the consensus ends and a variety of philosophical questions arise about the nature of properties and their relationship to other entities and each other. Is the category of properties a fundamental one, or is the existence of properties determined by the existence of something else? Are some properties more fundamental than others? What is the relationship between properties and causation, and causal laws? What is the relationship between properties and meaning? Do properties determine what could and what could not happen? Do they determine which natural kinds there are? Do properties exist independent of the mind?

Table of Contents

  1. What Are Properties? Ontological Questions
    1. The Ontological Basis of Properties
    2. Nominalism versus Realism
  2. The Identity and Individuation of Properties
    1. Extensional Criteria
    2. A Revised Extensional Criterion: The Modal Criterion
    3. Hyperintensional Criteria
    4. Dualism about Properties and Concepts
    5. The Causal Criterion
    6. Quiddities
  3. Which Properties Are There?
    1. Families of Properties
    2. Maximalism versus Minimalism
  4. Problems with Instantiation
    1. The Instantiation Regress
    2. The Paradox of Self-Instantiation
  5. Categorical and Dispositional Properties
    1. Do Dispositional Properties Depend upon Categorical Ones?
    2. Dispositional Properties from Categorical Ones
    3. Dispositional versus Categorical Properties
    4. Explanatory Uses for Dispositional Properties in Metaphysics: Laws and Modality
    5. Problems with Pan-Dispositionalism
  6. Properties and Natural Kinds
  7. Different Types of Properties
    1. Intrinsic and Extrinsic Properties
    2. Accidental and Essential Properties
    3. Monadic and Polyadic Properties
    4. Determinable and Determinate Properties
    5. Qualitative and Non-Qualitative Properties
    6. Technical Terms for Property Types
  8. Realism about Properties: Do Properties Exist?
  9. Properties in the History of Philosophy
    1. Ancient Theories of Properties
    2. Medieval Theories of Properties
    3. Properties and Enlightenment Science
  10. References and Further Reading

1. What Are Properties? Ontological Questions

a. The Ontological Basis of Properties

Properties are also known as ‘attributes’, ‘characteristics’, ‘features’, ‘types’ and ‘qualities’. The question of whether properties are a fundamental category of entities or whether qualitative similarity and difference is determined by the existence of something else has been a feature of philosophical debates since ancient times. (See Section 9.)

In contemporary philosophy, there are four main accounts of the ontological basis of such entities: universals, tropes, natural classes and resemblance classes. The alternative to any of these accounts is to treat properties as ungrounded entities which require neither further explanation nor ontological grounding. To see the difference between the different accounts of the ontological basis of properties, let us consider three instances of being white: the lily, the cloud and the sample of copper sulphate. The universals theorist maintains that each of these instances of white are instances of universal whiteness, an entity which is either transcendent, in that it exists whether or not it is ever instantiated, or immanent, in that it is wholly present in each of its instances. In the latter case, universals exist as part of the spatio-temporal world, whereas in the former they are abstract.

The trope theorist regards each instance of whiteness as an individual quality, not simply in the case of different types of white particulars such as the lily, the cloud and the copper sulphate, but also across particulars of the same type: the whiteness of each sample of copper sulphate is a distinct trope. Tropes are particular, unrepeatable entities, but this ontology of individual qualities must also have the resources to ground resemblance between tropes. The trope theorist wants to be able to say, for example, that the individual white tropes in a bunch of lilies resemble each other, but the nature of this resemblance is a matter of contention. Some theorists hold that trope similarity is primitive, a matter of unanalysable fact (Maurin 2002), while others maintain that tropes fall into resemblance classes or natural classes (Ehring 2011). Whatever the details of the formulation, it is crucial for a viable theory of properties that some such similarity between tropes obtains, because without it the ontology of tropes is one of bare particulars. In the latter case, the individual white tropes possessed by each lily would be no more similar nor different to each other than the red of the stoplight, the taste of the chocolate bar or the texture of the lizard, and that fails the very first demand of what we want a property theory to do. Similarity or resemblance between tropes is required alongside the mere existence of individual qualities themselves.

In the third and fourth accounts of qualitative similarity and difference, particulars are of the type they are by virtue of being members of sets of particulars: the lily, the cloud and the copper sulphate are all members of the set of white things, and it is in virtue of this that these particulars are white. If set membership is all that is required to be a property, then this view yields a super-abundant, over-populated ontology of properties: anything is a member of infinitely many sets with other things, but not all of these collections mark objective similarities. In order to deal with this over-population problem, the set-theoretic account of properties might add that some of this infinite collection of sets are more natural than others, making the account of properties one of natural classes of particulars (Lewis 1983a, 1986). The resemblance class theorist postulates a less abundant range of properties by maintaining that particulars belong to the classes they do because of primitive resemblance relations between them (Rodriguez-Pereyra 2002). Strictly speaking, however, although the natural and resemblance class theories give an account of qualitative similarity and difference, they may not all count as property theories; whether they do or not depends upon whether one opts to identify the classes of particulars with properties or not.

b. Nominalism versus Realism

A key factor which influences the decision about which ontological account of properties to accept is the question of whether general, repeatable or universal entities exist, or whether the entities which exist in the world are all particulars.

This debate is usually described as one between nominalism and realism, although care is needed here because these terms have other philosophical meanings as well. Within the discussion of properties, nominalism is taken to mean denying the existence of general or repeatable entities such as universals, in favour of an ontology of particulars; however, it is also used to mean ‘denying the existence of abstract objects’ as well. These positions are independent of each other and, in the case of property theories, it is possible to be a nominalist in the sense of denying the existence of abstract objects while accepting the existence of universals (and, conversely, to deny the existence of universals while accepting abstract objects as some resemblance nominalists do). For instance, David Armstrong’s account of properties as immanent universals is consistent with denying the existence of abstract objects while accepting the existence of repeatable, universal entities (Armstrong 1978a, 1978b). From now on, ‘nominalism’ is reserved for the denial that general, repeatable or universal entities exist.

Similarly, the term ‘realism’ is also ambiguous, this time within the study of properties: one might be a realist in the sense of being a realist about universals or repeatable entities; or, more broadly, one might be a realist about the existence of properties. This section considers realism in the former sense and postpones discussion about the existence of properties until Section 8.

In the context of theories of properties, we can distinguish realism, which accepts the existence of universals (either immanent or abstract) or which treats properties as a fundamental category of entities, from two versions of nominalism. The first, moderate nominalism accepts that individual qualities or properties exist in the form of tropes, while the view which is sometimes described as extreme nominalism denies the existence of any fine-grained qualities or property-like entities at all. The appearance of objective similarity and difference in nature must, for the extreme nominalist, be accounted for in terms of sets of concrete particulars (where set membership is not, on pain of circularity, determined by the properties which the particulars have) or in virtue of the particulars falling under a certain concept or a certain predicate applying to them. The former is known as set or class nominalism if no further account is given of why particulars belong to the classes which they do, although some sets may be considered to be more natural than others (see 3b); however, some proponents of this set-theoretic version of extreme nominalism maintain that particulars belong to the classes which they do in virtue of the particulars resembling each other (Rodriguez-Pereyra 2002).

Alternative versions of extreme nominalism refuse to give any reductive account of why distinct particulars are qualitatively similar to each other, dismissing this phenomenon (which gives rise to the debate between nominalists and realists in the first place) as not needing explanation. In this view, which is associated with Quine (1948), the One Over Many Problem is not a genuine philosophical problem: we can give an account of why ‘b is F’ and ‘c is F’ are true in terms of the particulars b and c existing and the predicate F applying to them. We do not require anything more than this semantic theory of predication, according to this version of extreme nominalism; and so not only do we not need to postulate universals, we do not need to postulate an alternative ontological category of particulars such as tropes, nor to give a reductive account of properties in terms of predicates or concepts of the kind which other extreme nominalists might support. This denial of the problem is disparagingly called ‘Ostrich Nominalism’ by Armstrong (1978a, 16) because of the ostrich’s habit of putting its head in the sand in the face of danger, but Quine’s view is defended from this charge by Devitt (1980). (See also Armstrong’s response to Devitt, 1980.)

The extreme nominalist position is usually motivated by suspicion about the ontological nature of universals since these must either be abstract objects, with the particulars which have them participating in or instantiating these abstract entities, or immanent universals which are wholly present at each instantiation. In both cases, one might be concerned that we do not have an account of the relationship between particulars and the universals which they instantiate: that is, what instantiation is. Moreover, if instantiation is itself a relation, its existence may lead to an infinite regress (see Section 4a). One might also be concerned about whether we can understand how immanent universals can be wholly present at many locations at once. In the apparent absence of strict criteria of identity or individuation for universals, which might shed light upon what being a universal amounts to, the extreme nominalist suggests that we should avoid ontological commitment to such entities on the grounds that they are ontologically mysterious (Devitt 1980).

On the other hand, the realist about universals complains that the extreme nominalist’s view is unexplanatory or that she has the direction of explanation the wrong way around. For instance, the extreme nominalist who accounts for qualitative similarity in terms of predicates (sometimes called a ‘predicate nominalist’) explains that distinct particulars are red because the predicate ‘is red’ applies to them; but, the realist urges, the more coherent explanation is that the predicate ‘is red’ applies to the particulars because each of the particulars has the property of being red. In short, it is more coherent to explain why predicates apply to particulars in terms of the properties which they have, rather than the other way around. The same criticism would apply to other forms of extreme nominalism which characterise qualitative similarity between particulars as being a matter of their belonging to the same set or their being subsumed under the same concept. According to Armstrong, the extreme nominalist is either ‘failing to answer a compulsory question in the examination paper’ (1978a, 17) by rejecting the One Over Many Problem, or is getting the answer to that question wrong.

The moderate nominalists, who attempt to occupy the middle position between the realists and extreme nominalists, accept that there is a fine-grained ontological category of qualitative entities, but they insist that these are particular qualities rather than general, repeatable or universal entities. The initial complaint from the realist about these moderate forms of nominalism, such as trope theory, is that if tropes are individual qualities with no relations of similarity or difference between them, then they are each as unlike each other as they are alike and so they fail to satisfy the primary desideratum of a theory of properties because we still have no account of what qualitative similarity is. However, it is crucial to note that this criticism is only effective against naïve accounts of trope theory. As was noted above, more sophisticated forms of trope theory remedy this difficulty by giving an account of similarity between tropes, either by postulating primitive resemblance relations between tropes or by postulating versions of class or resemblance nominalism where tropes are the members of natural or resemblance classes, rather than particulars. Thus, such trope theorists cannot be charged with failing to provide a coherent ontological basis for qualitative similarity.

Despite this, however, the dispute between realists and moderate nominalists lingers on, with the former claiming to have the simpler ontology in comparison with trope theory, and accusing the versions of trope theory which treat resemblance between tropes as primitive of accepting too much as unanalysable brute fact. The trope theorists counter by repeating their complaints about the mysteriousness of universals, and as yet there is no clear winner in this debate. Even Armstrong (1992), who was committed to grounding similarity in immanent universals, admits that

trope theory has comparable explanatory power to his favoured universals theory.

It would be easy to spend the remainder of this article evaluating these alternative accounts of the

ontological basis of properties and the respective benefits of realism or nominalism. However, since each of the theories covered by both realism and moderate nominalism provides a workable property theory which gives an account of qualitative similarity and difference, this project would be superfluous to current requirements. Moreover, although each of these views has its committed proponents, some philosophers have suggested that a principled decision between the options is one which cannot be made in isolation from other, broader philosophical commitments such as those concerning the nature of modality or the existence of abstract objects (Allen 2016), or, if not, then it is a choice which is not of great philosophical significance (Hirsch 1993). With these additional difficulties in mind, the question of whether nominalism or realism is preferable, and the more specific matter concerning which nominalist or realist theory is the best, will not be pursued further.

2. The Identity and Individuation of Properties

It is at least useful—or, some philosophers would argue, imperative (Frege 1884, Quine 1948)—for there to be an account of identity and individuation for each category of entities. If we do not have an account of what determines whether an entity E is exactly the same entity as a member F of the same ontological category as E, or what makes E and F distinct from each other, we do not have a clear conception of what kinds of entities E and F are. To put the point simply: what determines that E = F, or what individuates E from F? The identity and individuation criteria required are constitutive, rather than epistemic, so we need not know (nor even be able to know) whether one property is the same as another in every particular case; it is the question of what makes it the case that one property is the same as another which is at issue.

This requirement for identity and individuation criteria for each category is a general one in metaphysics—applying equally to other categories such as sets, objects and persons—but it is one which has proved problematic in the case of properties because it is a difficult requirement for the property theorist to satisfy. Thus, those who treat the provision of identity criteria as mandatory for a category of entities to be legitimate go as far as rejecting the objective existence of properties, qualities, attributes and such in favour of versions of nominalism which rely on predicates or sets of concrete individuals instead (see Section 1b).

a. Extensional Criteria

The initial problem is that properties cannot be identified by their spatio-temporal location alone (as we might do with particular objects) because many distinct properties can be co-located. Nor do properties satisfy extensional identity criteria like sets do; that is, a property cannot be identified by the set of individuals which instantiates it, at least if we just take actual individuals into account. Purely by accident, all individuals with a property P might also have property Q and so the set of all P individuals will be identical with the set of all Q individuals. If we accept a set-theoretic extensional account of property identity, then P = Q. For example, we can imagine a world in which everything which has the mass of exactly one gram is also a sphere, and that nothing else in that world is a sphere. In such a world, being a sphere = having mass 1g because the set of individuals which instantiates being a sphere is the same set as that which instantiates having mass 1g, since sets are identified by the elements they contain. But it is utterly counterintuitive to identify these properties: it seems possible that something which is not a sphere could have a mass of 1g, or that a sphere could have a mass other than 1g. This is known as the problem of accidental coextension.

With the obvious candidates rejected, the search for identity criteria for properties must look elsewhere. Part of the difficulty with how to proceed at this point arises because we need at least a rough picture of how many properties there are in order to ascertain whether a proposed criterion matches our intuitions about properties or not. The question of the number of properties which there are might, in turn, be affected by what one thinks that properties do: are properties causal entities, such as causes and effects, or entities which determine natural laws or regularities in nature? Are they semantic values; that is, do they determine what the predicates of our language mean? Or, are they something else besides?

Some of these options will be discussed below, but for now it is enough to note that the interconnections between these issues make it difficult to give a unique and plausible account of property identity in the abstract. Nevertheless, there are some viable candidates for such a criterion.

b. A Revised Extensional Criterion: The Modal Criterion

First, one could take seriously the intuition that the set-theoretic account of property identity, which was rejected above on the grounds of accidental coextension, might be acceptable if we considered all the possible individuals which instantiate a property, rather than just all the actual individuals which instantiate it. The problem with accidental coextension is that the same set of individuals happen to instantiate apparently distinct properties P and Q, although it seems plausible to think that an individual could exist which instantiated P without instantiating Q. But that problem will be alleviated if we include such possible individuals in the set in the first place. However, in order to do this, possible individuals must exist in the same sense as actual ones and so, following David Lewis, we must accept that modal realism is true (Lewis 1986). If we do, there is a constitutive, modal criterion of property identity based on the necessary coextension of identical properties; equivalently, for the modal realist, properties are identical if they are instantiated by the same set of possible and actual individuals.

One might object that Lewis’s modal criterion does not individuate properties finely enough, however. For instance, some distinct properties appear to be necessarily coextensive in his view: being a triangle and being a closed three-sided shape are instantiated by all the same actual and possible individuals but, one might argue, they are not the same property and so we do not want to identify them as Lewis’s criterion would do. At this point, the supporter of the modal criterion has a choice of two responses: first, he might deny the objector’s intuition that being a triangle and being a closed, three-sided shape are distinct properties. Or he might question the example in another way by arguing that such properties are not coextensive anyway, either because they are instantiated by distinct individuals or else because they are relations between different parts of the same individuals. Being a triangle and being a closed three-sided shape involve angles and sides respectively, regardless of whether broadly speaking they are instantiated by the same individual things (Rodriguez-Pereyra 2002, 100). However, a consequence of this move is that we cannot rely upon our intuitions about whether a property is monadic or polyadic (see 7c for more on this distinction).

Alternatively, if one decides to identify necessarily coextensive properties to preserve the modal criterion, there are also difficulties. First, it seems plausible that someone might have contradictory beliefs about a property: Sam believes that he has drawn a triangle, but Sam does not believe that he has drawn a closed three-sided shape. If we want properties to ground the distinction between these beliefs, or between propositional attitudes in general, then there will have to be a finer-grained distinction between properties. This matter is particularly pressing if one hopes for a property theory which helps to account for meaning or representation.

Secondly, the modal criterion identifies all indiscriminately necessary properties—properties which trivially apply to everything (see 7f)—since these too are necessarily coextensive. Properties such as being such that the number thirty-seven exists, being such that 2 + 2 = 4, and is dancing or not dancing apply to every possible individual and so all turn out to be identical with each other. One might regard this as an advantage on the basis that indiscriminately necessary properties are a dubious family of properties, although there do seem to be cases in which we are intuitively prone to distinguish them, such as when Sam believes that he is such that 2 + 2 = 4, but Sam does not believe that he is such that Fermat’s last theorem is true. If properties directly determine mental content, Sam cannot have both a true and a false belief about the same property.

c. Hyperintensional Criteria

In order to deal with these problems, we seem to require a finer-grained, hyperintensional criterion of property identity that can distinguish between properties which are necessarily coextensive. There is not much consensus about what the basis of such a criterion would be: one might think that properties are individuated linguistically or formally, so the property of being triangular and red would be distinct from being red and triangular. Perhaps this individuates properties too finely, at least for many of the roles we have presumed that properties play. Alternative hyperintensional accounts identify properties with objectively existing concepts (Bealer 1982) or with abstract objects (Zalta 1983, 1988). Alternatively, one might turn to the quiddistic criterion of property identity discussed below.

d. Dualism about Properties and Concepts

The main problems for the modal criterion seem to arise when we are trying to employ properties to give an account of mental representation, or to capture differences between someone’s psychological states. If this is the case, one might argue that we could supplement the ontology of properties—identified and individuated according the possible and actual individuals which instantiate them—with a finer-grained ontology of concepts or linguistic entities. Properties could be coarser grained, perhaps identified and individuated according to the modal criterion, while predicates or concepts could be employed in the explanation of psychological states. (Bealer 1982. See Nolan 2014 for criticism of this strategy.)

e. The Causal Criterion

An alternative, and potentially much more coarse-grained, account of property identity is proposed by Shoemaker (1980) who suggests that properties can be identified and individuated in virtue of their causal roles. Thus, property P is identical with property Q if and only if P and Q have all the same causes and effects. Such a criterion exploits the fact that properties are causally related to each other and, furthermore, many properties appear to enter into these causal relations essentially: having mass of 1kg is having whatever it is that requires 1N force to accelerate at 1m/s2 in a frictionless environment, and which will create 9 x 1016 Joules of energy when the 1kg mass is destroyed. Because the causal relations in question are usually general causal relations, versions of this criterion are sometimes characterised as identifying and individuating properties in terms of their nomological or nomic role: that is, the role which the respective properties play in laws of nature, whether causal or structural (Swoyer 1982; Kistler 2002). The causal and nomological role criteria are sometimes grouped together as structuralist accounts of property identity and individuation, since what is essential to a property is its relations to other properties (and perhaps also to other entities).

The utility of the causal criterion might be restricted, however: if any properties do not enter into causal relations—that is, if they are uncaused and also causally inert—the causal criterion will not apply to them. Also, properties which are epiphenomenal (if any exist) will also be omitted, unless these can be identified and individuated on the basis of their causes alone. Spatio-temporal properties and properties of abstract objects (if there are any) are particularly problematic in this regard. Given these problems, one might maintain that the ontology of properties is mixed, with some which are essentially causal properties and others which are not. If so, however, the causal criterion is not a general criterion of what makes properties the same as each other or different, and thus it does not illuminate what in general a property is. Nevertheless, as the causal conception of properties has become more popular, more research has been done to explain how properties which do not appear to be essentially causal are essentially causal after all (Mumford 2004; Bird 2017; Williams 2017).

At this point, it is worth noting a metaphysical distinction between two closely related views which are consistent with property structuralism: one can take the causal relations which a property enters into as its constitutive identity criteria, or one can take properties to have an essentially causal nature which then determines the respective relations which each property enters into. In the former view, the nature of a property is determined by the relations in which it stands, whereas in the latter, the nature of a property determines the relations in which it stands. If one cares about there being strict identity criteria for each category of entities (Quine 1948), then the former provides non-circular identity criteria for properties (on the assumption that the nature of the relations into which a property enters is not determined by the nature of the property), whereas the latter view does not. Rather, the latter view asserts that each property has or consists of an intrinsic causal (or nomological) nature which serves to identify and individuate it. Although this move will not satisfy those who require strict identity criteria, it is argued that assuming that properties have intrinsic, essentially causal natures can facilitate a rich and fruitful theory of causation, laws, modality and perhaps more, and thus that it is worth abandoning methodological scruples for metaphysical benefits. These theories are discussed in Section 5.

If either of these structuralist conceptions of properties is correct, then a property could not have different causes and effects from those it has, because the causal relations which it enters into are constitutive of its nature (or else its nature determines which causal relations it enters into). Each property has its causal or nomological role necessarily. (A property might have different causes and effects in different background conditions, or in conjunction with different properties, but that is different.) One argument given in favour of this conception of properties is how well it fits with our understanding of fundamental properties via the physical sciences: in keeping with the example at the beginning of this section, we can empirically determine what properties can do whereas it is not obvious that we have the same epistemic access to what their qualitative nature is (for exceptions, see the next section). It would be parsimonious, as well as convenient, to think that there is nothing more to being a property than its contribution to causal or nomological processes.

f. Quiddities

Against the structuralist conceptions of properties discussed in the previous section, one might be concerned that there is more to a property than its causal or nomological role; or, going further, that the nature of a property is only contingently related to the role it plays in causation or laws. If this is the case, the nomological role R played by a property P in the actual world could be played by Q in another possible situation; and furthermore, P (which has actual role R) could have nomological role S in another possible situation. Moreover, one might worry that the causal or nomological criteria try to characterise properties in terms of their relations to other things, rather than as they themselves are internally. For instance, Armstrong notes that ‘properties are self-contained things, keeping themselves to themselves, not pointing beyond themselves to further effects brought about in virtue of such properties’ (Armstrong 1997, 80). If one takes this view, then what are properties and how are they identified? One might suggest that each property has a unique intrinsic qualitative nature known as a quiddity.

Some philosophers have complained that quiddities are obscure entities, distinguished by brute, unanalysable qualitative differences between them. Moreover, they imply a primitive account of transworld identity for properties; that is to say that what makes an entity the same property in different situations is nothing to do with the nomological, causal or other theoretical role that it plays, but simply to do with it having or being the same quiddity (Black 2000). A property Q which makes things appear blue to the human eye in normal light in the actual world could make things taste of chocolate in another. What makes property Q be Q in that counterfactual situation is that it has the same quiddity. The primitive qualitative ‘this-ness’ which quiddities impart to properties makes them analogous to haecceities, whatever it is which makes a particular the particular which it is (over and above the properties it instantiates). (See Schaffer 2005 for some disanalogies between quidditism and haecceitism.)

The postulation of quiddities presents epistemic challenges which Lewis (2009) notes, since it is not clear how we are able to acquire knowledge about quiddities if any effect that they could have upon us is associated with a specific quiddity only contingently. Furthermore, one might recall the parsimony argument of the previous section, presented in favour of forms of property structuralism: science does not appear to require the postulation of quiddities and can deal with properties entirely in terms of their causal or nomological role. If we do not need to postulate quiddities, why bother?

The supporter of quiddities has at least three responses available here as well as another way of side-stepping the worst of the criticism without reconciling with the structuralist. The first response is the most direct, arguing that we do have epistemic access to the qualitative nature of properties in our conscious experience (Heil 2003, who does not support a quiddistic conception of properties but one in which properties are both essentially causal and qualitative). The main difficulties for this response is to maintain the analogy between qualia and quiddities, and to argue that our conscious experience is broad enough to support a general argument for the existence of quiddities of properties which do not appear to us in conscious experience.

Secondly, one might argue that although quiddities are obscure when considered to be distinct, or partially distinct, entities from the properties which they individuate, they are not so obscure when regarded as being the properties themselves (Locke 2012). This latter conception of properties does not treat them as having internal qualitative natures in virtue of which they are individuated but as being those natures; in this view, properties are individuated in a primitive way simply by being numerically either the same property or a different one. Although this alternative conception gets rid of quiddities, and so placates the proponent of the parsimony argument, it does not advance our understanding of the individuation of properties beyond there being primitive qualitative differences between them.

The third response could take the form of a tu quoque argument against the supporters of a structuralist conception of properties, since there are epistemic challenges for them too; even if we identify and individuate properties in virtue of their causal roles, it is not obvious that empirical investigation will permit us to determine which properties exist (Allen 2002). Finally, one could argue that we do not need to accept quidditism in order to treat the causal roles of properties as being contingent, since there could be counterparts of actual, world-bound properties which play a different nomological or causal role. (See Black 2000; Hawthorne 2001; and Schaffer 2005 (who does not recommend this position).)

3. Which Properties Are There?

a. Families of Properties

There are not only many different properties, but many different families of properties: moral properties, such as good and bad; mathematical ones, such as being prime or being a convergent series; aesthetic ones, such as being beautiful; psychological ones, such as believing in poltergeists or wanting a drink; properties from the social sciences; and properties from the physical sciences. Every subject area about which we can think or speak about has properties associated with it; and there are perhaps many more besides. This leads to questions about whether all these families of properties exist in the same sense as each other, and whether one family is dependent upon or determined by another. We might also consider how different properties within a family of properties are related. (For a selection of metaphysical distinctions between properties, see Sections 6 and 7.)

Some varieties of properties may be mind- or theory-independent—that is, they would exist whether or not humans (or other conscious beings) had ever existed to discover them—while others might be mind- or theory-dependent. The latter are classifications which depend for their existence at least partially upon the existence of conscious subjects to be the classifiers. One might, for example, consider physical or natural properties to exist mind-independently, and aesthetic properties to be mind-dependent. Another distinction between families of properties might come about due to differences in the entities which instantiate them. For instance, some properties such as mathematical ones might be instantiated by abstract objects, while others are possessed by spatio-temporal entities.

Despite the prima facie differences, one might think that these families of properties are related to one another. Perhaps one family of properties is entirely determined by the existence of another family. For instance, psychological, moral or ethical properties might be entirely determined by (broadly speaking) physical ones by a relation such as supervenience, realisation or grounding. Furthermore, while some accounts of supervenience relate facts rather than properties, properties still play a crucial role as constituents in facts or states of affairs. Mathematical properties might be thought to be determined by logical properties, but in that case the relation of determination is one of logical entailment rather than ontological priority. (See Frege and Russell.)

The question of which families of properties exist mind-independently and which do not, and whether interesting relations exist between families of properties, can be clarified only by examining specific features of the different subject areas associated with them, a much larger task than can be accomplished here. Furthermore, although it makes intuitive sense to divide properties into families such as the physical, the psychological and so on, further philosophical consideration reveals difficulties in clarifying such distinctions and making them philosophically rigorous while retaining an interesting account of the relationship between them. There is, for instance, not much philosophical substance to a distinction between physical properties and mental ones if these families can be defined only in opposition to each other.

Finally, one might be interested in whether some properties within a family are dependent upon others of the same family, making some individual properties more fundamental than others. For example, one might think that all ethical properties are determined by one or two fundamental ones—being good or being just, for instance—or one might maintain that mathematical properties are entirely determined by the properties of natural numbers. Again, it is the task of the different areas of philosophy concerned, such as Moral Philosophy or the Philosophy of Mathematics in these cases, to work out whether these dependencies are viable.

b. Maximalism versus Minimalism

The question of whether some properties are more fundamental than others, in the sense of their determining the existence of other properties, is also of more general metaphysical interest when we overlook the boundaries between different families of properties, since it is related to the question of how many properties there are. Does every possible property exist? Does every predicate pick out a property? Or are a few properties the ‘real’ or genuine ones, with the others which we appear to refer to either being ontologically determined by the genuine ones or being linguistic or conceptual entities?

The answers to these questions lie somewhere on a continuum between minimalism on the one hand, which maintains that a very sparse population of properties exists, to maximalism on the other, which asserts the existence of every possible property (and perhaps even some impossible ones). This contrast between the minimalist and maximalist ends of the continuum is also captured by two conceptions of properties as being sparse and abundant (Lewis 1983a). How we decide which point on this continuum is the most plausible depends in part upon the role we think that properties play in the world and also upon the identity conditions which we think properties have: that is, upon what makes one property the same as or different from another. Furthermore, it may turn out that there are different conceptions of properties in play, intended to fulfil different metaphysical roles, which may be able to coexist alongside each other. Thus, a dualist account of properties is also a possibility, or else one might find some way in which the sparse properties and the abundant ones are connected.

The minimalist maintains that the properties which exist are sparse or few in number, a set of properties which (may) determine the behaviour of the rest. From a physicalist standpoint, the properties of fundamental physics are the most promising candidates for being members of the minimal set of sparse properties: properties of quarks, such as charge and spin, as opposed to properties such as being made of angora, liking chocolate or being green. Some sparse properties may exist which we have yet to discover, and which we may never discover; their existence is in no way tied to our language use or what we have the ability to pick out. Although there are few sparse properties, this is a comparative claim: there may still be infinitely many of them if we consider determinate properties such as specific masses—such as having mass of 1.4 grams—to be more fundamental than the determinable property mass.

The maximalist, on the other hand, obeys a principle of plenitude with respect to which properties exist. At the extreme, every property which could exist does exist, although the range of properties which this principle permits depends upon how the ‘could’ in ‘could exist’ is understood. Perhaps one of the most abundant population of properties is postulated by Lewis (and quickly rejected for not being metaphysically useful), who regards qualitative similarity and difference to be determined by membership in sets of actual and possible individuals. In the least discriminating understanding of this account of properties, any set of actual or possible individuals counts as a property, making the collection of properties into a super-abundant transfinite collection which far outruns our ability to name them. But, as Lewis quickly notes, there are simply too many of these properties to be useful—‘If it’s distinctions we want, too much structure is no better than none’ (1983a, 346)—and so he abandons this extreme maximalism in favour of an account of properties which is discussed below.

One could also retain a broad range of possible properties in a different way to Lewis’s sets of possible and actual individuals, perhaps by accepting the existence of transcendent universals, including universals which exist even though they are never instantiated by any actual individual. Such entities might even range beyond the possible to include universals which can never be instantiated, or which could be instantiated only if the laws of logic were non-classical, such as universals corresponding to the properties of being a round square or being a true contradiction.

A prima facie less abundant form of maximalism considers properties to be the semantic values of predicates, thus entities which either determine the meaning of any actual predicate in a human language or determine any meaning which there is or could be. (Whether this second maximal account of properties is only prima facie less abundant than the previous suggestion or is genuinely less abundant depends upon the relationship between possibility and range of meanings, a question which will not be considered here. If the range of possible meanings turns out to be coextensive with the range of possibilities, there may be no difference between these options.)

Even if we restrict ourselves to actual languages, there are many predicates, and so if there are properties which correspond with each of them, we will have a very abundantly populated ontology. How finely grained such a maximalist ontology is depends upon how we distinguish one property from another (or, relatedly, one predicate from another). In this view, there are uncontroversially properties for being red and being not red. But one might wonder whether there is a distinction between being red and not being not red which can be determined only when we have a principle for individuating properties or predicates. If the criterion is syntactic, then the properties being red and not being not red are distinct, but if the criterion is semantic, ‘being red’ and ‘not being not red’ are intuitively predicates picking out the same entity.

One might attempt to hold an intermediate position between maximalism and minimalism. For example, one might argue that which properties exist are those which have explanatory utility, giving us a more abundant population of properties than the minimalist physicalist accepts and a more restricted one than that which maintains that there is a property to determine the meaning of every predicate. But on reflection it is not clear how different this view will turn out to be from the maximalist accounts based upon the semantic values of predicates; after all, predicates exist because we use them in explanatory sentences. One might need a more restrictive account of legitimate explanations in order to whittle the range of properties down.

One advantage of a liberal, maximalist account of properties is epistemic: if properties are based upon predicates of our language, or on the types which we employ in our explanations, then properties are easy to find. Being an aardvark, or being igneous rock, or having influenza, or being a chair are all properties to which we refer and there is no need to go looking for some more fundamental, ‘genuine’ or ‘real’ set of properties to ground the types into which we classify things in our everyday and scientific explanations. However, this epistemic advantage over minimalism may not persist once we move away from the properties we encounter in the natural and human world and consider how we know about the myriad uninstantiated properties which most maximalists endorse, or once we consider the properties which are not instantiated by spatio-temporal objects but by abstract ones. These cases are particularly problematic because, if a version of the causal theory of knowledge is true, it is not clear how we could know about the properties of abstract objects or about properties which are not instantiated in the actual world at all. At this point, maximalism loses the epistemic advantage, although it still promises a useful account of meaning based upon which properties exist.

Second, the maximalist’s ontology of properties has a pragmatic advantage: the maximalist has a greater range of properties at her disposal, whereas the minimalist may discover that a property or a family of properties for which we have predicates does not exist.

Third, the maximalist can explain predicate meaning directly: the properties which exist determine what our predicates mean.

But for the minimalist, these advantages do not mitigate what he regards as the vastly uneconomical, overpopulated ontology of properties which the maximalist endorses. The maximalist accepts properties such as being threatened by a dragon on a Sunday and being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla. The former is a property which has never been instantiated, while the latter is one which is only instantiated in a world of computer games, motor races and gorillas. Are we to say that these properties have always existed? If we are not, then they must have come into existence at some point in the history of the universe, in virtue of a more minimal set of properties which forms the basis for all the rest. If we treat these original properties as fundamental, the minimalist argues, then parsimony will be restored.

In addition to rejecting higher-level properties which appear to be superfluous to the causal workings of the universe, such as being within two miles of a burning barn or being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla, some minimalists also adhere to a Principle of Instantiation and reject all alien properties which are never instantiated in the actual spatio-temporal world. Alien properties, such as being a perfect circle or being threatened by a dragon on a Sunday, are rejected in favour of treating them as conceptual or ideal entities which are mind-dependent.

Minimalists disagree about how minimal the set of sparse properties should be, with some physicalist minimalists accepting only the properties of fundamental physics (whatever they turn out to be). However, if we restrict properties to this extent, we are left with the question of what a great many things which we thought were properties actually are. If being water or being square, being green or being a mouse are not properties, then they must be something else, since they form such a central position in our worldview that eliminating them entirely from the ontology is out of the question. It does not seem plausible to treat them in the same way that Armstrong does with alien properties and to maintain that they are mind-dependent or ideal.

At this point, it seems that a compromise is needed. Both minimalism and maximalism are viable in their own right, but as far as explanation goes, they lack precisely what the other can provide. The minimalist’s properties can account for the fundamental nature of reality and perhaps also the causal processes which occur in it, while the maximalist can explain higher level predication and give an account of explanation and predicate meaning. Ideally, the property theorists would like the best of both worlds.

There are two ways in which this compromise can be achieved: first, by a form of dualism about properties which treats sparse and abundant conceptions of properties as different categories of entities (Bealer 1982). There is a sparse population of properties (or ‘qualities’ as Bealer calls them) and an abundant one of concepts, which are not mind-dependent entities in the way in which we often think about concepts, but rather objectively existing entities.

Second, one could accept Lewis’s strategy and give an account of how the sparse properties determine the existence of the abundant ones. According to Lewis (1983a, 1986), there is a fundamental set of sparse, perfectly natural properties which determine the existence of all the other properties by set-theoretic, Boolean combinations. All other properties lie along a continuum, placed according to how simply they are related to the perfectly natural ones. Those which are closely related count as natural properties, with naturalness being a matter of degree which is determined by closeness to perfectly natural properties. If we suppose that the sparse properties are physical ones, then properties such as being green or being a mouse are both natural to some degree or other, as is (to a lesser extent) being fourth placed in the Mushroom Cup on MarioKart in the guise of a gorilla, but eventually naturalness trails off. Being green is more natural than being grue (where ‘grue’ is defined as being green if observed before 2085, otherwise blue) while being grue* is less natural still. (Being grue* is defined as being green if observed before 2030 or blue if observed between 2030-40 or red if observed between 2040-50 or pink if observed between 2050-60 or . . . and so on for 30 disjuncts (Elgin 1995).) The abundant properties exist in virtue of being determined by the sparse natural properties.

The ontological distinction which Lewis marks can also be characterized in other ways. For instance, Armstrong maintains that some universals are genuine ones, with the existence of other universals being determined by them. Such a distinction between perfectly natural sparse properties and the rest is a primitive one, however, and is thus not open to further analysis. If one considers parsimony to be an objective fact about the universe, then it is plausible to accept that some such minimal set of properties exists, but its existence has to be assumed rather than being argued for (McGowan 2002).

4. Problems with Instantiation

A particular is said to instantiate a property P, or to exemplify, bear, have or possess P. In the case of Platonic forms, the particular participates in the form of P-ness which corresponds to or is identified with the property P. One might wonder whether instantiation can be analysed further in order to give us some insight into the relationship between a particular and the properties which it instantiates, but it turns out that this is very difficult to do. In fact, instantiation runs into two major problems: the instantiation regress and problems about whether self-instantiation is possible.

a. The Instantiation Regress

The first problem arises if instantiation is treated as a relation. Presuming that relations are analogous to properties, or are a species of property, then the instantiation relation will behave in a similar way to a property. Let us say that particular b is P. If a relation of instantiation connects b with P, then b instantiates P. But then something must connect b, P and the instantiation relation (let us call it I1), and so there must be another instantiation relation I2 which does this job. However, now the question arises of what connects b, P and I1 with I2, and the answer must be that there is another instantiation relation I3 to do that; and then there must be another relation I4 to connect b, P, I1 and I2 with I3. For each instance of instantiation, we require another relation to bind it to the entities which we already have and so there will never be enough instantiation relations to bind a property P to the particular which has it. It appears that treating instantiation as a relation leads to an infinite regress, and so the instantiation relation is not coherent after all. (The instantiation regress is often associated with a regress suggested by F. H. Bradley (1893) and is thus sometimes known as ‘Bradley’s Regress’.)

There are several ways in which the property theorist might try to avoid this regress. First, she might appeal to the notion of an internal relation: that is, a relation which exists if the entities it relates exist. (Examples of internal relations include x being taller than y or x resembling y. All that is needed for such relations to hold is the existence of the things which they relate, Mount Everest and the Eiger for the former, for instance, or two black kittens for the latter.) However, one cannot say that instantiation is itself an internal relation because the existence of a particular b and a property P is not sufficient to determine that b is P. For example, the existence of a particular cat, Fluffy, and of the property of being white do not on their own guarantee that Fluffy is white; something more is required, in this case that Fluffy instantiates the property of being white. (Even if Fluffy is white, the problem here is that the relation between Fluffy and being white is a contingent one; Fluffy could exist and be black or tabby and so the mere existence of Fluffy and whiteness does not determine the existence of the instantiation relation. Although see Broad 1933, 85.)

David Armstrong argues that, while we cannot do without the first-order instantiation relation between particular and property, we can then treat whatever is required to bind particular, property and instantiation as being an internal matter. In terms of the example of the regress above, the additional instantiation relations, I2, I3 and so on, exist if particular b, property P and I1 exist such that b instantiates1 P. Nothing more is required, and the supposed regress is a cheap logical trick, rather than implying ontological infinitude. Armstrong claims that instantiation is a fundamental universal-like tie which is not open to further analysis.

Armstrong’s response depends strongly upon whether his account of internal relations is a plausible one. Do they provide, as he claims, an ontological free lunch (1989, 56; MacBride 2011, 162–6)? In addition, one might also question whether his solution works for every account of the ontology of properties. Armstrong’s account of instantiation is formulated for immanent universals—entities which are wholly present in each of their instantiations—but it is more difficult to think of instantiation as a fundamental, non-relational tie if it relates a particular to an abstract, transcendent universal, or to a resemblance class of which the particular is a member. If we are to treat instantiation as fundamental, then different accounts of the ontological nature of properties might require their own accounts of instantiation.

Alternatively, the property theorist might challenge the claim that the instantiation regress is vicious (Orilia 2006). If we further analyse the regress outlined above, we either require an infinite number of states of affairs to bind a particular to the property it instantiates, or each state of affairs (each particular’s instantiating a property) requires infinitely many constituents in order to exist (the particular, the property and infinitely many instantiation relations). Orilia distinguishes these as an external and an internal regress respectively, since in the former case the infinitude of additional entities is external to the original state of affairs of b’s being P, while the latter asserts that any state of affairs, such as b is P, does not simply contain b and P but infinitely many instantiation relations besides. Although this may not be what we intuitively expect of the relationship between particulars and the properties they have, one might argue that there is nothing ontologically wrong with such infinitude unless one has already presupposed that the world is finite. After all, we are happy to accept that the real numbers are infinite, such that there are infinitely many numbers between any two real numbers, and so it is not clear why such infinitude cannot occur in the natural world. There is, for instance, debate in the physical sciences about the existence of ‘real’ infinities (see Infinity, Section 4). If one allows that the world is infinitely complex, then the instantiation regress is not vicious, although its consequences for the way the world must be are quite counterintuitive (Allen, 2016, 29–31).

b. The Paradox of Self-Instantiation

It seems plausible to maintain that any property instantiates being a property, and furthermore (if one thinks that properties are abstract objects such as transcendent universals) that the property of being abstract instantiates the property of being abstract. It seems, in such cases, that it is possible for some properties to instantiate themselves and thus that there is such a property as being self-instantiating or a property’s instantiating itself. Moreover, the situation with the Instantiation Regress would be simplified if it were possible for instantiation to instantiate itself. That way, one might argue that the apparently infinite multitude of instantiation relations were in fact instances of the same relation, instantiated over and over again, with different numbers of relata each time on some versions of the regress. However, there is a logical problem with self-instantiation which has led some philosophers to suggest that self-instantiation should not be allowed.

Let us suppose that, for every property of being Q, there is also a negative property of being not Q. If this is the case, then there is a property of being non-self-instantiating or something’s not instantiating itself. But such a property appears to be logically impossible once we consider whether it instantiates itself: if the property of not instantiating itself does not instantiate itself, then it does instantiate not instantiating itself and so it instantiates itself. But if it does instantiate itself, then it is self-instantiating and so it does not instantiate itself. We have a paradox.

Faced with this paradox, one could take the rather extreme measure of banning self-instantiation entirely which would leave us in an implausible situation with respect to ‘properties’ such as being a property, which would not (strictly speaking) be a property. One might mitigate this consequence by introducing a theory of types for properties in addition to banning self-instantiation. Thus, we would have first-order properties which are instantiated by particulars, second-order properties which are instantiated by first-order properties, third-order properties which are instantiated by second-order properties and so on; each nth-order of properties can only be instantiated by the entities of the (n-1)th order. Being a property would then be a shorthand for being a second-order property (a property instantiated by first-order properties), or being a third-order property (a property instantiated by properties of first-order properties) and so on, and these properties do not self-instantiate. However, this hierarchy is perhaps too strict for daily use and conflicts with our intuitive judgments. For example, if a table instantiates the property of being crimson, it also instantiates the property of being red and being a colour; but the property of being crimson also intuitively instantiates being red and being a colour. However, if the theory of types is correct, we have to distinguish the first-order property of the table’s being red from the second-order property of crimson’s being red; different properties are involved in each case if we introduce a hierarchy.

Alternatively, one might solve the problem of self-instantiation by limiting which entities count as genuine properties and accepting a more minimalist position. This response rejects the premise that corresponding to every property Q, there is a property of being not Q which is instantiated just when Q is not. Thus, everything which does not instantiate the property of being red is not thereby not red, and we need not think that the property of not self-instantiating accompanies the property of self-instantiating. The paradox associated with there being a property of self-instantiation need not arise.

5. Categorical and Dispositional Properties

While Plato regarded participation in a form as making something the kind of thing it is, Aristotle also treated such kinds as giving a particular the causal power to do something, the potential to have certain effects. This contrast between understanding properties as qualitative, categorising entities and as dispositional or causally powerful ones survives in contemporary philosophy as the distinction between categorical and dispositional properties. We can conceive of a property such as mass in two contrasting ways: on the one hand, mass is a measure of how much matter a particular is made of; on the other, the mass of a particular determines how much force is required to move it, how much momentum it will have when moving and thus what will happen if it hits something else, and how much energy will be produced if the mass were to be destroyed.

Some philosophers argue that all dispositional properties are dependent upon categorical ones (Armstrong 1999; Lewis 1979, 1986; Schaffer 2005); others argue that all properties are dispositional and have their causal power necessarily or essentially (Cartwright 1989; Mumford 1998, 2004; Bird 2007; Marmadoro 2010a); some accept that a mixture of categorical and dispositional properties exist (Ellis 2000, 2001; Molnar 2003); and still others contend that all properties have a dispositional and a categorical aspect (Schroer 2013) or are both categorical and dispositional (Heil 2003, 2012). Dispositional properties, properties which have their causal roles essentially, are also known as dispositions, powers, causal powers and potentialities; however, it is important to note that these terms are not always used interchangeably.

a. Do Dispositional Properties Depend upon Categorical Ones?

There are three primary motivations for the view that all dispositional properties must depend somehow upon categorical ones: first, dispositional properties are regarded as epistemologically suspect, since we cannot experience a dispositional property as such. Second, dispositional properties are considered to be ontologically suspect. Third, it is thought that we do not need to think of dispositions or dispositional properties as being an ontologically independent category of entities because statements about the dispositional properties an individual instantiates can be analysed as conditional statements about the categorical properties which that individual instantiates, or else we can give an ontological account of how dispositional properties depend upon categorical ones. These issues are considered in turn.

The first motivation is more common within the empiricist tradition, but not exclusive to it. To say that a particular has a disposition or a causal power to do something does not entail that the causal power is actually manifested or that the effect is produced, since the particular may not be in the appropriate conditions for the effect to occur. For instance, although a particular sugar cube is soluble, such a disposition may never be manifested if the sugar cube is never near water; its being soluble ensures that it could dissolve, that it would were the circumstances to be right, and perhaps also that it must do so (although dispositionalists disagree about whether a causal power manifests itself as a matter of necessity in the appropriate circumstances). Thus, accepting the existence of irreducible dispositional properties involves accepting the existence of irreducible modality in nature, perhaps amounting to natural necessity, which makes each property produce its respective effects. As Hume pointed out, such natural necessity cannot be detected by experience, since we can only experience what is actually the case, and so strict empiricists have rejected irreducible dispositional properties on this basis. Some of those who think that at least some dispositional properties are irreducible to categorical ones accept this view about our experience and argue that we have other reasons to accept natural necessity, while others argue that we can experience irreducible modality in nature after all, perhaps through our own intentions being dispositional (Mumford and Anjum, 2011).

The second ontological objection to irreducible dispositional properties is raised by Armstrong (1997, 79) who argues that accepting dispositional properties commits one to Meinongianism. As noted above, any particular instantiation of a property which is the power to M may never manifest M; however, such entities are still construed as being powers to do M and are often individuated in virtue of their manifestations. For example, solubility is the power to dissolve, combustibility is the power to burn, and so on. In committing ourselves to the existence of unmanifested dispositions, the objector argues, we are also committing ourselves to the being (in some sense or other) of their manifestations, a range of entities which do not exist. In most cases, dispositional properties are constituted by relations between instantiated powers and a non-actual manifestation, which Armstrong argues is both ontologically uneconomical and absurd, reminiscent of the ontological commitment attributed to Alexius Meinong by Bertrand Russell (1905). On this basis, Armstrong concludes, essentially dispositional properties should be rejected. (See Mumford 2004, 192–5; Handfield 2005 452–461; and Bird 2007, 105–111 for responses.)

The third objection against irreducible dispositions is that we do not need to talk about dispositions and dispositional properties in the first place because we can translate disposition ascriptions into non-dispositional language. To that end, the conditional analysis of dispositions was first suggested by Carnap (1928, 1936–7), whose own account failed due to the fact that he insisted on analysing dispositions as truth-functional material conditionals. In Carnap’s proposal, we could analyse the dispositional predicate ‘is combustible’ as follows:

(C)  For any object o, if o is lit or otherwise ignited, o is combustible if and only if o burns.

The disadvantage of this account is that it provides a criterion to apply the predicate ‘is combustible’ only for objects which are ignited and says nothing about those objects which are not near any source of ignition. However, we intuitively want to say that the piece of paper on my desk is combustible and the water in the glass is not, whether or not these items are ever ignited. Carnap’s simple analysis leaves out the crucial aspect of dispositions and dispositional properties: the disposition or causal power to have a certain effect is present even when the disposition is not active and has no chance of being triggered because the requisite conditions do not obtain.

The failure of Carnap’s attempt to eliminate dispositional language led to more sophisticated accounts which attempt to analyse an object’s possession of a disposition in terms of subjunctive or counterfactual conditionals: that is, by capturing what the object would do were certain conditions to obtain (whether or not they do actually obtain). The most famous of these is the Simple Conditional Analysis which analyses disposition ascriptions as follows:

(CA) An object o is disposed to manifest M in conditions C if and only if o would M if C obtained.

(Ryle 1949; Goodman 1954; Quine 1960)

While this analysis is an improvement on Carnap’s attempt, there are several well-known counterexamples to it. First, the stimulus conditions may obtain and the disposition not manifest because the effect is masked. For instance, the paper is combustible because it would light were certain stimulus conditions to obtain (were it to be in contact with a source of ignition), but the disposition will not manifest if the atmosphere around it contains no oxygen; the lack of oxygen will mask its combustibility. Second, we can imagine a situation in which the presence of the conditions required for the disposition to manifest removes the disposition somehow; in our current example, perhaps the presence of a source of ignition also causes the paper to be soaked by water, making it, while wet at least, no longer combustible. A disposition where the presence of the requisite triggering conditions results in an object’s either acquiring or losing a disposition is known as a finkish disposition, following Martin (1994). Third, we can find examples in which the effect of a disposition is mimicked when the triggering conditions occur, even though the disposition is not present. For instance, consider Lewis’s famous Hater of Styrofoam (1997), who breaks Styrofoam containers each time they are struck, giving the impression that such containers are fragile when they are not. Such examples show that (CA) can be true while intuitively the dispositional predicate ‘is fragile’ should not be ascribed to the object; the conditional can be true when the disposition is mimicked.

Difficulties with the Simple Conditional Analysis have led to refinements in this approach (Prior 1985; Lewis 1997; Manley and Wasserman 2008), although the Simple Conditional Analysis still has defenders who challenge the counterexamples of finks, masking and mimicking (Choi 2008). However, the complexities of eliminating dispositional ascriptions by analysing them as conditionals have encouraged many contemporary philosophers to take another look at the plausibility of treating dispositional properties more realistically, either as entities which depend for their existence on categorical properties and other entities, or as an independent ontological category.

b. Dispositional Properties from Categorical Ones

Armstrong takes a minimally realist attitude to dispositions: the dispositions which an individual has to act in this way or that are entirely determined by the categorical properties they instantiate and the laws of nature which govern them. Although such dispositions are real, they are a derived category of entities, not a fundamental one, since they are ontologically dependent upon categorical properties and laws. For Armstrong (1983), laws of nature are necessary connections holding between universals (which, as was noted above, Armstrong considers to be the ontological basis of properties) but these necessary connections can vary across different possible situations. Although in the actual world it is true that the instantiation of an F necessitates the instantiation of G, this necessary connection need not hold in counterfactual situations; in another possible situation, F may necessitate the instantiation of H instead of G. Thus, what a property does is determined by which laws obtain in the world in which it is instantiated, not by that property’s intrinsic nature. In Armstrong’s view, categorical properties and laws of nature are more fundamental than the dispositions they confer, and the causal disposition a property has is contingent upon what the laws of nature are in the world in which it is instantiated. Thus, what a property has the power to do can vary in different possible situations. (See Contessa 2015 for a criticism of this view.)

c. Dispositional versus Categorical Properties

Central to arguments about whether we should conceive of properties as categorical or dispositional are clashing intuitions about whether it is plausible for a property P with the causal power to do C1 in the actual world to have the power to do C2 in another possible world w. If so, and if this indicates a genuine possibility, then property P does not have its causal power as a matter of necessity; if this is not possible, then properties do have their causal roles necessarily (or because of their essential nature, if this is different) and are thus dispositional. For instance, in the actual world, particulars with like charges—such as two electrons instantiating negative charge—repel each other. But, is it possible that like-charged particulars could attract each one other? The supporter of categorical properties says ‘yes’ whereas someone who favours dispositional properties says this is not possible. The supporter of dispositional properties maintains that if there were a property which could make electrons attract, it would not be charge but a distinct property, schmarge (say). Since schmarge does not exist in the actual world it is an alien dispositional property, and rather than accept existence of alien properties, some dispositionalists prefer to deny the possibility of electrons attracting.

The empiricist’s suspicion of the natural necessity inherent in dispositional properties is largely based upon an epistemic argument: how can we justify believing that such natural necessity exists, especially since we cannot find out about it through experience? However, the dispositionalist employs a converse epistemic argument which notes that the supporter of categorical properties also postulates entities which lie outside our epistemic grasp: if a property P can have different causal powers C1 and C2 in different possible situations, then the property itself must have a purely qualitative nature or quiddity which is only contingently associated with anything which P can do. Moreover, one and the same causal power C1 can be associated with distinct categorical properties P and Q, and so it is not clear how we determine that one property is being instantiated rather than another. It is plausible to think that we have experiential access to properties only via the effects which they have on us, but this makes the nature of quiddities as mysterious as natural necessity (especially from an empiricist perspective).

d. Explanatory Uses for Dispositional Properties in Metaphysics: Laws and Modality

These arguments are taken to establish the position that at least some properties are dispositional rather than categorical. This position, it is argued, has significant explanatory advantages for metaphysics considered more broadly. First, if properties essentially or necessarily involve having a specific causal role, then the causal relations between properties remain stable and the properties of an object bring about certain effects as a matter of necessity. These fixed relations between properties permit an account of causal laws as derived entities, which hold in virtue of dispositional properties and which hold as a matter of necessity (Mumford 2004). This, it is claimed, is respectively more coherent or more parsimonious than the accounts of laws available with an ontology of categorical properties which treat laws either as simply being contingent regularities holding in virtue of the distribution of properties in a world (Lewis 1973, 1994) or else require the postulation of second-order relations holding between properties or universals to act as laws of nature which govern what those properties do (Armstrong 1983).

Second, some supporters of a dispositional conception of properties argue that the essential, natural modality which such entities involve can be used to give a naturalistic account of possibility and necessity (Jacobs 2010; Borghini and Williams 2008; Vetter 2015). The dispositional properties which an individual instantiates determine what that object could do, and also what it must do in certain circumstances, thereby providing truthmakers for modal statements about that individual. Thus, the truth of statements such as ‘This coal could burn’ or ‘Hillary Clinton could be a physicist’ are made true by the dispositional properties which these individuals instantiate or by properties which actually instantiated dispositional properties that have the power to instantiate. This dispositionalist account of modality has, according to its supporters, the resources to provide an account of modality without recourse to abstract objects or to possible worlds. Furthermore, since some dispositionalists restrict what is possible to what is possible given the dispositional properties which exist, have existed and will exist in the actual world, this account of modality is an actualist one; it does not require ontological commitment to the existence of merely possible entities.

Although the formulation of these dispositionalist accounts of modality is still in the early stages, they already face some significant challenges. The primary difficulty concerns whether an ontology of actually instantiated dispositional properties can provide a broad enough modal range to match our common-sense intuitions about what is possible. For instance, logical and mathematical truths appear to be necessarily true, but we do not readily think of them as being made true by actual dispositional properties or causal powers. ‘2 + 2 = 4’ is always true, and intuitively could not be false, but it is not obvious what in the world makes it that way, nor whether it is coherent to say that everything has the disposition to make such statements true. The dispositionalist has given an account of logical and mathematical necessities in terms of dispositional properties to permit an alternative account of them. (See Vetter 2015.)

Furthermore, claims such as ‘Dinosaurs could have developed digital technology’ or ‘If Coulomb’s Law is false, these two proximate negative charges would not repel’ present difficulties: the first because it is an unactualised possibility which seems very unlikely given the dispositional properties instantiated now or in the past, and the second because it is a counterlegal possibility, a possibility which concerns a situation which could only occur were the laws of nature in the actual world to be false. The dispositionalist can deal with the former type of example by allowing that possibilities are not only grounded by which dispositional properties are actually instantiated, but also by the dispositional properties which these actually instantiated properties could produce, and the ones which these latter, uninstantiated properties could produce, and so on. Thus, it does not matter that no dinosaur actually had the power to invent digital technology, nor that nothing actually has the power to cure cancer, because the possibility rests on something existing (or having existed) which has the power to produce the power to do so.

On the other hand, examples of counterlegal possibilities have proved a more intransigent problem for dispositionalist modality. If, as was noted above, the dispositionalist thinks of natural laws as being entirely determined by the dispositional properties or causal powers which the world instantiates, the actual dispositional properties instantiated in the world cannot also determine possibilities which run counter to those laws. It makes no sense to imagine that the world could have been exactly like the actual one and yet the laws of nature be different. If the dispositionalist wants truthmakers for counterlegal possibilities, then she must be committed to the existence of alien causal powers, ones such as schmarge, which are uninstantiated in the actual world. However, if the dispositionalist makes this move, then her theory has lost the advantage that it claimed over other theories of modality, since it is now committed to the existence of possibilia or abstract objects in order to ground modality. Given this, most dispositionalists restrict what is possible to what is possible given the causal powers which exist, have existed or will exist in the actual world, thus denying possibilities which could occur only if the actual laws of nature were false. In doing so, they accept that some intuitively plausible possibilities, such as ‘It is possible that this one kilogram of gold will not fall towards the Earth when it is unsupported’, are not genuine possibilities at all; the gold might not fall were the universal law of gravitation not to hold, but in this version of actualist dispositionalism, this law holds necessarily; situations in which there is no gravity are not genuinely possible. (Although see Borghini and Williams 2008 and Vetter 2015, who suggest that actual powers or potentialities might be able determine possibilities which go beyond those permitted by the current laws of nature.)

Not all dispositionalists concur with the use of their ontology to ground necessity and possibility in this way. Mumford and Anjum (2011) have suggested an alternative account which argues that dispositions act with a sui generis modality—dispositional modality—which is weaker than necessity and yet stronger than contingency.

e. Problems with Pan-Dispositionalism

Pan-dispositionalism—the view that all properties are dispositional ones—faces several challenges to its coherence. First, there is the complaint that even among the natural properties, some properties are obviously not causal powers: properties such as being a cube or being red are not obviously ones which are essentially causal. The pan-dispositionalist’s answer is usually that such properties are dispositional after all: colours are properties with the power to cause certain wavelengths of light to be reflected, or to cause a specific reaction in ourselves and other animals, and being a cube is associated with various effects such as not being able to roll, being stackable, making a certain imprint in soft clay, and so on. The dispositionalist might add that such properties are continuously manifesting (Hüttemann 2013), which gives the appearance of there being a distinct set of categorical properties.

Second, the pan-dispositionalist ontology is vulnerable to the ‘always packing and never travelling’ objections: dispositional properties are potentialities to have certain effects, but if their manifestations consist in the production of more dispositional properties, the manifestation of the potential of a power consists in the production of more potentialities. (See Molnar 2003, 11.2 for variants of this problem.) This is an ontology of potentialities which ‘never passes from potency to act’ (Armstrong 2004). The critic of pan-dispositionalism argues that such powers must be supplemented by categorical properties to give the world actuality or being, or in order that actual events occur, rather than just the passing of potencies around. For instance, Heil argues that the world cannot be one in which properties are nothing more than contributions to what their bearers have the power to do because such bearers would be indistinguishable from empty space; there would be doing but no being, and this, Heil urges, does not make sense because there would be nothing to do anything at all. According to Heil, a purely dispositionalist ontology would be equivalent to an empty universe.

This objection could be met by accepting a theory in which properties are both qualitative and dispositional (Heil 2003, 2012; Schroer 2013), by permitting continuously manifesting dispositional properties which are analogous to categorical ones, or else by denying the need for a fundamental level (Schaffer 2003). However, Mumford (2004, 174–5) implies that these responses are not required, since the objection is based upon a misunderstanding of what being an essentially dispositional property or power involves, treating these entities as actual only in virtue of their producing actual manifestations. As Mumford argues, being potent (as these entities are) is a way of being and so it is wrong to think of pure powers as being mere potentialities in the first place.

Despite these difficulties in the formulation of a pan-dispositionalist ontology, it is thought by its supporters to have significant explanatory advantages over its rival which treats properties as categorical. The primary reasons for this are that dispositionalists can invoke the irreducible modality in nature in order to explain the necessity of causation and natural laws (Mumford 2004), or to ground an actualist account of modality which permits us to explain what is necessary and what is possible in terms of actually existing properties (Jacobs 2010; Borghini and Williams 2008; Vetter 2015).

6. Properties and Natural Kinds

The world appears to contain kinds of stuff as a matter of natural fact: water, elephants, gold, carbon dioxide, humans, red dwarf stars and so on. We can class these as ‘natural kinds’ and they are especially useful for making inductive inferences to be used for prediction and explanation. What exactly is the relationship between these kinds and properties? Some philosophers, with an exceptionally relaxed view of kinds (or a minimalist view of properties), argue that kinds and properties coincide: that is, that something’s being of a certain kind K simply involves the instantiation of a property and vice versa. However, although it is intuitively plausible to associate kinds with properties in some way, there seem to be more properties than there are kinds. Carbon, elephants, or stars each behave in a variety of ways in virtue of belonging to their respective kinds, while red things, or those which have a mass of 1.1 grams, display a much more restricted range of causal behaviour. Nevertheless, one might still think that this difference is a difference of degree (Bird 2014, 2).

Furthermore, if we do not restrict ourselves to what might be considered natural properties, the mismatch between properties and kinds is magnified. If we are trying to characterize what makes something a natural kind, there are plenty of properties—especially in an abundant conception of properties—which do not seem to be very natural. If it is contentious to consider green things as forming a kind, it seems even more so to include grue ones, or those which instantiate properties such as being on the eighth page of the first novel I read this year, being married to an ice-hockey fan, or being next to a marmoset. In view of this problem, one can either declare that the sharing of such properties does not mark out individuals as a kind or that there are some kinds which are non-natural ones. If one chooses the latter option, there may be further questions about how individuals of such non-natural kinds relate to the properties which they instantiate.

The simplest explication of a natural kind is that the individuals which belong to it share a property or a collection of properties (with some properties being excluded, as noted above). A subset of natural properties, or comparatively more natural properties if one prefers Lewis’s account of property naturalness, determines which natural kinds there are. In this view, natural kinds would be a derivative category and one might choose to dispense with them entirely in favour of the properties or collections of properties which are essential to each individual of the kind. In this view, the kind water is coextensive with having the property of being H20; and we might call the latter the essence of water.

However, this essentialist view is difficult to sustain in the case of many paradigmatic examples of natural kinds, such as species. It is impossible to characterize exactly which properties determine that an individual tiger is a member of the kind tiger, in the sense of giving the properties which are necessary and sufficient for membership of the kind. Furthermore, because species evolve over time, there is not a good reason for thinking that the failure to find a set of properties which are necessary and sufficient for kind membership is an epistemological problem rather than an ontological one. The essentialist account of kinds does not easily account for kinds which appear to be able to change their natures.

Richard Boyd has suggested a characterisation of kinds which might be able to account for such changes in terms of the properties which exist (Boyd 1991, 1999; Millikan 1999). He argues that an entity is a natural kind in virtue of its being a cluster of properties which are commonly instantiated in the same individual, where such clusters are formed and maintained by a homeostatic mechanism. Such mechanisms are either intrinsic to the property cluster because some collections of properties are internally more stable than others, or they are extrinsic and the property cluster is maintained in a fairly stable state by the environment or some other causal mechanism. No property of the cluster need be necessary to the kind, nor need there be any property which is sufficient for kind membership, which allows for the existence of kinds which lack essences. Kinds can change because their individual members lose or gain a property, or because the extension of the kind changes such that novel individuals are included within it. Nevertheless, Boyd argues, the clustering occurs because such changes from a stable cluster have a lower chance of persisting. Thus, we can explain why the members of a species maintain the properties which they do while their environment remains stable and why they evolve as the environment changes when mutations may have a greater chance of survival.

7. Different Types of Properties

There are several useful distinctions between different types of properties. Often these are made to mark a metaphysical distinction between them, to draw attention to the fact that these different types of properties behave in significantly different ways in the same circumstances, or in order to treat them theoretically in different ways. The distinction between categorical and dispositional properties is one such distinction, which has been discussed at length above. Others are considered much more briefly in this section. In addition, the table at the end of this section includes definitions and examples of other types of properties.

a. Intrinsic and Extrinsic Properties

There is a kiwi fruit in my fruit bowl which has a huge variety of properties. It is (roughly) ellipsoid, brown, slightly hairy, bright green and white inside, it has black seeds, it is sweet, soft, contains about 10g sugar and 1g protein, weighs 63 grams and is 5cm in diameter. It is lying next to an over-ripe pear, was grown in New Zealand, is partially obscured by the electricity bill, has travelled farther than I have in the last year, is not Hilary Clinton, it has no beliefs about classical logic, and is being used in a philosophical example.

Intuitively, the properties listed in the former sentence are more important than those in the latter: the difference between the kiwi fruit and the pear is not marked by the fact that one was grown in New Zealand and the other was not (although that happens to be true), and because neither of them are Hilary Clinton and both are partially obscured by the electricity bill, those properties cannot be what mark the difference either. It would make no real difference to the kiwi fruit or its continued existence if the bill were moved from on top of it, but it will change if I get a knife and slice it in half. Not only do the properties in the former set seem to be what determine the real difference between the kiwi fruit and other things in the world, those properties are more likely to be causally efficacious: the kiwi fruit is nutritious because of them, will roll when put on a slope, and can be used to knock over small objects if your aim is good.

It would be philosophically useful to draw a distinction between the properties which (roughly speaking) a particular has in virtue of itself, its own nature, and those which it has due to its relations with other things: that is, those which are intrinsic properties and the extrinsic ones. But can we draw a principled distinction between them? Several bases for such a distinction have been suggested: some attempt to be purely logical and to avoid any commitment to a particular metaphysical position, whereas others can be classed as metaphysical criteria because their plausibility requires that one make certain assumptions about the way the world is.

It is worth noting that some properties can be intrinsic when instantiated by some individuals and extrinsic when instantiated by others. These properties are locally intrinsic or extrinsic. For instance, consider the properties being such that a dog exists or becoming nervous when encountering a dog. In either case, these properties will be extrinsic when instantiated by anything which is not a dog, but intrinsic when instantiated by a dog, thus they are locally intrinsic properties. In what follows, the use of ‘intrinsic’ is confined to properties which are intrinsic when instantiated by any individual.

Lewis suggests that his ontologically elite perfectly natural properties are good candidates to determine intrinsicality. These properties, as we saw above (3b), are the most fundamental ones and ground the existence of other properties which are natural as a matter of degree. Perfectly natural properties determine the objective similarity and difference in the world, and thereby determine whether particulars are duplicates of each other or not. Intrinsic properties are just those properties which duplicates must share. Particulars can be duplicates of each other and differ in extrinsic properties.

However, accepting this criterion depends upon accepting Lewis’s claim that there is a set of such fundamental properties and, secondly, that those properties are intrinsic ones. Neither of these claims are without their detractors. The first claim is vulnerable to criticism from both maximalists about properties and those who deny the existence of a fundamental level to reality. Lewis’s second claim that all fundamental properties are intrinsic has been challenged on the grounds that some seemingly fundamental physical properties such as gravitational mass or spin might require the existence of other particulars to be instantiated. (See Bauer 2011; Allen 2018.) Moreover, even if one accepts Lewis’s minimalist metaphysical account of what the world contains (or something fairly close to it, such as Armstrong’s genuine universals), one might worry that ‘intrinsicality’ has been very closely inter-defined with ‘duplicate’ in this case: duplicates share all their intrinsic properties, while intrinsic properties are those shared between duplicates. Even if this criterion is correct, it does not go a long way towards explaining what an intrinsic property is.

Jaegwon Kim (1982) suggests that we can characterize the distinction in terms of loneliness: intrinsic properties are the properties a particular would have even if nothing else existed in the world. (This criterion requires only that no other contingently existing objects exist and does not exclude necessarily existing particulars, if there are any, such as numbers.) However, although an object’s being lonely is intuitively an extrinsic property, since being lonely depends for its instantiation on the absence of contingently existing objects, it turns out to be an intrinsic property in Kim’s criterion (Lewis 1983b, 198–9). Langton and Lewis (1998) suggest amending Kim’s criterion: an intrinsic property is one whose instantiation is independent of loneliness and accompaniment; that is, it is a property which can be possessed or lacked by a particular regardless of whether or not any distinct, contingently existing objects exist. However, this criterion is still not adequate, since some properties such as being spherical and lonely or non-spherical and accompanied turn out to be independent of loneliness and accompaniment, and thereby would count as being intrinsic. Langton and Lewis rule these disjunctive properties out by fiat, by characterising disjunctive properties as those which have disjuncts which are more natural then they are. (Recall Lewis’s account of naturalness in 3b above.) Accordingly, an intrinsic property is one which is independent of loneliness and accompaniment, and also is neither a disjunctive property nor the negation of a disjunctive property. As with Lewis’s original criterion based on duplication (which he does not reject in favour of the new criterion), Langton and Lewis’s criterion is a metaphysical one because it requires commitment to some kind of property hierarchy.

One might also be concerned about the scope of Langton and Lewis’s criterion since they specifically state that their criterion omits properties which involve particular entities, which they call impure properties, such as being Nelson Mandela or being more than fifty kilometres from Juba. In addition, the criterion makes all indiscriminately necessary properties—such as being such that 2 + 3 = 5—intrinsic as long as they are not disjunctive. (Lewis’s original duplication account, on the other hand, treats all indiscriminately necessary properties as intrinsic.) If this is the case, each particular has infinitely many more intrinsic properties that we would usually be inclined to attribute to it. One could exclude indiscriminately necessary properties from the criterion as well as impure properties, but the consequence of that would be an even less general criterion than before. In response, some philosophers have called for a more general criterion to distinguish between intrinsic and extrinsic properties which is able to take all properties into account.

One attempt to distinguish intrinsic and extrinsic properties on purely logical grounds is by defining extrinsicality. The instantiation of an extrinsic property by an individual consists in its bearing certain relations to at least one distinct individual, while properties which do not do this are intrinsic. We can call the former d-relational properties and maintain that properties which are not d-relational are intrinsic (Francescotti 1999, Harris 2010, 467). There are drawbacks to this account as well, however. First, it is not obvious that one can determine what counts as a ‘distinct individual’ without recourse to intrinsic and extrinsic properties, or else by introducing a metaphysical element into the criterion. If one individual’s being distinct from another requires their not having intrinsic properties in common, then we have made no progress. Second, one might be concerned about how we should deal with d-relations to abstract objects. If an individual can be d-related to abstract objects, then some properties turn out to be extrinsic which seem intuitively to be intrinsic: for instance, the sugar’s weighing 1 kilogram is extrinsic if 1 is an abstract object; in fact, all measurement properties would turn out to be extrinsic properties. On the other hand, if we accept that an individual’s relations to abstract objects cannot make the properties it instantiates d-relational, then indiscriminately necessary properties such as being such that 37 exists all turn out to be intrinsic, and this is another outcome we might hope to avoid.

As these and other suggested criteria have all turned out to be unsatisfactory, some philosophers have suggested that our intuitions about intrinsic and extrinsic properties are unstable and involve more than one division between properties. In this vein, Marshall (2016) suggests that intrinsicality covers three related types of properties: interior properties associated with an individual’s internal nature; properties preserved in duplication; and local properties which are necessarily ascribed to an individual on the basis of how it and its parts are. These, it is argued, play different roles in metaphysical explanation.

b. Accidental and Essential Properties

An individual can survive the loss of some properties and still retain its identity, while other properties are essential to it; were it to lose one of these latter properties it would no longer be the type of particular that it is. We can call the former properties accidental properties and the latter essential ones. For example, a dog is usually larger than a rabbit, has four legs, is domesticated and can swim; it also has a DNA profile similar to that of other dogs and has parents who are also dogs. A particular dog could lose a limb or be unable to swim, and it would still count as being a dog. But were an animal not to have dogs for parents, we would be unlikely to consider it to be a dog. (This example is employed for simplicity, but as noted above in Section 6, species are not really good examples of this distinction, since it is not obvious that there are properties which are essential to being a certain species.) Similarly, it is essential to a piece of gold that it has atomic number 79, but accidental that it is liquid or that it weighs two grams. The former essential property is shared by everything which counts as gold, whereas the latter properties are instantiated by the particular qua gold as a matter of contingent fact.

What is being given here is a modal characterisation of the distinction between accidental and essential properties: the former are those which a particular could lack while still being of the broader type that it is, while if something lacked its essential properties it would cease to exist (at least as the type of thing which it is). To put the point another way, a particular cannot lack its essential properties. Essentialism is the view that at least some particulars have essential properties.

At first glance, the modal characterisation of the distinction between accidental and essential properties fits well with our common-sense intuitions; the properties without which an individual could not exist seem intuitively to capture the essence of that individual. But this characterisation has been challenged because on closer inspection it turns out to classify a range of properties as essential which do not contribute to making a particular the kind of thing that it is. For instance, in this characterisation of the distinction, essential properties will turn out to include all of what we call indiscriminately necessary properties. These are properties which everything has, such as being such that 37 is prime number or being such that the ratio of the circumference to the diameter of a circle is Π. Since these properties are instantiated by everything, they do not intuitively contribute to making each individual what it is; they are not intuitively part of its essence. Furthermore, as Kit Fine (1994) pointed out, each individual has more specific properties necessarily which do not appear to determine that individual’s essential nature. For example, Socrates has the property of being the sole element of the singleton set containing Socrates (that is, being the sole member of {Socrates}), but that property is not, one would think, an essential property of Socrates the man. Fine argues that these examples are enough for us to abandon the modal characterisation of the distinction for an alternative.

In view of this problem, amended accounts have been sought, including Fine’s own suggestion which is that essential properties contribute to the definition of an object, or amended modal criteria which attempt to rule out the problematic properties on the grounds that they are not intrinsic to the individuals in question (Denby 2014), are not locally necessary to the individuals (Correia 2007), or are not sparse properties (Wildman 2013, Cowling 2013). (See also Zalta 2006 for an alternative approach.) As with the attempts to distinguish intrinsic from extrinsic properties, there is a danger of close inter-definition here, and consequently one of circularity: it may not be possible to characterise the intrinsic-extrinsic distinction (say) without a grasp upon the essential-accidental distinction or the distinction between sparse and abundant properties, and vice versa, making the resulting explanations quite impoverished. From an ontological point of view, however, such inter-definition is acceptable but one might feel justified in following Lewis and simply assuming that the characteristics of intrinsicality and sparseness go together, alongside being an essential property when such properties are present.

c. Monadic and Polyadic Properties

Thus far, this article has been primarily concerned with properties which, on each instantiation, are instantiated by one individual: properties such as being blue, being a cube, being an electron, or being a dog. These are monadic properties. However, many properties appear to require more than one individual to be instantiated: Edgar is friends with Julia, the cat is inside the box, Amir is in between Julia and Edgar, Julia is in the same class as Amir and Marie, and 2 is a common factor of 8, 10 and 12. These properties are more commonly known as relations, since they determine how one thing (or more) stands to others. But because they usually require more than one individual to be instantiated (or else, they relate one individual to itself), they are also known as polyadic properties, with their adicity capturing how many individuals are required to instantiate the property: Edgar is friends with Julia is the instantiation of a dyadic property, while being in between is a triadic property instantiated by Amir, Julia and Edgar, and so on.

The predicates of our natural languages allow for many cases in which the number of argument places of a predicate (its degree) is variable: ‘is friends with’ is two-place in the example above, but as ‘are friends with each other’ it could be three-place, four-place, five-place or more; similarly, ‘being in the same class as’ or ‘being a common factor of’ can vary in degree. In most formal logic, the degree of a predicate is fixed (for an exception, see Orilia 2000), but if we use natural, rather than formal, language as a guide to ontology, we might be tempted to think that the properties which correspond to these predicates can vary in their adicity. These are variably polyadic or multigrade properties which admit of a different number of participants in different circumstances.

We can distinguish internal relations from external ones (although philosophers disagree about what exactly they mean by ‘internal relation’). Briefly put, an internal relation is a relation which exists if its relata do. For instance, Ben Nevis is taller than Snowdon, but nothing more is needed for the is taller than relation holding between them than the existence of the two mountains at the heights which they actually are. On the other hand, being friends with each other is an external relation: the mere existence of Edgar and Julia is not sufficient to ensure that they are friends as they might never meet or may not get on; the relation of their being friends with each other exists in addition to the existence of its relata.

Internal relations (and hence the distinction between internal and external relations) are characterised in slightly different ways. For instance, Armstrong maintains that a relation is internal if its existence is necessitated by the intrinsic natures of its relata (1997, 87–9). For instance, in the case of Ben Nevis and Snowdon, their intrinsic properties of being the height that they are necessitates the existence of the relation of Ben Nevis being taller than Snowdon. On the other hand, Lewis claims that an internal relation is one which supervenes upon the internal nature of its relata. An earlier version of the distinction, proposed by G. E. Moore, is that a relation R between entities b and c is internal if the existence of b necessitates that b bears the relation R to c (1919, 47). Thus, in Moore’s case, only the existence of b is necessary for the relation between b and c to hold. Moore’s kind of internal relation has sometimes been distinguished as ‘super-internal’ where the existence of R is necessitated only in virtue of b’s intrinsic properties, or as simply a ‘one-sided’ relation when extrinsic features of b might also be relevant to necessitate the existence of relation R between b and c (see Bennett 2017, 192–4). Because internal relations exist if their relata do, their addition to the ontology (and employment in metaphysical theories) requires no additional ontological commitment over and above the entities they relate (and a general commitment to the existence of such relations). Thus, they have been described by Armstrong as ‘an ontological free lunch’ (1989, 56).

From a historical perspective, relations were not considered to be real entities, with the underlying motivation for this being the conviction that they could be reduced to or supervene upon monadic properties. However, such a reduction has never been fully explained. Furthermore, relations are regarded as being philosophically problematic for at least two reasons. The first is that even when external relations are instantiated, it is not clear where they are: Bangalore is south of New Delhi, but the relation being south of is not one of the properties which these two cities instantiate individually, so it is not located entirely where either of the cities is, and so one might wonder where the relation is. Perhaps its location is somehow divided between its relata, but it must be divided in such a way that the relation can be considered as one unified entity. Furthermore, Heil complains that relations do not fit neatly into our ontological categories of substance or attributes, that they are ‘neither fish nor fowl’ (2012, 141). But neither of these complaints counts decisively against the existence of irreducible relations: if they exist, they simply have to exist (and to have their location) in a way different than either substances or monadic attributes. Like Armstrong’s immanent universals which are wholly present in each of their instantiations, relations are not bound to behave in the same way as the objects and properties of ordinary middle-sized objects.

Another objection threatens the existence of external relations, a version of which was discussed in 4a. This is known as Bradley’s Regress (1893, 32–3). If relation R genuinely relates objects b and c, then R must be something to b and c. However, if R is something to b and c, then there must be a relation R’ which captures the relation between R and b and c. However, if R’ genuinely relates R, b and c, then there must be another relation R’’ which relates R’ to R, b and c; which in turns requires the existence of another relation R’’’, and so on. There is a regress of relations and thus, argues Bradley, the existence of external relations is impossible.

There have been some attempts to solve Bradley’s Regress using relational tropes (Maurin 2010, 321–3) or facts (Armstrong 1989, 109–10); but, as MacBride has argued, these strategies rely upon assuming the coherence of relations in the first place (2011). Russell, on the other hand, adopts the alternative strategy which highlighted the indispensability of relations, such as spatio-temporal relations, to science (1924, 339). It is more likely, he argues, that there is something wrong with Bradley’s regress argument than that we are wrong to take so much of our fundamental science at face value.

A challenge for any philosophical account of relations, assuming now that they can be construed realistically, is how we should understand how non-symmetric relations make a contribution to different states of affairs. The same constituents—Edgar, Julia and the relation of seeing (for instance)—can form two distinct states of affairs: Edgar sees Julia and Julia sees Edgar, which differ in relational order or differential application. Russell (1903, 218) became interested in giving an account of this relational order, a question which has been taken up in contemporary metaphysics (Hochberg 1987; Fine 2000; Orilia 2011). One might think of the difference between the two states of affairs as being explained by the relation having a direction, of the relation being directed from one relatum to another; or one might think that the positions or argument places of the relation are occupied in different ways. In this case, the argument place occupied by the one being seen is different from the one doing the seeing. Fine criticises these two accounts and suggests his own, non-local account of how we can explain differential application in terms of the other states of affairs into which a particular relation enters. Alternatively, MacBride has suggested that we should accept relational order as primitive, in the same way that most philosophers who accept real external relations avoid Bradley’s Regress by simply assuming that the fact that b relates c does not require further explanation (2014).

d. Determinable and Determinate Properties

 Being vermillion or being crimson are specific cases of being red, which is itself a specific case of being coloured. Similarly, being triangular is a case of being shaped, and having a mass of 1.06 kilograms is a specific instance of having mass. This relationship between properties such as being coloured and being red, and then between being red and being crimson, is known as the determinable-determinate relation, where colour is the determinable and crimson is the determinate instance of it. Given that a property, such as being red, can be determinable and determinate, a property’s status as determinable or determinate is usually regarded as relative matter. The different determinates of a particular determinate often exclude one another (if something is red, it cannot be blue or green), and this was thought to be a defining feature of a determinable and its determinates, although this is not always the case, since one can argue that different determinate odours or tastes are compatible with each other (Armstrong 1978b, 113). Nevertheless, even in cases where determinates do exclude each other, the determinable does not appear to be simply the conjunction of all the determinates but something over and above that.

One philosophical question which arises as a result of this distinction is what the relationship between determinables and determinates is. One can be a realist about both determinates and determinables, at which point the further question arises about whether determinates are more ontologically fundamental than determinables; one can be a reductionist about determinables; or one can be an anti-realist about determinables.

One might wonder whether there are any ontologically irreducible determinable properties on epistemic grounds: perhaps we only have to refer to determinable entities such as colour and shape because of our perceptual or cognitive limitations. It is too complicated to think about the world in maximally specific terms, or we do not have the perceptual apparatus to be able to detect such maximal specificity; however, in the absence of these limitations, we would not require determinables. For example, see Heil (2003). However, for this argument to be plausible, and for the reduction or elimination of determinables to be possible, the world must be absolutely determinate and without metaphysical vagueness, and this too is a matter of philosophical debate. Nevertheless, the ontological conviction that the world is maximally determinate is an important motivation for reductive or anti-realist views.

On the other hand, the reality of irreducible determinables is problematic since it is not obvious that we can perceive determinables as such: we perceive shape in virtue of perceiving specific shapes, or colours in virtue of perceiving determinate colours. We do not seem to be aware of determinables as objects of our perceptions.

However, Prior (1949) suggests that determinables must be more than their determinates because determinates are similar with respect to those determinables: red, blue and orange are similar with respect to their colour as are being triangular and being oval with respect to their shape. For this respect to exist, one might argue, determinables must be ontologically independent of determinates and must be real. Furthermore, this ontological point is exploited by Fales to improve the epistemological situation with respect to determinables. He notes that we can perceive the specific similarity between determinates, and in doing so we must be indirectly aware of determinables (1990, 172).

A second argument for the existence of determinables comes from their role in laws of nature and the fact that they are postulated in scientific explanations. For instance, we think of Newton’s second law as holding between the determinables mass, force and acceleration, rather than there being infinitely many laws holding between determinate instances of these determinables. Furthermore, in chemical laws, the relevant relationship holds between determinables (between acids and alkalis, to give a simple example), and one might argue that the specific molecular features of the determinate substances are not important (Batterman 1998).

Realists about determinables have presented a variety of accounts, including an essentialist account (Yablo 1992) which treats determinables as having essences which are contained within the essences of their determinates; accounts based on the causal relations of the determinables being a subset of those of the determinates (Fales 1990); and a causal powers-based account in which causal powers of a determinable are a subset of those of any and all of its determinates (Wilson 1999).

The main version of reductionism about determinables treats them as disjunctions of all their determinates: being coloured is equivalent to being red or being blue or being green or . . . . One objection which is raised against this view is that it does not match the way we think about determinables. Moreover, it seems that someone might fully understand a determinable such as colour while having no conception of all the disjuncts of the disjunction (all the different colours) which make that determinable. In such cases it is not obvious how the reductionist can maintain that such a person understands the determinable in question. Furthermore, the assumption that the world is maximally determinate is questioned on the basis that it is thought to violate the principle of plenitude with respect to the possible ways the world might be. See also Bigelow and Pargetter (1990) for an alternative version of reductionism.

e. Qualitative and Non-Qualitative Properties

Prima facie, it appears that properties such as being blue, having a mass of 1 kilogram, or being an electron are different in kind to being Barack Obama, being such that 4 is an even number, and being the same weight as William Shakespeare, in the sense that the first set of properties apply to the individuals which instantiate in them in virtue of the qualities that individual has (and also, if they are extrinsic properties, in virtue of the qualities which other individuals have and the relations between them), while the latter do not. The latter class of properties include haecceistic properties, impure properties and identity properties (and disjunctions and negations of these), as well as arguably including modal and temporal properties (being possible, being actual, being now) and mathematical properties. (See 7f for some examples of these and further definitions.)

Can we draw a distinction between qualitative and non-qualitative properties, and is there a criterion according to which we can do so? The principled distinction would be a philosophically useful one, since the distinction is already employed in its intuitive formulation: it is qualitative properties, not non-qualitative ones, which are shared by duplicates. Langton and Lewis’s distinction between intrinsic and extrinsic properties also applies only to qualitative properties (1998, and see 7a); laws of nature are taken to connect qualitative properties rather than non-qualitative ones, and furthermore, inductive inferences are considered illegitimate if the terms within them refer to non-qualitative properties (Hempel and Oppenheim 1948). In addition, claims about the truth of physicalism are usually restricted to claims about the ultimately physical nature of qualitative properties.

There has been some contemporary philosophical consideration of this distinction (Diekemper 2009; Cowling 2015). Reductive analyses of non-qualitative properties have attempted to account for them in terms of the linguistic attributes of the predicates which apply to them (that they always include proper names, for example), or have attempted to characterise non-qualitative properties as being those whose existence necessarily requires the existence of specific individuals (Rosenkrantz 1979). While this latter account is plausible for many positive non-qualitative properties—for instance, being Barack Obama requires the existence of Barack Obama—it does not work as well for negative non-qualitative properties such as being distinct from Barack Obama, since such a property might exist in the absence of Barack Obama himself. Alternatively, one might suggest that qualitative properties are specifically those which can be defined in an appropriate way from perfectly natural properties, or are those which supervene on them (Bricker 1996). Cowling (2015) finds all these alternatives problematic and advocates a primitivist approach to the distinction.

f. Technical Terms for Property Types

Since there are several specialised technical terms for different types of properties, it will be useful to list them here.

Property Name Description Examples
Qualitative properties,

Pure Properties

 

General Qualities being green,

having mass,

being an armadillo,

being near an iceberg

Existential Properties Property that requires the existence of something or other (usually of a certain type) being such that a cat exists,

being such that a triangle exists

Haecceistic Properties, Identity Properties,

Impure Properties

Property which involves a particular entity being Marie Curie,

being 300km from Bamako,

not being written by David Lewis

Identity Properties A subset of haecceistic properties involving being a particular thing being Obama,

being Marie Curie

Indiscriminately Necessary Properties Property instantiated by every particular being such that 6 + 6 = 12,

being self-identical,

being such that a triangle exists

8. Realism about Properties: Do Properties Exist?

So far, this article has presupposed that properties exist mind-independently, or that at least some of them do. But this claim has been challenged for two main reasons. First, there are the concerns about there being constitutive identity and individuation criteria for properties which were raised in Section 2. Second, there are several interconnected epistemic worries about whether and how we are able to discover or to refer to the properties which exist mind-independently (Putnam 1981; Elgin 1995; Allen 2002). While these do not challenge the existence of properties directly, they remove some of the motivation for postulating that the world has objective qualitative joints of the kind which properties mark, since this motivation has traditionally been based upon the explanatory power which an ontology containing properties has. If we are not justified in our beliefs about which properties exist, it is hard to see how they can have any explanatory power.

Since such epistemic worries do not directly challenge the existence of properties unless one has a fairly strict requirement that the entities of our ontology be epistemically accessible to us, it remains open to the property theorist to advocate a kind of ‘Kantian humility’ about whether the properties which we think exist are the ones which there really are (Lewis 2009). If this attitude is acceptable, then properties can be employed in metaphysics whatever their epistemic relationship to us.

9. Properties in the History of Philosophy

Concern about how we should understand qualitative similarity was a prominent issue during several periods of philosophical history. Since the historical discussions of properties are varied and detailed, as well as sometimes being enmeshed with specific philosophical concerns of the time, it will be impossible to do justice to them here. Bearing this problem in mind, this articles is restricted to considering the very first known theories of properties and then summarise other notable points at which discussion about properties became prominent.

a. Ancient Theories of Properties

In the philosophical traditions of both ancient Greece and ancient India, the phenomenon of similarity and difference between distinct things prompted a certain amount of consternation which became bound up with the desire to explain the even more troubling phenomena of persistence and change. Early philosophers could see—on the basis of their everyday experience—that there were different things around them which were nevertheless the same: entities could be equal and yet unequal, a phenomenon which was in danger of being contradictory. Some philosophers postulated the existence of different elements or substances to account for these similarities and differences, which led to pre-Socratic accounts of the world in which one element is more important or more fundamental than the others; there is an archê or material principle in virtue of which the other substance types come into existence. For Thales, the archê is water; for Heraclitus (in some interpretations) fire; while others preferred pluralistic accounts of the elements, such as Empedocles’ four: earth, air, fire and water.

However, these accounts of different elemental substances stop short of being property theories because they do not have a conception of entities which can be co-located with each other—that is, that can be instantiated in the same spatio-temporal region as each other—and which also perhaps inhere in a more fundamental substance. Thus, in such theories, it is particularly difficult to explain the phenomenon of change. If one has only substances and no properties, the causation of one thing B by another A appears to be a case of substance A being destroyed and substance B being created: if one melts sand and salt together and gets glass, it appears that the sand and salt have been destroyed and the glass created. Each case of change or causation is a radical transformation, conceptually equivalent to the creation of one substance simultaneously with the destruction of another. Furthermore, it appears that the glass has been created from something which is not glass; it was not clear how to explain the coming-into-existence of such things from what they are not, or even how change is possible at all. The explanatory situation is arguably even more serious since it does not just affect cases of substantial change, such as salt and sand turning into glass, but also seemingly insignificant changes such as a hot cup of coffee getting cooler or a solid ice cube becoming liquid as it warms. (See ParmenidesOn Nature, specifically The Way of Truth, which denies the existence of both change and differences of type.) Such problems with change gave rise to fruitful metaphysical discussions, only fragments of which survive today, and generated what became the first theories of properties. How good an account of properties and change any of the pre-Socratics managed to give is therefore a matter of controversy, although Marmadoro (2015) argues that Anaxagoras treated kinds of substances as powers, and several commentators have ascribed a sophisticated account to Heraclitus (Finkelberg 2017).

Perhaps the most famous account of properties from Ancient Greece can be attributed to Plato, who formulated the theory of forms, the first known version of a theory of universals. Plato presented what became known as ‘the One Over Many argument’ in which he argued that many particular F-things could also be one if they are regarded as instantiating or participating in a universal F-ness (Republic, 596a). This accounts for how distinct particulars can be qualitatively the same by grounding their qualitative similarity in the universal which they all instantiate, and thus avoids the contradictory claim that such particulars are both the same and different, or that they are equal and unequal at the same time. For instance, different cats are the same because they instantiate the universal cat and are different because they are distinct individuals. Further differences can be grounded by universals which some of the cats instantiate and others do not, such as being tabby, being fat, or being feral. In addition, Plato argued that the forms must transcend the instances of them: first, because exact (qualitative) equality between different particulars cannot be experienced in nature and thus cannot be due to relations between the particular objects themselves; and second because there are some forms of which no perfect instances exist, such as the perfect circle, although examples of imperfect circles abound.

Following Plato, Aristotle accepted that objective similarity and difference is grounded by forms or universals, but he denied that such entities are transcendent. In his view, universals are immanent, wholly present in each of their instances, rather than being abstract entities which exist independently of them. Furthermore, Aristotle made a distinction between properties or attributes and the substance in which they inhere, or the particular which instantiates them. In this view, some of the philosophical mystery concerning change is dissipated since an entity can persist while the properties which it instantiates change. Water instantiates solidity and cold when it is frozen and liquidity and (comparative) warmth as it heats up, but the water continues to exist. Such an ontology maps conveniently onto the different grammatical elements of our ordinary language (at least if we speak a language with subjects and predicates and adjectives and nouns) with the substances being picked out as the subject or the object, and adjectives or predicates referring to the properties. Substance types such as cat, human, or water are further determined by particulars instantiating immanent universals, and we can understand substantial change—the creation of water, for instance, in a chemical reaction—by a change in the properties instantiated by matter.

Another contrast between Aristotle’s view and the earlier one of Plato is in the nature of the properties or universals they postulated: for Plato, universals can enter into causal relations (despite being abstract objects) but they are predominantly required to determine which category or type of thing a particular is; whereas, for Aristotle, universals have essential causal powers to bring about certain effects in the appropriate circumstances. For Aristotle, a particular’s instantiating a universal gives it the potentiality to have an effect, an effect which will be actualised if the particular is in the appropriate conditions. An ice cube has the potentiality to melt in appropriately warm conditions even if the particular ice cube is never in an environment greater than zero degrees Celsius. Aristotelian properties are essentially causal, which makes Aristotle’s view similar to that of the dispositionalists discussed in Section 5.

Early Indian philosophers encountered similar obstacles to the Greeks in attempting to understand the phenomena of persistence and change, which some early metaphysicians sought to alleviate by distinguishing quality from substance. For instance, Kaṇāda, founder of the Vaiśeṣika school, distinguishes three categories of existents: substance, quality and action, which together can provide an account of the constitution of the cosmos and the change within it (Kaṇāda, Vaiśeṣika Sūtra 8.14). Vaiśeṣika metaphysics, in conjunction with the broadly speaking metaphysical realist Nyāya epistemological system founded by Akṣapadi Gautama, provides a sophisticated account of real and existent particulars and real universals according to which particular substances, qualities and actions fall into categories. The Vaiśeṣikas consider what is existent to be a subset of the real: universals are real but not existent because they are objective, mind-independent entities rather than unreal or imaginary ones, but they do not exist in the same sense as individual objects or qualities. Particulars qualities are thus more fundamental than universals are for the Vaiśeṣika—the former exist and are real, whereas the latter are merely real—making Vaiśeṣika perhaps the earliest form of trope theory (Matilal 1990, ch. 4; Halbfass 1992, 122–7).

Universals are apprehended directly via perception and are eternal, unitary and located in a plurality of things; that is, like Aristotle’s account of them, they are immanent in that a universal is wholly present in every particular which instantiates it. Particular cows, or particular colours, or particular academic institutions, fall into the categories which they do because of the universals which they instantiate. Moreover, such universals can be further distinguished according to whether they determine natural or conventional classifications: cows and colours would be categorised as natural universals (jāti) while being an academic institution is an imposed classification (upādhi), determined as a matter of convention.

In common with objections to other, much later accounts of immanent universals (Armstrong 1978b), the early Buddhist philosopher Diṅnāga raised an objection to the Nyāya-Vaiśeṣika conception of a universal on the basis that a unitary entity’s being wholly present in multiple locations is incoherent. In the tenth century, Udayana attempted to provide a strict distinction between natural and imposed universals, and also placed restrictions upon the natural universals so that they could not fall foul of the problems associated with instantiation and self-instantiation noted below in Section 5 (Udayana, Kiraṇāvalī). The development of this metaphysics of properties then continued in the school of Navya-Nyāya (or New Nyāya). See, for instance, Annambhaṭṭa’s The Manual of Reason.

b. Medieval Theories of Properties

The subject of properties came to the fore once again in 12th Century Western European philosophy, and questions about what grounds qualitative similarity became important. Peter Abelard and Guillaume de Champeaux debated the nature of universals, with the former developing a form of nominalism, the view that universals are not objectively existing entities but are names, or irrealism which did not seek to determine the ontological status of universals at all. Abelard argued that realism about universals inherited from Boethius is incoherent since the instantiation of a universal by otherwise very different particulars would lead to contradictions. Both a frog and Aristotle instantiate the universal animal, but that makes it both irrational and rational, which is a contradiction. William of Ockham also formulated a version of nominalism which is sometimes regarded as an early trope theory

The rediscovery of the works of Aristotle in Western Europe from the middle of the 12th Century onwards also encouraged the ongoing debate. William of Ockham formulated a version of nominalism which is sometimes regarded as an early trope theory, and Aquinas adopted aspects of Aristotle’s theory of universals and incorporated into them Aristotle’s notion of causal powers in order to explain qualitative similarity, the nature of change and natural necessity.

c. Properties and Enlightenment Science

The European Enlightenment changed the focus of discussions about properties away from ontological worries about what properties are towards concerns about how properties fit in with our scientific worldview. One result of this change of focus was the development of a distinction between properties which has become known as ‘the primary and secondary quality distinction’. Most famously espoused in the work of John Locke, the distinction was inherited by Locke from Galileo, Malebranche and Boyle, and was widely held in some form by scientists of the time who began to distinguish those properties which are perceived exactly as they exist in objects and those which are mediated by the senses (or in some versions of the distinction are entirely subjective). A tomato has the near-spherical shape objectively, but it does not have its red colour independently of being perceived by a conscious observer. Primary qualities, according to Locke, include Shape, Size, Motion, Number, Texture, and Solidity, while secondary qualities are Colour, Taste, Sound, Felt Texture and Smell. If there were no perceivers, the latter qualities would not exist, but that is not usually taken to imply that these qualities are entirely subjective and do not in any sense exist in the objects which appear to instantiate them. Rather, as Locke maintains, there is a causal relationship between the objects and our sensory system such that secondary qualities are caused by the primary qualities of objects with the effects being mediated by the senses; secondary qualities ‘are powers to produce various sensations in us’ (Locke, 1689, VIII, §10).

A second feature of early modern property theories involved growing empiricist distrust of the Aristotelian conception of properties as being causal powers, entities which make effects occur (in the appropriate circumstances) and thereby ground natural necessity. Most famously, David Hume found nothing in sensory experience—no corresponding sensory impression—which indicated the existence of necessary connexions in nature of the variety which causal powers might ground. For the strict empiricist, there is no reason to believe in the existence of unactualized possibilities or potentialities—potentialities which have not manifested their effects—when all which can be observed are the actual effects when they occur. I can never experience the potential of a sugar cube to dissolve in water; I can only observe its dissolving when it actually does so. For the strict empiricist, powers or potentialities are mysterious features of objects, beyond our possible experience, and so we should not postulate their existence.

10. References and Further Reading

  • Allen, S. R.  2002. Deepening the controversy over metaphysical realism. Philosophy 77: 519541.
  • Allen, S. R.  2016. A Critical Introduction to Properties. London: Bloomsbury.
  • Annambhaṭṭa. Edited and translated by G. Bhattacharya. 1983. The Manual of Reason. Calcutta: Progressive Publishers.
  • Armstrong, D. M. 1978a. Universals and Scientific Realism. Volume 1. Cambridge: Cambridge University Press.
  • Armstrong, D. M.  1978b. Universals and Scientific Realism. Volume 2. Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1980. Against ‘Ostrich Nominalism’. Pacific Philosophical Quarterly 61: 440–9.
  • Armstrong, D. M. 1983. What is a law of nature? Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1989. Universals: An Opinionated Introduction. Boulder, CO: Westview Press. (pp 75–112 reprinted as ‘Universals as attributes’ in Loux (ed.), 2001: 65–91.)
  • Armstrong, D. M. 1992. Properties. In Mulligan (ed.), 1997: 14–27.
  • Armstrong, D. M. 1997. A World of States of Affairs. Cambridge: Cambridge University Press.
  • Armstrong, D. M. 1999. The causal theory of properties: properties according to Shoemaker, Ellis, and others. Philosophical Topics 26: 25–37.
  • Armstrong, D. M. 2004. Four Disputes about Properties. Synthese 144: 309–20.
  • Batterman, R. 1998. Why Equilibrium Statistical Mechanics Works: Universality and the Renormalization Group. Philosophy of Science 65: 183–208.
  • Bauer, William A. 2011. An argument for the extrinsic grounding of mass. Erkenntnis 74: 81–99.
  • Bealer, George. 1982. Quality and Concept. Oxford: Oxford University Press.
  • Bennett, Karen. 2017. Making things up. Oxford: Oxford University Press.
  • Bigelow, John, and Pargetter, R. 1990. Science and Necessity. Cambridge: Cambridge University Press.
  • Bird, A. 2007. Nature’s Metaphysics. Oxford: Oxford University Press.
  • Bird, A. 2014. Human Kinds, Interactive Kinds and Realism about Kinds. Unpublished Manuscript.
  • Bird, A. 2017. Manifesting Time and Space. In Jacobs (ed.), 2017: 127–138.
  • Black, R. 2000. Against quidditism. Australasian Journal of Philosophy 78: 87–104.
  • Borghini, A. and Williams, N. E. 2008. A dispositional theory of possibility. Dialectica 62: 21–41.
  • Boyd, R. 1991. Realism, Anti-Foundationalism and the Enthusiasm for Natural Kinds. Philosophical Studies 61: 127–148.
  • Boyd, R. 1999. Homeostasis, Species, and Higher Taxa. In Wilson (ed.), 1999: 141–186.
  • Braddon-Mitchell, D. and Nolan, R. (eds.). 2009. Conceptual Analysis and Philosophical Naturalism. Boston, MA: MIT Press.
  • Bradley, F. H. 1893. Appearance and Reality. London: Swan Sonnenschein.
  • Bricker, P. 1996. Isolation and unification: The realist analysis of possible worlds. Philosophical
  • Studies 84: 225–238.
  • Broad, C. D. 1933. Examination of McTaggart’s Philosophy: Vol. 1. Cambridge: Cambridge University Press.
  • Carnap, R. 1928. The Logical Structure of the World. Berkeley: University of California Press.
  • Carnap, R. 1936–7. Testability and Meaning. Philosophy of Science 3: 419–471 and 4: 1–40.
  • Cartwright, N. 1989. Nature’s Capacities and their Measurement. Oxford: Oxford University Press.
  • Choi, S. 2008. Dispositional Properties and Counterfactual Conditionals. Mind 117: 795–841.
  • Contessa, G. 2015. Only powers can confer dispositions. Philosophical Quarterly 65: 160–76.
  • Correia, F. 2007. (Finean) Essence and (Priorean) Modality. Dialectica 61: 63–84.
  • Cowling, S. 2013. The Modal View of Essence. Canadian Journal of Philosophy 43: 248–266.
  • Cowling, S. 2015. Non-Qualitative Properties. Erkenntnis 80: 275–301.
  • Denby, D. 2014. Essence and Intrinsicality. In R. Francescotti (ed.), 2014: 87–109.
  • Devitt, Michael. 1980. ‘Ostrich Nominalism’ or ‘Mirage Realism’. Pacific Philosophical Quarterly 61: 433–9.
  • Diekemper, J. 2009. Thisness and events. Journal of Philosophy 106: 255–276.
  • Ehring, Douglas. 2011. Tropes: Properties, Objects and Mental Causation. Oxford: Oxford University Press.
  • Elgin, Catherine Z. 1995. Unnatural science. Journal of Philosophy 92: 289–302.
  • Ellis, B.  2001. Scientific Essentialism. Cambridge: Cambridge University Press.
  • Fales, Evan. 1990. Causation and Universals. London: Routledge.
  • Fine, K. 1994. Essence and Modality. Philosophical Perspectives 8: 1–16.
  • Fine, K. 2000. Neutral Relations. Philosophical Review 199: 1–33.
  • Finkelberg, A. 2017. Heraclitus and Thales’ Conceptual Scheme. Leiden: Konninklijke Brill.
  • Francescotti, Robert. 1999. How to define intrinsic properties. Noûs 33: 590–609.
  • Francescotti, Robert. (ed.) 2014. Companion to Intrinsic Properties. Berlin: De Gruyter.
  • Frege, Gottlob. 1884. Die Grundlagen der Arithmetik. Translated by J. L Austin (1950, second edition 1968) as The Foundations of Arithmetic. Evanston, IL: Northwestern University Press.
  • Gautama, Akṣapadi. Nyāya Sūtra.
  • Goodman, N. 1954. Fact, Fiction and Forecast. Cambridge, MA: Harvard University Press.
  • Halbfass, W. 1992. On being and what there is: Classical Vaiśeṣika and the History of Indian Ontology. Albany: State University of New York Press.
  • Handfield, T. 2005. Armstrong and the Modal Inversion of Dispositions. Philosophical Quarterly 55: 452–61.
  • Harris, R. 2010. How to define extrinsic properties. Axiomathes 20: 461–478.
  • Hawthorne, J. 2001. Intrinsic properties and natural relations. Philosophy and Phenomenological Research 63: 399–403.
  • Heil, John. 2003. From an Ontological Point of View. Oxford: Oxford University Press.
  • Heil, John.  2012. The Universe As We Find It. Oxford: Oxford University Press.
  • Hempel, C and Oppenheim, R. 1948. Studies in the logic of explanation. Philosophy of Science 15:
  • 135–175.
  • Hirsch, E. 1993. Dividing Reality. Oxford: Oxford University Press.
  • Hochberg, H. 1987. Russell’s Analysis of Relational Predication and the Asymmetry of the Predication Relation. Philosophia 17: 439–59.
  • Hume, David. 1777. (Third Edition: 1975.)  An Enquiry Concerning Human Understanding. Oxford: Clarendon Press.
  • Jacobs, Jonathan D. A powers theory of modality. Or, how I learned to stop worrying and reject possible worlds. Philosophical Studies 151: 227–48.
  • Jacobs, Jonathan D. (ed.). 2017. Causal Powers. Oxford: Oxford University Press.
  • Kaṇāda. Vaiśeṣika Sūtra.
  • Kim, Jaegwon. 1982. Psychophysical supervenience. Philosophical Studies 41: 51–70. Reprinted in Kim, 1993: 175–193.
  • Kim, Jaegwon. 1993. Supervenience and Mind. Cambridge: Cambridge University Press.
  • Kistler, M. 2002. The causal criterion of reality and the necessity of laws of nature. Metaphysica 3:
  • 57–86.
  • Langton, Rae and Lewis, D. 1998. Defining ‘intrinsic’. Philosophy and Phenomenological Research 58: 333–345.
  • Lewis, David. 1973. Counterfactuals. Cambridge: Harvard University Press.
  • 1983a. New work for a theory of universals. Australasian Journal of Philosophy 61: 343–77. Reprinted in Mellor and Oliver (eds.), 1997: 190–227.
  • Lewis, David. 1983b. Extrinsic properties. Philosophical Studies 44: 197–200.
  • Lewis, David. 1986. On the Plurality of Worlds. Oxford: Blackwell.
  • Lewis, David. 1994. Humean Supervenience Debugged. Mind 103: 473–390.
  • Lewis, David. 1997. Finkish Dispositions. The Philosophical Quarterly 47: 143–158.
  • Lewis, David. 2009. Ramseyan humility. In Braddon-Mitchell and Nolan (eds.), 2009: 203–222.
  • Locke, D.  2012. Quidditism without Quiddities. Philosophical Studies 160: 345–363.
  • Locke, John. 1989. An Essay Concerning Human Understanding.
  • Loux, Michael J. (ed.). 2001. Metaphysics: Contemporary Readings. London: Routledge.
  • MacBride, Fraser. 2011. Relations and Truth-Making. Proceedings of the Aristotelian Society CXI: 159–76.
  • Manley, D. and Wasserman, R. 2008. On Linking Dispositions and Conditionals. Mind 117: 59–84.
  • Marmadoro, Anna. 2010a. Do powers need powers to make them powerful? In Marmadoro (ed.), 2010: 337–352.
  • Marmadoro, Anna. (ed.). 2010b. The Metaphysics of Powers: their grounding and their manifestation. London: Routledge.
  • Marmadoro, Anna. 2015. Everything in Everything. Oxford University Press.
  • Marshall, D. 2016. The Varieties of Intrinsicality. Philosophy and Phenomenological Research 92: 237–263.
  • Martin, C. B. 1994. Dispositions and Conditionals. Philosophical Quarterly 44: 1–8.
  • Matilal, Bimal Krishna. 1990. Logic, Language and Reality. New Delhi: Motilal Banarsidass Publishing.
  • Maurin, Anna-Sofia. 2002. If Tropes. Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Maurin, Anna-Sofia. 2010. Trope theory and the Bradley regress. Synthese 175: 311–326.
  • McGowan, Mary-Kate. 2002. The Neglected Controversy over Metaphysical Realism. Philosophy 77: 5–21.
  • Mellor, D H. and Oliver, A. (eds.). 1997. Properties. Oxford: Oxford University Press.
  • Millikan, R G. 1999. Historical Kinds and the Special Sciences. Philosophical Studies 95: 45–65.
  • Molnar, G. 2003. Powers: a study in metaphysics. Oxford: Oxford University Press.
  • Moore, G E. 1919. External and internal relations. Proceedings of the Aristotelian Society 20: 40–62.
  • Mulligan, K. (ed.). 1992. Language, Truth and Ontology. Dordrecht, The Netherlands: Kluwer Academic Publishers.
  • Mumford, S. 1998. Dispositions. Oxford: Oxford University Press.
  • Mumford, S. 2004. Laws in Nature. London: Routledge.
  • Mumford, S. and Anjum, R. L. 2011. Getting Causes From Powers. Oxford: Oxford University Press.
  • Nolan, Daniel. 2014. Hyperintensional metaphysics. Philosophical Studies 171: 149–160.
  • Orilia, Francesco. 2000. Argument Deletion, Thematic Roles, and Leibniz’s Logico-grammatical Analysis of Relations. History and Philosophy of Logic 21: 147–162.
  • Orilia, Francesco. 2006. States of affairs. Bradley vs. Meinong. In Raspa (ed.), 2006: 213–238.
  • Orilia, Francesco. 2011. Relational Order and Onto-Thematic Roles. Metaphysica 12: 1–18.
  • Parmenides. On Nature.
  • Plato. Parmenides.
  • Plato. Phaedo.
  • Plato. The Republic.
  • Prior, Arthur N. 1949. Determinables, Determinates, and Determinants (I, II). Mind 58 (229): 1–20, 58 (230): 178–94.
  • Prior, E. 1985. Dispositions. Aberdeen: Aberdeen University Press.
  • Putnam, H. 1981. Reason, Truth and History. Cambridge: Cambridge University Press.
  • Quine, W. V. 1948. On what there is. The Review of Metaphysics. Reprinted in Quine, 1953: 1–19.
  • Quine, W. V. 1953 (Second Edition 1960). From a Logical Point of View. Cambridge, MA: Harvard University Press.
  • Quine, W. V. 1960. Word and Object. Cambridge, MA: MIT Press.
  • Raspa, V. (ed.). 2006.  Meinongian Issues in Contemporary Italian Philosophy. Frankfurt: Ontos.
  • Rodriguez-Pereyra, G. 2002. Resemblance Nominalism. Oxford: Oxford University Press.
  • Rosenkrantz, G. 1979. The pure and the impure. Logique et Analyse 88: 515–523.
  • Russell, B. 1903. The Principles of Mathematics. London: George Allen & Unwin.
  • Russell, B. 1905. On denoting. In Russell, 1994: 415–27.
  • Russell, B. 1924. Logical Atomism. Reprinted in his Logic and Knowledge: Essays 1901–1950, R C Marsh (ed.), London: George Allen & Unwin Ltd: 323–43.
  • Russell, B. 1994. The collected Papers of Bertrand Russell 4. London: Routledge.
  • Ryle, G. 1949. The Concept of Mind. London: Penguin.
  • Schaffer, J. 2003. Is there a fundamental level? Noûs 37: 498–517.
  • Schaffer, J. 2005. Quiddistic knowledge. Philosophical Studies 123: 1–32.
  • Schroer, Robert. 2013. Can a single property be both dispositional and categorical? The “Partial Consideration Strategy” partially considered. Metaphysica 14: 63–77.
  • Shoemaker, S. 1980. Causality and Properties. Reprinted in Mellor and Oliver (eds.), 1997: 228–254.
  • Sider, Theodore. 1993. Intrinsic properties. Philosophical Studies 83: 1–27.
  • Swoyer, Chris. 1982. The nature of natural laws. Australasian Journal of Philosophy 60: 203–223.
  • Udayana. Kiraṇāvalī
  • Vetter, B. 2015. Potentiality: From Dispositions to Modality. Oxford: Oxford University Press.
  • Wildman, N. 2013. Modality, Sparsity, and Essence. Philosophical Quarterly 63: 760–782.
  • Williams, Neil E. 2017. Powerful Perdurance: Linking Parts with Powers. In Jacobs (ed.), 2017: 139–164.
  • Wilson, Jessica M. 1999. How Superduper Does a Physicalist Supervenience Need to Be? The Philosophical Quarterly 49: 33–52.
  • Wilson, R (ed.). 1999. Species: New Interdisciplinary Essays. Cambridge, MA: MIT Press.
  • Yablo, Stephen. 1992. Mental Causation. The Philosophical Review 101: 245–280.
  • Zalta, Edward N. 1983. Abstract Objects: An Introduction to Axiomatic Metaphysics. Dordrecht: D. Reidel.
  • Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intentionality. Cambridge, MA: MIT Press.
  • Zalta, Edward N. 2006. Essence and Modality. Mind 115: 659–693.

 

Author Information

Sophie Allen
Email: s.r.allen@keele.ac.uk
University of Keele
United Kingdom

Paulo Freire (1921—1997)

Freire
By Slobodan Dimitrov – own work CC BY-SA 3.0

Paulo Freire was one of the most influential philosophers of education of the twentieth century. He worked wholeheartedly to help people both through his philosophy and his practice of critical pedagogy. A native of Brazil, Freire’s goal was to eradicate illiteracy among people from previously colonized countries and continents. His insights were rooted in the social and political realities of the children and grandchildren of former slaves. His ideas, life, and work served to ameliorate the living conditions of oppressed people.

This article examines key events in Freire’s life, as well as his ideas regarding pedagogy and political philosophy. In particular, it examines conscientização, critical pedagogy, Freire’s criticism of the banking model of education, and the process of internalization of one’s oppressors. As a humanist, Freire defended the theses that: (a) it is every person’s ontological vocation to become more human; (b) both the oppressor and the oppressed are diminished in their humanity when their relationship is characterized by oppressive dynamics; (c) through the process of conscientização, the oppressors and oppressed can come to understand their own power; and (d) ultimately the oppressed will be able to authentically change their circumstances only if their intentions and actions are consistent with their goal.

Table of Contents

  1. Colonized Brazil
  2. Early Years
  3. Influences on Freire
  4. Literacy Campaign
  5. Philosophical Contributions
    1. Critical Pedagogy Versus the Banking Model of Education
    2. Internalization
    3. Conscientização
    4. Freedom
  6. Pedagogy of the Oppressed
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
  7. Exile Years
  8. Return to Brazil
  9. Working Assumptions
  10. Criticisms
  11. Legacy
  12. References and Further Reading

1. Colonized Brazil

In order to better understand Paulo Freire’s ideas and his work, it is important to consider the context from which Freire developed his philosophy. Freire’s context was the North Eastern region of Brazil from the 1930s through the 1960s. Brazil was a Portuguese colony from 1500 to 1822. As was the case with other American colonies, most of the Indigenous people of Brazil perished due to the harsh, forced labor conditions and because they did not have any immunity to European diseases. Some of the natives who survived were enslaved in engenhos (sugar mills). Since most of the Indigenous population died, the owners of the engenhos engaged in the practice of buying African people as slaves to work and to increase the production of sugar, which was one of the main Brazilian exports during the years Brazil was a Portuguese colony.

Most of the Brazilian population during the years of Portuguese colonization was of Indigenous and African descent. There was very little movement of Portuguese immigrants into Brazil. To the Portuguese, Brazil was primarily a commercial enterprise that allowed them to exploit the Brazilian resources in order to rival England and Holland economically. Newspapers were not published in Brazil until 1808, and literacy among the vast majority of Brazilians was simply nonexistent.

Freire’s life and work continues to ameliorate the aftermath of 400 years of colonization and slavery in the American continent. Slavery was officially abolished in Brazil in 1888 when Brazil experienced a period of economic growth after its independence from Portugal in 1822. However, even during the mid-20th century, the economic conditions for many Brazilians were so negative and the hunger they experienced so unbearable that many farmers sold themselves or members of their families into slavery in order to avoid starving.

2. Early Years

Paulo Reglus Neves Freire was born in Recife in 1921. Freire experienced firsthand the political instability as well as the economic hardships of the 1930s. Freire’s father died during the economic depression of the thirties, and as a young child, Freire came to know the crippling and dehumanizing effects of hunger. Young Freire saw himself being forced by the circumstances to steal food for his family, and he ultimately dropped out of elementary school to work and help his family financially. It was through these hardships that Freire developed his unyielding sense of solidarity with the poor. From childhood on, Freire made a conscious commitment to work in order to improve the conditions of marginalized people.

Freire managed to finish elementary school between Recife and Jaboatão and later attended the secondary school, Oswaldo Cruz, in Recife. Aluízio Pessoa de Araújo, the principal of Oswaldo Cruz secondary school, agreed to allow Freire to study at a reduced tuition because Freire’s family could not afford to pay the full tuition. To reciprocate the favor, Freire began to teach Portuguese classes at Oswaldo Cruz in 1942. Freire then went on to study law at Recife’s School of Law from 1943 to 1947.

3. Influences on Freire

Paulo Freire’s thought and work were primarily influenced by his historical context, the history of Brazil, and his own experiences. Some of the early and lasting influences on Freire were his parents, his preschool teacher, and Aluízio Pessoa de Araújo, the principal of Oswaldo Cruz secondary school. The ideas that contributed to the development of Freire’s philosophy and work are existentialism, phenomenology, humanism, Marxism, and Christianity. The ideas of G. W. F. Hegel, Karl Marx, Anísio Teixeira, John Dewey, Albert Memmi, Erich Fromm, Frantz Fanon, and Antonio Gramsci were Freire’s major influences.

Freire learned tolerance and love from his parents. Freire’s father died in 1934 due to complications from arterial sclerosis. Freire was 13 years old. Freire’s mother assumed the responsibility of providing for her four children. Even though Freire’s childhood was not an easy one due to the death of his father and the economic conditions of the 1930s, Freire’s parents had created an environment of tolerance and understanding in his home.

Eunice Vasconcelos was Freire’s preschool teacher, and she greatly influenced his understanding of school and learning. Because of this experience, Freire came to love learning, and he came to see school as a place where one is encouraged to explore one’s curiosity. Another important influence on Freire was Aluízio Pessoa de Araújo. Freire’s mother approached him to ask if young Freire could study at his school. The only problem was that Edeltrudes was not able to pay for Freire’s tuition. He accepted Freire into the school anyway because he was committed to teaching for the sake of helping people, and this proved to be a lasting influence on Freire.

Freire’s thought was deeply influenced by a number of G. W. F. Hegel’s ideas. Most notably are Hegel’s process metaphysics, social ethics, phenomenology, and the tension of the master versus slave dialectic. Throughout his writings, Freire makes the claim that the ontological vocation of all human beings is to become more human. While many of Freire’s readers and critics speculate that Freire assumes a substance metaphysics that reifies some types of human nature, other interpretations assume a Hegelian process metaphysics. If we assume the validity of this latter interpretation, then just as the unfolding of history culminates in Absolute Spirit for Hegel, similarly with Freire, it is the process of becoming that is important. Freire was also influenced by Hegel’s communitarianism and worked with individual students always with the aim of benefiting the community as a whole. Freire understood the importance of empowering individuals (positive rights) and protecting them (negative rights), which is a consequence of Freire’s understanding of the role, importance, and commitment to the betterment of the community. Freire also adopted phenomenology as his preferred method for not only making sense of his context, but also for figuring out a way to help his students learn about their own contexts. The emphasis on subjectivity from phenomenology was used by Freire to help his students understand their own realities through their learning of language, or as Freire called it, “the word,” and to learn together how to speak their word. Hegel’s tension of the master versus slave dialectic became for Freire the tension between the oppressor and the oppressed.

Karl Marx’s ideas were foremost influential on Freire’s own philosophy. Among the ideas from Marx that influenced Freire are Marx’s class consciousness, his concept of labor, and false consciousness. For Marx, when a person gains awareness of their class consciousness, they become cognizant of their economic place in their society and thus of their class interests. Freire’s concept of conscientização points to the process of becoming aware not only of one’s class, but also more broadly of the roles one’s race, gender, physical ability, and so forth play in our society. Freire, like Marx, believed that it is through our work that humans can change the world. Whether Freire’s students were construction workers, janitors, factory workers, or shoemakers, Freire used their work and the words for their tools both to teach them how to read and write as well as to share with his students how each of them transformed the world and made their world through their work. Just as Marx pointed to the spiritual loss from alienated labor that workers experienced, likewise Freire aimed to prevent this loss and restore human dignity to the work of his students by sharing with them the transformative power of their work. What Freire refers to as the internalization of a master has its basis in Marx’s concept of false consciousness. For Marx, false consciousness takes place whenever a member of the proletariat mistakenly believes that they are not being exploited, or that by working harder, they will some day gain economic stability and freedom. For Freire, Marx’s false consciousness takes place when the oppressed internalizes the ideology of the oppressor.

Freire was also influenced by Anísio Teixeira’s work and philosophy. Teixeira’s work called for the democratization of the Brazilian society through education. Teixeira opposed the education of his time, which was exclusive to the upper classes and thus promoted a social elitism that left the majority of Brazilians without access to education. Teixeira worked toward establishing a free, public, secular education that would be accessible for everyone. Freire was moved by Teixeira’s questioning of why the average Brazilian did not embrace a democratic spirit, and both Teixeira and Freire agreed this was due to the traditionally hierarchical and authoritarian ways in which people had related to each other during the time that Brazil had been a Portuguese colony, and afterward while slavery continued being an institution in Brazil. Freire, like Teixeira, believed and worked toward the possibility of developing a democratic sensibility through education.

John Dewey’s philosophy of education was another influence on Freire’s philosophy and work, particularly in the classroom dynamics, and the dynamic between the teacher and the students. Teixeira had been a student of Dewey, and the importance of fostering a democratic sensibility through education became central to Freire. Freire believed the classroom was a place where social change could take place. Freire, like Dewey, believed that each student should play an active role in their own learning, instead of being the passive recipients of knowledge. Consequently, Dewey and Freire both agreed that the ideal teacher would be open-minded and confident—confident in their competence while also open-minded to sharing and learning from his or her students. Both Dewey and Freire were critical of teachers whose dispositions were undemocratic, who transmitted information from the expert to the student, and who lacked curiosity and confidence to continue learning from their students.

Existentialism was another significant influence on Freire’s philosophy. Freire believed that human beings are free to choose and thus responsible for their choices. While on one hand, Freire did very much take into account the historical context created by the legacy of slavery in Brazil, he never believed the historical conditions determined the future for him, his students, or Brazilian society. On the contrary, Freire espoused the existential belief that humans need not be determined by the past. When Freire taught literacy classes, he not only taught his students how to read and write. Freire shared conscientização and, with this, the awareness that his students were free to choose the life they created for themselves.

Erich Fromm’s ideas also helped Freire discern how to bring about human liberation vis a vis the dominant ideology of Brazil at the time. Before Critical Theory, human reason was interpreted to be our source of rational, autonomous choices and enlightened dialogue. Marx problematized this assumption, however, when he pointed to false consciousness as one of the ways through which the dominant ideology becomes an instrument of domination that controls human choices and promotes alienation. Freire relied on Fromm’s understanding of human freedom and Fromm’s discussion of control to come to his own understanding of the dynamic between the oppressors and the oppressed. Like the existentialists before him, Fromm advocated the creation of human values instead of following pre-established and unquestioned norms. Freire was influenced by Fromm’s understanding of freedom to develop the liberatory praxis of critical pedagogy whereby the people in the classroom contributed to each other’s conscientização and thus embrace and claim their own freedom. In order to explain the difference between humanism and humanitarianism, Freire used the biophilic and necrophilic concepts from Fromm. In his book The Heart of Man (1967), Fromm distinguishes between two types of approaches to helping others. One approach is to feel the need to control the situation and the people who are being helped. The other approach is to allow the situation and the people to be what they potentially may be. Fromm characterizes the people who feel the need to control as necrophilic because in their need to control other people and the events in life itself, they deny people and life of their own possibilities. According to Fromm, those who are able to allow other people and events to unfold into what they may become are characterized as being biophilic because they respect the freedom and creativity of human beings and trust in the unfolding of life’s events.

The ideas of Albert Memmi and Frantz Fanon helped Freire to make sense first of the Brazilian and then the Latin American, African, and Asian colonized experience. Although Freire was deeply influenced by Marx’s analysis of economic classes, the Brazilian and Latin American histories could not be understood by class analysis alone due to the history of colonization and slavery. Freire agreed with Memmi that the primary reason for colonization was economic. Freire believed there were two reasons why the literacy rate was so low in northeastern Brazil. The first was because the Portuguese were primarily concerned with the economic exploitation of Brazil and its people. As was the case in other Latin American countries, Catholic priests did educate some of the people and advanced to some degree the interests of the natives; however, according to Freire’s understanding, and influenced by Memmi, the colonization of Brazil was first and foremost an economic endeavor. The exploitation of the land’s resources and the people’s labor through the institution of slavery and the aftermath of slavery was the second reason the literacy rate was extremely low. In agreement with Teixeira, Freire believed the lack of democratic sensibility and education in Brazil was precisely due to the history of colonization in Brazil.

Besides Memmi, Fanon was deeply influential in Freire’s understanding of the colonized experience. Perhaps the most salient influence of Fanon on Freire was Fanon’s idea that the oppressed must be actively engaged at every step of gaining their own freedom. In other words, the oppressed cannot and should not be liberated by anyone other than themselves. Fanon’s discussion of language, in his case the difference between “proper French” and his Creole French, also influenced Freire’s understanding and teaching of Portuguese in such a way that Freire always acknowledged the legitimacy of his students’ way of speaking the Portuguese language.

Freire’s philosophical development was also influenced by several of Antonio Gramsci’s ideas. Gramsci’s idea of the organic intellectual influenced Freire to believe in the importance of educating and fostering the development of his working-class students. Influenced by Fanon and Gramsci, Freire was committed to the idea and practice of legitimizing the experiences and knowledge of his students so that organic intellectuals would emerge. These organic intellectuals would in turn be in the best position to contribute to the solutions of the community’s problems since they would know their community, the intricacies of their context, and their problems and solutions better than any expert who had studied the problem merely academically.

Equally important to the theoretical influences here mentioned was the spiritual influence that Christianity had on Freire’s philosophy. Freire was particularly influenced by liberation theology as it developed in Latin America. Liberation theology prioritized fighting poverty, political activism, practice, and social justice. Freire’s philosophy was very much in line with the grassroots, bottom-up organization of liberation theology, which emphasized the importance of practicing the teachings of Jesus Christ instead of obediently following the established orthodox church hierarchy.

4. Literacy Campaign

Paulo Freire began to work with illiterate peasants and workers in the northeastern region of Brazil in 1947, and by the beginning of the 1960s, he had organized a popular movement to eradicate illiteracy. Due to the Portuguese colonization of Brazil, as well as the institution of slavery, the literacy level of most Brazilians was extremely low. The population of the northeastern region of Brazil in 1962 was 25 million, and of these, approximately 15 million were illiterate.

In 1947, when Freire was 26 years old and while he was still teaching language classes at Oswaldo Cruz secondary school, he began to work at the government agency called the Serviço Social da Indústria (SESI). He was appointed to work as an assistant in the Division of Public Relations, Education and Culture. The goal of this agency was to provide social services in the areas of health, housing, education, and leisure for the Brazilian working class.

Freire worked at SESI for 10 years, and during this time, he learned many important aspects about the Brazilian working class and Brazilian school system that informed how he would later develop as a teacher and political thinker. Freire worked closely with the schools, examining how policy was made and how it affected the quality of education for the students. It was during this time that Freire noticed how some of the Brazilian working-class parents were raising their children. Although Freire had been brought up in a tolerant environment, this was not the case in most other homes. Freire came to SESI with a democratic sensibility, however, he was met with what seemed to be a type of conditioned authoritarianism that affected how parents related to their children and how teachers approached their teaching. Physical punishment toward children was often used both by parents as well as teachers. Freire noticed that the harsh physical punishment the children were subjected to did not serve the intended purpose; instead, children were alienated from their parents and teachers, and an environment of harsh authoritarianism was more firmly established. Consequently, Freire began training teachers and parents to learn more tolerant ways of teaching and disciplining their children.

During the 10 years that Freire worked for SESI, he gathered many experiences that would later help him shape his doctoral studies and dissertation at the University of Recife. After his work for SESI, Freire accepted a position as a consultant for the Division of Research and Planning. It was during this time that Freire began to establish himself as a progressive educator. He conducted studies in adult education and marginal populations and presented these at national adult education conferences. His early ideas were of cooperative decision-making, social participation, and political responsibility.  Freire did not see education as merely a way to master academic standards or skills that would help a person professionally. Instead, he cared that learners understood their social problems and that they discovered themselves as creative agents. In 1959, Freire completed his doctoral dissertation titled Educacåo e Actualidade Brazileira (Present-day Education in Brazil).

In 1961, the mayor of Recife, Miguel Arraes, asked Freire to help develop literacy programs for the city. The goal of these programs was primarily to encourage literacy among the working class, to foster a democratic climate, and to preserve their Indigenous traditions, beliefs, and culture. It was during this time that Freire began to work with his cultural circles and found out just how damaging and pervasive the institution of slavery continued to be, even decades after slavery had been abolished.

Freire decided to use the name “cultural circles” instead of literacy classes. He had several reasons for this choice of words, and one reason was the negative connotation of the word “illiterate.” Although most of his students were, as a matter of fact, illiterate, no one wanted to describe or think of themselves as such. Another reason was that Freire’s project did not focus solely on teaching people how to read and write. At the time, literacy was one of the requirements for voting in presidential elections, and Freire meant to create a sense of political awareness by the methods he used to teach as well as the content he shared with his students.

The teachers of the cultural circles were deliberately not called teachers, but rather coordinators, and the students were instead called participants. Instead of traditional lectures, dialogue was encouraged. Freire chose not to use the traditional language primers because their content was often irrelevant to the cultural context of the peasants and the workers he taught. Instead, Freire began with the existential conditions of the learners. Of the coordinators, Freire required that they be driven by love, be guided by humility, and have great faith in the human potential. Freire asked that the coordinators consider education as a vehicle for liberation instead of domestication.

Also in 1961, João Goulart assumed the presidency of Brazil. Goulart was a populist leader, so when he was elected, many student groups, unions, and peasant leagues began to emerge. At the same time, a communist presence was more clearly felt in Brazil. It was partly because of these events that Freire transferred the cultural circles from the city of Recife to the Cultural Extension Service (SEC) in the University of Recife. From June of 1963 to March of 1964, Freire and his team trained college students and others who were interested on how to work with adult literacy learners. Freire planned to reach as much of Brazil as he could by establishing more than 20,000 cultural circles around the country. Freire’s plan was to teach five million adult learners within a two-year period how to read and write.

On April 1, 1964, a military coup that was supported by the CIA overthrew the Goulart administration. The mayor of Recife, Pelópidas Silveira, was arrested, Freire was discharged from his position, and all of Freire’s teaching materials were confiscated. Freire was subjected to a series of interrogations and accused of being a communist. He spent 75 days in jail, where he began to write his first book Educação como Practica da Liberdade (Education as the Practice of Freedom). The new military regime deemed Freire’s literacy project as subversive and stopped the funding for the project. Freire and his family were exiled from Brazil from 1964 to 1980. They first lived in Bolivia, then in Chile, where Freire continued his literacy project with Chilean farmers.

In the process of working with both Brazilian and Chilean peasants, Freire realized that even though people were no longer enslaved and had learned how to read and write, and in some cases were the owners of their own land, they did not consider themselves as being free. With this insight, one of Freire’s lifelong goals became to create the circumstances for his students to discover themselves as human beings, with their own agency as subjects and not objects, as members of a community, and as the creators of culture.

5. Philosophical Contributions

a. Critical Pedagogy Versus the Banking Model of Education

Paulo Freire’s philosophical views grew from his experiences as a teacher and the interactions he had with his students. Rather than continuing with the established cultural patterns of relating to people through a hierarchy of power, Freire’s starting point in the classroom aims to undermine the power dynamics that hold some people above others. Freire emphasizes that a democratic relationship between the teacher and her students is necessary in order for the conscientização process to take place.

Freire’s critical pedagogy, or problem-posing education, uses a democratic approach in order to reach the democratic ideal, and, in this sense, the goal and the process are consistent. He explains how the teacher who intends to hold herself at some higher level of power than that of her students, and who does not admit to her own fallible nature and ignorance, places herself in rigid and deadlocked positions. She pretends to be the one who knows while the students are the ones who do not know. The rigidity of holding this type of power dynamic negates education as a process of inquiry and of knowledge gained.

Freire is very critical of teachers who see themselves as the sole possessors of knowledge while they see their students as empty receptacles into which teachers must deposit their knowledge. He calls this pedagogical approach the “banking method” of education. This pedagogical approach is similar to the process of colonization, given that the colonizing culture thinks of itself as the correct and valuable culture, while the colonized culture is deemed as inferior and in need of the colonizing culture for its own betterment. The banking method is a violent way to treat students because students are human beings with their own inclinations and legitimate ways of thinking. The banking method treats students as though they were things instead of human beings.

Instead of the banking method, Freire proposes a reciprocal relationship between the teacher and the students in a democratic environment that allows everyone to learn from each other. The banking method of education is characterized as a vertical relationship:

teacher

student

The relationship developed through the banking method between the teacher and the students is characterized by insecurity, suspicion of one another, the teacher’s need to maintain control, and power dynamics within a hierarchy that are oppressive. The critical pedagogy that Freire proposes allows for a horizontal type of relationship:

teacher ↔ student

This relationship is democratic insofar as both the teacher and the student are willing and open to the possibility of learning from each other. With this type of relationship, no one is above anyone, and there is mutual respect. Both the teacher and the student acknowledge that they each have different experiences and expertise to offer to each other so that both can benefit from the other to learn and grow as human beings.

Instead of tacitly promoting oppressive relationships through the banking method of education, Freire chooses the process of critical pedagogy as his pedagogical model. This is because critical pedagogy utilizes dialogue among human beings who are equals rather than oppressive imposition.

Another negative consequence of the banking method is that students are not encouraged, and thus do not learn how to think critically, or to feel confident about thinking for themselves. The relationship between a student and a teacher who uses the banking method is similar to that of a farmer who obeys the orders of his/her boss. As was the case with the peasants with whom Freire worked, when a person’s day-to-day experience is dominated by another person or group of people, most of the dominated people are not capable of developing the ability to think, to question, or to analyze situations for themselves. Instead, their consciousness develops primarily to obey the orders imposed on them.

To promote democratic interactions between people, Freire suggests that teachers problematize the issue being discussed.  When issues or questions are problematized by teachers who work through critical pedagogy, readily made answers are not available.  Students realize that although some questions do have clear-cut answers, many of our deeper questions do not have obvious answers.  When students learn that teachers are human beings just as everyone else, and that teachers do not know everything but that they are also learners, students then feel more confident in their own search for answers and more comfortable to critically raise questions of their own.  The banking method denies the need for dialogue because it assumes that the teacher is the one who possesses all the answers and the students are ignorant and in need of the teachers’ knowledge. In order to problematize a subject, the teacher assumes a humble and open attitude. Given the teacher’s personal example, the students also become open to the possibility of considering the different positions being discussed. This promotes a dynamic of tolerance and democratic awareness because critical pedagogy undermines relationships where some people have power or knowledge, and some do not, and where some people give orders and others obey without questioning. Problematizing promotes dialogue and a sense of critical analysis that allows students to develop the disposition for dialogue not only in the classroom but also outside of it. This is of utmost importance because the disposition and value of dialogue spills over in a positive way to the students’ other relationships, at home, in the work place and in the community.

b. Internalization

Paulo Freire worked with people who came from a context of pervasive historical oppression. Most of his students came from families who had been previously enslaved, and Freire came to understand that abolishing slavery did not automatically mean that people were free. He also realized that teaching people how to read and write so they could vote in Brazilian elections, that is, enabling people through positive rights, was still not enough for people to realize their own freedom and end their oppression. Freire recognized that the oppression of a human being runs much deeper than political institutions and legal guarantees. He discovered that while we may actively seek our freedom, besides the institutional obstacles like colonization and dictatorships, there are also internal obstacles that prevent us from being free. The concept of internalization treated in this section is psychologically deep and rich in meaning.

In order to explain what internalization means, Freire writes about an incident in a Latin American latifundio (plantation) where a group of armed peasants took over the plantation. For tactical reasons they wanted to keep the landowner boss as a hostage. However, not a single peasant was able to keep guard over the boss because his very presence frightened them. Freire speculates that it is possible that the very act of fighting against their boss made the peasants feel guilty. Freire concludes that, in fact, the boss was “inside” them. These peasants had internalized their master. Although the boss was, as a matter of fact, overpowered by the peasants who outnumbered him, and was thus not in the position to give them orders or punish them if the peasants disobeyed, the peasants’ behavior was still driven out of fear of their boss. The freedom of the peasants was not merely contingent upon them physically removing their boss from the plantation, as they had initially believed. These peasants had been thoroughly conditioned to obey orders, to behave in a submissive way, to know and keep their “place,” which they did even when the boss was no longer in power.

Whenever we internalize our oppressors, we behave in the way the oppressor would have us behave even if they were not present. The example that Freire provides is a very telling one, and other common examples would be those of internalized racism or internalized patriarchy. To internalize racism, for instance, means that a racist person need not be present to oppress another—the person who has internalized racism behaves in a way that promotes the power of the oppressor and reifies the oppressive structure. An example of internalized racism in the 21st century would be dark-skinned people promoting whiteness, for instance by using whitening creams. An example of internalized patriarchy may be when a man feels like crying but does not because he does not want to seem weak. All of these are different ways in which people internalize an oppressive structure and then seek freedom and power within that structure. There are many other ways in which we internalize oppressive structures besides racism and patriarchy, such as our nationality, age, patterns of speech, weight, sexuality, or being able-bodied or disabled.

c. Conscientização

As previously mentioned, Paulo Freire worked with people who had been socialized within institutions shaped by the oppression of colonization. It bears repeating that although slavery was formally abolished in 1888, people continued to sell themselves into slavery during Freire’s time. Freire worked with the sons, daughters, and grandchildren of former slaves, and he noticed that the power dynamics of the institution of slavery continued to affect how people saw themselves and how they related to the people around them.

Conscientização is often described as the process of becoming aware of social and political contradictions and then to act against the oppressive elements of our sociopolitical conditions. This entails developing a critical attitude to help us understand and analyze the human relationships through which we discover ourselves. Conscientização usually begins with the individual person becoming aware of her own social context, political context, economic context, gender, social class, sexuality, and race and how these play an important role in the shaping of her reality. The process of conscientização also entails becoming aware of our agency to choose and create our reality.

Harriet Tubman, the African-American abolitionist, is known to have said that she would have freed more slaves, but the problem was that not all of them knew they were slaves. Tubman’s observation captures the heart of conscientização.” When a person or group of people has been socialized within an oppressive system such as slavery or patriarchy, it is often the case that the oppressed internalize the oppression and do not know that they are oppressed. To illustrate, before becoming politically aware, a woman, let us call her Jane, might behave by and within the norms of patriarchy all of her life. If, for instance, Jane applies for a promotion at work and the promotion is denied to her but is instead given to a less qualified and younger woman, Jane’s conscientização regarding sexism and ageism may begin.

Because of their history, socio-political, and economic contexts, the workers and peasants that Freire worked with were often not aware of the extent of their own oppression. Since they had been socialized to obey orders, to perform specific functions, and to not question authority figures, they were discouraged from following their own interests and from thinking for themselves. Freire noticed that his students would often think of themselves as objects instead of subjects and agents with the ability to choose their own destiny.

There are several steps in the process of conscientização. Freire worked with his students in his cultural circles and chose a curriculum that allowed him to help his students become aware of their socio-political realities. Freire began the process by creating the conditions through which his students could realize their own agency. He describes this first step as being able to identify the difference between what it means to be an object (a thing) and a subject (a human being). Once the first step of the process has been taken, namely the recognition of their agency, Freire emphasized to his students how the consequences of their choices did in fact shape their personal history as well as contributed to the creation of human culture. Equally important, Freire also highlighted the fact that every single human being has the ability to change the world for the better through their work. This was very important because it allowed common men and women to see their own self-worth. Given that their dialect, race, work, and culture were constantly demeaned by a system of oppression, Freire affirmed the worth of every person and that person’s work. Freire’s students came to see themselves as the makers of their own destinies, as confident shoemakers and weavers who created art, and whose culture and dialects were important and valuable.

d. Freedom

Paulo Freire writes about an instance when he asked his students what the difference was between animals and humans. The answers given to him are troubling and insightful. Before the peasants began the process of conscientização, they of course had the ability to become aware of their own agency, but they had not begun the process of conscientização, so they did not think of themselves as being free. When the students were asked about the difference between animals and humans, one of the peasants in the cultural circles in Chile responded that there was no difference between men and animals, and if there was a difference, animals were better off because animals were freer. According to this peasant, an animal enjoys a greater degree of freedom than a human being.

The peasant’s honest answer is indicative of how he saw himself and the context in which Freire worked. Although they were not legally enslaved, these peasants did not think of themselves as being free agents, as subjects with the option to choose and create their own lives and history. Instead, they saw themselves as objects upon whom orders were imposed, so the animals that were not required to follow orders were freer than them. In other words, for these peasants there was no real difference between them and the beasts of burden used to toil in the fields, unless the animal, a fox or bird for instance, was not used for farm labor. In this case, the animal had a higher degree of freedom than a human being.

These responses are indicative of the fact that the “freedom” of the peasants must be qualified. It is true that technically and politically they were no longer slaves. However, they did not think of themselves as being free human beings with their own agency and the ability to decide for themselves. Through working with the South American peasants in Brazil and Chile, Freire came to see that these peasants were not merely a marginalized group of people, but, worse than this, they saw themselves as existing solely for the benefit of their bosses, not as existing for themselves and for their own sake. Their social context had conditioned them into believing that the purpose of their being was only to benefit their bosses. Their economic and political contexts conditioned them to not see themselves as human beings (subjects), but rather as things or objects that exist merely to serve the bosses’ orders. The problem was not simply that they were illiterate but that they were completely alienated from their own agency. When Freire understood the extent of his students’ oppression, he chose to not only teach them how to read and write but also to create the conditions necessary in the classroom for the students to realize their own agency and come to see themselves as human beings. The process of conscientização is much more than learning a set of habits or skills. It is becoming aware of one’s own agency as a human being.

The concept of “freedom” has many connotations. Freedom may mean being able to move about freely or it may mean not being enslaved, for instance. Freire believed that “freedom” is the right of every human being to become more human. Freire noticed that “freedom” meant something different for the peasants with whom he worked. Freire explained that the peasants he worked with wanted land reform—not to be free, but rather to be able to own their own land and thus become landowners, or more specifically, the bosses of new employees.

Freire wrote how a peasant’s goal is in fact to be a free human being, but for them to be a free human being within the contradictory context in which they had been socialized and which they had clearly not overcome, meant to be an oppressor. Freire writes how the oppressed find in the oppressor their model of “manhood” or their model of humanity, of what it means to be a free person. The peasants had come to equate freedom with the ability to oppress others. This is because the context within which they lived dichotomized the boss as “free,” given that the boss was the one in charge and who commanded the peasants to follow his or her orders. The peasants were in turn dichotomized as not being free because they had no choice but to carry out the boss’s orders. Given this historical context, the only example the peasants had of what it meant to be a free person was the example of an abusive boss. Thus, the peasants came to believe their freedom could be only found by oppressing others.

Having the right to vote, to own property, to free speech, or to an education—though undeniably important—does not mean that a person is free. There are different ways in which people may be free, and freedom is a matter of degree. Contrary to the mainstream Western liberal belief, the fact that we are not enslaved physically does not mean that we are free, and it does not mean that we are not behaving the way our internalized oppressors would have us behave.

Freire adamantly opposed authoritarian relationships, which only cause further oppression. This is not merely for the sake of the oppressed, but also for the sake of the oppressors who become oppressed themselves through the dynamics of oppressive relationships. Freire writes how the fear of freedom is embodied by the oppressors but in a different way than by the oppressed. For the oppressed, the fear of freedom is the fear to assume or own up to their own freedom. For the oppressors, the fear is fear of losing the “freedom” to oppress.

6. Pedagogy of the Oppressed

Pedagogy of the Oppressed is Paulo Freire’s best-known work. He wrote it during his first years of exile from Brazil and published it in 1968. The book was translated into English in 1970. It has been banned and blacklisted numerous times by different governments who find the book to be subversive and dangerous. Among these governments was the South African government during Apartheid. In the United States of America in the 21st century, the book was banned from being taught in public schools in the state of Arizona under House Bill 2281.

Pedagogy of the Oppressed is divided into four chapters, and several important themes are developed throughout the book. Among these themes are how the oppressed and the oppressors are affected by the act of oppression, that liberation is a mutual process, the banking model of education, the incompleteness of human beings, generative themes and the use of cooperation, and unity and organization to liberate the oppressed.

a. Chapter 1

There are several important ideas elaborated in the first chapter of Pedagogy of the Oppressed. All of these ideas are developed throughout the book, and Paulo Freire comes back to these ideas throughout his later books and writings. The first thesis is that the dehumanizing situation under which many people live is not a given destiny but rather the result of unjust systematic oppression that fosters violence in the oppressors and dehumanizes the oppressed. Here, Freire makes one of his central theses, namely, that in their struggle to regain their humanity, the oppressed must not become the oppressors of their oppressors. Freire claims that it is only the oppressed who will be able to liberate both themselves and their oppressors by restoring the humanity of both groups.

Freire warns the oppressed against becoming oppressors on two counts: (1) whether the oppressed gain power and use this power to oppress their previous oppressor; or (2) in the case of the oppressed gaining power over other oppressed people and becoming their oppressors, as they seek their own individual liberation. The danger of a previously oppressed person becoming an oppressor is due to their ambiguous duality. Freire points out that the oppressed are at one and the same time both themselves (the oppressed) and the oppressor, whose consciousness they have internalized. Due to this ambiguous duality and the internalization of their oppressors, the oppressed seek to become like the oppressors and share in their way of life.

In this chapter, Freire also begins his criticism of charity versus social justice. Throughout Pedagogy of the Oppressed as well as throughout the rest of his life, Freire makes a distinction between charity and social justice. If social justice was in fact the existing state of affairs in society, Freire argues, there would be no need for charity. In this first chapter, Freire begins to discuss what he calls a false charity or a false generosity that is displayed by the oppressors toward the oppressed in the form of social programs and aid. However, Freire points out, the dispensers of this false generosity often feel threatened by those they claim they wish to help (the oppressed). This is a theme Freire maintains throughout his writings. Freire explains how the oppressors must perpetuate injustice in order for them to be able to express their false generosity. Freire develops this idea further in chapter four of Pedagogy of the Oppressed and comes back to it in in his Education for Critical Consciousness.

Freire also puts forth the thesis that freedom is acquired by conquest, that a person must claim their own freedom because freedom is not something that can be gifted to a person by another. This is a thesis that Freire continues to develop throughout his life. In this chapter, Freire begins by telling us that, oftentimes, members of the oppressors have a change of heart and seek to cease being exploiters of the oppressed. However, Freire warns us that the heirs of exploitation, due to their origin, almost always bring with them their prejudices. Because of their background, even when they seek to help the oppressed, they mistrust the people’s ability to transform their own circumstances and instead believe that they must be in control of the change that takes place. In other words, they still behave paternalistically and believe to know better than the people they falsely claim to respect.

Freire closes the first chapter of Pedagogy of the Oppressed by emphasizing how the oppressed must be intimately involved in each stage of their liberation. This is because, as he emphasizes, freedom is something each of us must claim for ourselves; freedom is not a gift to be given by some people to others.

b. Chapter 2

The most important idea that Paulo Freire develops in chapter two of Pedagogy of the Oppressed is the distinction between the banking model of education versus a critical pedagogy. Please see section 5a for a detailed explanation of this central Freirean concept and practice.

A central element of Freire’s pedagogy is dialogue, and he emphasizes its importance in this chapter. Freire prefers dialogue to imposition. He writes that it is love and respect that allow us to engage people in dialogue and to discover ourselves in the process. By its nature, dialogue is not something that can be imposed. Instead, genuine dialogue is characterized by respect of the parties involved toward one another. We develop a tolerant sensibility during the dialogue process, and it is only when we come to tolerate the points of view and ways of being of others that we might be able to learn from them and about ourselves in the process.

Freire believes that it is necessary for us to develop our tolerance of others so that all may learn from each other. However, tolerating others does not mean that one has to stop being who one is as one tolerates others’ behavior and ways of thinking. Dialogue and imposition are diametrically opposed approaches to relating to one another. According to Freire, imposition of our views upon others comes from a lack of confidence in our own beliefs. The person who either imposes or attempts to impose her views on others behaves in a life-denying manner insofar as she seeks to control others and insofar as she thinks in absolute terms with predetermined conclusions. Dialogue, on the other hand, comes from a place of tolerance. Dialogue can take place when we are comfortable with and confident in our beliefs and ourselves so that even if others disagree with us, we do not interpret their disagreement to mean that we are wrong. Dialogue is life-affirming and allows people and situations to be what they may become; it understands life and people as developing in an open-ended creative process. Instead of believing that “The Answers” or “The Truth” have already been determined, a person who engages others in dialogue believes that the answers and the truth will emerge as we listen and speak to one another. The control of the process comes through the development of the dialogue itself. Those who impose their views on others are afraid of losing their false sense of control. Dialogue, on the other hand, comes from a place of love, respect, trust, humility, and curiosity, and it assumes remaining open to change, to the tensions caused by uncertainty and the precarious, as well as to the further developments that unfold.

c. Chapter 3

In chapter three of Pedagogy of the Oppressed, Freire continues to develop his thesis on helping. He elaborates on the idea that those who educate, facilitate, or help in any way—be it social workers, research teams from universities, and so forth —must first learn to listen to and work with those whom they are helping. Freire is critical of professionals who have internalized the patterns of institutional domination in which they were socialized so that they come to believe that being in a position of power or having some form of institutional authority allows them to help the oppressed with top-down strategies and means. Freire’s criticism is that these “helpers” have come to believe that they have the right type of knowledge, the expertise, and the answers to what the people they are “helping” need, so that their approach to helping is from those who can and who know to those who have not been able to or who do not know:

Political leader/teacher/researcher/social worker
↓↓
students/community members being helped

The problem with this approach is that those who offer their help and expertise, those who are confident in their good intentions and qualifications, do not always trust that the ones who are the most knowledgeable of the problem and the solutions needed are the same people who need the help.

Relatedly, Freire makes a distinction between humanitarianism and humanism. Although both concepts mean well for the whole of humanity, they are not the same, nor do they achieve the same results. Freire was critical of social movements that pretend to give humanitarian aid. This was because he noticed that what oftentimes happens is that in the process of “helping,” the helpers rob the people being helped of their own agency to improve their own condition. There are ways to help people that promote the autonomy of the person or the group of people being helped and other ways of “helping” that impose our assistance on those who ask for our help. This is an important distinction because a humanitarian approach does not lend itself to dialogue insofar as the person in the helping position claims to know what the person in need of help needs and imposes the help. The humanist respects the person in need of help and offers help in such a way as to enable the person being helped to help herself.

Besides developing his thesis on helping, Freire also elaborates on what he terms “limit situations.” In his cultural circles, Freire began his literacy classes by making use of generative themes and words. These would be words such as tijolo (brick). The word would be broken down into its syllables (ti-jo-lo), then the students would practice enunciating the consonants coupled with vowels (ta, te, ti, to, tu; ja, je, ji, jo, ju; la, le, li, lo, lu) and then combine the syllables to generate new words. Sometimes the generative words would be “land,” “economy,” and “culture,” for instance. The facilitator and the students would not only break down the generative words into syllables, but they would also discuss their meanings. There would be times when, in the process of discussing certain generative words and themes, the class would come to a “limit situation.” These limit situations described a shared problem that the participants of the class and the facilitator, by working together, could overcome, for instance, putting up stop signs at intersections where they were needed.

d. Chapter 4

Paulo Freire is very critical of all liberation and populist movements that deny the oppressed the right to participate in their own liberation. Leaders of revolutionary movements cannot gift freedom upon the oppressed, nor can they temporarily use oppressive means to liberate them after the revolutionary movement comes to an end. Leaders are responsible for coordinating and facilitating dialogue among citizens, but, as Freire points out, leaders who deny the participation of the people they are trying to help effectively undermine their very goal to help.

Besides insisting that the solutions we seek come from problems rooted in our experience, Freire motions us toward adopting a pluralistic sensibility that respects the “other,” given that there is more than one way of being. A pluralistic sensibility is manifested through the tolerance we exercise during any dialogue. Democratic interactions are based on a type of faith in humanity, in the belief that all are able to discuss their problems, that is, the problems of their country, continent, world, work, and of democracy itself. In order to engage and be engaged by others in dialogue, it is necessary that we cultivate a sensibility of confidence, humility, and willingness to risk loving others and that we allow others to be who they are. Genuine dialogue is not possible without these values. Freire did not pretend to have any solutions other than to suggest that an open-ended dialogue could lead us to have a more just and humane world.

7. Exile Years

Paulo Freire lived in exile from 1964 to 1980 in Bolivia, Chile, the U. S. A., and Switzerland. Bolivia was the first country where Freire lived in exile from Brazil, but he only stayed in Bolivia for a brief time. Given that Freire had lived his whole life at sea level, the high altitude of the Andes did not settle well with him, and he had a very difficult time adjusting to the altitude of La Paz. Shortly after his arrival in Bolivia, a coup overthrew the administration of Victor Paz Estenssoro. Due to the political climate and the high altitude, Freire sought political asylum in Chile, where he lived from 1964 to 1969.

The five years that Freire lived in Chile proved to be very fruitful in terms of his writing and research. Freire was also able to continue and make advances with his work on literacy. Freire worked for the Instituto de Desarrollo Agropecuario (Institute for the Development of Agriculture) and with the University of Chile with the Department of Special Planning for the Education of Adults. Freire’s literacy model was successfully adopted, and this led Freire to participate in the Chilean agrarian reform effort. At this time, the United Nations Educational, Scientific and Cultural Organization (UNESCO) approached Freire to become a consultant, and Freire continued to assist the organization of cultural circles throughout Chile.

The five years that Freire lived in Chile were very good years for him and his family. The Chilean people came to love Freire and made him feel welcome. Working with the Chilean peasants was also very helpful to Freire insofar as his experiences with them allowed him to notice differences between the illiterate peasants in Brazil and Chile. Although their histories were similar, they were not the same people, and so Freire came to understand the experience of the oppressed more fully by also working with the Chilean peasants.

It was during his time in Chile that Freire was able to complete the manuscript of his first book, Educação como Prática da Liberdade (Education as the Practice of Freedom), which was published in 1967 in Rio de Janeiro. Freire was also able to write the manuscript of Pedagogy of the Oppressed based on his experiences in Brazil and Chile. Pedagogy of the Oppressed was first published in Spanish in 1968, and because of the political climate in Brazil, the book had to wait until 1975 to be published in Portuguese. By this time, Pedagogy of the Oppressed had already been translated to English, Italian, French, and German.

In 1968, Freire received invitations from Harvard University and the World Council of Churches (WCC) in Geneva Switzerland. He made the agreement to go to Harvard first and then to Geneva, departing from Chile in 1969 to live in Cambridge, Massachusetts, from April 1969 to February 1970. He taught at Harvard’s Center for the Study of Change and Social Development. During Freire’s time at Harvard, he worked as a visiting professor and gave lectures and conferences. He also published “The Adult Literacy Process as Cultural Action for Freedom” and “Cultural Action and Conscientization” in the Harvard Educational Review. These were later published as the monograph titled Cultural Action for Freedom in 1972.

Freire’s time in the U. S. A. allowed him to experience racism and discrimination first-hand as he saw the way people had to make do in the low-income housing and ghettos of New York City. These experiences, like the ones he had with the Chilean peasants, added to his Brazilian experiences and broadened his vision regarding the struggles of the oppressed. He understood that the third world and first world categories were not so clear cut, but rather that poverty and oppression could be found in developed countries as well.

After his time in the U. S. A., Freire lived in Switzerland, from 1970 until his return to Brazil in 1980. Freire worked for the World Council of Churches (WCC) as a consultant for the Office of Education and popular educational reform. In 1971, Freire, in collaboration with other Brazilian exiles, formed the Institute of Cultural Action (IDAC) in Geneva. The goal of IDAC was to bring about a pedagogical practice that brought awareness to the political dimensions of pedagogy. Through his involvement with the WCC and IDAC, Freire traveled to and worked in South and Central America, Africa, Australia, the Middle East, Asia, Europe, and North America.

Because of Freire’s deep interest in and empathy toward colonized countries, he followed closely the liberation struggles of African countries, specifically Mozambique, Angola, Cape Verde, São Tomé and Príncipe, and Guinea-Bissau. In 1975, the newly formed government of Guinea-Bissau invited Freire to help them organize a literacy campaign. Guinea-Bissau had been colonized by the Portuguese since 1440, and by 1975 they had a 90 percent adult illiteracy rate.

8. Return to Brazil

Paulo Freire lived in exile for close to 16 years, from 1964 to 1980. Upon his return to Brazil, he continued his work as an educator until his death in 1997. From 1980 to 1990, he worked at the Universidad de Campinas (UNICAMP) and as a professor in the Postgraduate Education program at the Pontifícia Universidade Católica de São Paulo (PUC-SP). In 1987, he was re-instated as Senior Professor at the Federal University of Pernambuco; however, Freire immediately retired from this position in order to make space for the younger generation of professors. At that time, Freire became Professor Emeritus at the Federal University of Pernambuco.

In 1980, Freire was intimately involved in founding the Partido dos Trabalhadores (PT) (Worker’s Party). This political party challenged the military rule and promoted democracy in Brazil. In 1989, Freire accepted an invitation to become the Secretary of Education for the city of São Paulo. During this time, São Paulo had 12 million people, with 720,000 students in 654 schools K-8. He served as Secretary of Education for two years, until May 1991. During this time, Freire began working toward improving the structural conditions of the buildings where the schools were housed. Besides the physical structures of the schools, he also worked to reform the schools’ curriculum in order to move toward engendering a school environment where students would be happy to learn and teachers would be encouraged to value the students’ backgrounds, cultures, values, interests, and languages. Freire was very sensitive to language discrimination, and he worked toward creating an environment where children would not be alienated due to their non-standard Portuguese dialects, ways of speaking, and syntax. After his retirement as Secretary of Education, Freire continued with his writing projects and went back to teaching in the Supervision and Curriculum graduate program at the Pontifícia Universidade Católica de São Paulo.

In October 1986, Elza, Freire’s wife and companion of 42 years, passed away due to cardiac failure. Freire was deeply affected by the loss of his wife and struggled with depression and grief. The following year, Freire began to slowly reengage himself with his work. He began to work as a consultant for UNICEF and resumed his teaching duties at the Pontifícia Universidade Católica de São Paulo. He also attended a symposium in Los Angeles to commemorate Elza’s life. There he met the educator and social activist Myles Horton, with whom Freire would collaborate to write the book We Make the Road by Walking: Conversations on Education and Social Change (1990). Collaborating on this book with Horton allowed Freire to reengage himself with his writing and eased the pain of losing his wife.

Two years after Elza’s death, Freire married Ana Maria (Nita) Araújo Hasche. Nita’s father was Dr. Aluízio Araújo, the principal of Oswaldo Cruz secondary school, where Freire had been allowed to study at a reduced tuition when he was a young man. Nita and Freire had known each other since then, and years later Freire served as one of Nita’s doctoral dissertation advisors at the Pontifícia Universidade Católica de São Paulo. An accomplished scholar in her own right, Nita contributed significantly to Freire’s later work and has continued to carry Freire’s vision forward, publishing several of his writings posthumously. Nita and Freire lived, loved, and worked happily until Freire passed away due to heart failure on May 2, 1997. He was 75 years old.

9. Working Assumptions

Besides the main philosophical contributions that were explored in section 5, Paulo Freire also thought about and developed other important ideas. These ideas are the working assumptions without which Freire’s work would not have been able to be developed. Although these ideas are just as important as his main philosophical contributions, these ideas are not usually given as much attention by Freire scholars. This section will briefly explain Freire’s working assumptions, namely, his view of human nature, authenticity, dialogue, and love.

Freire believed, as he often wrote, that the ontological vocation of every human being is to become more human. He believed that every person is always a work in progress, unfinished and open to further growth. This idea plays a central role vis a vis his other ideas because Freire worked from the assumption that people could change, learn, and grow to become better, more humane human beings. Freire’s idea of human nature allowed him to articulate his ideas regarding hope, which he believed was grounded on human beings’ incompleteness, beings who are unfinished and always in the process of becoming.

Another idea that played a central role in Freire’s philosophy was that of authenticity. Freire understood that the oppression the people he worked with had experienced had stunted their ability to live authentic lives and relate to the people around them in authentic ways. Especially at the beginning of his work, Freire noticed how many of the peasants he worked with had a deterministic view of history and their socioeconomic and political situations. Part of Freire’s goal was to help his students realize that their reality was not determined, but rather that history is made by one’s choices.

As mentioned, Freire observed that when a person internalizes an oppressor, it is difficult for her to be authentic. This is because when we internalize or host an oppressor, our intentions are split between our desire for freedom and the oppressive tendencies we have internalized, which means that we may feel the need to compete or oppress others in order for us to get ahead. Alienated from ourselves, our work, and other people, and due to the dehumanizing social structures that promote non-democratic relationships, living an inauthentic life may lead us to feel anxiety and potential meaninglessness.

Dialogue is another central working assumption for Freire, who encouraged people to be open, tolerant, and willing to learning from each other. For Freire, dialogue meant the presence of equality, mutual recognition, affirmation of people, a sense of solidarity with people, and remaining open to questions. Freire wrote in length about dialogue and dialogic relationships, which he characterized as loving, humble, hopeful, and exhibiting faith in humanity. Dialogue is the basis for critical and problem-posing pedagogy, as opposed to banking education, where there is no discussion and only the imposition of the teacher’s ideas on the students.

Love is perhaps the most central working assumption that Freire develops and continues to come back to throughout his many years of work. In a video documentary, Freire says of himself, “I’m an intellectual who is not afraid of being loving. I love people and I love the world, and it is because I love people and I love the world that I fight so that social justice is implemented before charity.” Freire wrote about the role that love plays in the commitment to a liberating education early on in Pedagogy of the Oppressed, where he wrote a section on Che Guevara and the feelings of love toward the Latin American peasants Guevara sought to liberate. Freire continued coming back to the role of love in education throughout his many writings until the end of his life. In one of Freire’s last books, Pedagogy of the Heart, he further explores the role of emotions in the process of conscientização. He believed that education was an act of love, and it thus required courage to be politically committed to work toward the empowerment of our students and belief in their potential.

10. Criticisms

There are several criticisms that have been made of Paulo Freire’s work and theories. The most common criticism that is made of Freire is due to his style of writing. Freire’s critics find his writing style to be verbose, cumbersome, and difficult to understand. Relatedly, Freire came under attack by feminists because in his earlier books Freire consistently used male pronouns and male examples. Unlike English, Portuguese is a gendered language, and although Freire was sympathetic to feminism, Freire’s writing was, like most of the writing at the time, dominated by male-centered examples and pronouns. Once Freire was made aware of this shortcoming in his writing, he revised the language of his earlier books in later editions and adopted a more gender-neutral style for the writing of his later books.

Another criticism that has been made of Freire’s work is that his pedagogical model and many of his theories regarding pedagogy are not transferable from the Brazilian third-world context where they were formulated. Although teachers in the U. S. A. have tried to work with Freire’s pedagogical model, the U. S. A. context is too different, his critics argue, from the one where Freire developed his ideas.

Additionally, Freire has been criticized for not fully espousing either Marxism, feminism, Catholicism, nor a militaristic approach to revolutionary change. Although Freire was sympathetic to certain elements of each of these approaches and set of beliefs, his insistence on the importance of dialogue frustrated many of his critics, who have attacked him for not having a concrete and practical method for helping people that could be used in different contexts. Freire has been criticized by leftists for his antireductionist approach and his insistence on dialogue, which in their opinion only slows down the change they want to bring about. Organizers of training events for teachers and social leaders would often invite Freire to help with the planning. Often these organizers became frustrated with Freire’s refusal to provide them with rules or a set of ready-made solutions to their problems.

11. Legacy

Numerous if not countless scholars, activists, politicians, and leaders have been influenced by Paulo Freire’s life and ideas. Among these are bell hooks, Cornel West, Angela Valenzuela, James H. Cone, Peter McLaren, Henry Giroux, Donaldo Macedo, Joe L. Kincheloe, Carlos Alberto Torres, Ira Shor, Shirley R. Steinberg, Michael W. Apple, Stanley Aronowitz, Leonardo Boff, and Jonathan Kozol.

Freire’s Pedagogy of The Oppressed has been influential the world over, and it has been translated into 17 languages. In the 21st century, it is considered to be too subversive for reading; it is one of the banned books in the state of Arizona (U. S. A.). Freire’s emancipatory model of teaching has been widely adopted in previously colonized countries and continents such as Latin America, Africa, Asia, the Philippines, India, and Papua New Guinea. Having been established to generate dialogue and support research into pedagogical approaches and theories, the Paulo Freire Institute is active in 18 countries. The World Bank funded the Southern Highlands Rural Development Program’s Literacy Campaign, which is based on a Freirean model of pedagogy.

Freire was presented with numerous medals, honorary degrees, and recognitions both during his lifetime as well as posthumously. Among these honors are the 1980 King Baudouin International Development Prize and the 1986 UNESCO Prize for Education for Peace. In 2008, Freire was inducted into the International Adult and Continuing Education Hall of Fame.

More important than all of the recognitions Freire received and the scholars he influenced, Freire’s life was his most significant legacy. His life’s example continues to inspire. He created the conditions by which thousands of people, the children and grandchildren of former slaves, could learn to read and write, learn about their agency and freedom, and learn to love.

12. References and Further Reading

  • Collins, Denis. Paulo Freire: His Life, Works & Thought. New York: Paulist Press, 1977.
    • Excellent short introduction to Paulo Freire’s life and philosophy.
  • Bakewell, Peter. A History of Latin America: Empires and Sequels 1450- 1930. Malden,   MA: Blackwell Publishers, 1997.
    • Latin American history, from colonization through independence.
  • Finn, Patrick J. Literacy With An Attitude: Educating Working-Class Children in Their Own Self-Interest. New York: SUNY Press, 2009.
    • Example of the banking model of education in the U. S. A.
  • Fonseca, Sérgio C. “Repercussões das ideias de Anísio Teixeira na obra de Paulo Freire.” Travessias, 2 (2008) 3-15.
    • Examination of Anísio Teixeira’s influence on Paulo Freire’s philosophy.
  • Freire, Ana Maria Araujo and Donaldo Macedo. The Paulo Freire Reader. New York: Continuum, 2000.
    • Presents Paulo Freire’s main ideas with an introduction written by Nita Freire.
  • Freire, Paulo. Education for Critical Consciousness. New York: Seabury Press, 1973.
    • Re-publication (in English) of Paulo Freire’s first book Education, the Practice of Freedom together with Extension or Communication.
  • Freire, Paulo. Education, the Practice of Freedom. London: Writers and Readers   Publishing Cooperative, 1976.
    • Paulo Freire’s first book, where he develops the banking model of education versus critical pedagogy.
  • Freire, Paulo. Extensión o Comunicación. Colombia: Editorial América Latina, 1974.
    • Paulo Freire discusses the better and worse methods to communicate between agronomic engineers and farmers.
  • Freire, Paulo. Letters to Cristina: Reflections on My Life and Work. New York: Routledge, 1996.
    • Series of autobiographical letters written to his niece discussing the events in his life and his philosophy.
  • Freire, Paulo. Pedagogy of Freedom: Ethics, Democracy and Civic Courage. Lanham, Rowman & Littlefield Publishers, 1998.
    • One of the last books that Paulo Freire authored, offering his most mature and insightful reflections. Also contains an informative and incisive foreword written by Donaldo Macedo.
  • Freire, Paulo. Pedagogy of the Heart. New York: Continuum, 2007.
    • Written toward the end of Paulo Freire’s life. Here he takes a look back at his work while still developing nuances to his concept of conscientização.
  • Freire, Paulo. Pedagogy of Hope: Reliving Pedagogy of the Oppressed. London:   Bloomsbury Academic, 2014.
    • This book is the “sequel” to Pedagogy of the Oppressed, where Paulo Freire explains the context and further elucidates the concepts he developed in Pedagogy of the Oppressed.
  • Freire, Paulo. Pedagogy of the Oppressed. New York: Continuum, 1970.
    • Paulo Freire’s most read book, where he develops the concepts of banking versus critical education.
  • Freire, Paulo. Teachers as cultural workers: letters to those who dare teach. Boulder: Westview Press, 1998.
    • Paulo Freire addresses teachers and encourages us to commit ourselves to continue being caring and open-minded.
  • Freire, Paulo. “The Adult Literacy Process as Cultural Action for Freedom.” Harvard Educational Review. 40:2 (1970) 205-225.
    • Paulo Freire’s article published in the U. S. A. during the time he taught at Harvard. Here he articulates the essence of Pedagogy of the Oppressed to the American academic audience.
  • Fromm, Erich. The Heart of Man: Its potential for good and for evil. Mexico: Fund, University Press, 1967.
    • Erich Fromm develops the biophilic and necrophilic concepts that influenced Paulo Freire.
  • Hooks, Bell. Teaching Community: A Pedagogy of Hope. New York: Routledge, 2003.
    • Critical pedagogy in the current U. S. American context.
  • Kirylo, James D. Paulo Freire the Man from Recife. New York: Peter Lang Publishing, Inc., 2011.
    • Excellent and thorough biography of Paulo Freire.
  • Martínez, Eusebio Nájera. “Paulo Freire – Fragmentos testimoniales de una praxis 3,” Online video clip. YouTube, 20 February 2010. Web. 28 August 2015.
    • Three-part documentary on Paulo Freire and his work.
  • Valenzuela, Angela. Subtractive Schooling: U.S. – Mexican Youth and the Politics of Caring, Albany: State University of New York, 1999.
    • An example of the banking model of education in the U. S. A.

 

Author Information

Kim Díaz
Email: kdiaz60@epcc.edu
El Paso Community College
U. S. A.

Maurice Blanchot (1907–2003)

Though Maurice Blanchot’s status as a major figure in 20th century French thought is indisputable, it is debatable how best to classify his thought and writings. To trace the itinerary of Blanchot’s development as a thinker and writer is to traverse the span of 20th century French intellectual history, as Blanchot lived through, and engaged with, in some capacity, virtually every single major intellectual movement of the age. Spanning several generations of French philosophy (from the phenomenology of the interwar years, to the structuralism of the 1950s and early 1960s, to the post-structuralism of the 1960s and 1970s), Blanchot’s thought remains strictly irreducible to any of these categories, insofar as it resists enclosure, and responds ceaselessly to the demand of bearing witness to that which is timeless, nameless, and radically other.

Thus far, Blanchot’s greatest influence has arguably been felt in the fields of literature and literary theory. His fictional texts, Thomas the Obscure (1941), Death Sentence (1948), and The Madness of the Day (1949) are among the most unique and challenging texts in 20th century French literature. His critical essays on Kafka, Rilke, Sade, Mallarmé, and Hölderlin, and his interpretation of the myth of Orpheus, are considered canonical texts in the field of literary studies. His relationship to philosophy, though equally significant, is more nuanced and complex.

While references to philosophical concepts and themes are certainly pervasive throughout his writings, Blanchot eschews formal argumentation and proposes no systematic philosophical theory of his own. Throughout his myriad references to Levinas, Hegel, Nietzsche, Heidegger, and countless others, Blanchot seeks to read philosophers on their own terms, engaging with their respective terminologies, and operating from inside their respective philosophical systems in order to highlight the ways in which these systems inevitably open onto an outside. This tension, between the functioning of a system—be it philosophical, political, textual—and its anarchic, unrepresentable, outside, is a commonly recurring trope within Blanchot’s writings. He is generally less concerned with taking sides on philosophical questions, than in showing how any system or theory that aspires to exhaustive totality undermines itself by assuming a starting point that precedes, or exceeds, the system itself.

Throughout his writings, several other recurrent themes can likewise be discerned. These include an exploration of the paradoxes associated with death, repetition, and time, as well as the various aporias related to origins and ends. Blanchot’s writings show him to be a thinker broadly committed to privileging anonymity and difference over identity and sameness. Though his thinking, particularly with regard to politics, undergoes a series of significant shifts over the course his life, there is a certain consistency to Blanchot’s overall approach. His central concern is to draw philosophy, literature, and theory-at-large, into relation with an otherness, a proverbial outside, beyond its limits—to which it must constantly respond.

Table of Contents

  1. Biography and Intellectual Itinerary
    1. Early Life and Journalism
    2. Bataille, the War, and the Èze Years
    3. A Return to Politics
    4. Responding to the Other
    5. Writing the Disaster
  2. Engagement with Major Philosophers
    1. Levinas
    2. Hegel
    3. Nietzsche
    4. Heidegger
  3. Key Concepts and Themes
    1. Two Kinds of Death
    2. The il y a
    3. The Neuter
    4. Community and the Political
  4. References and Further Reading
    1. Major Works
    2. English Translations
    3. Secondary Bibliography

1. Biography and Intellectual Itinerary

a. Early Life and Journalism

Blanchot was born in Quain, a town in Saône-et-Loire, in 1907. His family was conservative and Catholic; his father encouraged Blanchot and his siblings to practice Latin at the kitchen-table. Blanchot studied Philosophy and German at the University of Strasbourg, which at the time boasted one of the most extensive libraries in France. It was here, around 1925 or 1926, that Blanchot first met Emmanuel Levinas, and the two became life-long friends. By 1929, Blanchot had relocated to Paris, and briefly pursued, during the early 1930s, the study of medicine at Saint Anne’s Hospital. It was around this time that Blanchot began his first collaborations with the journals of the French far-right. Espousing a vehemently anti-Hitlerian tone, Blanchot’s articles bemoaned the perceived complacency of the French government in addressing the growing threat of German expansionism. Blanchot’s writings from this period have come under considerable scrutiny, in recent years, for their alleged filiation with anti-Semitic currents on the French far-right. An exhaustive examination of all articles signed by Blanchot during the 1930s, however, reveals no instances of racially-exclusionary language or overt anti-Semitism. In his later writings, Blanchot addresses his dubious political commitments of the 1930s, seeking to disambiguate his own youthful involvement in reactionary politics from the anti-Semitism of his one-time associates.

With the outbreak of war in Europe, Blanchot momentarily withdrew from political writing and concentrated his efforts on the writing of fictional texts and literary criticism. His first novel, Thomas the Obscure, was published in 1941, meeting initially with poor reviews in the Parisian press. A second novel, Aminadab, was published a year later, in 1942. During this time, Blanchot was already beginning to develop a distinctive, literary critical voice. His first collection of literary critical essays, Faux pas, appeared in December 1943, featuring texts on a diverse-range of writers, including Mallarmé, Proust, Kierkegaard, Rimbaud, and Melville.

b. Bataille, the War, and the Èze Years

The early 1940s were a particularly formative time in Blanchot’s life. Towards the end of 1940, Blanchot was introduced, by Pierre Prévost, to Georges Bataille. An incredibly close bond would be formed between the two men, lasting until Bataille’s death in 1962. At the time they first met, Bataille was hard at-work on his Nietzsche book, and Bataille’s interpretation of the German philosopher as a radically non-teleological thinker and natural adversary of Hegel would prove immensely influential not only upon Blanchot, but upon an entire generation of French intellectuals. At Bataille’s invitation, Blanchot became a regular participant in the bi-monthly philosophical discussions at 3 rue de Lille, where Blanchot met Denise Rollin, with whom he would later enter into a close relationship. Along with Bataille, Blanchot helped formulate, in late 1942, the abortive project of the “Collège socratique.” In March 1944, Blanchot was present at the famous “Discussion on Sin” organized by Bataille, and attended by Camus, Merleau-Ponty, Sartre, and Klossowski, among others.

Over the decades that followed, Blanchot would frequently engage with Bataille’s highly-influential writings, both directly and indirectly. Of particular importance in chronicling the influence of Bataille, is Blanchot’s 1962 essay, “The Limit-Experience,” later republished in The Infinite Conversation (1969), the 1971 text, Friendship, as well as the first part of Blanchot’s l983 text, The Unavowable Community.

The spring of 1944 was a difficult time for both men. Bataille became quite ill and temporarily left Paris for Samois, while Blanchot himself departed for his family-home in Quain. It was here, in June 1944, that Blanchot was put against the wall by a firing-squad and “mock-executed.” These remarkable, undoubtedly traumatic, circumstances would be later recounted by Blanchot some fifty years later in his text, “The Instant of My Death” (1994). With the surrender of the German army in Paris, on August 25, the war effectively came to an end for Blanchot, who was on the move between Paris and various locales in the south of France throughout 1945 and 1946. It was during this period that Blanchot penned important essays on Kafka, René Char, Nietzsche, and Hölderlin, while assisting Bataille in bringing to publication the first edition of the journal Critique.

The winter of 1946 saw the beginning of a new phase in Blanchot’s life as a writer. He moved for several weeks to a small house in Èze, near Nice, where he lived without electricity and worked on his récits at night. It was over the course of the following years that Blanchot’s reputation as a writer would largely be won. He completed his remarkable, cryptic récit, Death Sentence, in 1947 and saw it published in June 1948. His third (and final) novel, The Most-High, featuring a more political bent, was also published in 1948, followed by the fictional text, The Madness of the Day (1949), and another volume of critical essays, The Work of Fire (1949), which contained the seminal text, “Literature and the Right to Death” (first published in 1948). Rounding-out a decade of incredible productivity, Blanchot’s Lautréamont and Sade was published in 1949.

By this point, Blanchot was producing a new critical essay for publication virtually every couple of weeks. During this period of prolific writing, he continued to move frequently, staying with his brother, René, whenever he found himself in Paris. In September 1949, Blanchot returned to the small house in Èze, which he would make his primary residence until 1957. Here, amidst the “essential solitude” of this medieval village overlooking the Mediterranean coast, Blanchot would write some of the most influential critical essays of his career, including the theoretical writings contained within The Space of Literature (1955).

Indeed, it is perhaps for the writings found within The Space of Literature that Blanchot is most widely-known. Here we find his frequently-cited accounts of the gaze of Orpheus, the two kinds of death, and (in the text’s appendix) the two versions of the imaginary. At the heart of Blanchot’s writings here, which engage in turn with Kafka, Rilke, Mallarmé, and Hölderlin, is a thesis about the radical non-essentiality of literature and the exigency of worklessness (désoeuvrement) which is literature’s aim and concern. If the story of Orpheus and Eurydice is important in this context, it is because Orpheus shows us, in turning back to gaze at Eurydice, a concern for the origin of the work, its absence and inspiration, which overrides any interest in its status as a completed and consummated work. In turning to view Eurydice, Orpheus ruins the work of bringing her out of the darkness, and yet this ruination, Blanchot insists, in his June 1953 essay, reveals what is most essential to literature, namely, its concern for the impossibility and palpable absence that reside at its origin. Literature is less about the completion of great “works,” than it is with maintaining a paradoxical relation with the “worklessness” and impossibility that unravels every work.

This “non-teleological” emphasis, which is also evident in Blanchot’s portrayal of Kafka interminably wandering outside Canaan, is likewise seen in the essay on “The Two Versions of the Imaginary,” first published in 1951, and included as an appendix within The Space of Literature. Here, Blanchot provocatively juxtaposes two versions of the literary image. One version, clearly associated with Hegel and Mallarmé, views the image as the life-giving negation of the thing. It places the thing in question at a distance from us in order to help us understand it in its ideality, thus facilitating productive knowledge. The productive recuperability of this type of image is then contrasted, in Blanchot’s account, by the “other imaginary,” the one which resides outside of the world and its possibilities for knowledge and understanding. Here, the seductive gleam of the image refers us not to the absence of the thing, but to the distance (and difference) that always separates each thing from itself—precluding any possibility for a neat, teleological recuperation. In refusing to subordinate difference to identity and distance to presence, Blanchot is already anticipating the ascendency of the simulacral that will play such a prominent role in the post-structuralist theories of the decades to come.

Beyond his influential literary critical essays, the 1950s also saw the publication of three more, increasingly spare and challenging, récits: When the Time Comes (1951), The One Who Did Not Accompany Me (1953), and The Last Man (1957), in which plot-development and characterization are pared-down to an absolute minimum, as if to highlight the dislocation of presence and the disruption of time to which these texts each bear witness.

c. A Return to Politics

Blanchot’s mother died, in 1957. By all accounts, her passing affected the family greatly. After spending the winter with his brother and sister-in-law in Paris, Blanchot moved into his own flat, on rue Madame, in late summer 1958, beginning a new phase in his intellectual and personal itinerary. The return to Paris, in 1957, was significant in a number of respects. First, it marked a renewed engagement with national politics. Second, it coincided with an increasing focus on questions of an explicitly philosophical nature which called-forth a new, ever more rigorous and demanding style of writing.

The Algiers crisis of 1958, the collapse of the French Fourth Republic, and the rise to power of de Gaulle ushered in a frightening new era in politics. Blanchot, who had not participated in national politics since the 1930s, threw himself into the very middle of the resistance against de Gaulle’s Fifth Republic. Throughout the late summer of 1958, he frequently met-up with Marguerite Duras and Dionys Mascolo (a major influence on Blanchot’s political thinking during this time), and became involved with Mascolo’s anti-Gaullist paper (co-founded with Jean Schuster), Le 14 Juillet. The marked change in Blanchot’s political thinking was clearly evident in his manifesto, “Refusal,” published in October 1958, and in another anti-Gaullist piece, “The Essential Perversion.”

When Francis Jeansen and twenty-three other dissidents were put on trial, in September 1960, for opposing French colonial rule and supporting the Algerian struggle for independence, Blanchot, Mascolo, and a group of other intellectuals, determined to pen a declaration of solidarity with the defendants. The resulting piece, commonly known as the “Manifeste de 121” was written by Blanchot, and declared support for Algerian independence, as well as for those conscripts who refused to be drafted into the conflict. On the heels of this intervention in national politics, Blanchot and Mascolo (along with others) attempted, in 1960, to start an experimental new publication to be called “The International Review.” The publication aspired to solicit short texts on a variety of topics, in three languages (French, German, and Italian), to be written in fragmentary form. Though the ambitious project never fully materialized, it marked an important, early moment in Blanchot’s attempt at rethinking the notion of community beyond borders and fixed identity.

d. Responding to the Other

The late 1950s and early 1960s also saw, in addition to a renewed focus on the political, an emergence of significant stylistic and theoretical innovations in Blanchot’s writing. In October 1958, Blanchot used, for the first time in his published writing, the notion of “le neuter” as a lexical placeholder for the trace of what remains outside of being and non-being. The notion of the neuter would grow in prominence in Blanchot’s writings over the decades to come, comprising one of the most important tropes within his later writings. Around the same time, in 1958, Blanchot published new, important work on Nietzsche, confronting “head-on” the attempted Fascist appropriation of the thinker’s legacy, and seeking to rehabilitate Nietzsche as a thinker intrinsically resistant to all totalizing (Fascist) thought.  A year earlier, in 1957, Blanchot had begun work on the text initially entitled “Waiting,” which would eventually reappear within the 1962 text, Awaiting Oblivion. “Waiting” is significant for a couple reasons. It is a text comprised solely of fragments, conjoined loosely by a shared emphasis on the themes of forgetting, waiting, and temporality bereft of presence. In May 1959, these fragments were offered by Blanchot to a Festschrift produced in honor of Heidegger’s 70th birthday. The publication of Awaiting Oblivion involved a radical subversion of the categories of genre. Readers are left to ponder: Is it a work of experimental fiction? Is it a philosophical text? Or is it something altogether other? With Awaiting Oblivion, Blanchot puts into play a form of fragmentary writing that refuses enclosure within any fixed genre, serving as testimony, rather, to that which escapes all categorization, all thematization, and all definition. It is a text dedicated to radical alterity, and thus, to the neuter itself.

These developments in Blanchot’s thought would soon be supplemented by the writings of an old friend. In 1961, Levinas published his groundbreaking book, Totality and Infinity. Its account of an ethical metaphysics based upon man’s impossible burden of responsibility for the Other, would prove influential upon Blanchot, whose writings, from 1961 onward shift decisively into the domain of the ethico-political. Key to these developments in Blanchot’s thinking is the increasing prominence of the neuter. It is the neuter that Blanchot conceives as a notion that displaces the primacy of ontology and holds open the space of an ethico-political relationship always yet-to-come, always irreducible to fusion, identity, or Oneness. The neuter thus serves as a provocative rejoinder both to Heidegger and Hegel, whose philosophies (though quite different) similarly prioritize, in Blanchot’s view, the totalizing embrace of Being. It is during this period that Blanchot also continues to draw influence from Nietzsche’s texts, in which the exigency of fragmentary writing is given its supreme voice.

Published in 1969, The Infinite Conversation contains critical essays on a host of literary topics (Char, Duras, German Romanticism, Kafka, Flaubert, Roussel), as well as essays dealing with philosophical and theoretical considerations (Levinas, Simone Weil, Nietzsche, Heidegger, Freud, Bataille, Foucault), which are in turn punctuated and disrupted by instances of fragmentation and dialogue between unnamed interlocutors. The Infinite Conversation is undoubtedly Blanchot’s most stylistically diverse text, combining fragmentary texts and more standard literary critical writings, like those found in The Space of Literature or The Book to Come (1959). It is a text without a center-point, without a single unifying theme—unless this theme is the movement of dispersion and dislocation that has always already destabilized all pretense of unity, and exposed all interiority to that which is radically outside it.

During the events  of May 1968, Blanchot found himself at the heart of the anti-authoritarian movement as a member of the Comité d’action étudiants-écrivains. Penning numerous, unsigned pieces for the group’s magazine, Comité, Blanchot espoused a radical politics based upon a rejection of all forms of hitherto existing political order: a communism without communism. By mid-1969, however, Blanchot had distanced himself from the group, citing as a reason (in a letter to Levinas) its position in support of Palestine and opposition to the state of Israel.

As the 1970s began, a veritable changing of the guard was underway. As post-structuralism entered its zenith, with Derrida and Deleuze producing many of their seminal writings, Blanchot’s health began to decline precipitously and death seemed all around him. Jean Paulhan passed away in 1969, then Paul Celan drowned himself in the Seine in April 1970. Blanchot himself endured hospitalization in the early 1970s, and in early 1972 wrote letters to his closest friends thanking them, as though retrospectively, for a life in which he was privileged to meet them. His 1973 text, The Step Not Beyond, which was written entirely in the fragmentary form, resembles at times a meditation on death—though less as a statement of its impending reality, than as a testimony to its interminable impossibility. Consigning us to a time without present, the act of writing, Blanchot insists, makes the process of dying endless and the instant of death unattainable.

e. Writing the Disaster

Many of these themes reemerge in his 1980 text, The Writing of the Disaster, only supplemented by a somewhat broader panoply of accompanying themes and emphases. Present here are explicit references to Levinas, a staple in Blanchot’s texts since the early 1960s, and Hegel—but these perennial sources of inspiration and provocation are accompanied now by a host of unexpected, other voices. Blanchot devotes space to an engagement with the psychoanalysts Serge Leclaire and D. W. Winnicott on the topics of narcissism and the primal scene; he references Melville’s “Bartleby,” and offers fragments on Derrida, Deleuze and Guattari, Heidegger’s obsession with etymology, and Nietzsche’s views on the Jews, among many other topics. At the heart of Blanchot’s text is the notion of the disaster itself. Not merely synonymous with the Holocaust, the dispersive force of the neuter, or the “impossible necessary death” (The Writing of the Disaster, p. 67) that has always already preceded (and ruined) every installation of subjective, egoic mastery—the polysemy of the disaster includes traces of each of these meanings, without being in any way reducible to a single fixed meaning or concept.

Crucially, the disaster remains outside of all presence, beyond representation, and divorced from possibility and truth. It is wholly “otherwise” than being or non-being. Yet, despite the disaster’s exteriority (and anteriority) with respect to each of these classic, philosophical notions, it is the disaster that accords each of these notions its respective meaning, on the condition that this meaning never coincide fully with itself. The disaster has always already touched, inhabited, compromised, and ruined every worldly edifice predicated upon stability, totality, unity, and Oneness before it can even be founded. The disaster is a “name” for that which turns every subject, every text, every historical narrative, and every political system ceaselessly outside itself, toward the radical alterity that escapes its enclosure, and serves as its condition of both possibility and impossibility.

Blanchot followed this, arguably his most challenging text, with another important text, The Unavowable Community, in late 1983. Here Blanchot lays out, with reference to Bataille and the novelist Marguerite Duras, among others, a rethinking of the notion of community as irreducible to the notions of self-identity and presence. A small book, entitled A Voice From Elsewhere, appeared in 1992, and final, striking piece of short fiction, “The Instant of My Death,” was published in 1994. Blanchot passed away on February 20, 2003.

2. Engagement with Major Philosophers

a. Levinas

Blanchot and Levinas first met in Strasbourg in 1925 while studying philosophy. They soon developed a deep friendship that would last until Levinas’s death in 1995. Various anecdotes from their friendship are well-known. We know, for instance, that it was Levinas who first introduced Blanchot to Heidegger’s Being and Time in the late 1920s. A little over a decade later, it was Blanchot who helped secure a safe-haven for Levinas’s wife and daughter in a monastery during the war. Yet anecdotes like these can only offer a superficial sense of the profound bond that came to be formed between these two men, so different in their respective backgrounds, beliefs, and interests. The mutual debt of influence shared between them would alter each of their intellectual paths irrevocably, and serve as a catalyst for some of the most important developments in Blanchot’s own thinking.

When Blanchot, throughout his writings, engages with the ideas of Levinas (whom he considered, along with Bataille, his closest friend), it is never Blanchot’s strategy merely to repeat, uncritically, Levinas’s philosophical doctrines, much less to appropriate them as his own. Rather, Blanchot pays tribute to Levinas most devoutly at the precise moments in his texts when he accentuates the difference, and distance, between himself and Levinas. Fidelity to one’s friend, Blanchot suggests, requires a measure of compulsory infidelity. It is by bearing witness to the differences between himself and Levinas that Blanchot most eloquently testifies unto the profundity of their relationship.

Levinas’s name appears for the first time in Blanchot’s published work in a footnote to “Literature and the Right to Death,” published in 1947. And though explicit references to Levinas are rare within Blanchot’s voluminous critical output of the 1940s and 1950s, the presence of certain Levinasian tropes is nevertheless unmistakable during this period. Chief among these is the il y a, which both Levinas and Blanchot attempt to construe as a challenge to the fundamental ontology of Heidegger, as well as to Hegel’s philosophy of death. It is not until 1961, with the publication of Levinas’s Totality and Infinity, that Blanchot undertakes an explicit engagement with Levinas’s philosophy. This engagement initially takes the form of three chapters devoted to Levinasian philosophy in The Infinite Conversation (1969), and is then followed by numerous fragments in the pages of The Writing of the Disaster (1980), and a retrospective of their friendship (“Our Clandestine Companion”).

Whereas the scope of Blanchot’s earlier (1940s-1950s) allusions to Levinasian thought are limited primarily to a consideration of the il y a and its bearing upon the ontological status of the work of art (or literature), at the heart of these post-1961 engagements one finds a noticeable shift of attention and emphasis towards the ethico-political sphere—a move carried out in direct response to the provocative new directions introduced within Levinas’s own thought. Of major importance within Levinas’s philosophy during this period is the figure of the “Other” (Autrui). Standing in sharp contrast to the notion of otherness that had predominated throughout the Western philosophical tradition, Levinas’s Other does not admit of thematization, mediation, or reciprocity. This Other is radically irreducible to any notion of the Same or the Self. To stand in relation to the Other is to exist in “infinite relation” with that which shatters all forms of totality, abiding beyond both being and non-being. Moreover, as Levinas maintains, the Other burdens the subject with an ethical responsibility that is both impossible to decline and impossible to fulfill satisfactorily. This ethical relation is not chosen, but imposed upon the subject. It demands that the subject put the Other before all else.

It is in the context of these philosophical developments that Blanchot, without accepting any of this uncritically or without reservation, enters into explicit dialogue with Levinas’s texts, from 1961 onward. In the broadest of terms, what Blanchot aims to do, across these various engagements, is to explore ways in which the relation with absolute alterity described by Levinas might allow us to rethink the nature of human relations and community. In this sense, Blanchot is neither adopting Levinasian philosophy as his own, nor contradicting it, but rather pushing it toward its limit, to the point where the Levinasian philosophy of Transcendence, whose religious overtones loom large, opens onto a new form of secular humanism grounded in a concrete emphasis on ethico-political responsibility.

To move in this direction, Blanchot accords the Levinasian philosophy a privileged position within his texts, all the while refusing to spare it critique, interrogation, or transposition. Throughout his writings of the 1960s and beyond, Blanchot does not cease to pose probing questions towards Levinas’s texts. Who exactly is this Other to whom Levinas refers? Is it possible to name the Other as such without compromising his radical alterity? What is the meaning of the “ethics” to which Levinas refers? Is such an ethical comportment exclusive to believers of the Jewish faith? Is it dependent upon a belief in the Jewish God?

Difficult questions such as these are neither avoided by Blanchot, nor accorded facile resolution. Rather, they are explored in all their complexity and allowed to ramify and redouble themselves throughout the pages of Blanchot’s writings. Though he remains thoroughly committed to a rigorous atheologism, Blanchot acknowledges, in light of Levinas’s writings of the early 1960s, the profound philosophical importance of Judaism. What makes Judaism so distinctive, so philosophically important, according to Blanchot, is both its “nomadic” essence and the privilege it accords to man’s sacred responsibility for the Other. Unlike Heidegger’s “pagan” philosophy, for example, which situates truth in rooted dwelling and permanence, Blanchot highlights the impressive manner in which the “truth” of Judaism develops amidst exile, dispersion, and up-rootedness. Moreover, it is Judaism which accords an unparalleled importance to mankind’s relation (of non-relation) with the infinite.

Yet while Levinas understands this relationship to the (transcendent) Other primarily in terms of the paradigmatic “asymmetry” of man’s rapport with God, Blanchot seeks to reconfigure this relationship in terms of the “double dissymmetry” of a relation between two or more human beings. Where the Levinasian account stresses hierarchy and places emphasis upon the verticality of man’s relationship with the Most-High, Blanchot proposes a non-hierarchical relationship between human beings that is irreducible to unity or duality. Dissymmetry, in the Blanchotian account, means that the relation (of non-relation) between the Self and the Other, is redoubled by the Other’s relation (of non-relation) with respect to the Self. Importantly, this redoubling does not lead, in Blanchot’s account, to any dialectic of reciprocity or recognition.  It is not the presence of the divine, as in Levinas’s ethical metaphysics, that saddles the Self with infinite responsibility; rather, it is the presence of one’s own neighbor, one’s fellow man, that introduces a burden of responsibility that can neither be satisfied nor ignored.

While Blanchot remains skeptical of Levinas’s heavy reliance upon a conceptual lexicon (God, the Other, ethics, and so forth) which seems to betray the very alterity it seeks to evoke, he senses in Levinas’s project (and in Judaic philosophy, more broadly) a provocative antidote to the philosophies of totality. Blanchot senses, moreover, within Levinasian philosophy, a precedent for rethinking the meaning of social responsibility and community outside of the economy of being. The influence of Levinas’s philosophy upon Blanchot’s thinking, particularly from 1961 onward, is thus far-reaching and profound.

b. Hegel

Much as thinkers of the medieval period would have referred to Aristotle simply as the philosopher, for Blanchot, it is Hegel who most embodies the discourse of philosophy construed as a systematic whole. As Blanchot writes in The Infinite Conversation, Hegel is the thinker “in whom philosophy comes together and accomplishes itself” (The Infinite Conversation, p. 4). Hegelian philosophy thus becomes the backdrop for much of what Blanchot has to say, not only about philosophy proper, but also about history and literature. What Hegel represents is the false-promise of totality in all its various forms (epistemological, ontological, political, historical, and textual). The “Hegelian system” becomes an emblem for every system, that is, for every attempt at achieving exhaustive, irrefutable self-enclosure—whether this be construed as a system of Absolute Knowledge or even something like Mallarmé’s “Absolute Book.”

Confronted by a discourse that seeks authority and mastery over “the All,” Blanchot’s strategy, in reading Hegel, is to position himself obliquely, along the margins of Hegel’s text, neither opposing Hegel directly, nor endorsing him. A Blanchotian reading will typically follow the author of the Phenomenology up to the point where the text begins to unravel on the basis of its own logic and its philosophy gives way to aporia. Two early examples of this can be found in Blanchot’s essays from 1947 and 1948, entitled respectively, “The Spiritual Animal Kingdom” and “Literature and the Right to Death.” Here we find Blanchot, under the influence of a Kojèveian reading of Hegel, coming to highlight the paradoxes implied by the notions of death and negativity in Hegel’s text.

Not unlike Bataille, Blanchot senses an air of fraudulence surrounding the Hegelian system’s pretense of enclosure. Beginning with Kojève’s thesis that Hegel’s philosophy is a philosophy of death, Blanchot wonders what happens to this negativity, which serves as the driving force for all history, once history has arrived at its end-point. Moreover, if negativity is precisely what provokes the dialectic of history into motion in the first place, then does this not assign to negativity a position simultaneously “before” and “beyond” the very system in question? Such excess, or non-recuperable exteriority, is precisely what Hegel’s system seems to presuppose and yet simultaneously reject. This means that the coherence of the Hegelian system depends upon the very thing that it excludes. Pointing out this dependency of the inside upon the outside is a frequently recurring Blanchotian trope, and it is used with great effect here, with respect to Hegel. As Blanchot himself writes in The Writing of the Disaster, “What exceeds the system is the impossibility of its failure, and likewise the impossibility of its success” (The Writing of the Disaster, p. 47).

By the late 1960s and early 1970s, Hegel’s name is increasingly juxtaposed, in Blanchot’s texts, with the name Nietzsche. If Hegel is seen by Blanchot as the great totalizer, then Nietzsche, on the other hand, is the thinker without enclosure, without a Hauptwerk, without a system, and without any doctrine that would not, simultaneously, suspend itself. If Hegel, moreover, is the great thinker of possibility (the basis of which, according to the Kojèveian interpretation, is death), then it is Nietzsche who comes to emblematize, for Blanchot, the vertigo of eternal return which contests every origin and every end, suspending the work of death, and consigning us to the impossibility of dying.

c. Nietzsche

Blanchot’s Nietzsche is a complex figure positioned both within metaphysics and “always already” outside it. He is, as Blanchot asserts in 1958, “the last philosopher” (The Infinite Conversation, p. 141), a thinker whose texts comprise the culminating event in Western metaphysics. At the same time, Blanchot insists, Nietzsche is outside metaphysics, gesturing us toward the dispersive, the fragmentary, and the incommunicable. During his decades-long engagement with Nietzsche’s thought, Blanchot offers incisive commentary on a wide variety of topics: nihilism, the Last Man, the Will to Power, Dionysus, the philosophy of time, the future, the Death of God, perspectivism, and ecstastic experience. Moreover, Blanchot’s texts from the late 1950s onward demonstrate an acute sensitivity to the political efficacy and political baggage of Nietzsche’s thought. Acknowledging Nietzsche’s horrific appropriation by fascist ideologues during the 1930s and 1940s, Blanchot nevertheless seeks to portray Nietzsche as a paradigmatically non-systematic thinker, whose thought (if followed rigorously and without compromise) resists all attempts at appropriation and mastery. To the extent that one reads Nietzsche attentively, one sees him to be a thinker at odds with all forms of totality, totalitarianism, and anti-Semitism.

Blanchot’s first substantive engagement with Nietzsche’s philosophy appears in late 1945. Here, in an essay entitled “On Nietzsche’s Side,” Blanchot reinscribes Karl Jaspers’ seminal thesis on Nietzsche, namely, that the “essential impulse” of Nietzsche’s thought resides in the tendency toward self-contradiction. Showing, once more, the influence of Kojève, these incessant contradictions do not, according to Blanchot, “get to rest in some higher synthesis, but hold themselves together by an increasing tension” (The Work of Fire, p. 290). This tendency of Nietzsche’s thought to contradict itself without resolution points toward the broader role which Nietzsche will play within Blanchot’s texts as spokesperson par excellence for non-teleological thinking.

Central to this non-teleological capacity of Nietzsche’s thought is the notion of the eternal return. Following immediately upon the heels of Klossowski’s “Forgetting and Anamnesis” paper (1964), Blanchot begins to develop, in the mid-1960s, a distinctive and radical reading of the eternal return which views Nietzsche’s “thought of thoughts” less as a doctrine, than as a simulacrum of a doctrine. In the pages of The Infinite Conversation, Blanchot proposes a novel thesis concerning the reason for Zarathustra’s (and Nietzsche’s own) regimen of postponement and deferral with respect to the proclamation of the message of eternal recurrence. This postponement, Blanchot argues, should not be attributed to some contingent incapacity on the part of the speaker to articulate the thought faithfully or exhaustively, but rather, to the thought’s radical aversion to all presence. The eternal return is continuously deferred from all thought, according to Blanchot, because deferral of all presence is the very meaning of the thought itself. What “returns”— if anything—is an event that has never been present; or rather, an event that hollows out presence itself.

By the time of The Step Not Beyond (1973), Blanchot’s writing on Nietzsche becomes increasingly oblique. In a set of remarks penned in direct response to Klossowski’s Nietzsche and the Vicious Circle, but also suggesting engagement with Deleuze’s late-1960s work on the eternal return, Blanchot suggests that “dissymmetry is at work in repetition itself” (The Step Not Beyond, p. 42), meaning that the past does not repeat the future in the same way that the future repeats the past. Further reinscription of the thought of eternal return occurs in The Writing of the Disaster, where Blanchot repeatedly evokes a modality of temporal repetition that has always already dislodged presence, suspended the present, and withdrawn from the Self any basis upon which to construct a coherent notion of self-identity or subjectivity.

d. Heidegger

The provocation posed by Heidegger to French theory during the mid-20th century is well-documented. From Sartre and Lacan, to Levinas and Derrida, the imposing demand of Heidegger’s philosophy weighed heavily upon countless thinkers. In this respect, Blanchot was no exception. Blanchot’s good comprehension of German, and his early exposure to Heidegger’s work (he was introduced to it via Levinas, in the late 1920s), made a significant engagement with the author of Being and Time perhaps inevitable.

Over the span of Blanchot’s published writings, we find countless instances of substantive engagement with Heidegger’s thought, initially on issues pertaining to the work of art, poetics, and Hölderlin. Later, these engagements would come to include a deeper questioning of the status of Being, the problem of nihilism, and the notion of futurity, among other topics. Throughout his post-war writings, Blanchot displays acute sensitivity and great nuance in dealing with Heidegger’s legacy as a thinker once ensnared by the allure of National-Socialism. On one hand, Blanchot is quick to acknowledge that, in committing his philosophical lexicon to the cause of the Nazi party in a public endorsement of Hitler in 1933, Heidegger had cast boundless suspicion over his own discourse and forever tarnished it; on the other hand, Blanchot sees Heidegger’s philosophy as worthy of commentary and to a certain extent inescapable as a point of reference, insofar as it presents (like Hegel, but in a different register) an account of the totalizing embrace of Being. Heidegger’s challenge to philosophy is a challenge that is impossible to ignore.

As early as his review of Sartre’s Nausea, in 1938, Blanchot can already be seen insisting upon the importance of Heidegger’s account of the crisis faced by modern art. And though explicit references to Heidegger during the wartime years are rare, it is clear that Blanchot had already assimilated, by this time, much of Heidegger’s thinking. Nowhere is this more evident than in Blanchot’s early writings on Hölderlin, which strongly reflect a Heideggerean bent. For Heidegger, Dichtung (which means “poetry” in common parlance, but also refers etymologically to the notion of “invention”) comes to be privileged as the most essential type of artwork because it serves as the basis for Dasein’s historical being, as well as serving as the origin of language itself. According to Heidegger, all genuine work of artistic creativity has Dichtung at its origin. Blanchot, in the early 1940s, follows Heidegger by insisting upon the privileged role of poetic language as foundational with respect to the world. It is poetic language that inaugurates a world and discloses the human subject.

Only around 1946, with the publication of his “The ‘Sacred’ Word of Hölderlin,” does Blanchot begin to take a noticeable distance from Heidegger. In this essay, Blanchot finally rejects the Heideggerian reconciliation between Dichtung and Being, and offers an account of Hölderlin’s poetic work that views it less as an act of ontological foundation, than as a site of irresolvable tension wherein the poem ceaselessly confronts its own impossibility and groundlessness.

By the time of Blanchot’s 1952 essay “Literature and the Original Experience,” the similarities and differences between his and Heidegger’s views on art and poetry are even more starkly defined. What the two thinkers share, in a general sense, is a refusal of any aesthetic philosophy based upon the distinction between form and content, subject and object. Moreover, each thinker builds his account from an initial confrontation with Hegel’s Aesthetics, and its famous injunction that “art today is a thing of the past.” But whereas Heidegger insists upon the work’s privileged relation to truth (as “unconcealment”), and hence to the world, Blanchot develops an account of art and literature that stresses their radical exteriority with respect to the world, work, and truth.

Nor is Blanchot’s engagement with Heidegger by any means limited to aesthetics. In the midst of his return to national politics, and his in-depth immersion into the philosophy of Nietzsche, Blanchot offers an important commentary, in 1958, on Heidegger’s exchange with the philosopher Ernst Jünger on the aporias of nihilism. Shortly thereafter, Blanchot is invited to contribute a piece of writing for inclusion within Heidegger’s 70th birthday Festschrift. This piece, entitled “Waiting,” is comprised of a series of fragments, marking Blanchot’s first published foray into a textual form that would assert itself with increasing prominence in his writings over the decades that followed. In this piece, which was later republished with substantial revisions and additions as part of the 1962 text, Awaiting Oblivion, Blanchot describes a type of waiting devoid of transitivity, in which time is no longer measured as a succession of present “now-moments,” but left free from all appropriation and calculation. Waiting here does not refer to an anticipation for something or someone which could ever come to occupy a moment of fixed-presence. Rather, it signifies a waiting for a moment that dislodges chronological temporality: a waiting for nothing other than waiting itself.

Blanchot pays tribute here to the influence of Heidegger in this account of non-representational, post-metaphysical temporality, and yet, as Blanchot’s essay on Heraclitus, first published in 1960, makes clear, a profound divergence in their respective approaches has occurred.  While Heidegger’s Heraclitus famously offers us an insight into the unconcealment of Being, Blanchot proposes to read the Heraclitian fragments as an instance of language construed not as a shelter for Being, but as a response to the radical alterity of that which remains outside of Being altogether. Here, as in so much of Blanchot’s writings of the 1960s and 1970s, language assumes a double function as that which names the possible—but also bears witness to that which infinitely precedes and exceeds all ontology. Moving somewhat away from the notion of the il y a, which was still an ontological construct (albeit a subversive one), Blanchot increasingly deploys the notion of the neuter, a pseudo-concept intended to displace all ontological primacy. Having nothing to do with either being or non-being, the neuter serves as a condition of both possibility and impossibility for Heidegger’s ontological framework, implicitly turning aside the question of the meaning of Being, and upstaging it with the more urgent question of the other.

3. Key Concepts and Themes

a. Two Kinds of Death

Blanchot’s account of the so-called “two kinds of death” is a well-known component of his literary criticism of the 1950s and a recurrent point of emphasis in his ongoing dialogue with the philosophies of Hegel and Heidegger.

For Hegel, as Blanchot notes, death is what produces all possibility of meaning in the world by serving as the catalyst for the dialectic itself. Death is constantly put to work and subsequently recuperated, in Hegel’s system, leading history toward its point of inevitable culmination. For Heidegger, death is likewise related to the notion of possibility. More specifically, it is construed, in Being and Time, as one’s own possibility, a possibility which is non-transferable and not to be outstripped. It comprises the very basis for Dasein’s authentic existence.

In his writings, Blanchot does not directly oppose these accounts of death. What Blanchot suggests, however, is that there is also another side to death which these philosophies marginalize or exclude. It is a side in which the power and possibility of death are suspended. It is this phenomenon to which Rainer Maria Rilke, in a letter from 1910, seeks to bear witness with the words: “Nothing is possible for me anymore, not even dying.” Here, all desire for a masterful, self-actualizing, proper death is forestalled by the realization that death, in fact, is never accessible for the self.

In a manner somewhat reminiscent of Epicurus, Blanchot argues that this second kind of death is incommensurable with any subjective, or personal, experience. Commenting on death in The Space of Literature, Blanchot writes, “I have no relationship with it, it is that toward which I cannot go, for in it I do not die, I have fallen from the power to die. In it they die; they do not cease, and they do not finish dying” (The Space of Literature, p. 155). Thus death, which for Hegel and Heidegger is associated with possibility, comes to be contrasted with the anguish of anonymous death, which is impossible for the Self, and can neither be willed, mastered, or even undergone by any personal subject—in the present.

By articulating this doubleness associated with the notion of death, Blanchot is able to challenge both Hegel and Heidegger on fundamental points of their respective philosophies. If death contains within itself this trace of impersonality that expels all attempts at mastery, propriety, and power, then the consequences of this are significant.  The so-called work of the concept in Hegel, which is powered by death, must now be understood as silently accompanied by worklessness and impossibility. Likewise, in the context of Heidegger’s thought, if death is double, then no death can ever be wholly proper or authentic. In contrast to all philosophies, going back to Plato, in which death is equated with Truth, presence, or consummation, Blanchot thus seeks to emphasize the error, absence, and interminability associated with dying. In short, Blanchot is showing that death remains a radically indeterminable, or volatile, concept whose inclusion within any system poses not only a challenge to systemic coherency, but also to definitive closure.

b. The il y a

The notion of the il y a, which means “there is,” appears prominently throughout Blanchot’s writings and comprises one of the most direct links between his texts and those of early Levinas. Though the trope of the il y a appears in Blanchot’s literary writings dating back to the mid-1930s (“The Last Word”), its most significant early deployment occurs in Blanchot’s novel Thomas the Obscure, which Levinas then explicitly references in his 1947 text, Existence and Existents. Blanchot and Levinas thus develop the notion of the il y a somewhat in tandem during the period in question, influencing one another, while gradually coming to propose subtly different points of emphasis in their respective usages of the phrase.

The il y a features two seemingly contradictory traits. First, it involves the presence of the absence of being. Second, it points toward the inescapability of being or its radical resistance to negation. In coming to formulate these difficult notions, Blanchot and Levinas are engaging critically with the account of fundamental ontology offered in Heidegger’s Being and Time. In contrast to Heidegger’s insistence upon the primacy of being-in-the-world, Blanchot and Levinas seek to articulate a more “primal” ontological state, namely, one which involves the notion of being unmoored from all objects. It is a state characterized by the sheer absence of a world. In the midst of the il y a, the world and its possibilities vanish, leaving as a palpable residue the preconceptual singularity of being itself. Gone is the original generosity of the Heideggerian “gift of Being.” In its place, Blanchot and Levinas assert the vertiginous horror of objectless being, sheer anonymity, and insomniac wakefulness. The il y a thus signifies something even more archaic than ontological difference; it involves a state which serves subversively as a condition of both possibility and impossibility for the Heideggerian distinction between being (Sein) and beings (das Seiende).

This status of serving as both a condition of possibility and impossibility is indeed crucial to the notion of the il y a. As a foundational point, the il y a shows itself to undermine the very things it conditions. Because it serves as a condition for all propositions, whether affirmative or negative, the il y a necessarily remains impervious to the force of negation. When everything else has been negated (at the end of history), the murmur of the indestructible il y a remains.  Moreover, to the extent that it serves as a condition for the world of objects, the il y a poses an inevitable threat to the sense and meaning of the world itself, by confronting the world with an objectless void that precedes and exceeds it. The il y a is what interrupts both Heidegger’s being-in-the world and Hegel’s dialectic by exposing them to something that is unassimilable and foreign, yet necessarily intimate and immersive.

Blanchot begins to move subtly beyond the limits of the Levinasian account of the il y a when he turns this discussion back toward the question of literature. According to Blanchot, what literature seeks as its aim is nothing other than this very state of preconceptual singularity. Literature, Blanchot insists, seeks to bear witness neither to worldly meaning, conceptual truth, nor subjective experience, but rather to a state which precedes all meaning, truth, and subjective experience. Literature, therefore, is both conditioned by the il y a and aims to return toward it. This is what grants the literary work its unique status in Blanchot’s thought: in order to exist, the literary work must necessarily harbor within itself the murmur of the il y a (the work’s origin) which is synonymous with the absence of the work. The work thus contains within itself the trace of its own dissolution, since what makes it possible also puts it in touch with its own impossibility. This emphasis on circularity is a key aspect of Blanchot’s unique interpretation of the il y a, and it is a circularity which, in spite of the vertigo it involves, demands to be radically affirmed

Periodic references to the il y a continue to appear well into Blanchot’s later writings. Its importance as a trope, however, is largely displaced, from the early 1960s onward, by an even more provocative pseudo-concept that Blanchot calls “the neuter.” Whereas the il y a remains situated, at least nominally, within the economy of being and non-being (even as it challenges this economy), the neuter suspends the question of being or non-being altogether and remains radically irreducible to any ontology whatsoever. While the il y a, in the writings of Blanchot and Levinas, evokes the groundless ground of all being, or as Blanchot puts it, the impossibility of not-being, the neuter gestures us even further, toward the very limit of philosophy as such.

c. The Neuter

The neuter is one of the most difficult concepts in Blanchot’s critical apparatus. We might casually think of the neuter as a kind of third gender opposed to the strictly male or female genders. But this is an approach that Blanchot rejects. The neuter is not a gender or a genre of any kind, he insists. It is not a class of beings. Indeed, for Blanchot, the neuter is set apart from everything visible and invisible, everything present and absent. It is commensurable no less with a subject than with an object. The neuter is not of this world, or any world, for that matter. And yet, it is by no means transcendent either. The neuter stands outside of all totality, all unity, all Oneness. It withdraws itself, or effaces itself, the very moment it is uttered or inscribed. The neuter is precisely a (nameless) name for the movement of thought that draws every word and every concept ceaselessly towards its outside, its other.

In practical terms, the neuter evokes a word’s ability to suspend and remark itself in such a way that it ceases to signify what it signifies, and it begins to drift into the indeterminacy of multiple meanings. The neuter is a kind of principle of “original” difference and differentiation that both conditions and threatens the installation of all forms of self-identity, meaning, and truth. It thus bears striking similarities to Derrida’s “différance” to the extent that the neuter establishes the non-coincidence of language with itself. If language is understood in terms of the differential relations between signs, then it is the neuter which has always already brought this difference into play. The neuter is what exposes every word to an infinity of meanings, making language possible on the condition that it is constantly traversed by a radical alterity that both precedes and exceeds it.

Blanchot’s “discovery” of the neuter (in the early 1960s, in this sense, though the term had been used previously in his writings) was highly impactful on the development of his thought as a whole. It is widely known that nearly all of the chapters in Blanchot’s 1969 text, The Infinite Conversation, had been previously published as stand-alone articles in journals such as the Nouvelle Revue française. Significantly, many of these original articles were substantially modified by Blanchot in the years that elapsed between their initial publication (some date all the way back to the mid-1950s) and their ultimate inclusion within the pages of The Infinite Conversation. These revisions reflect a shift in Blanchot’s work that began to take place in the 1960s, and that impact his views on being, language, and philosophy rather dramatically. Integral to this shift is the emergence of the neuter in Blanchot’s theory and writings. His revisions leading up to the publication of The Infinite Conversation reflect this increasing awareness of the neuter’s capacity for displacing, suspending, and ungrounding the very language of philosophy that it conditions.

Thus, in the 1969 republished versions his earlier articles, Blanchot places scare-quotes around the words “being” and “presence,” replaces the word “logos” with “difference,” and substitutes the terms “impersonal” and “anonymous” with “neuter.” These changes are anything but cosmetic. Rather, they reflect a concerted effort on Blanchot’s part to assert the trace of otherness and difference at the heart of philosophy and language. By no means merely an exercise in semantics, the emergence of the neuter, in Blanchot’s writings of the 1960s, goes hand-in-hand with the increasing emphasis on the ethico-political that comes to the fore in his work around the same time, largely in response to developments in Levinas’s thought. “Every encounter,” Blanchot writes, “where the Other suddenly looms up and obliges thought to leave itself, just as it obliges the Self to come up against the lapse that constitutes it and from which it protects itself—is already marked, already fringed by the neutral” (The Infinite Conversation, p. 306).

In The Writing of the Disaster, Blanchot calls the neuter, alongside the notions of the outside, the disaster, and return, the “four winds of the spirit’s absence…the names of thought, when it lets itself come undone and, by writing, fragment” (The Writing of the Disaster, p. 57). The neuter, like the disaster, refers to a movement of thought beyond meaning, that makes meaning possible (on the condition that it never be identical to itself). Together, these notions comprise Blanchot’s most rigorously elaborated tropes for thinking (the non-thought of) the absolute alterity of the outside.

d. Community and the Political

There are several phases to Blanchot’s engagement with politics, making any all-encompassing encapsulation of his views on the topic nearly impossible. Just as his thinking about major issues of philosophical and literary importance undergoes alteration in the sixty-plus years of his career as a mature writer—so, too, do his views on politics evolve greatly.

One can readily identify a very early phase, spanning roughly the decade of the 1930s, during which Blanchot contributed numerous articles and essays to the journals of the French far-right. These articles espoused a virulently anti-Hitlerian rhetoric and took a dim view of any attempts at appeasement or compromise with regard to the growing German menace. Problematically, though, the immediate target of Blanchot’s derision in these pieces was often the parliamentary French government, with its perceived weakness in the face of the Nazi threat. Blanchot’s advocacy, during this period, of terrorism against the liberal state as a means of “national salvation” showed a clear, antidemocratic bent to his early thinking. Much later in his writings, Blanchot came to address this period through a self-critical lens, claiming that despite his youthful participation in French nationalist circles, he consciously refused association with anti-Semitic elements on the far-right.

Between the start of the Second World War and 1957, Blanchot assumed a largely apolitical stance, spending much time in the south of France, and producing an extraordinarily prolific outpouring of literary texts and critical essays. With his return to Paris in 1957, Blanchot reentered the sphere of national politics. He soon developed a close friendship with Dionys Mascolo, who would launch, in July 1958, the paper Le 14 Juillet, alongside Jean Schuster. The paper was oriented around resistance to General de Gaulle’s return to power, and Blanchot elected to write two important articles for publication. Central to Blanchot’s opposition to the regime was his staunch refusal of de Gaulle’s claim to embody the French national destiny. Blanchot saw de Gaulle’s recourse to the rhetoric of national salvation and a quasi-religious politics as a perversion of the highest order. Such perversion, according to Blanchot, demanded vigorous opposition and categorical refusal. By 1958, therefore, one can see Blanchot explicitly and forcefully rejecting precisely the kind of politics (based on military might, patriotism, and national salvation) that he had advocated as a young journalist in the 1930s.

If anything, Blanchot’s politics in the years that followed his initial collaboration with Mascolo’s paper only grew more progressive, more radical. In September 1960, when twenty-four French and Algerian dissidents were put on trial for subverting the French colonial efforts in Algeria, Blanchot was a driving force behind the production of the so-called “Manifeste de 121,” a text which endorsed the right of Frenchmen to refuse to be drafted in to the Algerian conflict, and voiced support for Algerian independence. Along with Mascolo, and several others, Blanchot then sought to channel his efforts on the “Manifeste” into an even more ambitious project: the creation of an international journal of “total criticism,” which would meld together political, literary, and scientific discussions. This “International Review” would be published in three languages (French, German, and Italian), in a format comprised primarily of fragments. By early 1964, however, the project for this experimental journal was abandoned.

Blanchot’s participation in left-wing politics, however, would not wane. During the évenéments of May 1968, Blanchot became a member of the Comité d’action étudiants-écrivains, a group of revolutionary students and writers who agitated against the government and passionately rejected all forms of representational politics predicated upon the pursuit of power. As a member of this radical group, Blanchot anonymously penned numerous texts for its semi-secret magazine, Comité. By March 1969, though, the group had begun to break apart, and Blanchot himself disavowed any further participation in it, due to the group’s position (which was then common in extreme leftist circles) against Israel and in favor of Palestine.

Blanchot thus carried with him, into the 1970s, 1980s, and beyond, a rather unique political style combining aspects of left-wing radicalism, social justice advocacy (as seen, for example, in his writings against apartheid, in support of Salman Rushdie, and later, in favor of gay rights), and unwavering support for the state of Israel. Indeed, the impossible memory of the camps, and the burden of responsibility associated with it, factors heavily into the fragments that came to comprise Blanchot’s texts, The Step Not Beyond and The Writing of the Disaster. The Holocaust looms particularly large, here, as a catastrophic provocation in relation to which all forms of politics whatsoever must be judged, calling forth a political response which rejects all forms of totality or totalitarianism, demanding an infinite attentiveness to the other.

Blanchot refers, at times, to such a politics as “communism.” A Blanchotian form of communism, however, would exclude all forms of preexisting community. Such a communism would have no historical or theoretical precedent. It would be, strictly speaking, a communism solely of the future, one that would reject all forms of previously established communal order. As the etymology here suggests, rethinking communism involves nothing less than rethinking the meaning of community itself. This is a project that Blanchot, inspired by the work of Phillipe Lacoue-Labarthe and Jean-Luc Nancy, embarked upon in his 1983 text, The Unavowable Community, and which comprises one of the major focuses of his later writings.

The politics evoked within these writings reject any notion of community based on the notion of fusion, communion, or nationalism. They emphasize, instead, a double demand. First, to affirm the necessity of a break, a rupture, in the dialectical development of political history. This involves proactive political engagement, agitation, and advocacy. Secondly, though, beyond this demand to create an interruption in the politics of possibility through concrete, worldly intervention, there is the requirement of bearing witness to an infinite demand for justice which exceeds all calculation, all possibility, and all work. This second demand is what specifically requires the community to look beyond all forms of self-identity or self-presence in order to assume an impossible responsibility for the nameless other, without identification, and without resources, who is always yet to come. Such is the challenge, as daunting as it is urgent, that is inseparable from Blanchot’s later thought.

4. References and Further Reading

a. Major Works

  • Thomas l’Obscur, Gallimard, Paris, Gallimard, 2005.
  • Aminadab, Paris, Gallimard, 1942.
  • Faux Pas, Paris, Gallimard, 1943.
  • Le Très-Haut, Paris, Gallimard, 1948.
  • L’Arrêt de mort, Paris, Gallimard, 1948.
  • La Part du feu, Paris, Gallimard, 1949.
  • Lautréamont et Sade, Paris, Minuit, 1949.
  • Au moment voulu, Paris, Gallimard, 1951.
  • Celui qui ne m’accompagnait pas, Paris, Gallimard, 1953.
  • L’Espace littéraire, Paris, Gallimard, 1955.
  • Le Dernier Homme, Paris, Gallimard, 1957.
  • Le Livre à venir, Paris, Gallimard, 1959.
  • L’Attente L’Oubli, Paris, Gallimard, 1962.
  • L’Entretien infini, Paris, Gallimard, 1969.
  • L’Amitié, Paris, Gallimard, 1971.
  • La Folie du jour, Montpellier, Fata Morgana, 1973.
  • Le Pas au-delà, Paris, Gallimard, 1973.
  • L’Écriture du désastre, Paris, Gallimard, 1980.
  • La Communauté inavouable, Paris, Minuit, 1983.
  • Michel Foucault tel que je l’imagine, Montpellier, Fata Morgana, 1986.
  • Une voix venue d’ailleurs : Sur les poèmes de Louis-René des Forêts, Plombières-les-Dijon, Ulysse, Fin de Siècle, 1992.
  • L’Instant de ma mort, Montpellier, Fata Morgana, 1994.
  • Écrits politiques 1958-1993, Paris, Lignes-éditions Léo Scheer, 2003.

b. English Translations

  • Death Sentence (1978). New York: Station Hill Press.
  • The Gaze of Orpheus and Other Literary Essays (1981). New York: Station Hill Press.
  •  The Madness of the Day (1981). New York: Station Hill Press.
  • The Sirens’ Song (1982). Brighton: Harvester.
  • The Space of Literature (1982). Lincoln: University of Nebraska Press.
  • Vicious Circles, followed by ‘After the Fact’ (1985). New York: Station Hill Press.
  • When the Time Comes (1985). New York: Station Hill Press.
  • The Writing of the Disaster (1986). Lincoln: University of Nebraska Press.
  • The Last Man (1987). New York: Columbia University Press.
  • Michel Foucault as I Imagine Him in Foucault/Blanchot (1987). New York: Zone Books.
  • Thomas the Obscure (1988). New York: Station Hill Press.
  • The Unavowable Community (1988). New York: Station Hill Press.
  • The One Who Was Standing Apart From Me (1992). New York: Station Hill Press.
  • The Step Not Beyond (1992). Albany: State University of New York Press.
  • The Infinite Conversation (1993). Minneapolis: University of Minnesota Press, 1993.
  •  The Blanchot Reader (1995). Oxford: Blackwell.
  • The Most High (1995). Lincoln: University of Nebraska Press.
  • The Work of Fire (1995). Stanford: Stanford University Press.
  • Awaiting Oblivion (1997). Lincoln: University of Nebraska Press.
  • Friendship (1997). Stanford: Stanford University Press.
  • The Station Hill Blanchot Reader (1998). New York: Station Hill Press.
  • ‘The Instant of My Death’ in Maurice Blanchot and Jacques Derrida, The Instant of My Death / Demeure: Fiction and Testimony (2000). Stanford: Stanford University Press.
  • Faux Pas (2001). Stanford: Stanford University Press.
  • Aminadab (2002). Lincoln: University of Nebraska Press.
  • The Book to Come (2003). Stanford: Stanford University Press.
  • Lautréamont and Sade (2004). Stanford: Stanford University Press.
  • A Voice from Elsewhere (2007). Albany: State University of New York Press.
  • Political Writings, 1953-1993 (2010). New York: Fordham University Press.

c. Secondary Bibliography

  • Bident C., Maurice Blanchot, partenaire invisible, Paris, Champ Vallon, 1998.
  • Bruns G., Maurice Blanchot: The Refusal of Philosophy, Baltimore, Johns Hopkins Press, 1997.
  • Collin F., Maurice Blanchot et la question de l’écriture, Paris, Gallimard, 1971.
  • Derrida J., Parages, Paris, Galilée, 1986.
  • Fort J., The Imperative to Write: Destitutions of the Sublime in Kafka, Blanchot and Beckett, New York, Fordham University Press, 2014.
  • Fynsk C., Last Step: Maurice Blanchot’s Exilic Writings, New York, Fordham University Press, 2013.
  • Hart K., The Dark Gaze: Maurice Blanchot and the Sacred, Chicago, University of Chicago Press, 2004.
  • Hewson M., Blanchot and Literary Criticism, London, Bloomsbury Academic, 2011.
  • Hill L., Blanchot: Extreme Contemporary, London, Routledge, 1997.
  • Hill L., Maurice Blanchot and Fragmentary Writing: A Change of Epoch, London, Continuum, 2012.
  • Holland M., Avant dire: essais sur Blanchot, Paris, Hermann, 2015.
  • Iyer L., Blanchot’s Communism, London, Palgrave Macmillan, 2004.
  • Kuzma J., The Eroticization of Distance: Nietzsche, Blanchot, and the Legacy of Courtly Love, Lanham, Lexington, 2016.
  • Lacoue-Labarthe, P., Agonie terminée, agonie interminable: Sur Maurice Blanchot, Paris, Galilée, 2011.
  • Nancy J., The Disavowed Community, New York, Fordham University Press, 2016.
  • Nancy J., Maurice Blanchot, passion politique, Paris, Galilée, 2011.

Author Information

Joseph Kuzma
Email: jkuzma@uccs.edu
University of Colorado, Colorado Springs
U. S. A.

Plato: The Timaeus

platoThere is nothing easy about the Timaeus. Its length, limited dramatic discourse, and arid subject-matter make for a dense and menacing work. But make no mistake, it is a menacing work of great subtly and depth. Cosmology has traditionally received the bulk of scholarly attention. No less important, however, are the dialogue’s narrative elements, beginning with its characters. Socrates needs no introduction, yet who are Timaeus, Critias, and Hermocrates, and why does Plato give them starring roles? Also worth considering is the dialogue’s narrative structure. What begins as a snappy exchange between Socrates and Timaeus soon gives way to a pair of protracted speeches. It is not like Socrates to sit idly by while others pontificate, yet he does. Is this a sign that Socrates endorses these speeches or is there another reason for his silence? And where is Plato? He is absent from the drama, but to what extent is he philosophically present? When reading a dramatic work, one ordinarily assumes a critical distance between the author and his characters. Is the Timaeus any different? Does Plato endorse any of the ideas presented? If he does, is there any way of telling which ideas given that he never speaks for himself?

Like any Platonic dialogue, the Timaeus is dynamic and multifarious—a complex interplay between muthos and logos, art and argument, theatrics and theory. The purpose of this entry is not to render a definitive interpretation of the dialogue, but rather to reveal the possibilities afforded by a close reading of the text.

Table of Contents

  1. Authorship
  2. Date of Composition
  3. Dramatis Personæ
    1. Timaeus
    2. Socrates
    3. Critias
    4. Hermocrates
  4. Dramatic Date and Setting
  5. Narrative Structure
  6. Outline & Analysis of the Dialogue
    1. Prologue, Part 1: An Invitation to Storytelling (17a–20c)
    2. Prologue, Part 2: The Story of Atlantis (20c–27b)
    3. Monologue, Part 1: The Creation Story of Intellect (27c–47e)
    4. Monologue, Part 2: The Creation Story of Necessity (47e–69a)
    5. Monologue, Part 3: The Creation Story of Man (69a–92c)
  7. Concluding Remarks
  8. References and Further Reading
    1. Standard Greek Text
    2. English Translations
    3. Classic Studies
    4. Classical Studies
    5. Other Studies of Related Interest

1. Authorship

Most scholars agree that Plato wrote somewhere between 30 and 40 dialogues. The precise number, however, is an open question owing to disputes over authorship. A case in point is First Alcibiades. Some scholars (such as Denyer) believe that it is authentic; others (such as Schleiermacher) do not. More commonly included among the Platonic dubia are the Cleitophon, Epinomis, Eryxias, Lovers, Minos, Second Alcibiades, and Theages (but reference Altman’s “Reading Order and Authenticity: The Place of Theages and Cleitophon in Platonic Pedagogy”). While doubts surround the authorship of some dialogues, this is not so with the Timaeus, and for good reason. As evidence, we have, for starters, ancient testimony. We find, for instance, in Aristotle more references to the Timaeus than to any other dialogue. It seems unlikely that Aristotle—given his familiarity with Plato’s works, having spent nearly 20 years in the Academy—would have repeatedly attributed this work to Plato if its authorship were doubtful. The Timaeus retained its place in the Platonic corpus throughout late antiquity and the Middle Ages. It was translated into Latin by Cicero (whose translation ends at 47b) and Calcidius, and it inspired commentaries by Plutarch and Proclus. At no point in its transmission was its authorship seriously contested. Confidence in the dialogue’s authenticity remains steady today. Thus, although we may not have the autograph—the original, handwritten text by the author—we have excellent reason to include the Timaeus among, and no good reason to exclude it from, those works issuing from Plato’s hand.

2. Date of Composition

No one knows exactly when Plato wrote his dialogues or their precise order of composition. Nevertheless, it has become commonplace to group them into three compositional periods: early, middle, and late. The Protagoras, for instance, is usually included among Plato’s early dialogues and, if the doxographical tradition is to be trusted, the Laws is his last. As for the Timaeus, scholars are divided. Some (for example Cherniss) include it among the late dialogues—along with the Critias, Sophist, Statesman, Philebus, and Laws—given their stylistic affinity. Others (including Owen) call attention to the Timaeus’s philosophical content—particularly its use of paradeigmata to explain predication—as evidence of an earlier dating. It is debatable, however, whether style can reliably establish a dialogue’s age. Consider, for example, the statue Laocoön and His Sons. One may ask: Is it a Greek original, a Roman original, or a Roman copy of a Greek original? Regrettably, there is little, if any, telling if style is the sole measure. A feel for style, after all, is what led Winkelmann to propose erroneously a fourth-century BCE date for the statue. Or consider another example: a mature artist who completes an unfinished work from his youth. Where does it fit in his oeuvre? Because the piece was finished late, classifying it as “early” would not be entirely fitting; but because it was started early, calling it “late” also seems mistaken. This is not to say that dating artwork is impossible, but it does suggest that knowing a work’s time of production requires knowing its method of production, which, concerning the dialogues, we know preciously little about.

Ordering the dialogues based on content is also problematic. Suppose that Plato writes three dialogues: one advancing an idea, a second expanding on that idea, and a third rejecting the more fully developed version of the idea. If we set these works side-by-side, there seems to be a progression of thought from an idea’s birth and development to its repudiation, which suggests that the dialogues were written in chronological order. Behind this thinking, however, lies an assumption: that the dialogues record Plato’s personal teachings as they emerged and evolved over time. This, admittedly, has some plausibility to it. Philosophers have been known to write dialogues for didactic purposes: Galileo’s Dialogues Concerning the Two Chief World Systems and Berkeley’s Three Dialogues between Hylas and Philonous come to mind. But the difference lies in the fact that Galileo and Berkeley, apart from dialogues, wrote treatises wherein they expressed their own thoughts in their own voices. With the exception of letters, several of which are likely spurious, Plato wrote dialogues exclusively, never once stepping out from behind his dramatis personæ. This makes attributing ideas to Plato considerably more difficult. If his purpose were to proselytize, then why did he not utilize a more direct form of discourse? It was, after all, not uncommon practice for ancient philosophers in their writings to speak for themselves. Heraclitus (Diogenes Laertius, IX. 5–6), Zeno of Elea (Plato, Parm. 127c–d) and Anaxagoras (Diogenes Laertius, II. 6; cf. Plato, Phd. 97b) all wrote books and, if the testimonia is any indication, wrote them in such a way that their voices were unmistakable. That Plato does not speak for himself suggests that his interests may have been less with dictation than with participation. Given their lively dramatic character, the dialogues act as powerful magnets, drawing us in, inviting us to listen to the conversation, to participate in the exchange, and to live the only kind of life that Socrates considered worth living. If this is what the dialogues are about—if they are more about the questions raised than about the answers given—then the Timaeus’s content will tell us little, if anything, about when it was written.

By no means do these considerations extinguish the heated debate over the Timaeus’s date of composition, but they do at least provide one with compelling reasons for thinking that rendering a decisive compositional chronology is a problem that will not soon be resolved.

There are many fine studies that may be consulted on the problems and possibilities of dating Plato’s dialogues. Excellent points of departure include Brandwood’s “Stylometry and Chronology” and Howland’s “Re-Reading Plato: The Problem of Platonic Chronology.”

3. Dramatis Personæ

a. Timaeus

Opinion has shifted over Timaeus’s historicity. The ancients (such as Cicero and Iamblichus) took him to be a real person, whereas scholars today tend to regard him as a literary invention. Be that as it may, his character, thanks to Plato’s creative genius, radiates an abundant life. Socrates introduces Timaeus as a foreigner from Locri, a town in southern Italy, praising him as a poet and sophist and as one who has managed high-ranking political offices and reached the summit in all areas of philosophy (20a). Timaeus’s philosophical credentials are reaffirmed by Critias, who honors him as an astronomer who investigates the nature of all things (27a). Philosopher, diplomat, and lyricist—Timaeus has the markings of a true polymath. These wide-ranging talents bring to mind the Republic’s philosopher-king who composes an autochthonous myth to explain human origins (414c–415c). The philosopher-statesman Timaeus will in similar fashion captivate his audience with a likely story about cosmic origins. But the philosopher-king in the Republic promulgates his story knowing it to be untrue. Does Timaeus do the same?

b. Socrates

In practically every Platonic dialogue, from beginning to end, Socrates is the guiding narrative force. Often, he lures others into discussion, leading them through dialectic to see where their thinking has gone astray. Whether he is asking questions or providing answers, Socrates is usually engaged in conversation. In the Timaeus, however, he is conspicuously silent. He invites his companions to give speeches of their own, but once they begin, he listens intently, never interrupting. There are other dialogues—the Phaedrus, Symposium, and Republic, to name a few—where Socrates demonstrates patience while listening to lengthy discourses. But Timaeus’s speech easily surpasses these in duration and ambition. Why does Socrates hold his tongue? One reason seems to be that the day before he entertained his guests with a speech (17b ff.) and now they are expected to return the favor (20b). Dressed for the occasion, he vows to “keep his peace and listen” (27a1) while he dines on his “feast of speeches” (27b7–8). Another reason for Socrates’s speech: it is not an account but a story (muthos), and thus not to be scrutinized but savored. Still, Socrates is not himself—at least, not what we come to expect. What is he up to? Or maybe one should ask: What is Plato up to in depicting Socrates as he does?

c. Critias

The identity of Critias is disputed. The name brings to mind the notorious oligarch Critias: student of Socrates, uncle of Plato, and leader of the Thirty Tyrants. But the dialogue’s dramatic date suggests that we might be dealing with someone else. By most accounts, the dialogue takes place between 429 and 408 (see Dramatic Date and Setting). At that time, Critias the oligarch (b. 460) would have been an astute man between the age of 30 and 50. The dialogue’s Critias, however, struggles to recollect basic things discussed the day before (26b) while at the same time having little difficulty recalling a story he heard “a long time ago” (26c5–6). Short-term memory loss and a robust long-term memory are qualities typically associated not with a sharp-minded, middle-aged man, but rather with someone who is elderly. Also worth noting is Critias’s remark that when he first heard the Atlantis story from his grandfather, the poems of Solon were new (21b). Even if we assume that these poems were written late in Solon’s life (d. 560), it seems strange that someone living in the mid-fifth century would think of them as new. These considerations suggest that the dialogue’s Critias is probably not the oligarch, but more likely his grandfather or someone else with the same name. If that is the case, why does Plato include a character named Critias, knowing what that name would have connoted to his Athenian audience? For 21st century speculations on Critias’s identity, see Lampert and Planeaux “Who’s Who in Plato’s Timaeus-Critias and Why,” Welliver, Character, Plot, and Thought in Plato’s Timaeus-Critias, and Nails, The People of Plato: A Prosopography of Plato and Other Socratics.

d. Hermocrates

There is little doubt that Hermocrates is the great Syracusan diplomat and general of the Peloponnesian War, whose intelligence, courage, and experience earned him praise from Thucydides (Thuc. 6.72). At the peace conference at Gela, Hermocrates alerted the Sicilian Greeks – and at Syracuse his fellow citizens – to the imperial threat posed by Athens. Moreover, because of his effort to unify warring factions, Athens met her defeat in the Sicilian expedition. Plato’s audience would have known Hermocrates not only for his ability to induce action through speech, but also as one of the chief adversaries of imperialistic Athens. Readers should therefore note the irony when Hermocrates cajoles Critias into telling his story about ancient Athens’ effort to liberate those nations subjugated by imperialistic Atlantis (20d).

4. Dramatic Date and Setting

There is no specific mention of locale, but since Socrates receives the speeches as “guest gifts” (20c1, 27a2), it is likely that, like Timaeus and Hermocrates, Socrates is a guest at Critias’s home in Athens. As for the dialogue’s dramatic date, based on Critias’s remark that his discourse will be both payment of debt to Socrates and a tribute to the goddess on her feast-day (21a; cf. 26e), it is reasonable to think that the Timaeus takes place during one of many Athenian festivals celebrated in honor of Athena. Opinions vary as to which festival Plato has in mind. Some (such as Cornford) argue that it is either the annual (Lesser) Panathenaea or the quadrennial (Greater) Panathenaea, both of which took place in Hecatombaeon (July/August). Others (such as Taylor), who regard the events described in the Republic and Timaeus as occurring on subsequent days, argue that it must be a different Athena-centered festival (like Plynteria), since the events in the Republic take place during the Bendideia, which was celebrated in Thargelion (May/June), a full two months before the Panathenaea. Opinions regarding the year of the drama also vary: 429 (Nails), 421 (Lampert and Planeaux), 411 (Press), and 409–408 (Zuckert).

5. Narrative Structure

Apart from the characters, dramatic date, and physical setting, narrative structure is an important feature of Plato’s dialogues. Some dialogues such as the Republic, Symposium, and Phaedo are narrated. Others such as the Euthyphro, Apology, and Crito are dramatic, consisting of direct dialogue between characters without narration. Although it may seem trivial, narrative structure can have a bearing on how one reads and interprets a dialogue. Unlike a narrated dialogue, which is presented by someone who has already digested and formed an opinion about the material that he is relating to his audience, a dramatic text is free of such bias. In a narrated dialogue, however, an author can provide extra-dialogical details such as a character’s body language and tone of voice, how characters are physically oriented to one another, and a running commentary. A narrated dialogue can also have varying degrees of depth, from a narrator reporting a conversation he himself heard to a narrator reporting a conversation he heard from someone else, who might have heard it from yet another person, and so on. This depth naturally raises questions about the narrator’s credibility. This is relevant to a dialogue such as the Symposium, which Aristodemus relates to Apollodorus, who in turn relates it to his friend Glaucon, an exchange that we ourselves witness as readers. Narrative depth also plays a role in the Republic, where we find Socrates criticizing mimetic speech while practicing it as the dialogue’s narrator. As a direct drama, the Timaeus lacks narrative depth, nor is it particularly rich in dramatic discourse. Although there are dialogical passages such as the opening exchange between Socrates and Timaeus, uninterrupted speeches account for over 70 of the work’s 75 Stephanus pages. This raises a fair number of concerns, many of which pertain to Plato’s relation to the text. One may argue that by not interrupting Timaeus’s speech, Plato is giving it his assent. Why else would he allow Socrates to fade uncharacteristically into the background? It is worth noting, however, that the narrative does not end at Timaeus 92c, but continues with Critias 106a. In other words, the Timaeus and Critias are published as separate dialogues, but they form a narrative unit. Because Plato left the Critias incomplete, we have no idea how it was to end or whether, as some have speculated, Plato planned a third dialogue, the Hermocrates, to round out a trilogy. Socrates might have remained a passive observer throughout, or he might have remained temporarily silent, waiting politely for Critias to finish his speech before launching his Socratic assault. But maybe Plato left the Critias unfinished on purpose as an exercise for his audience to pick up where he left off, to bring the dialogue to a close, and to be not just passive spectators but active participants, like the characters in the dialogues, makers of philosophical discourse. Open questions abound.

6. Outline & Analysis of the Dialogue

a. Prologue, Part 1: An Invitation to Storytelling (17a–20c)

Socrates opens the dialogue with a question, although not the kind the reader expects. Rather than asking about piety, justice, or beauty—the usual Socratic fare—he enquires into the whereabouts of a missing guest. He counts those present, “One, two, three,” only to come up short: “But where is the fourth?” Timaeus attests that the guest’s absence is not intentional: “Some illness (astheneia) befell him” (17a). Nevertheless, one wonders whether Timaeus is telling the truth. Was the guest willing but unable to come, or might Timaeus be covering for someone unwilling to face Socrates? And who might this person be? Plato never reveals his identity. Theaetetus, Clitophon, and Plato himself were candidates put forward by ancient authorities (Procl. In Ti. 16–17). But a compelling case has been made in recent years for thinking that the missing guest is Alcibiades, whose absence may have had to do less with physical illness than with moral weakness (see Lampert and Planeaux). Timaeus does not say what kept the guest away, but only that it is astheneia, which could refer to either physical or moral infirmity. If moral sickness is the reason, one cannot accuse Timaeus of dishonesty. Then again, he is not completely forthcoming either, leaving ambiguous the real reason for the guest’s absence. This is not surprising given Timaeus’s own admission that he can give nothing more than an imprecise account of anything related to the physical world (29b)—a deficiency that would presumably extend to the physical or psychological condition of an absent party guest. At any rate, Timaeus, speaking for Critias and Hermocrates, assures Socrates not to worry: those who are present will repay Socrates for the gifts he so generously bestowed the day before.

Socrates’ gifts, it turns out, were speeches about the best kind of polis (17c–19a). At Timaeus’s request, Socrates rehearses his chief points. In the best polis:

1)    everyone is given one occupation suited to his or her nature;

2)    there is no gender discrimination: occupations are open to men and women alike;

3)    the artisans occupy a class separate from the guardians, who are charged with making war on behalf of the polis;

4)    the guardians, who have a spirited and philosophic nature, live communally, may not own private property, and undergo the same rigorous training regardless of gender;

5)    all marriages and child-rearing are regulated by the polis; and

6)    offspring of good citizens are reared as guardians while offspring of bad citizens are handed over to the polis to be raised presumably as artisans.

These points naturally bring to mind the Republic’s kallipolis, but what is included on the list is just as intriguing as what is excluded; for Socrates leaves untouched, among other things:

1)    the deleterious effects of poetry on the soul (Rep. 3 and 10);

2)    the search for justice and the analogy between the soul and the state (Rep. 4);

3)    the distinction between knowledge and opinion and their respective objects (Rep. 5);

4)    the metaphor of the sun, divided line, and allegory of the cave (Rep. 6 and 7);

5)    the philosopher-king and his education in mathematics and dialectic (Rep. 7); and

6)    the eventual decline and collapse of the kallipolis into tyranny (Rep. 9 and 10).

Socrates in the Timaeus paints a picture that pales in comparison to his account in the Republic, bringing to light the polis’s political foundations but disregarding the philosophical forces animating it. How strange for Socrates to assemble a political body only to leave it soulless. Equally strange is Timaeus’s reaction. Upon finishing his summary, Socrates asks whether he has omitted anything. Timaeus responds, “Not at all” (19b). Surely, this is contrary to the reader’s expectations. If, from a dramatic point of view, the Republic and Timaeus had taken place on subsequent days, Timaeus should have pointed out these omissions. What are we to make of this? One suggestion is thatTimaeus forgot parts of the discussion from the day before and answers to the best of his recollection. Or, maybe Timaeus knows Socrates left bits out, but responds as he does just to move the conversation along. Another possibility, however, is that Socrates has summarized his speech in its entirety, omitting nothing. In this case, the polis of the Timaeus is not the kallipolis of the Republic, but rather its likeness or approximation. This reading fits the rest of the drama quite well, especially when considering that the showpiece of the dialogue, Timaeus’s cosmology, is itself no more than a likely story, an approximation of the truth (29d).

Plato drives the wedge even deeper between the Republic and Timaeus by having Socrates express a desire to see his polis set in motion—like beautiful animals “moving and contending in some struggle” (19c). Contrast this with Socrates’ claim in the Republic that the kallipolis, regardless of whether it could actually come to be, would still serve as “a heavenly model for anyone who wants to see and found a polis within himself” (592b). Of course, this might be an inconsistency in Socrates’ thinking, but it might also be that he is up to his old ironic tricks. In the Republic, because Glaucon and Adeimantus were receptive to dialectic, Socrates’ traditional psychoanalytic approach of asking questions was enough to set their souls in motion. Timaeus and Critias, given their social status and alleged reputation for wisdom, would predictably be a bit more stubborn and less receptive to dialectic. For this reason, Socrates, like a behavioral psychologist, sets up a test with his speeches from the day before in order to draw from their responses insight into their natures. Just before he finishes, Socrates, as he so often does, flatters his companions, singling them out as the only living men capable of satisfying his desire (20b)—a transparent display of Socratic irony if there ever was one. Thus, the apparent inconsistency in Socrates’ thinking might be nothing more than a change in methodology. Socrates, as is so often the case, is toying with his interlocutors, using them as guinea pigs in his little experiment. One wonders whether it is for this reason that the fourth member of their party—if he is indeed Socrates’ spurned lover, Alcibiades, who had drunkenly embarrassed himself in the Symposium—failed to show.

b. Prologue, Part 2: The Story of Atlantis (20c–27b)

Hermocrates, promising to repay Socrates’ guest gifts, invites Critias to recount a story that he shared the day before. Critias first heard the story as a ten-year old from his grandfather while attending the Apaturia (21b), a three-day festival associated with hereditary groups—phratries, as they were known—which gathered to celebrate their patrilineal kinship (homopatoria) and to receive new members. A legendary border dispute between Athens and Boeotia served as the festival’s etiological backdrop—an important fact omitted by Critias. The principal figures were two combatants: the Boeotian king Xanthus (“Fair One”) and the Athenian champion Melanthus (“Dark One”). The two proved to be equals—that is, until Melanthus at one point cried out to his opponent, “You are cheating, Xanthus, for there is someone behind you!” Xanthus, distracted, whirled around and Melanthus, seizing his opportunity, dealt a fatal blow. Some versions of the story allege that Melanthus tricked Xanthus; others blame Zeus Apātenōr (“Zeus Deceiver”) or Dionysus Melanaigis (“Dionysus of the Black Goatskin”) for the ruse. All versions explain the festival’s origins with a punning etymology: the Apaturia commemorates Athen’s victory by apatē (deception). Thus, “Apaturia” would have had for Plato’s audience a dual meaning, signifying both a reception and a deception. The reader should keep this in mind when listening to Critias, for his story emulates the Apaturia: it pays respect to common origins and fraternity, yet has an element of deception insofar as it presents a distorted picture of the state that Socrates wishes to be set in motion.

Before relating his story, Critias provides some background regarding its transmission. It dates back to Solon, Critias’s distant relative, who heard it from an old priest from Sais, an Egyptian city that shared with Athens the same patron goddess (Neith/Athena, 21e). This is significant; for just as the Apaturia reunited communities with blood ties, the meeting between Solon and the Egyptian priest marks the reunion of two cultures with tutelary ties. But there is another tie worth noting, namely the intellectual kinship between Critias and the priest. From his emphasis on the story’s lineage, Critias makes known his esteem for ancient records and hearsay. The Saisian priest likewise shows his preference for ancient hearsay by criticizing Solon’s speeches about antiquities. “You are young, young in soul, all of you,” the priest tells Solon. In particular, he criticizes the Greek myth of Phaethon, describing it as a story (muthos) with no basis in empirical fact (22c–d). Socrates, of course, was critical of this type of unimaginative positivism, which in the Phaedrus he refers to as a sort of “rural wisdom” (agroika sophia, 229e). The reader may wonder whether Socrates’ companions will ever fulfill his wish. He has presented them with a lifeless state and entreated that they set it in motion, yet it seems unlikely that an appeal to the lifeless science of the Egyptians will accomplish this end. This is worth bearing in mind when listening later to Timaeus’s strange, but lively, creation story (muthos).

With the background of the story out of the way, Critias gets to the story proper, which, like the Apaturia’s etiological myth, concerns an ancient dispute. The players are Atlantis and Athens—not the Athens familiar to Solon, but rather an Athens existing some 9,000 years before. Governed by excellent laws and unsurpassed in war, Athens struggled against the imperialistic Atlantis. Although the odds were stacked against her, Athens emerged victorious, liberating nations that Atlantis had enslaved along the way (25b–c). It is worth remembering that fifth-century Athens, far from being a spirited liberator, was herself an ambitious, imperialist power. These ambitions would contribute not only to Athens’ greatness, but also to her demise. Perhaps her greatest embarrassment was the disastrous Sicilian Expedition. Shortly after deploying a massive fleet of ships and hoplites, Athens recalled the expedition’s principal proponent, Alcibiades, to stand trial for sacrilege. Rather than returning to Athens, Alcibiades fled to Sparta, where he gave his best advice on how to overcome his mother city. The Athenian war effort deteriorated rapidly. Its eventual defeat was assisted by none other than Hermocrates, who, like Melanthus, achieved victory through deception (Thuc. 7.73–4). One night, he sent associates to the Athenian camp. Pretending to be traitors, they misled the Athenians into believing that Syracusans were guarding the roads and that it would be safer to withdraw during the day. Unaware of the ruse, the Athenians lingered, giving the Syracusans time to block the escape routes and disable the Athenian ships. Days later, the expedition would end in catastrophe with thousands of retreating Athenians killed and several thousand more sold into slavery or left to die in prisons.

Permeating Critias’s speech, therefore, is the theme of deception. He not only receives the story at a festival of deception, but transmits it at a time when Athens herself would fall victim to deception. One even begins to wonder whether Critias himself should be trusted. Consider his claim that Socrates’ speech the day before “was not far off the mark from agreeing with … what Solon said” (25e). This clearly overstates the truth. Not unlike Socrates’ polis, ancient Athens was a regimented society with discrete class divisions: (1) craftsmen, (2) shepherds, hunters, and farmers, (3) warriors responsible only for executing matters of war, and (4) priests (24a–b). But here the similarities end. Socrates makes no mention of priests; and although he does separate the artisans and farmers from the warriors, he does not separate the artisans from the farmers (17c). Moreover, Critias says nothing about how ancient Athens treated its women, the moral character of its warrior class, or whether marriages and child rearing were state-regulated. Socrates’ polis and ancient Athens have much less in common than Critias suggests, which prompts the question: What is the point of Critias’s story? Why does Plato include it in the dialogue? Two reasons come to mind. One is that Plato likely intends to invoke a sense of dramatic irony. Critias ironically relates a story of Atlantis’ imperial hubris at a time in history when Athens was planning the Sicilian Expedition. To do so at the behest of Hermocrates (20d) only intensifies the irony (see Dramatis Personæ: Hermocrates). Another reason is to invite his readers to ponder how the love of glory can obstruct the pursuit of truth. Critias believes that the poleis described by Solon and Socrates are the same, but because he is so enamored with the story’s lineage, he fails to notice their differences. By inflating his reputation (doxa), Critias leaves himself vulnerable to the deceptiveness of opinion (doxa).

c. Monologue, Part 1: The Creation Story of Intellect (27c–47e)

Almost as a tease, Critias confesses that he stopped short of giving a full account of ancient Athens. If Socrates cares to hear the rest of the story, he must first listen to Timaeus, who will speak on cosmic origins and the creation of man (27a). Socrates, thrilled at his forthcoming feast of speeches, invites Timaeus to begin, requesting that he first follow custom by invoking the gods. What Timaeus does instead is peculiar. To start with, he pauses to reflect on the relevance of such an invocation, remarking that at the onset of any affair, small or great, everyone, even someone of “little prudence,” calls on some god (27c). What is to be made of this? Is it merely an incidental remark or a subtle jab at Socrates and Critias, who, having made no invocation themselves, lay bare their own imprudence? At any rate, there is an underlying irony to Timaeus’s remark: he sanctimoniously chides those who make no pious gestures, while making no pious gesture himself (27d). Instead of calling upon the gods, he calls upon ordinary men who will judge his speech (29d). Is this Plato’s way of suggesting to the reader that Timaeus is himself imprudent? Or is it that Timaeus, being a sophist, believes that there are two sets of rules: those that apply to him and those that apply to everyone else? It is difficult to say what precisely Plato intends by this, but one thing is for sure: as with Critias, the reader must guard himself against Timaeus’s potential deceptiveness.

As he begins, Timaeus lays some philosophical groundwork, discussing four issues vital to his cosmology:

1)    the metaphysical distinction between Being and Becoming;

2)    the principle that whatever comes to be does so owing to some cause;

3)    the role played by a divine craftsman or demiurge in making the cosmos; and

4)    the extent to which we can be confident that an account is true given the entities with which it is concerned.

The first of these—the Being-Becoming distinction—Timaeus introduces without hesitation or defense. There are, he says, two kinds of entities: (1) genuinely existing, un-generated and indestructible, changeless entities, which are grasped by reason together with a rational account and (2) entities that do not genuinely exist, that come to be and pass away, and that individuals form opinions about based on irrational sensation (27d–28a). Readers familiar with the Republic, Phaedo, and Symposium will recognize this as the traditional “Platonic” distinction between unchanging Forms and changing objects of everyday experience. Timaeus does not explicitly identify the first class of entities as “Forms” (ideai, eidē), but he does refer to them repeatedly as “models” or “paradigms” (paradeigmata, 28a–c, 29b, 31a, 37c, 38b–c, 39e, 48e–49a), a term that Plato uses elsewhere to denote the Forms (Euthphr. 6e, Rep. 472c, and Prm. 132c–d).

Next, Timaeus turns his attention to the cosmos, which is visible and tangible and hence something belonging to the second class of entities: those that come to be and pass away. But what caused the cosmos to be? In other dialogues, the Forms function as causes (Euthphr. 5d, 6d–e; Phd. 100b–103a, 103c–105c). In the Timaeus, the principal cause of cosmic order is the demiurge, or divine craftsman (dēmiourgos). Timaeus explains that a craftsman, when making something, bases his work on a model (paradeigma, 28a). This model, he argues, must belong to the class of unchanging things because craftsmen always intend their work to be beautiful and only an unchanging model can help craftsmen achieve this end. It is no different regarding the cosmos, which the demiurge makes by looking to “that which is grasped by reason and prudence and is in a self-same condition” (29a). As he prepares to give his account of cosmic origins, Timaeus offers a word of warning: an account concerned with changing things will always be less accurate than one concerned with unchanging things. Consider the difference between a mathematical proof and a forensic analysis: the latter, while it might be compelling, is merely probabilistic; the former, as long as one has done the calculation properly, guarantees that one has grasped the truth. Timaeus’s point is that insofar as the cosmos is itself something generated, any account pertaining to it, like a forensic analysis, can only be likely, never certain. With that in mind, he concludes, “It is fitting for us to receive the likely story (ton eikota muthon) about these things and not to search further for anything beyond it” (29d).

Socrates expresses his enthusiasm for Timaeus’s proposal, encouraging him to “perform the song (ton nomon) himself” (29d). This is an interesting choice of words. In the Republic, Socrates stresses the sovereignty of musical training, emphasizing how rhythm and harmony are able to permeate the soul and perfect it (401e). Socrates, by referring to Timaeus’s forthcoming speech as a “song,” acknowledges not only its overall significance, but also the potential impact, positive or negative, on those who are listening. Also noteworthy is Socrates’ seemingly complimentary remark, “We have received your prelude so wondrously (thaumasiōs).” The word thaumasios literally means “wonderful” or “marvelous,” which by itself seems innocuous enough, but Plato does occasionally uses it ironically (Phdr. 242a, Rep. 435c). Is this the case here? Is Socrates returning Timaeus’s earlier jab (27a), a final salvo before Timaeus begins his speech? It would certainly not be out of character for Socrates to let Timaeus know in a veiled manner, “Something is not right. I am on to you.” Plato, with his word choice, might be giving readers fair warning: something about Timaeus is not sitting well with Socrates, so his speech will be received gladly as a guest-gift should, but with the right amount of caution. It would be best for us too, as readers, to follow Socrates’ circumspect lead.

Resuming his speech, Timaeus gives a general account of how the demiurge began his work. “The god” (ho theos), as Timaeus calls him, being beautiful and intelligent, intended his creation to resemble him as much as possible (29e–30a). Onto pre-existing matter, therefore, which moved unmusically and without purpose, a rational order was imposed. Knowing the intelligent to be more beautiful than the unintelligent, the demiurge imbued the cosmic body with soul (30b). As his model, the demiurge used that living being (ion, 30b) which embraces all living beings: “holds them within itself, just as this cosmos holds and embraces us and all the other nurslings constructed as visible” (30c–d). As for whether there is one cosmos or many, Timaeus argues against the Presocratic pluralists who posit an infinite number of kosmoi (for example, Democritus, DK 67A1). For Timaeus, because the cosmos resembles its model and its model is one in number, the cosmos too must be one in number. To create the cosmic body, the demiurge drew upon four elements—fire, air, earth and water—in equal proportion; and basing his work on a model, he created a body whole and perfect, one in number, free of old age and disease, and spherical in shape (31b, 33a–b). In addition, because the cosmic body was to be self-sufficient—in no need of anything external to it—it was made without eyes or ears, atmosphere for breathing, or organs for digestion (33c). It also lacked hands, legs, and feet; and the motion imposed on it was not directional—moving linearly in this or that direction—but rotational. The reader may rightly consider strange this so-called cosmic body, for even though it is based on a model embracing all living beings, its appearance is wholly unlike any other known living thing. But without any doubt, the strangeness does not end here.

Next comes the creation of the cosmic soul (34c–36d). This passage is notoriously difficult, as it contains a detailed account of how the demiurge fashions the substance of the soul from a mixture of pairs: (1) of the Indivisible and the Divisible and (2) of the Same and the Other. How can seemingly unmixable things like the indivisible and the divisible be mixed together? This is a puzzle that Timaeus leaves for his audience to work out. Just as puzzling is the subsequent passage on the movement of this soul according to certain mathematical ratios. Is the reader to take these passages seriously? Several ancient commentators including Aristotle, Xenocrates, and Plutarch did. Perhaps we should as well. But there is good reason to think that something else is intended, even though what that is might be somewhat obscure. For one thing, one might ask how Socrates, Critias, and Hermocrates could follow Timaeus’s incredibly complicated and arguably convoluted story. It is not as though Timaeus had visual aids—at least none of which the reader is aware. Would anyone listening to Timaeus not want some clarification? Would someone not think to ask a question or two? It is unusual for Socrates, of all persons, to not hit the pause button to engage his companion. This is not to say that Timaeus is being insincere or that he has something up his sleeve, but so much of the narrative up to this point has raised questions about trustworthiness. Perhaps there is something more devious, even diabolical, to Timaeus’s speech. Or perhaps this is being too cynical. There is much ground yet to cover.

Timaeus concludes the first part of his story by describing how the demiurge makes time (37c–38c), the planets (38c–39e), and lesser gods (39e–40b). The creation of man the demiurge delegates to these lesser gods (40d–47e, see Monologue, Part 3: The Creation Story of Man). Particularly strange is how Timaeus describes the divine revolutions of the soul. These motions, he says, are bound within a spheroform body named “head,” which is attached to a body in order to prevent it from rolling along on the ground (44d). This passage conjures up the nightmarish image of Empedoclean eyeballs rolling about on the ground in search of foreheads (DK 31B57). But it also should remind readers of the spherical humanoids featured in Aristophanes’ account of the origins of love in the Symposium. There is no questioning the ambitiousness and inventiveness of Timaeus’s story, yet its content, bordering on the absurd, makes it difficult to see it as something other than a parody. The suggestion that Timaeus is manufacturing a sort of parody becomes even more plausible in the second part of his story.

d. Monologue, Part 2: The Creation Story of Necessity (47e–69a)

The cosmos, as Timaeus’s story continues, emerges not only from an act of divine reason; it also arises from brute necessity. Because reason and necessity are cosmic co-creators, Timaeus transitions from the role played by a calculating demiurge to that of unthinking necessity. For starters, he returns to the distinction drawn earlier between Forms and sensible things, basing this metaphysical distinction now on an epistemological distinction between intellect (nous) and true opinion (doxa alēthēs). Because different epistemic states require different objects, intellect and true opinion must be drawn to different objects—Forms and sensible things, respectively (51d–e). But here he argues that there must be a third item added to the list. Children are not born from a single parent, but rather from two: a father and a mother. Thus, in addition to that which comes to be (offspring ~ sensible thing) and that from which it comes (father ~ Form), there must be something in which the created thing comes to be (mother ~ space [chōras], 52a). Note, however, how Timaeus presents his argument. He does not say that any of this is demonstrably true—that there is some intellectual compulsion to believe it. Instead, he says, “Here’s how I myself cast my vote (psēphon)” (51d). Voting is how a legislature promulgates a law or how a law court arrives at a legal decision; it is not the preferred method of arriving at scientific truth. Timaeus draws a distinction between reason and true opinion—the former immovable by persuasion, the latter alterable by persuasion—yet given that he casts a vote, how can his account be anything other than mere opinion and hence something that he, or anyone else, might be persuaded against?

What follows is yet another complicated passage—a lengthy one concerning the constitution of matter (53c–55c). This is his famous discourse on triangles. As has already been established, there are four kinds of matter—fire, air, earth, and water—and each contains particles, which, we are now told, are geometrical solids (acting as molecules) composed of elementary right triangles (acting as atoms), either isosceles or scalene. In effect, all bodies can be reduced to collections of triangles. There is little need here to spell out Timaeus’s theory in detail. Other scholars have already performed that formidable task (see, for example, Cornford, Plato’s Cosmology and Kalkavage, Plato’s Timaeus, Appendix C). Suffice it to say, what Timaeus renders, if anything, is an aesthetically pleasant picture of the physical world with all of its beautiful geometrical shapes: tetrahedrons (fire), octahedrons (air), icosahedrons (water), hexahedrons (earth), and dodecahedrons (panels for the constellations). His picture is also a lyrical one; for in a brief digression on the subject of whether the cosmos is one or many, Timaeus says, “In reasoning about all these things, someone would do so musically (emmelōs)” (55c). It is as if he considers himself to be engaged in a dramatic contest of sorts. Who can render the most compelling, the most rhapsodic, the most beautiful song? Whose story (muthos) will collect the most votes from the judges? Timaeus has already let his audience know for whom his voting stone (psēphos), if he were given one, would be cast. Others, he confesses, may cast theirs for another (55d).

As one might expect from a highly refined literary work like Timaeus’s, word play abounds. Consider, for instance, the passage where Timaeus pauses to reflect on his storytelling (59c–d). He says at first that it would not be difficult to see where his story is headed—not for anyone “who pursues the look of likely stories.” Whenever such a person puts aside or buries (katathemenos) definite accounts of the unchanging and turns to likely accounts of transient things, he permits himself a pleasure (hēdonē) and time to engage in a sort of childish play (paidian). Thus, he invites his audience to “give free rein to this very play … [and] proceed to the likelihoods next in order.” The very next sentence engages in this sort of play: water mixed with fire, which Timaeus calls “fluid” (hugron), comes from “flowing over the earth” (huper gen reon, 59d). In this way, he presents the physical derivation in terms of a linguistic one. Clearly, this is Timaeus having a bit of fun, but it is consistent with his entire speech up to this point. Not long after, he claims that fire (thermon) is hot because it minces (kermon, 62a)—yet another play on words. Timaeus’s comedic storytelling continues with his account of eating, which he describes obscurely in terms of voiding (kenōseis) and refilling (plērōsis). His subsequent accounts of tastes (64a–65b), smells (66d–67a), sounds (67a–c), and colors (67c–68d) are just as impish, often stated matter of factly without explanation regardless of how absurd they might sound to his audience. Is the reader to take these accounts seriously, or are they in fact sly Platonic parodies of the detailed, yet cold and soulless accounts popular among scientists of his day? Timaeus’s most baffling claim perhaps—at least, among his more baffling claims—is in his discussion on color mixing. Red and black, he posits, when mixed together yield green (68d). This simply does not make sense—not from a scientific point of view, at any rate. How can this be taken seriously? What precisely is Timaeus doing?

Drawing his Story of Necessity to a close, Timaeus prepares to deliver his conclusion. In doing so, he adopts a more serious tone, but there is still something strange about what he says. He begins with a metaphor: “Now that the kinds of causes have been sifted out and lie ready to hand for us, like wood for builders, out of which we must weave together the account that remains…” (69a). A reasonable response to this would be a quizzical one. Flour and grains are sifted, not wood, and Timaeus clearly knows this given what he says earlier in his story about the movement and ordering of the elements in space: “just like the particles shaken and winnowed out by sieves and other instruments used for purifying grain: the dense and heavy are swept to one site and settle, the porous and light to another” (52e). What also could he mean by weaving wood together? Such strange metaphors he uses. Timaeus continues: “Let us go back again briefly to the beginning and make our way swiftly to the same point from which we arrived there; and let us attempt to add to our story a finish and a head that’s joined to what has gone before” (69a–b). He has already returned—more literally, retreated or withdrawn (anachōrēteon, 48b)—to the beginning once, namely when he transitioned from his Story of Intellect to his Story of Necessity. Now he is proposing a second return. What is the meaning behind these reversals? Earlier in the dialogue, Timaeus refers to the retrograde motions of outer planets (36d). Are his literary reversals a way of mimicking the reversals found in nature? Storytelling, for Plato, being an art form, is a form of mimesis, after all. Timaeus is creating a story about cosmic creation. Thus, what holds for the cosmos in its coming to be should also hold for Timaeus’s cosmo-muthos, and vice versa.

e. Monologue, Part 3: The Creation Story of Man (69a–92c)

Socrates’ feast of speeches—at least the first course, served by Timaeus—is nearing its end. With the cosmos in place, Timaeus draws his story to a close by rhapsodizing on the creation of man. Earlier in his speech, he indicated that the demiurge fashioned the cosmos only up to a point; the creation of other life forms he left to other, lesser gods. Speaking to these gods, the demiurge explains that were he to create animals and plants, they would turn out as perfect as the gods themselves (41c). In that case, he would not be able to achieve his goal of realizing all levels and kinds of perfection. At first glance, this seems strange. How can a being capable of making something as enormous as the cosmos be unable to populate it with meager animals and plants? The problem, however, is not as it seems. It is not that the demiurge lacks power. On the contrary, being perfect, the demiurge always performs optimally; his actions yield only what is best. It would therefore be contrary to his nature—and, one may say, offensive to his impeccable aesthetic tastes—to make something that did not in the closest possible way resemble him in greatness and beauty. It would be tantamount to a masterful artist lowering himself to produce an amateurish painting. Creating a masterpiece is within his power; not doing so would be a shameful underachievement, unbefitting of his greatness. For this reason, the demiurge, exercising sovereignty over all things, delegates to lesser gods, who were themselves created (41b), the responsibility of creating animals and plants. Given their relative imperfection, they could do what the demiurge could not—namely, create less perfect life forms—and not disgrace themselves by doing so. By allowing them to complete what he started, the demiurge could fully realize his plan to ensure that the all (to pan) is genuinely all (hapan), full and complete, a plenitude lacking nothing (41c).

How, then, does man come to be? What is his origin and nature? Timaeus gives not one, but two accounts. The accounts do not conflict, but they do differ in length, detail, and artistry. The first appears at the end of the Story of Intellect (40d–47e). It begins with the demiurge creating the human soul from a mixture—not the same mixture from which the cosmic soul came to be, but a different one using the same ingredients blended less purely. Having combined these ingredients, the demiurge distributes bits of the mixture to the various stars (for example, lesser gods) like a farmer sowing seeds (42d). At this point, the lesser gods take over and begin fashioning the human body. Borrowing portions of the elements, the gods “went about gluing them together … with close-packed rivets invisible for their smallness” (42e–43a). It is worth noting here the difference in approach. Whereas the demiurge sows seeds, the lesser gods insert rivets. Whereas the demiurge creates with an eye to balance and beauty, the lesser gods simply get the job done by pasting things together. The demiurge uses art and agriculture to create; the gods appear to be workers on an assembly line. Note, too, how Timaeus’s account of vision reflects this same workmanlike approach. It tells how vision comes about, but not why it does beyond that it allows us to keep our soul in tune with the movements of the heavens (45b–46a, 47b). The creation of man, as related in the Story of Intellect, is much akin to the work of the lesser gods: it is practical, but not particularly meaningful. It relates the origin of man, but not his nature. In fact, it seems to serve as little more than a bridge between the Story of Intellect and the Story of Necessity. A far more robust account of man’s creation follows the Story of Necessity.

As with the account of man’s creation in the Story of Intellect, the Story of Man begins with the soul. From the start, however, the account is decidedly more lively and detailed. Timaeus opens by telling how the gods, still entrusted with the task of making man, sculpted a body around the immortal soul and housed within that body another kind of soul governed by “terrible and necessary affections” (deina kai anagkaia pathēmata): pleasure and pain, rashness and fear, anger and hope (69d). Notice that they do not rivet or glue the soul to the body, but sculpt the body around the soul. Already these lesser gods approach their job with more grace and flair than they did before. Moreover, they have enough sense to separate the two souls so that the mortal soul will not contaminate the immortal soul. They do this by placing the mortal soul within the chest and separating the chest from the head with an isthmus (namely, the neck) within which the spirited part of the soul resides. Next to be made is the heart, which communicates to other organs when an unjust deed has been committed, and after that the lungs, which cool the heart, enabling it to be more subservient to reason. The lungs, however, are not merely placed within man; they are implanted (enephuteusan, 70c). It is remarkable how the image of the gods from one story to another has changed. The Story of Intellect underplays their artistry, emphasizing instead their efficacy. The Story of Man, by contrast, draws them closer in nature and purpose to the demiurge: they are not only sculptor-like in how they shape the body around the soul, but also farmer-like in how they plant organs within the body. Also worth noting is the image of man himself. He is no longer an inert machine assembled with glue and rivets, but a dynamic organism animated by passions and emotions.

Continuing his story, Timaeus shifts his attention to the liver—one of his seemingly favorite organs. Reason and emotion have been physically separated—reason being located in the head and the appetites in the abdomen. Because the appetites cannot understand reason—evidently because they speak different languages—there was a need for a mediator to convey messages from the higher faculties to the lower (71a). This becomes the purpose of the liver. From the intellect, the liver receives images, which it projects onto its surface to frighten and restrain the appetites (72b). This is reminiscent of the Republic’s cave allegory, where prisoners passively watch as images flicker across a cave wall. Just as those images keep the prisoners pacified, the images projected by the liver keep the appetites at bay. Apart from helping the intellect control emotion, the liver also serves as the source of divination (71e, 72b). No one in his right mind, Timaeus says, has access to divine reason; yet when asleep or overcome by some inspiration, man receives divine messages, which must be reflected upon and interpreted. At no point does Timaeus condemn divination as a form of superstition or subterfuge, treating it instead with respect and sincerity—at least for those who think deeply and reason slowly about their divinations. But why does Timaeus treat divination with respect at all? Why not like the Saisian priest explain divination away naturalistically as either hallucination or huckstery? Perhaps the reason is that for Timaeus—and by extension Plato—man is not simply a patchwork of physical parts: cells, tissues, organs, and systems. Man is not a god, but he does have divine origins, a divine element within himself, and the ability to be divinely inspired. Whereas the Saisian priest ridiculed the idea of myth shedding light on the nature, Timaeus wants to preserve the link between the natural and supernatural. Far from setting man and the gods apart, Timaeus’s cosmo-muthos brings them closer together.

Much care and skill is brought to bear on the gods’ creative efforts. Consider how they make bone and flesh. It begins with marrow, which “gave the mortal kind its roots” (katerrixoun, 73b). Marrow, Timaeus tells his companions, comes from a universal seed-stuff (epephēmisen) comprised of smooth and unwarped triangles. Planted within the marrow are various kinds of souls (73c). The gods then take some of this marrow and form it into a spherical shape. This spherical field (aroura) receives the divine seed (to theion sperma) and becomes the brain (73d). Bone likewise comes to be through an intricate, creative process (73e). First, earth is sifted by the gods to be pure and smooth. Next, it is kneaded and soaked with marrow, baked in fire, dipped in water, placed back into the fire, and dipped once again in water. This is an utterly fantastic account—one that emphasizes the intelligence and imagination behind the creation of man and of his seemingly most mundane parts. Bone is created as if by a potter and flesh as if by a wax-modeler (74c). Man is not the product of necessity or blind chance; he arises from a series of deliberate actions. The gods—whether the demiurge or lesser gods—emerge as agents who care not only to complete their work, but to introduce into their products a sense of style and value. They care about what they make. Their artistic fingerprints can be found on everything from the cosmic soul to human skin, hair, and nails. But they also know how best to manipulate materials. Owing to their different shapes, the elements behave differently. Thus, by combining an artistic sensibility with a knowledge of how to engineer things given the materials at their disposal, the gods are able to imbue the cosmos and man with beauty, structural stability, and purposefulness.

To be sure, man is well-formed and beautiful, but he is also able to grow and flourish. In fact, he appears in Timaeus’s story as plant-like. In several passages, the gods act as farmers, planting organs within the human body. Marrow literally allows man, like a tree, to take root (73b). In addition, the gods equip man with an irrigation system: “[the gods] channelled through our body itself, just as they were cutting channels in gardens, so that the body might be refreshed as though from an inflowing stream” (77c–d). These channels help the “stream of nourishment” flow so that the “irrigation may be made uniform.” The gods use their agricultural knowledge frequently when creating man, who, Timaeus says, is “not an earthly but a heavenly plant” (phuton ouk eggeion alla ouranion, 90a). This naturally prompts the question: Why is so close a connection drawn between man and plants? One may think it is a joke. Anyone can tell the difference between humans and plants. Maybe Timaeus wanted to get a rise from his companions. But if it is a joke, it is not a very good one; for as most would agree, there are few things less amusing than an oft-repeated joke. A possible clue may be found just before the discussion of man’s irrigation system, where Timaeus tells a brief story about the origins of plants (70e–77c). After making man, the gods decide to create plants, unlike animals in appearance, yet having sensations and “a nature akin to man.” Plants are indeed animals, he notes, because “everything that partakes of living may justly and most correctly called an animal” (77a–b). Since plants are intended to be eaten (77c), this cannot be a solemn plea for vegetarianism. Even so, it is certainly possible that Timaeus’s message is ecological. Sometimes, because of our egocentric and homocentric concerns, we lose track of our place in the world. It would be an overstatement to suggest that Timaeus is advancing an environmentalist ethics; but his holistic view of creation, if taken seriously, does raise important questions (1) about the relationship between creator and created, that is, between planter and planted (87b), (2) about the kinship between man and nature, and (3) about how the good for man relates to the good for other living things. Rarely has Plato been considered a philosopher with ecological concerns, but perhaps a careful study of the Timaeus would help to change this opinion.

Although man is capable of flourishing, he is also subject to collapse and decay. For this reason, Timaeus pivots to the origin and nature of diseases. He uses a strange word to describe their origin, saying that they are “constructed” or “contrived” (sunistēmi, 81e). Is the implication that the gods have a hand in creating diseases? Was this to ensure the imperfection of man? Timaeus, unfortunately, leaves these questions unanswered. Instead, he launches into a discussion of diseases that specifically afflict the body. These arise from an excess, deficiency, or misplacement of elements (82a–b). In other words, there is a physical imbalance, causing a body to become unmusical (plēmmelēsēi) and out of harmony with the cosmos (82b). After bodily diseases, Timaeus turns to diseases of the soul, giving special attention to folly, which comes in two varities: stupidity and madness. Stupidity, he says, is the greatest disease, arising when a body becomes too large and the intellect too weak (88a–b). When this happens, bodily motions gain mastery, causing the soul to become dull, slow, and forgetful. But does this mean that only large people are stupid? Are small folk immune to this disease? It seems an obvious question to ask, but Timaeus does not consider it. Moving along, Timaeus traces madness to overly seeded marrow (86c–d), not to wickedness, as some do: “people hold the opinion that he’s not diseased but willingly bad. But the truth is…” (86d). Timaeus’s claim here is both Socratic and un-Socratic. It is Socratic in that it affirms that no one performs wrongful acts willingly; however, it is un-Socratic in that rather than attributing wrongdoing to ignorance, as Socrates does, Timaeus roots it in bodily disease. In fact, physical causes are responsible for madness and stupidity. As for treating disease, Timaeus prescribes physical exercise and an avoidance of medicinal drugs (89a–c). Idleness and inactivity will only make matters worse; the body must mimic the cosmos and stay in motion. In addition, a person should tune the motions in his soul by applying himself to the liberal arts and all philosophy (88c). Striking the right balance and keeping one’s mind on higher things—these, according to Timaeus, are the ingredients to a good life and healthy soul. At this point, one might wonder: Would the silent Socrates be nodding in agreement or shaking in disagreement? Perhaps he would be doing both.

This would seem like a good place for Timaeus to stop. Critias promised Socrates that Timaeus’s story would explain the origins and nature of mankind. Timaeus has clearly gone above and beyond. Not only has he explained the origins and nature of man, but he has also given his companions lessons in human anatomy and physiology and human pathology. But Timaeus is not finished yet. There is one topic left to cover: the invention of sex. One might assume that man and woman have the same origin, but that is not so. The gods created man beautiful and well-ordered, but man is prone to physical and moral decay. He is also destined to die and, as it turns out, be reincarnated. Timaeus does not say what happens to courageous and just men, but as for the cowardly and unjust, they return not as men, but as women (90e). Women are therefore derivative, being born from morally deficient men. It is hard to know what to make of this, especially in light of the fact that in Socrates’ polis—the one that Timaeus is helping to animate—men and women are social equals. The subsequent account of intercourse is likewise peculiar. The channel for releasing urine, Timaeus says, also releases marrow (that is, seed) from the brain (91b). This marrow, being imbued with soul, gives the male reproductive organ a desire for emission. As a result, male genitals have become unpersuadable and autocratic, like an irrational animal (91b–c). In other words, the love to procreate is a form of madness in which a man loses his mind—or more literally, his brain. But is this the extent of man’s erotic feelings? Eros compels physical love, but can it compel philosophy? Can it compel movement within a polis? Timaeus rounds out his speech by explaining the origins of other animals (91d–92c). Like women, they derive from deficient men: birds from light-minded men who rely overabundantly on sight in their scientific demonstrations, land animals from unphilosophical men, and fish from the stupidest men. With the cosmos now made and fully populated, Timaeus delivers his closing line, which dispenses with the puns, jokes, and tomfoolery: “having been filled up, this cosmos has come to be—a visible animal embracing visible animals, a likeness of the intelligible, a sensed god; greatest and best, most beautiful and most perfect—this one heaven being alone of its kind” (92c). So ends Timaeus’s story, but not Socrates’ feast of speeches, which continues in the Critias.

7. Concluding Remarks

What is a contemporary audience to make of Plato’s Timaeus? Is it a serious attempt at natural philosophy or a farcical parody? There is no way of settling definitively Plato’s intent in writing the dialogue. In fact, that seems to be the nature of Plato’s art in general: he always wants his readers to keep wondering, to keep asking questions, to keep returning to his texts with the hope of discovering something new and exciting. Timaeus’s speech is very much a microcosm of the world around us with its multiple layers, interlocking pieces, and abrupt movements back and forth. But it also fits in well with a recurring theme in Plato’s dialogues. Time and again, we find in the dialogues a critique of science—not just any science, but the sort of science that tends to view nature in materialistic and mechanistic terms. In the Phaedrus, for instance, Socrates describes naturalized myths as displaying “rural wisdom” (229e). Scientific naturalism is also taken to task in the Phaedo, where Socrates recounts his youthful flirtations with the natural philosophy of Anaxagoras. Reading Anaxagoras, Socrates delighted in learning details about the shape and location of the earth and various facts about the sun, the moon, and the other heavenly bodies. Socrates also shares his delight when he learned from Anaxagoras that Mind (Nous) is the ordering principle of the world. But in the end, he was left disappointed: “I thought he’d go on to take me through the best for each and the good (agathon) common to all” (98b). Anaxagoras, therefore, impressed Socrates with his explanation of how things work, but he failed to explain why they are the way that they are and why they are good. In other words, Anaxagoras’s cosmos, in Socrates’ opinion, was without value: it moved, but without any rhyme or reason. Mind controlled everything, moving things about, but why did it move things in this way instead of that? Why did it move things at all?

If there is any message to be gleaned from the Timaeus, it is that material explanations alone cannot render a clear and complete picture of the world. Recall that the dialogue begins with Socrates recounting his speech, the day before, about the best polis. Not long after, he invites his companions to set this polis in motion—like beautiful animals “moving and contending in some struggle.” By the end of the dialogue, readers are left wondering: Has Timaeus really granted Socrates’ wish? Is Socrates’ polis any more alive at the end of the dialogue than it was at the beginning? Before rushing to judgment, one must remember that the Timaeus leads directly into the fragmentary Critias, which in turn may lead into a third dialogue, the Hermocrates. It would therefore be rash to answer these questions without taking the entire, unbroken narrative into account. Even so, there does seem to be an effort by Timaeus to create a living cosmos—one with more life and vitality than what the youthful Socrates found in Anaxagoras. It is true that Timaeus’ story at times radiates absurdity and silliness. At the same time, however, it is a lively artistic creation, much like the very cosmos his story depicts. There is craftsmanship behind its creation. Moreover, the cosmos is itself a living thing with a soul. The gods, as heavenly bodies, move in accord with reason, as does the entire cosmos itself. Beauty, goodness, and life permeate the universe. This might not be a direct response to Socrates’ invitation to set his polis in motion; but it is a step in the right direction and a powerful reminder from Plato that a philosophy that sacrifices the beautiful and the good—that sacrifices spirited muthos on the cold, antiseptic altar of logos—sacrifices a vital part of reality.

8. References and Further Reading

All translations of the Timaeus in this article are by Kalkavage, Plato’s Timaeus, Newburyport, MA: Focus Publishing, 2001.

a. Standard Greek Text

  • Burnet, John. “Clitopho,” “Respublica,” “Timaeus,” “Critias.” Platonis Opera, Vol. IV. Oxford: Clarendon Press, 1902.
    • Critical edition of the ancient Greek text. Essential for scholarly work.

b. English Translations

  • Bury, R. G. Plato: Timaeus, Critias, Cleitophon, Menexenus, Epistles, Cambridge, Mass.: Loeb Classical Library, 1960.
    • Interlinear Greek-English translation. Includes an introduction and notes. Essential for scholarly work.
  • Cornford, F. M. Plato’s Cosmology, London: Routledge & Kegan Paul, 1937. Reprinted, Indianapolis: Hackett Publishing Co., 1997.
    • Good translation, although perhaps a bit dated. Includes a detailed commentary and notes.
  • Jowett, Benjamin. “Timaeus,” The Collected Dialogues of Plato: Including the Letters. Eds. Edith Hamilton & Huntington Cairns, Princeton, NJ: Princeton University Press, 1961.
    • Dated translation, but still useful. Part of an anthology of Plato’s complete works. Readily available online, for example, here, here, and here.
  • Kalkavage, Peter. Plato’s Timaeus. Newburyport, MA: Focus Publishing, 2001.
    • Superb translation. Includes an interpretative essay, glossary, and several appendices.
  • Lee, Desmond. Timaeus and Critias. Revised by Thomas Kjeller Johansen, 2008, London: Penguin Books, 1972.
    • Good translation. Includes a lengthy introduction and notes.
  • Waterfield, Robin. Timaeus and Critias. Oxford: Oxford University Press, 2008.
    • Good translation. Includes a lengthy introduction, summary, and explanatory notes by Andrew Gregory.
  • Zeyl, Donald J. “Timaeus,” Plato: Complete Works. Ed. John M. Cooper, Indianapolis: Hackett Publishing Company, 1997.
    • Good translation. Part of an anthology of Plato’s complete works with concise introductions and notes.

c. Classic Studies

  • Cherniss, H. F. “The Relation of the Timaeus to Plato’s Later Dialogues.” The American Journal of Philology, Vol. 78, No. 3 (1957): 225–266. Reprinted in Studies in Plato’s Metaphysics. Ed. R. E. Allen, London and New York: Routledge and Kegan Paul, 1965. Also in Selected Papers. Ed. Leonardo Tarán, Leiden: Brill, 1977.
    • Makes the case for placing the Timaeus in Plato’s late compositional period.
  • Owen, G. E. L. “The Place of the Timaeus in Plato’s Dialogues.” The Classical Quarterly NS 3 (1953): 79–95. Reprinted in Studies in Plato’s Metaphysics. Ed. R. E. Allen, London and New York: Routledge and Kegan Paul, 1965. Also in Logic, Science and Dialectic. Ed. Martha Nussbaum. Ithaca: Cornell University Press, 1986.
    • Makes the case for placing the Timaeus in Plato’s middle compositional period.
  • Taylor, A. E. A Commentary on Plato’s Timaeus. Oxford: Clarendon Press, 1928. Reprinted, New York: Garland, 1967.
    • Lengthy and ambitious commentary on the dialogue. An early effort to challenge the view that the dialogue presents Plato’s own thoughts on cosmology.
  • Vlastos, Gregory. “The Disorderly Motion in the Timaeus.” Studies in Plato’s Metaphysics. Ed. R. E. Allen,  London and New York: Routledge and Kegan Paul, 1965. Reprinted in Studies in Greek Philosophy, Vol. 2, ed. D. W. Graham, Princeton: Princeton University Press, 1995.
    • Considers the nature of the disorderly motion discussed at Ti. 30a, 52d–53b, 69b.
  • Vlastos, Gregory. “Creation in the Timaeus: Is It a Fiction?” Studies in Plato’s Metaphysics. Ed. R. E. Allen, London and New York: Routledge and Kegan Paul, 1965b. Reprinted in Studies in Greek Philosophy, Vol. 2, ed. D. W. Graham, Princeton: Princeton University Press, 1995.
    • Follow-up to Vlastos’s essay “The Disorderly Motion in the Timaeus” in the light of Cherniss’s work.
  • Vlastos, Gregory. Platos Universe. Seattle: University of Washington Press, 1975. Reprinted by Luc Brisson, Las Vegas: Parmenides Publishing, 2005.
    • Offers an interpretation of the Timaeus’s cosmology within the broader context of Presocratic natural philosophy. Attributes the cosmological ideas in the dialogue to Plato himself.

d. Classical Studies

The following ancient commentaries are mainly of historical and scholarly interest. Newcomers to the dialogue will likely want to consult more recent scholarship.

  • Calcidius. On Plato’s Timaeus. Ed. and trans. John Magee, Cambridge: Harvard University Press, 2016.
  • Plutarch. “On the Generation of the Soul in the Timaeus.” Plutarch’s Moralia, Vol. 1, Pt. 1. Ed. and trans. Harold Tarrant, Cambridge: Harvard University Press, 1976.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 1, Book 1: Proclus on the Socratic State and Atlantis. Ed. and trans. Harold Tarrant, Cambridge: Cambridge University Press, 2007.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 2, Book 2: Proclus on the Causes of the Cosmos and its Creation. Eds. and trans. David T. Runia & Michael Share, Cambridge: Cambridge University Press, 2009.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 3, Book 3, Part 1, Proclus on the World Soul. Ed. and trans. Dirk Baltzly, Cambridge: Cambridge University Press, 2010.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 4, Book 3, Part 2, Proclus on the World Soul. Ed. and trans. Dirk Baltzly, Cambridge: Cambridge University Press, 2010.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 5, Book 4: Proclus on Time and the Stars. Ed. and trans. Dirk Baltzly, Cambridge: Cambridge University Press, 2016.
  • Proclus. “Commentary on Plato’s Timaeus.” Vol. 6, Book 5: Proclus on the Gods of Generation and the Creation of Humans. Ed. and trans. Harold Tarrant, Cambridge: Cambridge University Press, 2017.

e. Other Studies of Related Interest

  • Altman, William H. F. “Reading Order and Authenticity: The Place of Theages and Cleitophon in Platonic Pedagogy.” The Electronic Journal of the International Plato Society, n 11, 2011: 1–50.
    • Argues against the orthodoxy for the authenticity of Theages and Cleitophon. Relevant to the study of Platonic authorship.
  • Arieti, James A. Interpreting Plato: The Dialogues as Drama. Savage, MD: Rowman & Littlefield Publishers, Inc., 1991.
    • Offers a literary reading of eighteen dialogues, including the Timaeus. Argues that the dialogues ought to be approaches principally as dramas, not philosophical discourses.
  • Brandwood, Leonard. “Stylometry and Chronology.” The Cambridge Companion to Plato. Ed. Richard Kraut, Cambridge: Cambridge University Press, 1992.
    • Recounts attempts made by various scholars to date Plato’s dialogues using stylometry.
  • Howland, Jacob. “Re-Reading Plato: The Problem of Platonic Chronology.” Phoenix, Vol. 45, No. 3 (Autumn 1991): 189–214.
    • Offers a powerful and compelling case against efforts by some scholars to arrange Plato’s dialogues chronologically.
  • Ives, Charles. Socrates’ Request and the Educational Narrative of the Timaeus. Lanham, MD: Lexington Books, 2017.
    • Draws attention to the connection between Timaeus’s cosmology and Socrates’ request for a speech about war.
  • Lampert, Lawrence and Planeaux, Christopher. “Who’s Who in Plato’s Timaeus-Critias and Why.” The Review of Metaphysics, Vol. 52, No. 1 (September 1998): 87–125.
    • Important examination into the characters and historical-political background of the Timaeus and Critias.
  • Lovejoy, Arthur O. The Great Chain of Being: A Study of the History of an Idea. Cambridge, MA: Harvard University Press, 1936.
    • Classic work in the history of ideas tracing the origin and evolution of three philosophical principles: plenitude, continuity, and graduation.
  • Mohr, Richard. One Book, The Whole Universe: Plato’s Timaeus Today. Las Vegas/Zurich/Athens: Parmenides Publishing: 2010.
    • Ambitious anthology covering a broad range of topics related to the Timaeus. Derived from the Timaeus Conference at University of Illinois at Urbana–Champaign, September 2007.
  • Nails, Debra. The People of Plato: A Prosopography of Plato and Other Socratics. Indianapolis: Hackett Publishing Company, 2002.
    • Meticulous study of the individuals represented in Plato’s dialogues and their relationships to one another. Essential for scholarly work.
  • Parke, Herbert W. Festivals of the Athenians. Ithaca, NY: Cornell University Press, 1977.
    • Valuable study of the religious festivals of ancient Athens. Contains a thorough discussion of the Apaturia.
  • Press, Gerald A. Plato: A Guide for the Perplexed. London & New York: Continuum, 2007.
    • Very nice introduction to Plato’s art and thought urging that the dialogues be read as both philosophical and dramatic works.
  • Rutherford, R. B. The Art of Plato. Cambridge: Harvard University Press, 1995.
    • Offers a literary interpretation of the dialogues, including the Timaeus, focusing on their formal structure, language, character development, and imagery.
  • Sallis, John. Chorology: On Beginning in Plato’s Timaeus. Bloomington: Indiana University Press, 1999.
    • Interesting study focusing on the dialogue’s strange and mysterious “space” (chōra).
  • Welliver, Warman. Character, Plot, and Thought in Plato’s Timaeus-Critias. Leiden: E. J. Brill, 1977.
    • Close and careful examination of the Timaeus’s characters and their underlying political antagonisms.
  • Westra, Laura and Robinson, Thomas M. The Greeks and the Environment. Lanham, MD: Rowman & Littlefield Publishers, 1997.
    • Anthology devoted to the examination early Greek thinking on nature and ecology. Several chapters are devoted to Plato.
  • Zuckert, Catherine. Plato’s Philosophers: The Coherence of the Dialogues. Chicago; University of Chicago Press, 2012.
    • Bold and ambitious interpretation of Plato’s dialogues based on their dramatic order.

 

Author Information

Frank Grabowski
Email: fgrabowski@rsu.edu
Rogers State University
U. S. A.

The New Atheists

The New Atheists are authors of early twenty-first century books promoting atheism. These authors include Sam Harris, Richard Dawkins, Daniel Dennett, and Christopher Hitchens. The “New Atheist” label for these critics of religion and religious belief emerged out of journalistic commentary on the contents and impacts of their books. A standard observation is that New Atheist authors exhibit an unusually high level of confidence in their views. Reviewers have noted that these authors tend to be motivated by a sense of moral concern and even outrage about the effects of religious beliefs on the global scene. It is difficult to identify anything philosophically unprecedented in their positions and arguments, but the New Atheists have provoked considerable controversy with their body of work.

In spite of their different approaches and occupations (only Dennett is a professional philosopher), the New Atheists tend to share a general set of assumptions and viewpoints. These positions constitute the background theoretical framework that is known as the New Atheism. The framework has a metaphysical component, an epistemological component, and an ethical component.  Regarding the metaphysical component, the New Atheist authors share the central belief that there is no supernatural or divine reality of any kind.  The epistemological component is their common claim that religious belief is irrational. The moral component is the assumption that there is a universal and objective secular moral standard. This moral component sets them apart from other prominent historical atheists such as Nietzsche and Sartre, and it plays a pivotal role in their arguments because it is used to conclude that religion is bad in various ways, although Dennett is more reserved than the other three.

The New Atheists make substantial use of the natural sciences in both their criticisms of theistic belief and in their proposed explanations of its origin and evolution. They draw on science for recommended alternatives to religion. They believe empirical science is the only (or at least the best) basis for genuine knowledge of the world, and they insist that a belief can be epistemically justified only if it is based on adequate evidence. Their conclusion is that science fails to show that there is a God and even supports the claim that such a being probably does not exist. What science will show about religious belief, they claim, is that this belief can be explained as a product of biological evolution. Moreover, they think that it is possible to live a satisfying non-religious life on the basis of secular morals and scientific discoveries.

Table of Contents

  1. Faith and Reason
  2. Arguments for and against God’s Existence
  3. Evolution and Religious Belief
  4. The Moral Evaluation of Religion
  5. Secular Morality
  6. Alleged Divine Revelations
  7. Secular Fulfillment
  8. Criticism of the New Atheists
  9. References and Further Reading
    1. Works by the New Atheists
    2. Works About the New Atheism

1. Faith and Reason

Though it is difficult to find a careful and precise definition of “faith” in the writings of the New Atheists, it is possible to glean a general characterization of this cognitive attitude from various things they say about it. In The Selfish Gene, Richard Dawkins states that faith is blind trust without evidence and even against the evidence. He follows up in The God Delusion with the claim that faith is an evil because it does not require justification and does not tolerate argument. Whereas the former categorization suggests that Dawkins thinks that faith is necessarily non-rational or even irrational, the latter description seems to imply that faith is merely contingently at odds with rationality. Harris’s articulation of the nature of faith is closer to Dawkins’ earlier view. He says that religious faith is unjustified belief in matters of ultimate concern. According to Harris, faith is the permission religious people give one another to believe things strongly without evidence. Hitchens says that religious faith is ultimately grounded in wishful thinking. For his part, Dennett implies that belief in God cannot be reasonable because the concept of God is too radically indeterminate for the sentence “God exists” to express a genuine proposition.  Given this, Dennett questions whether any of the people who claim to believe in God actually do believe God exists. He thinks it more likely that they merely profess belief in God or “believe in belief” in God (they believe belief in God is or would be a good thing). According to this view there can be no theistic belief that is also reasonable or rational.

The New Atheists appeal to the deliverances of the empirical sciences as their criterion for rational religious belief. Harris and Dawkins are quite explicit about this. Harris equates a genuinely rational approach to spiritual and ethical questions with a scientific approach to these sorts of questions. Dawkins insists that the presence or absence of a creative super-intelligence is a scientific question. The New Atheists also affirm evidentialism, the claim that a belief can be epistemically justified only if it is based on adequate evidence. The combination of their reliance on science and their adherence to evidentialism entails that a religious belief can be justified only if it is based on adequate scientific evidence. The New Atheists’ conclusion that belief in God is unjustified follows, then, from their addition of the claim that there is inadequate scientific evidence for God’s existence (and even adequate scientific evidence for God’s non-existence). Dawkins argues that the “God Hypothesis,” the claim that there exists a superhuman, supernatural intelligence who deliberately designed and created the universe, is “founded on local traditions of private revelation rather than evidence” (2006, pp. 31-32).

2. Arguments for and against God’s Existence

The New Atheists are not philosophers of religion, and none of them addresses either theistic or atheistic arguments to any great extent. Dawkins does devote a chapter apiece to each of these tasks in the process of making a case for his claim that there almost certainly is no God. Harris, who thinks that atheism is obviously true, does not dedicate much space to a discussion of arguments for or against theism. He does sketch a brief version of the cosmological argument for God’s existence but asserts that the final conclusion does not follow because the argument does not rule out alternative possibilities for the universe’s existence. Harris also hints at reasons to deny God’s existence by pointing to unexplained evil and “unintelligent design” in the world. Hitchens includes chapters entitled “The Metaphysical Claims of Religion are False” and “Arguments from Design,” but his more journalistic treatment of the cases for and against God’s existence amounts primarily to the claim that the God hypothesis is unnecessary, since science can now explain what theism was formerly thought to be required to explain, including phenomena such as the appearance of design in the universe. After considering the standard arguments for God’s existence and rehearsing standard objections to them, Dennett argues that the concept of God is insufficiently determinate for it to be possible to know what proposition is at issue in the debate over God’s existence.

Dawkins’ argument for the probable non-existence of God is the most explicit and thorough attempt at an atheistic argument amongst the four. Dawkins labels his argument for God’s non-existence “the Ultimate Boeing 747 gambit,” because he thinks that God’s existence is at least as improbable as the chance that a hurricane, sweeping through a scrap yard, would have the luck to assemble a Boeing 747 (an image that he borrows from Fred Hoyle, who used it for a different purpose). At the heart of his argument is the claim that any God capable of designing a universe must be a supremely complex and improbable entity who needs an even bigger explanation than the one the existence of such a God is supposed to provide. Dawkins also says that the hypothesis that an intelligent designer created the universe is self-defeating. What he appears to mean by this charge is that this intelligent design hypothesis claims to provide an ultimate explanation for all existing improbable complexity and yet cannot provide an explanation of its own improbable complexity. Dawkins further states that the God hypothesis creates a vicious regress rather than terminating one. Similarly, Harris follows Dawkins’ in arguing that the notion of a creator God leads to an infinite regress because such a being would have to have been created.

3. Evolution and Religious Belief

The New Atheists observe that if there is no supernatural reality, then religion and religious belief must have purely natural explanations. They agree that these sociological and psychological phenomena are rooted in biology. Harris summarizes their view by saying that as a biological phenomenon, religion is the product of cognitive processes that have deep roots in our evolutionary past. Dawkins endorses the general hypothesis that religion and religious belief are byproducts (what some evolutionary biologists call “spandrels”) of something else that has survival value. His specific hypothesis is that human beings have acquired religious beliefs because there is a selective advantage to child brains that possess the rule of thumb to believe, without question, whatever familiar adults tell them. Dawkins speculates that this cognitive disposition, which tends to help inexperienced children to avoid harm, also tends to make them susceptible to acquiring their elders’ irrational and harmful religious beliefs. Dawkins is less committed to this specific hypothesis than he is to the general hypothesis, and he is open to other specific hypotheses of the same kind. Dennett discusses a number of these specific hypotheses more thoroughly in his attempt to “break the spell” he identifies as the taboo against a thorough scientific investigation of religion as one natural phenomenon among many.

At the foundation of Dennett’s “proto-theory” about the origin of religion and religious belief is his appeal to the evolution in humans (and other animals) of a “hyperactive agent detection device” (HADD), which is the disposition to attribute agency—beliefs and desires and other mental states—to anything complicated that moves. Dennett adds that when an event is sufficiently puzzling, our “weakness for certain sorts of memorable combos” cooperates with our HADD to constitute “a kind of fiction-generating contraption” that hypothesizes the existence of invisible and even supernatural agents (2006, pp. 119-120). Dennett goes on to engage in a relatively extensive speculation about how religion and religious belief evolved from these purely natural beginnings. In doing so, he employs the concept of a “culturally based replicator,” which Dawkins had previously labeled a “meme” (on analogy with “gene,” which refers to biologically based replicators). Though Hitchens mentions Dennett’s naturalistic approach to religion in his chapter on “religion’s corrupt beginnings,” he focuses primarily on the interplay between a pervasive gullibility he takes to be characteristic of human beings and the exploitation of this credulity that he attributes to the founders of religions and religious movements. The scientific investigation of religion of the sort Dennett recommends has prompted a larger interdisciplinary conversation that includes both theists and non-theists with academic specialties in science, philosophy, and theology (see Schloss and Murray 2009 for an important example of this sort of collaboration).

4. The Moral Evaluation of Religion

The New Atheists agree that, while religion may have been a byproduct of certain human qualities that proved important for survival, religion itself is not necessarily a beneficial social and cultural phenomenon on balance at present. Indeed, three of the New Atheists (Harris, Dawkins, and Hitchens) are quite explicit in their moral condemnation of religious people on the ground that religious beliefs and practices have had significant and predominately negative consequences. The examples they provide of such objectionable behaviors range from the uncontroversial (suicide bombings, the Inquisition, “religious” wars, witch hunts, homophobia, etc.) to the controversial (prohibition of “victimless crimes” such as drug use and prostitution, criminalization of abortion and euthanasia, “child abuse” due to identification of children as members of their parents’ religious communities, etc.). Harris is explicit about placing the blame for these evils on faith, defined as unfounded belief. He argues that faith in what religious believers take to be God’s will as revealed in God’s book inevitably leads to immoral behaviors of these sorts. In this way, the New Atheists link their epistemological critique of religious belief with their moral criticism of religion.

The New Atheists counter the claim that religion makes people good by listing numerous examples of the preceding sort in which religion allegedly makes people bad. They also anticipate the reply that the moral consequences of atheism are worse than those of theism. A typical case for this claim appeals to the atrocities perpetrated by people like Hitler and Stalin. The New Atheists reply that Hitler was not necessarily an atheist because he claimed to be a Christian and that these regimes were evil because they were influenced by religion or were like a religion and that, even if their leaders were atheists (as in the case of Stalin), their crimes against humanity were not caused by their atheism because they were not carried out in the name of atheism. The New Atheists seem to be generally agreed that theistic belief has generally worse attitudinal and behavioral moral consequences than atheistic belief.  Dennett is characteristically more hesitant to draw firm conclusions along these lines until further empirical investigation is undertaken.

5. Secular Morality

These moral objections to religion presuppose a moral standard. Since the New Atheists have denied the existence of any supernatural reality, this moral standard has to have a purely natural and secular basis. Many non-theists have located the natural basis for morality in human convention, a move that leads naturally to ethical relativism. But the New Atheists either explicitly reject ethical relativism, or affirm the existence of the “transcendent value” of justice, or assert that there is a consensus about what we consider right and wrong, or simply engage in a moral critique of religion that implicitly presupposes a universal moral standard.

The New Atheists’ appeal to a universal secular moral standard raises some interesting philosophical questions. First, what is the content of morality? Harris comes closest to providing an explicit answer to that question in stating that questions of right and wrong are really questions about the happiness and suffering of sentient creatures. Second, if the content of morality is not made accessible to human beings by means of a revelation of God’s will, then how do humans know what the one moral standard is? The New Atheists seem to be agreed that we have foundational moral knowledge. Harris calls the source of this basic moral knowledge “moral intuition.” Since the other New Atheists don’t argue for the moral principles to which they appeal, it seems reasonable to conclude that they would agree with Harris. Third, what is the ontological ground of the universal moral standard? Given the assumption that ethical relativism is false, the question arises concerning what the objective natural ground is that makes it the case that some people are virtuous and some are not and that some behaviors are morally right and some are not. Again, Harris’s view that our ethical intuitions have their roots in biology is representative. Dawkins provides “four good Darwinian reasons” that purport to explain why some animals (including, of course human beings) engage in moral behavior. And though Dennett’s focus is on the evolution of religion, he is likely to have a similar story about the evolution of morality. The fourth philosophical question raised by the New Atheists is one they address themselves: “Why should we be moral?” Harris’s answer is that being moral tends to contribute to one’s happiness. Dawkins replies to the critic who asks, “If there’s no God, why be good?” by questioning the necessity, desirability, and efficacy of a desire for divine approval as a motivator for moral behavior.

6. Alleged Divine Revelations

If there is no divine being, then there are no divine revelations. If there are no divine revelations, then every sacred book is a merely human book. Harris, Dawkins, and Hitchens each construct a case for the claim that no alleged written divine revelation could have a divine origin. Their arguments for this conclusion focus on what they take to be the moral deficiencies and factual errors of these books. Harris quotes passages from the part of the Old Testament traditionally labeled the “Law” that he considers barbaric (such as the command in Deuteronomy 13 to stone family members or close friends to death if they “secretly entice you” to “go worship other gods”) and then asserts (on the basis of his view that Jesus can be read to endorse the entirety of Old Testament law) that the New Testament does not improve on these injunctions. He says that any subsequent more moderate Christian migration away from these biblical legal requirements is a result of taking scripture less and less seriously. Dawkins agrees with Harris that the God of the Bible and the Qur’an is not a moderate. As a matter of fact, he says, “The God of the Old Testament is arguably the most unpleasant character in all of fiction” (Dawkins 2006, p. 31). Though he says that “Jesus is a huge improvement over the cruel ogre of the Old Testament” (Dawkins 2006, p. 25), he argues that the doctrine of atonement, “which lies at the heart of New Testament theology, is almost as morally obnoxious as the story of Abraham setting out to barbecue Isaac” (Dawkins 2006, p. 251). Hitchens adds his own similar criticisms of both testaments in two chapters: “The Nightmare of the ‘Old’ Testament” and “The ‘New’ Testament Exceeds the Evil of the ‘Old’ One.” He also devotes a chapter to the Qur’an (as does Harris) and a section to the Book of Mormon. Dennett hints at a different objection to the Bible by remarking that anybody can quote the Bible to prove anything.

This collective case against the authenticity of any alleged written divine revelation raises interesting questions in philosophical theology about what kind of book could qualify as “God’s Word.” For instance, Harris considers it astonishing that a book as “ordinary” as the Bible is nonetheless thought to be a product of omniscience. He also says that, whereas the Bible contains no formal discussion of mathematics and some obvious mathematical errors (he takes I Kings 7:23-26 to state that the ratio of the circumference of a circle to its diameter— π—is 3:1), a book written by an omniscient being could contain a chapter on mathematics that would still be the richest source of mathematical insight humanity has ever known. This sort of claim invites further discussion about the sorts of purposes God would have and strategies God would employ in communicating with human beings in different times and places.

7. Secular Fulfillment

Each of the New Atheists recommends or at least alludes to a non-religious means of personal fulfillment and even collective well-being. Harris advocates a “spirituality” that involves meditation leading to happiness through an eradication of one’s sense of self. He thinks that scientific exploration into the nature of human consciousness will provide a progressively more adequate natural and rational basis for such a practice. For inspiration in a Godless world, Dawkins looks to the power of science to open the mind and satisfy the psyche. He celebrates the liberation of human beings from ignorance due to the growing and assumedly limitless capacity of science to explain the universe and everything in it. Hitchens hints at his own source of secular satisfaction by claiming that the natural is wondrous enough for anyone. He expresses his hope for a renewed Enlightenment focused on human beings, based on unrestricted scientific inquiry, and eventually productive of a new humane civilization. Dennett believes that a purely naturalistic spirituality is possible through a selfless attitude characterized by humble curiosity about the world’s complexities resulting in a realization of the relative unimportance of one’s personal preoccupations.

8. Criticism of the New Atheists

A number of essays and books have been written in response to the New Atheists (see the “Works about the New Atheism” sub-section of the “References and Further Reading” section below for some titles). Some of these works are supportive of them and some of them are critical. Other works include both positive and negative evaluations of the New Atheism. Clearly, the range of philosophical issues raised by the New Atheists’ claims and arguments is broad. As might be expected, attention has been focused on their epistemological views, their metaphysical assumptions, and their axiological positions. Their presuppositions should also prompt more discussion in the fields of philosophical theology, philosophy of science, philosophical hermeneutics, the relation between science and religion, and historiography. Conversations about the New Atheists’ stances and rationales have also taken place in the form of debates between Harris, Dawkins, Hitchens, and Dennett and defenders of religious belief and religion such as Dinesh D’Souza, who has published his own defense of Christianity in response to the New Atheists’ arguments. These debates are accessible in a number of places on the Internet. Finally, the challenges to religion posed by the New Atheists have also prompted a number of seminars and conferences. One of these is a conference presented by the Center for Philosophy of Religion at the University of Notre Dame, entitled, “My Ways are not Your Ways: The Character of the God of the Hebrew Bible” (held September 10-12, 2009). For an introduction to the sorts of issues this conference addresses, see Copan 2008.

Criticisms have been raised about a number of the New Atheists claims mentioned above. With respect to epistemology, critics point out that the New Atheist assumption that religious faith is irrational is at odds with a long philosophical history in the West that often characterizes faith as rational. This Western Philosophical tradition can be said to begin with Augustine and to include a number of prominent Western philosophers up to the present (including Anselm, Aquinas, Descartes, Pascal, and more recently, Alvin Plantinga and Richard Swinburne). Moreover, given the New Atheist epistemological assumptions (and their consequences for religious epistemology), some criticism of their views has included questions about whether their reliance on empirical science is scientifically justified and whether there is adequate evidence to support their thesis of evidentialism. As for metaphysics, Dawkins has been criticized for engaging in an overly cursory evaluation of theistic arguments and for ignoring the philosophical literature in natural theology. Some critics, like William Lane Craig, reply that, at best, Dawkins’ argument could show only that the God hypothesis does not explain the appearance of design in the universe but could not demonstrate that God probably doesn’t exist. Critics allege that Dawkins’ assumption that God would need an external cause flies in the face of the longstanding theological assumption that God is a perfect and so necessary being who is consequently self-existent and ontologically independent. Critics also maintain that Dawkins owes the defender of this classical conception of God further clarification of the kind of complexity he attributes to God and further arguments for the claims that God possesses this kind of complexity and that God’s being complex in this way is incompatible with God’s being self-existent. In reply to Dawkins, Craig argues that though the contents of God’s mind may be complex, God’s mind itself is simple. Finally, as regards ethics, critics argue that a problem with the New Atheists biological answer to the philosophical question concerning the ontological ground of the universal moral standard is that it could only explain what causes moral behavior; it can’t also account for what makes moral principles true. And critics contend that the New Atheists’ answer to the question, “Why be moral?” could only show that belief in God is not needed to motivate people to be moral; it doesn’t explain what does (or should) motivate atheists to be moral.

9. References and Further Reading

a. Works by the New Atheists

  • Dawkins, Richard. The Selfish Gene, 2nd ed. (Oxford: Oxford University Press, 1989).
    • An explanation and defense of biological evolution by natural selection that focuses on the gene.
  • Dawkins, Richard. The God Delusion (Boston: Houghton Mifflin, 2006).
    • A case for the irrationality and immoral consequences of religious belief that draws primarily on evolutionary biology.
  • Dennett, Daniel. Breaking the Spell: Religion as a Natural Phenomenon (New York: Penguin, 2006).
    • A case for studying the history and practice of religion by means of the natural sciences.
  • Dennett, Daniel. “Afterword” in Richard Dawkins, The God Delusion, 10th anniversary edition (London: Penguin Random House, 2016), pp. 421-26.
    • Dennett’s retrospective about the impact made by the four original New Atheists following the initial publication of their books.
  • Harris, Sam. The End of Faith: Religion, Terror, and the Future of Reason (New York: Norton, 2004).
    • An intellectual and moral critique of faith-based religions that recommends their replacement by science-based spirituality.
  • Harris, Sam. Letter to a Christian Nation (New York: Vintage Books, 2008).
    • A revised edition of his 2006 response to Christian reactions to his 2004 book.
  • Hitchens, Christopher. God is Not Great: How Religion Poisons Everything (New York: Twelve, 2007).
    • A journalistic case against religion and religious belief.

b. Works About the New Atheism

  • Berlinski, David. The Devil’s Delusion: Atheism and its Scientific Pretensions (New York: Crown Forum, 2008).
    • A response to the New Atheists by a secular Jew that defends traditional religious thought.
  • Copan, Paul. “Is Yahweh a Moral Monster? The New Atheists and Old Testament Ethics,” Philosophia Christi 10:1, 2008, pp. 7-37.
    • A defense of the God and ethics of the Old Testament against the New Atheists’ criticisms of them.
  • Copan, Paul and William Lane Craig, eds. Contending with Christianity’s Critics (Nashville, Tenn.: Broadman and Holman, 2009).
    • A collection of essays by Christian apologists that addresses challenges from New Atheists and other contemporary critics of Christianity.
  • Craig, William Lane, ed. God is Great, God is Good: Why Believing in God is Reasonable and Responsible (Grand Rapids: InterVarsity Press, 2009).
    • A collection of essays by philosophers and theologians defending the rationality of theistic belief from the attacks of the New Atheists and others.
  • D’Souza, Dinesh. What’s So Great About Christianity (Carol Stream, IL: Tyndale House Publishers, 2007).
    • A defense of Christianity against the criticisms of the New Atheists.
  • Eagleton, Terry. Reason, Faith, and Revolution: Reflections on the God Debate (New Haven: Yale University Press, 2009).
    • A critical reply to Dawkins and Hitchens (“Ditchkins”) by a Marxist literary critic.
  • Keller, Timothy. The Reason for God: Belief in God in an Age of Skepticism (New York: Dutton, 2007).
    • A Christian minister’s reply to objections against Christianity of the sort raised by the New Atheists together with his positive case for Christianity.
  • McGrath, Alister and Joanna Collicutt McGrath. The Dawkins Delusion? Atheist Fundamentalism and the Denial of the Divine (Downers Grove, IL: InterVarsity Press, 2007).
    • A critical engagement with the arguments set out in Dawkins 2006.
  • Ruse, Michael. “Why I think the New Atheists are a Bloody Disaster,” https://biologos.org/blogs/archive/why-i-think-the-new-atheists-are-a-bloody-disaster, posted August 14th, 2009.
    • A criticism of the New Atheists by an atheist.
  • Schloss, Jeffrey and Michael Murray, eds. The Believing Primate: Scientific, Philosophical, and Theological Reflections on the Origin of Religion (New York: Oxford University Press, 2009).
    • An interdisciplinary discussion of issues raised by the sort of naturalistic account of religion promoted in Dennett 2006 and elsewhere.
  • Stenger, Victor. God: The Failed Hypothesis. How Science Shows that God does not Exist (Amherst: Prometheus Books, 2008).
    • A scientific case against the existence of God by a physicist who also taught philosophy and who is often classified as a New Atheist.
  • Stenger, Victor. The New Atheism: Taking a Stand for Science and Reason (Amherst: Prometheus Books, 2009).
    • A review of and expansion upon the principles of the New Atheism with responses to many of its critics.
  • Ward, Keith. Is Religion Dangerous? (Grand Rapids: Eerdmans, 2006).
    • A defense of religion against the New Atheists’ arguments by a philosopher-theologian.

 

Author Information

James E. Taylor
Email: taylor@westmont.edu
Westmont College
U. S. A.

Philodemus of Gadara (c.110—c.30 B.C.E.)

Philodemus of Gadara was a poet and Epicurean philosopher who, after leaving Gadara, studied in Athens under Zeno of Sidon before moving to Italy. Once in Italy, he lived in the area around the Bay of Naples, where he belonged to a circle of Epicureans that included Siro as well as the Roman poets Vergil, L. Varius Rufus, Quintilius Varus, and Plotius Tucca. His epigrams were preserved as part of the Greek Anthology, while his prose works were discovered at the Villa of the Papyri in Herculaneum, carbonized by the first pyroclastic surge of Mount Vesuvius in 79 C.E. He wrote on a wide range of topics, including epistemology, ethics, theology, aesthetics, logic and science, and the history of philosophy, but not physics. In his works, he presents himself as an entirely orthodox Epicurean. He does so by explicating the teachings of earlier Epicureans (especially those of Epicurus, Metrodorus, Hermarchus, and Polyaenus), defending the positions of his teacher Zeno of Sidon, arguing against fellow Epicureans whom he perceives to have strayed from orthodoxy, and advancing Epicurean positions against other schools like the Academics, Peripatetics, Stoics, Cynics, and Cyrenaics. Philodemus’ works fall into two distinct categories of style. The first are works that employ a bitter and polemical style, which he uses to denigrate other philosophers. A second, smaller group, which include On Death and his works on the history of philosophy, employ a much gentler tone and were perhaps designed to appeal to a more general audience.

The discovery of Philodemus’ works at Herculaneum in the eighteenth century was initially met with disappointment, and his works were initially regarded as offering little philosophical value. The negative reception of his works started to change in the 1970s, particularly due to the efforts of Marcello Gigante. Gigante founded the Centro Internazionale per lo Studio dei Papiri Ercolanesi, where, using new scientific methods, he made sure that revised editions of texts were released. More recently even newer technologies, such as multispectral imaging, have led to even more editions. The result of clearer editions has been to show that Philodemus’ works are more innovative than once thought, especially in the areas of aesthetics and ethics. This in turn has led to a realization that Epicureans were far less dogmatic than previously believed and that they were willing to incorporate non-Epicurean views, so long as they supported the school’s core tenets.

Table of Contents

  1. Life
  2. Sources
    1. Epigrams
    2. Prose Works and the Material Challenges of the Scrolls
  3. The Epigrams
  4. Philodemus’ Philosophy and Prose Works
    1. Epicureanism
    2. On the Good King according to Homer
    3. History of Philosophy
    4. Logic, Science, and Epistemology
    5. Ethics
      1. List of Ethical Works
      2. General Background on Epicurean Ethics
      3. On Choices and Avoidances
      4. On Death
      5. On Household Economics and On Wealth
      6. On Anger
      7. On Frank Speech
    6. Theology
    7. Aesthetics
  5. Influence and Legacy
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Very few concrete details are known about Philodemus’ life. Strabo tells us that he was born in Gadara, a Syrian Greek city which also produced other literary, rhetorical, and philosophical figures including the following: Menippus, Meleager, Theodorus the rhetorical teacher of Tiberius, Apsines the rival of Fronto of Emesa, Oenomaus the Cynic, and Philo the mathematician. It is not known when Philodemus left Gadara or if he went directly to Athens. Once there, however, he studied Epicurean philosophy with Zeno of Sidon (head of the Epicurean school from c.100-c.75 B.C.E.), who had a great influence on Philodemus. A number of his extant works (On Frank Speech and On Anger) are notes of lectures given by Zeno, and he describes himself as a faithful student both before and after Zeno’s death (PHerc. 1005 col. XIV.6-9). Many of Philodemus’ arguments adhere to Zeno’s interpretation of Epicurean philosophy. In On Rhetoric, for example, Philodemus consistently attempts to prove the orthodoxy of his views by restating those of Zeno, who had compiled evidence from founders’ works that supported his views. Likewise, in On Signs Philodemus puts forward Zeno’s position on Epicurus’ scientific method of inference.

Philodemus most likely left Athens in the ’80s or ’70s. His reasons for leaving are unknown, but he was probably a part of the large movements of people caused by either the Mithridatic Wars of the 80s or the Asiatic campaigns of the 70s. A reference in the Suda, a 10th-century Byzantine encyclopedia, suggests that he may have spent time in Himera but was expelled during a famine and a plague, when he was thought to have brought the anger of the gods. Unfortunately, it is impossible to comment on the reference’s veracity. What is more certain, however, is that Philodemus came to Italy, where he spent the majority of his time in either Rome, or Naples, or both. Evidence from his own work On Flattery (PHerc. 312 col. XIV) places him in the region around the Bay of Naples. Likewise, his dedication of three books of On Vices to Vergil, Quintilius Varus, Varius Rufus, and Plotius Tucca provides a further indication of his connection with the various Epicurean schools around Campania.

Once in Italy, Philodemus secured the patronage of Lucius Calpurnius Piso (c.100-43 B.C.E., consul 58 B.C.E), a wealthy Roman senator and father-in-law of Julius Caesar. According to Cicero, Philodemus met Piso when Piso was an adulescens, a term which applies to any age between 15 and 30. There are four pieces of evidence for the relationship between Philodemus and Piso: 1) To Piso, Philodemus dedicated a treatise called On the Good King according to Homer. 2) In Epigram 27 (AP. 11.44), Philodemus invites Piso to an Epicurean celebration. 3) Cicero depicts their friendship in his speech Against Piso; in this work, Cicero does not name Philodemus, but Asconius’ commentary identifies the unnamed Greek as Philodemus (Asc. Pis. 68). 4) In Catullus 47, Catullus depicts the friendship between a philosopher Socration, who can be identified as Philodemus, and a figure Catullus dubs Priapus, probably Piso.

Nothing is known about Philodemus’ death, but it is posited that he died around 30 B.C.E.

2. Sources

a. Epigrams

The majority of Philodemus’ epigrams, or poems ascribed to Philodemus, have been preserved in the Greek Anthology, which is a composite of the Palatine Anthology (found in two manuscripts AP and P) and the Anthology of Planudes (APl). These both had a common source, Constantine Cephalus’ omnibus of earlier collections of Greek epigrams including the Garland of Philip, in which Philodemus’ epigrams were incorporated. Some additional epigrams were also found in a papyrus from Oxyrhynchus (POxy. 3724). David Sider’s The Epigrams of Philodemos collects 38 epigrams either definitely by Philodemus or thought to have been by Philodemus in either AP or P. It is unknown whether Philodemus published the epigrams in his lifetime. Likewise, the original order in which the epigrams were written or arranged is not known. As a result, Sider has renumbered and re-grouped them as follows: epigrams 1 to 8, the Xanthippe cycle (Xanthippe was the wife of Socrates); epigrams 9 to 26, which are erotic poems; epigrams 27 to 29, which offer reflections on life in Campania; epigrams 30 to 34, on miscellaneous topics; epigrams 35 to 36, which have been ascribed to Philodemus but whose authorship cannot be proved or disproved; epigrams 37 to 38, which are not by Philodemus, but which have been included by Sider in order to evaluate all arguments for Philodemean authorship.

b. Prose Works and the Material Challenges of the Scrolls

Philodemus’ prose works are preserved in a collection of badly burned scrolls found at Herculaneum in an area named the Villa of the Papyri, which was discovered in 1750 by the Swiss military engineer Karl Weber. The library was found two years later in October of 1752. Upon its initial discovery no one was quite sure what they had found. The scrolls were burned beyond recognition, and did not resemble the papyri scrolls found in other places, particularly Egypt. Camillo Paderni, an artist put in charge, along with some workers, initially took the charred papyri for pieces of wood, throwing some aside and burning some as firewood. Eventually, Paderni and his workers noticed the relatively uniform nature of the finds; after first thinking they were rolls of fabric or fishing net, Paderni finally realized that they had found a library. He outlined this discovery in a letter to the Royal Society of London, saying that one room

appears to have been a library, adorned with presses, inlaid with different sorts of wood, disposed in rows; at the top of which were cornices, as in our times. I was buried in this spot for more than twelve days to carry off the volumes found there; many of which were so perished, that it was impossible to remove them.

As a result of the papyri’s carbonized state, Paderni employed a technique called scorzatura totale. This involved cutting the rolls in half vertically and then scooping out the middle portion. This method left intact the outside, concave layers, but caused the loss of important information about author, title, book number, and in some cases stichometric information, all of which is usually found at the end of the scroll. It also destroyed letters on each line crossing the cut.

After Paderni, a succession of techniques was used to open the scrolls, all of which caused further damage. They included the pouring of mercury onto the scrolls, the application of rose water, and lastly the application of vegetable gas, which did nothing but cause a bad smell. After these unsuccessful attempts, King Charles asked the head of the Vatican library for help, and Padre Antonio Piaggio was brought in to open the scrolls. Piaggio employed a combination of methods to open the scrolls, sometimes together or in isolation. The first way, known as scorzatura (“husking”), was to cut the papyri into two hemicylinders (or sometimes four smaller ones). Piaggio’s cuts were shallower than Paderni’s, which left the inner piece (the midollo or “marrow”) undamaged. Each semi-circular stack was called a scorza (“bark” or “husk”). A stack was read using a technique called sfogliamento, in which a drawing (disegni) was made of each layer before it was scraped off to expose the layer below. This method preserved only the lowest, outer layer together with the midollo. The process continued until no further layers could be separated. The disegni have been an important resource for later editors, as they preserve sovraposto and sottoposto, or fragments of layers that have become stuck to the layers inside or outside of them.

The midolli could be opened by unrolling (svolgimento) them. However, they were very brittle so Piaggio devised a machine to help open them. Animal membrane was attached to the outer edge of the papyri, ribbon or string was attached to the membrane, and then the ribbon was tied to a bar set above the midollo, which by the force of its own weight was allowed slowly to unwind. A third method (sollevamento) was used when a scroll had not been cut vertically into two sections. Working inwards from the outside of the scroll, each layer of the scorze could be lifted off. This technique had the problem of sometimes lifting off more than one layer at a time. In addition, Piaggio re-numbered the scrolls that Paderni had opened without leaving a record of having done so. This led to a number of works (for example On Music and On Piety) being read back to front, an issue which has now been remedied.

After Paderni, the British Reverend John Hayter (1756-1818) was invited to Naples to supervise the work of the Officina dei Papiri. Between 1802 and 1806, he and his team opened over two hundred scrolls. Like with Paderni, transcriptions were made of over half of these scrolls. Although these too were drawn by artists who did not know Greek, Hayter had these examined by people who did. After Naples had come under the kingship of Napoleon’s brother Joseph, Hayter went to Palermo where he continued his work. Eventually he returned to England. The disegni of the scrolls Hayter opened, together with the eighteen made earlier, were taken to England by Sir William Drummond, the British Minister to Naples from 1806 to 1809, and are now called the Oxford disegni (O). Some scrolls had later drawings made in Naples (N) to replace those taken by Hayter and others were made as new papyri were opened.

No new techniques were tried until the twentieth century, when Anton Fackelmann, a librarian from Vienna, used electromagnetism, which was successful. Once the layers were separated, Fackelmann thought to apply a coating of a natural transparent resin to strengthen them. He also added juice from fresh papyrus plants to give them added flexibility. Later, between 1999 and 2011, Brigham Young University undertook multispectral imaging (MSI) of the papyri held at the Officina dei Papiri Ercolanesi in Naples. The technique, developed by NASA scientists, takes several monochrome images of the same piece of papyrus, each with a different sensor. MSI uses filters to discern nonvisible portions of the light spectrum, particularly those in the nonvisible infrared spectrum, to differentiate black ink from the blackened scrolls. By dropping out the blackness of the papyri and enhancing the black ink, which both have different reflective characteristics, it is possible to read text that was formerly not visible.

Although the multispectral images (MSI) show text that cannot be seen by the human eye, it is still necessary for editors to view the originals of the papyri scrolls. For example, the MSI appear entirely flat when in fact the papyri fragments are highly ridged. These ridges can indicate sovrapposto and sottoposto, i.e. fragments from other columns that became stuck to other layers when the scroll was opened. Examining the papyri in person is also necessary to be able to assess their physical condition and size, and to see other features discernible when viewing papyri in person. The majority of the original scrolls are still housed in Naples at the Officina dei Papiri Ercolanesi, although there are some in Oxford at the Bodleian Library, together with the disegni taken by Drummond, and in Paris. Also found at the Officina dei Papiri are disegni of scrolls opened after Hayter’s departure as well as of copies made to replace those taken to England. The Naples disegni (N) are less reliable than those taken to Oxford.

The most recent technology applied to reading the papyri from Herculaneum has been X-ray phase-contrast tomography (XPCT). The application of this technology to reading the texts from Herculaneum is in relatively early stages, and there are still some limitations associated with it. However, the use of XPCT is most promising, as it offers the major advantage of being able to read letters without needing to open the scrolls, a process which is extremely damaging.

Each work from Herculaneum will have a number, such as PHerc. 1050, which was assigned at its original opening. It will also have an English title, Latin title, and finally a Greek one. For example, PHerc. 1050 may be called by its English title, On Death, or by its Latin title, De morte, or by a Greek one, Περὶ θανάτου. Philodemean scholarship tends toward using the papyrus number and the Latin or English title. In a citation of a passage from one of Philodemus’ works, scholars will cite an abbreviated title or papyrus number, a column number, and a line number. In the case of works that have had more than one editor or with works for which different books have had different editors, then an editor’s name is included as well.

3. The Epigrams

Philodemus’ epigrams reflect earlier Hellenistic conventions of using short elegiac couplets, that is, alternating lines of dactylic hexameter and pentameter. Philodemus draws on familiar epigrammatic subject matter such as erotic and sympotic topoi. Meleager, Asclepiades, Callimachus, and other authors from Meleager’s Garland all served as his poetic models. In keeping with Hellenistic tradition, his poems frequently convey the illusion that they were composed on the spot for performance at a dinner party. Even if they actually were extemporaneous to begin with, Philodemus would have polished them for publication. That they were published in his lifetime is attested by Cicero and a number of Latin poets, who were influenced by them.

Eight of Philodemus’ extant epigrams focus on the author’s relationship with Xantho (Sider 1-8, AP 5.131, 5.80, 9.570, 11.41, 5.112, 11.34, 5.4, 10.21), recounting its origin in erotic love and its move toward the poet’s desire for marriage and lifelong partnership. Twenty-eight poems are erotic (Sider 9-26, AP 5.13, 5.115, 12.173, 5.132, 5.24, 5.123, 5.25, 5.124, 5.121, 5.114, 11.30, 5.46, 5.308, 5.126, 5.107, 12.103, 5.306, 5.120, 5.120), including a witty poem in which Philodemus uses the name Demo to pun on his name (Sider 10; AP 5.115). Three poems deal with Philodemus’ life on the Bay of Naples, including two invitation poems (one to Piso, Sider 27, AP 11.44, and a second to friends, Sider 28, AP 11.35), and one contemplates the death of a friend (Sider 29, AP 9.412).

None of the poems are strictly speaking Epicurean, although the three poems that describe life in Campania (Sider 27-29, AP 11.44, 11.35, 9.412) touch on Epicurean themes such as friendship, death, and simple food. His incorporation of Epicurean ideas is itself influenced by earlier examples, which suggests that the inclusion of Epicurean themes by Philodemus has more to do with tradition than with his Epicureanism. Asclepiades had included Epicurean tenets in his poems, Posidippus Stoic tenets, and Callimachus a variety of schools. All three writers of epigrams had employed philosophical themes in their erotic poems to depict the trials of love.

Cicero (Against Piso 70) presents Philodemus’ decision to write poems as out of keeping with Epicurean traditions, and there was a tendency in sources hostile to Epicurus and his teachings to present Epicureans as anti-intellectual and anti-poetry. In reality, Epicurus’ views on poetry were more nuanced than his opponents present them, and he probably regarded poetry as a natural and unnecessary pleasure. Philodemus’ epigrams, which give the appearance of off-the-cuff recitations, fulfill Epicurus’ requirement that the wise man not go to great effort to compose poetry.

4. Philodemus’ Philosophy and Prose Works

a. Epicureanism

Epicurus (341—271 B.C.E.) established a school of philosophy around 305/4 B.C.E. He was an atomist who held an empiricist theory of knowledge, a moderate form of ethical hedonism, and a social theory based on contractarianism. Hostile sources tend to present Epicurus as anti-intellectual, anti-political, and as a sensual hedonist. Later Epicureans had a reputation for loyalty and orthodoxy, and they sought to clarify and defend Epicurus’ views against such polemic. Philodemus is no exception, and his expositions on the topics of Epicurean logic, science, epistemology, ethics, aesthetics, and theology are often extremely polemical in style. Aside from acting as an important source for Epicurean views, Philodemus’ works also provide important evidence about other ancient philosophical schools such as the Academics, Peripatetics, Cynics, Stoics, and Cyrenaics.

An area of Epicurean doctrine that is noticeably absent from Philodemus’ extant works is that of physics, although his discussions on epistemology and theology are informed by the school’s teachings on the subject. In particular, Philodemus’ works are informed by their view that it is through the study of nature (physiologia) that it is possible to live happily, by which Epicureans meant to live in accordance with pleasure. Epicurus distinguishes between the greatest pleasure, which is absence of physical pain (aponia) and mental distress (ataraxia), and the things that bring pleasure; later sources differentiate these as katastematic and kinetic pleasures respectively, although Epicurus does not do so in his extant works. He argues that although pleasure is limited and is a static state, that it is possible to vary it (Epicurus RS 9).

Philodemus’ lack of writing on the topic of physics may reflect his Roman context, as may his great interest in ethics, politics, and aesthetics. With regard to political involvement, which Epicureans are usually depicted as advising against, Philodemus argues that some people are constitutionally inclined toward political involvement (On Rhetoric fr. XIII.1-16 Longo Auricchio) and fame (On Flattery IV.4-12). Ultimately, however, he recommends withdrawal from the many to a close circle of friends as the best means of securing happiness. The most complete account of Epicurean physics is found in Lucretius, although fragments of Epicurus’ On Nature, of which Lucretius’ On the Nature of the Universe is an adaption, have been discovered among the Herculaneum papyri.

b. On the Good King according to Homer

On the Good King according to Homer (PHerc. 1507) is an ethical text, in which Philodemus offers an account of good and bad leadership qualities, but it also showcases Philodemus’ view that the Epicurean sage is best positioned to correctly interpret poetry. The treatise was dedicated to Lucius Calpurnius Piso Caesonius. Using examples from Homer, Philodemus offers advice on how to be a good leader and how to avoid being a bad one. He shows that a good person can be an effective and profitable leader if they abide by particular moral standards. He deals with themes such as leisure time, the character and behaviors of good and bad rulers, how to deal with conspirators and discord, interpersonal relationships, social harmony, as well as military matters.

Philodemus counsels against being a tyrant or despot and ruling through fear, saying that love and respect are much more effective means of governing. He recommends the avoidance of coarse behavior and jokes, licentiousness, drunkenness, overindulgence of food, boastfulness, unnecessary anger, severity, harshness, and bitterness in favor of the recitation of tasteful poetry, self-restraint in the consumption of food and drink, a stable disposition, control over excessive emotions, mildness, fairness, and gentleness. He writes that a good leader will be a lover of victory but not of unnecessary wars, battles, or civil war, and he argues that sowing dissent among one’s followers to maintain power is ineffective. He suggests that a system of punishment (rebukes and threats) and rewards (honors rather than personal gain) are effective for keeping discipline. Good rulers, according to Philodemus, are just and apply laws that are beneficial rather than simply strict. They display clemency and are dutiful. They undertake physical and intellectual training and are able to take wise counsel. The two traits Philodemus most praises in leaders are wisdom and conciliatory justice. Of all the Homeric heroes, Philodemus presents Nestor and Odysseus as displaying the greatest number of ideal traits.

Although the work is not strictly speaking a philosophical treatise, Philodemus interprets kingship theory through the lens of Epicurean philosophy, and he privileges traits such as emotional constancy, frankness, and self-restrained enjoyment of pleasures that contribute to personal security.

c. History of Philosophy

Philodemus’ historical works can be divided into two categories: the first includes dispassionate indices of past philosophers, while the second comprises works of a more polemical style in which he discusses issues surrounding the canonical texts of the early founders, orthodoxy, and doctrinal consistency. In this group of works, Philodemus defends his own views, presenting himself as a thoroughly orthodox Epicurean.

Diogenes Laertius (10.3) records that Philodemus wrote a history of philosophy, and scholars have suggested that a number of Herculaneum papyri belong to this work. These are simple indices on the Stoics (PHerc. 1018), Academics (PHerc. 164 and 201), Epicureans (PHerc. 1780), Pre-Socratics (PHerc. 327 and 1508), and Socratics (495 and 558). They contain the names of various philosophers together with their biographical details and the names of their students. They do not include analysis of any doctrines. Philodemus’ name does not appear on any of the extant fragments, and so it is not entirely certain that they are his works.

Philodemus’ remaining works on the history of philosophy are in his more usual polemic style, which he deploys against other schools and Epicureans whom he considers as failing to adhere closely enough to the teachings of the school’s early leaders. He regards the lives and teachings of Epicurus, Metrodorus, Hermarchus, and Polyaenus as the benchmark for later followers. He tends to present himself as maintaining orthodoxy while other circles of Epicureans practice a degraded version of Epicureanism. Three extant works (Memoirs, Against the …, and On Epicurus) offer examples of Philodemus’ technique of establishing the views of the early founders. In Memoirs (PHerc. 1418 and 310), Philodemus collates letters from the first generation of the school. The work’s aim is to preserve their memories and to pass along information about their daily lives to later Epicureans. In the third of the work that has been preserved, Philodemus provides excerpts from letters on the topics of friendship, financial contributions to the school, and how correctly to praise.

In Against the … (PHerc. 1005), Philodemus appears to have a similar aim of setting forth the views of the early founders, and he stresses that a good Epicurean must know the contents of their works before they are able to undertake critical interpretation. The question of canonization is thus an important aspect of this work. He cites Zeno, his teacher, as an example of an Epicurean whose exegesis of the school’s doctrines is based on careful study of the founder’s thoughts. Philodemus also defends Epicureans from the charge of doctrinal inconsistency. The full title of this work is not known and it is not precisely clear against whom Philodemus is arguing. It is more certain, however, that the work contains an attack on Epicureans, as well as on a non-Epicurean who exploited disagreements within the school to bolster his own argument. Philodemus, rather importantly, envisages two ways of being a follower of Epicurus: the first is to live a life guided by Epicurus’ teachings but not to engage in any doctrinal exegesis. It is clear that Philodemus regards this as an option for those who lack the education to delve in depth into the school’s teachings. The second follower is able to undertake interpretation of the founder’s teachings, having completed in-depth training; sages like himself and Zeno belong to this group.

A final work in which Philodemus focuses on the history of philosophy is On Epicurus (PHerc. 1231, 1232, 1289b, and perhaps 176). The work is a eulogy to Epicurus, and similarly to Against the … and Memoirs it contains a focus on orthodoxy and canonization. On Epicurus gives a particularly good indication of Philodemus’ strong emphasis on ethics and his view that ethics needs to be grounded in “the study of nature” (physiologia). It also highlights Philodemus’ desire to present himself as an orthodox interpreter of Epicurus’ doctrines. Although Philodemus does not usually provide the philosophical underpinnings for his analysis or offer a defense of his own views, in On Epicurus he does, which makes this text, together with On Choices and Avoidances, unusual within Philodemus’ oeuvre.

d. Logic, Science, and Epistemology

Rather controversially, Epicurus argued that all sensations are true, and he posited that the sensations provide knowledge of the world. According to Epicurus (Letter to Herodotus 50), however, a process of judgment takes place about the information presented by the sensations. It is at this stage that it is possible to form false opinions. Epicurus was thus concerned to develop a theory of knowledge about sense perception, and he investigated the question of how the senses can tell us what is true or false in his work The Canon. “Canon” in Greek refers to a ruler or a yardstick, in this case a yardstick for assessing what is true or false.

Epicureans established four criteria to test whether an opinion is true or false: 1. the aisthēseis (“senses”); 2. the pathē (“feelings); and 3. prolēpeis (“preconceptions”). There is also possibly a fourth criterion of truth, which is phantasikai epibolai tēs dianoias (“presentational applications of the mind”). These criteria of truth are based on the foundations of Epicurean physics, specifically its atomism, which argues that everything is made up of atoms and void. Atoms move in the void. This activity releases a stream of atoms, which are perceived by the senses. It is possible that Epicurus classed the mind together with the traditional five senses and that later Epicureans separated it out to create the fourth criterion of truth “presentational applications of the mind.” The second criterion of truth, the pathē, plays a key role in Epicurean ethics. The pathē are the feelings of pleasure and pain, which guide all choices and avoidances. Repeated sensations, whether on the mind or the five senses, lead to prolēpseis, or preconceptions about general notions. These are used by Epicurus to solve the pain of infinite regress because they require no further proof or definition. When a concept is mentioned, a preconception is called to mind, and we conceive an imprint of the thing which has already been learnt by the senses. Through a process of analogy it is possible to form further ideas about different concepts.

On Sensations (PHerc. 19/698) touches upon Epicurean physics, and underlying the work’s theory on sensations are the following arguments: sensations are common to both the body and the soul; sensations do not have memory; the sensations are irrational; all sensations are true; and sensations can be explained by Epicurean atomic theory. However, despite the presence of Epicurean canonic claims, On Sensations is not a work of physics but one of epistemology. The initial part of the scroll was destroyed in the process of opening it, which meant that the title and author information was lost; however, based on authorial style, there is good reason to think that the work is by Philodemus. Likewise, content, style, handwriting, and papyrological features such as height, suggest that PHerc. 19 and 698 belong to the same work. The work uses the difference between sight and touch to explore the Epicurean theory of sensations. It engages with the ideas of the school’s founders (Epicurus, Metrodorus, and Polyaenus), but it also introduces new formulations of traditional Epicurean arguments in the face of criticism from other schools. This is seen, for example, in the treatise’s arguments about the unity of sensation and its rejection of the Stoic idea of katalēpsis. These arguments are not known from any other source. Likewise, the treatise’s arguments about common sensitivities are also only attested in this text.

It contains six major arguments. 1) Columns I to VII argue that there is only one sensible faculty, despite the variations that can be observed when something is perceived through sight and touch. 2) Columns IX to XVI focus on Epicurean arguments about apprehension (epaisthēsis) and “affection” (pāthos) in response to Stoic theories of apprehension (antilēpsis) and “grasping” (katalēpsis). The Stoic theory of katalēpsis is rejected in favor of the Epicurean one on the basis that apprehension and affection happen concomitantly. Epicurean pāthos thus refers to both the passive act of receiving and the knowledge that one is perceiving, that is to say objective reality and the affection of the perceiver. 3) Columns XVIII and XIX examine the relationship between time and sensation, showing that recollection of past events is not a trait of the senses. 4) In columns XX to XXVII, the treatise presents arguments about so-called “common sensitivities.” The argument seeks to demonstrate that the unique function of the individual senses can be maintained at the same time that there exists “common sense.” The columns contend that the different senses perceive the same form analogously and that the difference lies in the mode of perception. 5) The fifth argument (cols. XXVIII to XXIX) addresses the opposition between common sense and the individual senses. 6) The sixth part (cols. XXIX to XXXIV) critiques arguments made by other schools which attribute to the senses abilities that they do not possess, and it outlines exactly what each sense is capable of perceiving.

The Epicurean emphasis on sense perception raises questions about how it is possible to gain knowledge of objects and things that are not directly perceived by the senses, such as atoms, void, the gods, or a concept like justice. In On Signs (Pherc. 1065), Philodemus offers insight into Epicurean arguments on the topic of how to gain knowledge about imperceptibles (adēla) from evident things. The text is not complete, but the extant part can be divided into four sections. Section 1 criticizes the objections raised by an opponent (cols. Ia.1 to V.36) and provides Epicurean rebuttals to them (cols. XI.28 to XIX.4) with a further set of objections and replies between columns five and eleven. Section 2 presents the arguments of an Epicurean Bromius, a contemporary of Philodemus (cols. XIX.9 to XXVII.28). Part 3 gives the arguments of Demetrius Lacon (cols. XXVIII.13 to XXIX.16), a contemporary of Zeno’s whose arguments are another version of Zeno’s. Part 4 offers the perspective of an unnamed Epicurean (cols. XXIX.20 to XXXVIII.22).

The text focuses on the relationship between two phenomena: the sign and thing signified. It contrasts inference from signs with syllogistic reasoning (i.e. deduction). Philodemus argues that Epicurean inference from analogy or similarity is the only viable way to understand the relationship between two phenomena. In contrast with the method of starting with the consequent and using deduction to establish an a priori relationship between the consequent (the thing signified) and antecedent (the sign), the Epicurean theory of signs begins from the antecedent and posits an a posteriori relation between two phenomena that have similar essential qualities. The emphasis on an a posteriori connection is consistent with Epicurean empiricism, as is the method of validation, which is inconceivability (adianoesia). In an empiricist fashion, the starting point is always an observable phenomenon. If both the antecedent and its consequent are perceptible things, then they can be verified by a process of positive “attestation” (epimarturēsis) or proved false by “negative attestation” (ouk epimarturēsis). For example, when a person thinks that they see Plato approaching, but they are unsure because of the distance, it is attested that it is indeed Plato by observable phenomena once Plato comes closer. However, if it is not attested by observable phenomena, then the idea is proved false.

In the case of unobservable or non-perceptible phenomena, the process of verification is somewhat different. The starting point is still the perceptible object. However, because it is not possible to attest to something that is not empirically observable, then the only means of verifying unobserved phenomena is “not-contestation” (antimarturēsis). For example, the observable phenomenon of motion demonstrates the existence of void, because there must be space for bodies to move in. In this case, the empirically observable phenomenon motion is the starting point of the inference from similarity about void. Moreover, the existence of motion does “not contest” the existence of void. If, on the other hand, the properties of the observable object contest (antimarturētai) those of the unobservable one, then the relationship is a false one.

On Signs also outlines a process of “critical appraisal” or “empirical reasoning” called epilogismos, a process used to infer the underlying properties of unobservable phenomena. For example, it is possible to critically appraise experiences of motion to discern certain properties about motion, which then allows the inference from analogy that void exists. The text also argues that it is possible to infer from similarity a phenomenon’s properties based on the past experiences of humankind (hīstoria) and not just on direct experiences.

e. Ethics

i. List of Ethical Works

 The majority of works found in the library of the Villa of the Papyri are on Epicurean ethics. On Flattery (PHerc. 222, 223, 1082, 1675, and perhaps 1457), On Arrogance (PHerc. 1008), On Household Economics (PHerc. 1424), and On Greed (PHerc. 253) were written by the same scribal hand and constitute books of a multivolume work entitled On Vices and Their Opposing Virtues. On Slander (PHerc. Paris 2), On Beauty, and On Eros may also belong to this same larger work. On Frank Speech (PHerc. 1471) together with On Conversation (PHerc. 873), On Gratitude (PHerc. 1414), and perhaps On Wealth (PHerc. 163) belong to a second multivolume work On Characters and Types of Life. On Anger (PHerc. 182) is the best-preserved book of a larger work that probably dealt with the emotions (pathē). On Death (PHerc. 1050) preserves about a third of a 118-column treatise on the topic of death.

ii. General Background on Epicurean Ethics

As with other ancient schools of philosophy, Epicurus sought a definition of eudaimonia (“happiness,” “well-being”) that was unique to his own school, and he taught that pleasure is the best means of achieving happiness. However, Epicurus did not endorse sensual hedonism but “sober reasoning and searching for the grounds of every choice and avoidance and banishing the beliefs, from which the greatest tumult lay hold of the soul” (Epicurus Letter to Menoeceus 132). Thus Epicurean pleasure is not hedonistic but is the absence of pain (aponia) and the resulting freedom from mental anxiety (ataraxia) together, the kind of pleasure that arises from the temporary satisfaction of a natural and necessary desire. He and his followers argued that if four basic principles were followed—that what is good is easy to get, what is bad is easy to endure, and that the gods and death should not be feared—then eudaimonia could be gained.

The senses teach that pleasure is good and that pain is bad, and every decision should be referred to this. Central to Epicurean ethics is the notion of limit, and all pleasure and pain have a natural limit. It is, however, possible to vary the type of pleasure experienced through varying the things that bring pleasure. Later sources differentiate between these two ways of experiencing pleasure with the terms katastematic and kinetic.

Epicurus overtly linked desire to happiness. He divided desires into three categories: natural and necessary, natural and unnecessary, and unnatural and unnecessary. Natural desires aim at the attainment of pleasure and the avoidance of pain, while unnatural desires are based on empty beliefs about what causes pleasure and pain. Epicurus enjoins followers to assess desires on the basis of what would happen if they remain unsatisfied. If when unsatisfied they cause pain, then they are necessary. If they do not cause pain when unsatisfied, then they are unnecessary. A natural and unnecessary desire aims at some variation to pleasure, but if a desire results in an excess of pain over pleasure it becomes an unnatural and unnecessary desire.

iii. On Choices and Avoidances

The text On Choices and Avoidances (PHerc. 1251) presents many of the views just outlined. The text is incomplete, and the extant 23 columns preserve what was perhaps the peroration. Although the title and author information are no longer evident, statistical, paleographical, and stylistic reasons make it likely that Philodemus wrote this work. Further, the manner in which the author deals with topics is reminiscent of Philodemus’ other works. Philodemus himself refers to a work On Choices and Avoidances, and the subject matter of PHerc. 1251 fits with this theme. The treatise deals with the need to distinguish between different desires, pleasures, and their sources so that good choices can be made and bad ones avoided. It teaches that rational calculation is the best way to ensure a happy life, one lived in accordance with the principal that pleasure is good and pain is bad. Philodemus aims to show the utility of the tetrapharmakos (“fourfold remedy”), an easily memorized summary of four key Epicurean doctrines (do not fear the gods, do not fear death, what is good is easy to get, what is bad is easy to endure). The tetrapharmakos highlights the therapeutic role of Epicurean ethics, utilizing medical imagery to do so. Philosophy is presented as treating psychic disorders in the same way that medicines treat bodily illnesses. Philodemus uses the analogy of philosophy and medicine in other works, including On Frank Speech, while the emphasis on memorization is in keeping with Epicurus’ pedagogical strategy in his letters, in which he presents memorization as key to navigating everyday situations, stating that, regardless of a student’s level, knowledge of all Epicurean doctrines is necessary.

Philodemus demonstrates how application of the tetrapharmakos to fears of dying, superstition, the valuation of external goods, justice, illness, and the management of one’s life in general can have positive consequences. He argues (col. XIII.16) that it is necessary to draw ethical arguments from the study of nature in order for them to be complete. It is from nature that it is possible to learn that nothing is produced without cause. The treatise begins (cols. I to III) with views that do not accord with those of Epicurus, before moving onto the topic of limits (col. IV). The idea of limits is central to Epicurean ethics, which taught that both pleasure and pain are limited in duration. Philodemus summarizes those ideas here. An understanding of limits enables the easy removal of pain through the satisfaction of basic desires, which Philodemus addresses in columns V and VI. He mentions the difference between types of desires, and presents the standard division of desires into three categories: natural and necessary, natural and unnecessary, and unnatural and unnecessary. However, these columns also present an innovation, perhaps in response to criticisms from outside the school, and Philodemus makes natural the genus and necessary and unnecessary the species.

Having discussed the idea of limits, which applies to two of the tetrapharmakos (that what is good is easy to gain and what is bad is easy to endure because they are both limited), Philodemus moves on to criticizing superstitious fears (cols. VII to X) that run counter to the Epicurean view that the gods are blessed and immortal beings, unconcerned with the affairs of humans. He critiques the view of the gods as vengeful and omnipotent beings, and he examines the impact these misguided beliefs have on people’s behaviors: according to Philodemus, they make people irascible, ungrateful, hard-to-please, and ill-tempered. People who hold such beliefs bring innumerable misfortunes not only to themselves but also to their cities. In columns XI and XIII, Philodemus focuses directly on the cardinal tenets of Epicureanism as taught by nature, placing great emphasis on rational calculation based on the tetrapharmakos. He stresses the fact that it was Epicurus who correctly established the tēlos of philosophy. Column XII deals with civic and criminal law, which work on the basis that people are taught to fear punishment (either from the law or from the gods). This position runs counter to Epicurean contractarianism. Philodemus’ arguments against the view are no longer extant, but it is clear that it does not fit with the tetrapharmakos.

Column XIV offers a one-way entailment between virtues and pleasure, another departure from Epicurus who regarded there to be a mutual entailment. The column also continues with the theme of physics and its connection to ethics. The end of the column is fragmentary but concludes with a comment about desires, which leads into Philodemus’ discussion of external goods in column XV. The understanding of external goods, however, is thought to be of secondary importance to the learning of the cardinal tenets, and Philodemus only dedicates this small portion of the peroration to this topic.

Columns XVI to XX focus on the final element of the tetrapharmakos: the fear of death. Philodemus examines actions and attitudes that result from fearing death. As in the case of superstitious fears, Philodemus does not explicitly state the Epicurean argument that death should not be feared because once dead we cease to exist. He again focuses on the practical problems that arise from the fear of death, including behavioral issues (cols. XVII and XX), incompetence especially with regard to financial administration (col. XX), interpersonal issues (col. XX), procrastination (col. XIX), and laissez faire attitudes. He argues that it is stupid to wish to extend life but that it is equally stupid to want to give up (col. XVI). He presents the fear of death as causing people to give up philosophy (col. XVII) and as inhibiting the attainment of a better life (col. XVIII).

The extant portion of the treatise concludes (cols. XXI to XXIII) with a comprehensive image of the Epicurean sage. Sages do not amass money but nor do they neglect their finances. Instead, they apply the tetrapharmakos to all financial decisions. They are generous and kind to others, showing gratitude when the same attitudes are shown to them. They do not fear death, and thus always cultivate new relationships and interests. Even though they do not fear death, they never seek it and always maintain their health.

iv. On Death

Philodemus’ On Death (PHerc. 1050) appears to have a much wider audience in mind. Throughout the treatise, Philodemus shows the ways that Epicurean philosophy can help combat common fears relating to death. He deals with a range of topics including the fact that the dead lack sensation (col. I) and the fact that a long amount of time gives as much pleasure as a short amount of time (col. III). This latter idea is revisited by Philodemus frequently throughout, and he stresses that a person’s conduct during their lifetime, regardless of how long or short that may be, is more important than how they die or if they are remembered after death. For example, going unburied is not a problem except that it demonstrates a lack of friends, and having no friends while alive is unfortunate (col. XXXI). Or, a death sentence is sad if someone is guilty, because they have lived a life of pain. If someone is unjustly sentenced to death, the quality of their life is what is important, not the manner of their death (col. XXXIV). Thus, a good person can take pleasure from knowing that his death will be regretted by other good people, but he will not be concerned with whether or not enemies gloat over his death (cols. XX to XXI). To do so is irrational because one will be dead and therefore unconscious. Likewise, Philodemus has no sympathy for people who fear dying in bed rather than battle, because once again posthumous glory is irrelevant when one will no longer exist. He acknowledges that it is sad to die young, but only if it has prevented someone from attaining a certain level of philosophy (col. XVII). Other topics Philodemus addresses are the lack of good things that accompany being dead (col. II), leaving behind family members who are dependents (col. XXV), dying childless (cols. XXII to XXIV), dying away from one’s fatherland (cols. XXV to XXVI), dying in poor physical condition (col. XXIX), and death at sea (col. XXXII). In most cases, Philodemus shows that these are not legitimate fears based on the Epicurean argument that sensation is dependent on the soul’s unity with the body; once one is dead, the two both cease to exist and all sensation is lost. Yet in the case of leaving dependents in a vulnerable position, Philodemus shows great sympathy and exhorts readers to make proper arrangements to avoid this situation.

The tone of On Death is far less harsh than Philodemus’ usual style. He remonstrates with other philosophers gently and uses sympathetic language to discuss non-Epicurean fears of death. For example, in columns VII and VIII, Philodemus uses a protreptic style to persuade readers of the advantages of the Epicurean view over that of the Stoic Apollophanes. Apollophanes appears to have argued that death is accompanied by pain because atoms cannot easily separate themselves from the soul. Rather than offering a harsh or sarcastic response, Philodemus clearly and concisely explains the Epicurean position that there is no pain because atoms are very small, very smooth, and very round, which allows them to painlessly fly through the skin’s pores at death.

v. On Household Economics and On Wealth

Two of Philodemus’ treatises examine the question of finances. On Wealth (PHerc. 163) is poorly preserved, but in what remains it seems that Philodemus argued that wealth and poverty are in themselves neither good nor evil. He dismisses the Cynic view that poverty is a good, the Stoic position that only virtue is important, and the popular view that wealth is evil. He instead presents the Epicurean position that wealth is only needed in moderation, which relates to the idea that natural wealth is both easy to attain and limited.

On Household Economics (PHerc. 1424) is particularly well-preserved, and Philodemus’ arguments are likewise extremely clear. The text focuses on Epicurean money management, and Philodemus is concerned with the question of how to acquire and maintain money in a way that does not inhibit pleasure. Part of the treatise critiques the views of Xenophon (fragments II, 2, cols. A to VII) and Theophrastus (cols. VII to XII). Philodemus takes issue with the fact that Socrates in Xenophon’s work does not use everyday meanings of terms, that his arguments are ambiguous, and that he is frequently irrational. He accuses both of assigning too much importance to the role of wives (cols. II and IX) and of including irrelevant details that are not needed for managing home finances effectively. However, he does not dismiss their views out of hand, and says that it is best to borrow from others if their theories are useful (col. XXVII).

In the work’s second part (cols. XXII to XXVIII), Philodemus defends the Epicurean position of money management, and he focuses on the correct attitude toward the acquisition and maintenance of wealth. He shows that wealth is not inherently problematic but that it is the attitude of the person administering it that can give rise to problems (col. XXIV). He recognizes that it is often necessary for philosophers to work (col. XI), and against the Cynics, he argues that the sage’s attitude to wealth is that having some is better than none (col. XII and XV). In fact, he argues that, although many things cause pain when present, they cause even more pain when absent (col. XII to XIII). However, he stresses that sages will not be bound by excessive toils to attain it (col. XI, XV and XVIII). Labor is problematic because it is often driven by the end for unnatural and unnecessary wealth (col. XVI). Unlimited wealth is not worth the trouble it takes to acquire, but sages should not be so leisured that they cannot provide for themselves (col. XVI). In keeping with the central place of friendship for Epicurean circles, Philodemus cites having friends as essential to the maintenance and acquisition of wealth: he argues that they help increase wealth (cols. XIV to XXV). He recommends giving to friends in times of prosperity and need (col. XXVI). In times of adversity, he also acknowledges that it may be necessary to set aside the practice of philosophy, writing that it is still possible to enact one’s philosophical principles by putting the needs of our friends before our own.

In short, Philodemus offers advice on how to apply the hedonic calculus to financial management, advocating that all wealth be acquired and maintained in such a way that does not require excessive labor or mental stress. His list of best and worst jobs in columns XXII and XXIII is based on his argument that when undertaking activities for making money and maintaining one’s existing possessions, it is necessary to (col. XXIII.39-42) “keep in mind that the principal [activity] consists in managing one’s desires and fears.” On this basis, military and political activities are the worst way for making a living, closely followed by the art of horsemanship, which he labels ridiculous, and mining. He calls mining with one’s own hands mad and mining through the use of slaves unfortunate. He writes that farming the land oneself is miserable. These jobs all require too much labor and provide insufficient pleasure in return. He deems owning land that is farmed by slaves acceptable on the basis that it creates opportunities for philosophical discussions amongst friends. Renting out properties and owning skilled slaves is likewise acceptable, for it leaves time for philosophy. However, the best way of earning a living is from the practice of philosophy. Philodemus’ recommendation to earn money from philosophy is the first appearance of this idea in Greek literature.

vi. On Anger

On Anger (PHerc. 182) provides important evidence for Epicurean emotional theory. The Epicureans held that emotions are cognitive, because they are connected to beliefs, which together with their atomic makeup and environment, shape a person’s disposition (diathēsis). On the basis that emotions are in part caused by beliefs, Epicureans held that it is possible to cure someone’s negative emotions by altering their core beliefs—a view in keeping with a curative approach to ethics. In On Anger, Philodemus presents (col. XXXVII.17-32) the school’s theory of the emotions as midway between that of the Stoics and Peripatetics. Unlike the Stoics, Philodemus regards emotions as a natural part of human nature, and he says that feeling them is an inevitable part of being human. They must, however, be regulated. In contrast with the Peripatetics, who argued that emotions are good if they are controlled by reason, Philodemus does not think emotions per se are good, because the only good for Epicureans is pleasure. Moreover, Philodemus regards the disposition of the person experiencing the emotion of utmost importance, and so an emotion can be good if the person feeling it has a good disposition, as would the Epicurean sage. If the person feeling an emotion has a bad disposition, then the emotion itself will be bad because they hold mistaken beliefs about its cause.

In On Anger, Philodemus links emotions to desires, and emotions are an evaluative response to a situation (col. XXXVII.32-39). Philodemus thinks such responses result from a person’s beliefs, in the sense that a person will respond emotionally to a situation depending on whether they believe their desires have or have not been met. In the case of anger, a person will feel angry if they perceive a desire to have been thwarted in some way. Yet, because emotions and desires are linked for Philodemus and desires are divided into natural and empty, so too are emotions (cols. XXXVII.39-XXXVIII.10). He stipulates that anger is natural and necessary only if the anger is caused by an intentional harm to a person’s natural and necessary desires, for instance their health, life, or happiness. The person who experiences natural and necessary anger will have a good disposition. This sort of anger is of limited duration. Empty anger, on the other hand, is experienced by someone with a bad disposition and is caused when someone’s unnatural and unnecessary desires are harmed. A further difference between those who experience the two types of anger relates to punishment, and Philodemus argues that a person experiencing natural and necessary anger will never enjoy punishment (col. XLIV.17-20). They will only use it as a means to prevent further instances of harm.

vii. On Frank Speech

Philodemus’ On Frank Speech (PHerc. 1471), which comprises his notes from a lecture of Zeno’s on the topic, provides insight into the key therapeutic technique of the Epicurean school. Parrēhesia (“frank speech”) was used to cure students of ethical flaws, but it was also a guideline for interpersonal relationships between sages. Its value lies in the technique’s recognition that students learn in a variety of ways, which is reflected in the teacher’s alteration of their style of criticism depending on how their students respond to criticism and on their educational needs. So, for example, Philodemus distinguishes students who have strong personalities from those who are tender (fr. 7.1-5). Other personality types that Philodemus examines are irascible people (fr. 68-74). He also states that the practitioner of frank speech must take into account a number of variables, such as whether or not the person is thankful to receive good will (frs. 75-80, fr. 88, col. XXIXb); gender (XXIb.12-XXIIb.9); and social status (see particularly cols. XXIIb.10-XXIVa.7), and age (col. XXIVa.7-XXIVb.12). His main focus is on how to vary the style of criticism depending on the student’s disposition.

Throughout the treatise, Philodemus uses sustained medical imagery, using the language of diseases and curing to discuss the treatment of ethical flaws. Philosophers are thus like doctors who prescribe medicine (i.e. Epicurean doctrine) to cure the soul. In this Philodemus is influenced by Epicurus, who had begun the tradition of equating the Epicurean wise man’s role as a healer of the soul to the doctor who healed physical ailments. A key element of Philodemus’ medical imagery is the self-diagnosis of the student, who must first recognize their character flaws before they can be successfully treated.

In addition to helping cure students, frank speech was an integral feature of Epicurean friendship. Friends in an Epicurean community could use it to overcome fears relating to the fear of death and the gods. For Philodemus, frank speech within an Epicurean community is key for generating goodwill (col. Va.3-10) and gratitude.

Two related treatises, On Conversation (PHerc. 873) and On Gratitude (PHerc. 1414), touch on similar themes. On Conversation examines the social settings of different types of speech, the usefulness of staying silent, and contemplation. On Gratitude, like On Frank Speech, argues that gratitude is an essential element of Epicurean friendship.

f. Theology

The cornerstone of Epicurean theology is the prolēpsis (“preconception”) of the gods as blessed and immortal beings, unconcerned with the affairs of humans. The school’s insistence on the gods’ lack of interference, either positive or negative, in the lives of humans led to the charge of atheism, a charge from which Philodemus vigorously defends the school in On Piety (PHerc. 1077/1098). In this work, Philodemus devotes one part to cataloguing the views of other philosophers and poets on the gods, and he attacks the Stoics praise of them as authorities. In part 2, he provides evidence that Epicurus and his followers believed in the gods, focusing specifically on their participation in public ritual. He also cites their avoidance of political and social persecution as further proof that they are not atheists. The main theme of the text is that incorrect views about the nature of the gods lead to a range of psychological, social, and political problems, including social unrest and violence.

The work belongs to broader ancient debates about the nature of the gods, a point acknowledged by Philodemus, who comments that although most people recognize the existence of the gods, their exact nature is not generally agreed on (col. LXVI.9-16). In addition to setting forth the traditional Epicurean view of the gods (cols. XL. 9-26 and XLVI.1-11), who act as role models for Epicurean sages (col. LXXI.12-19), Philodemus also argues that participation in public ritual is an essential part of promoting social cohesion (col. XXVI.25-6) and that Epicurus and his followers took part for natural and social causes (col. XXVI.5-12). However, he also argues that it helps to bring people closer to the gods (col. XXVII.12-9). Also of interest to Philodemus is the relationship between piety and justice, and he presents the two as linked (col. LXXVIII.8-12). He argues that a person who is pious in the Epicurean sense (i.e. who holds a correct prolēpsis of the gods) will abide by natural justice, which is a contract to avoid harming each other. The role of religion in human history is a further point of examination, and Philodemus argues that the belief that gods play an active role in human affairs was propagated as a means of social control. He states that early humans correctly recognized that the gods are insusceptible to harm, but that at some point people, for their own ends, ascribed myths that instilled fear in men (cols. VIII.23-29 and LXXV.1-24). He catalogues this development in a number of columns and, in the process, he conveys the message that traditional religion is a political tool.

In addition to the Epicurean belief that the gods do not play a role in human affairs, Epicurean atomistic views were a further cause for charges of atheism. These views held that everything is composed of indestructible atoms except for the gods, who are indestructible for two reasons: 1) they can be topped up with atoms from external matter, and 2) they are composed of a material that allows atoms to pass through them. There has been some scholarly debate as to whether or not Epicureans held an idealist or realist view of the gods. If they held an idealist view of the gods, then this meant that the gods were thought constructs, which could not be perceived by the senses. Instead, people had an innate knowledge of them. If Epicureans held a realist view of the gods, then they thought the gods were real beings that emit eidōla (“effluences” emitted by compounds of atoms).

Philodemus clearly thinks that the gods are real beings. In On the Gods III (PHerc. 157/152), he discusses the unique corporeal nature of the gods (frs. 5-13). He examines friendship among the gods (frs. 82-85, 87, 89), where the gods live (cols. VIII-X), how they move (cols. X-XI), whether or not they have furniture and instruments (col. XI), whether or not they sleep (col. XI), and the fact that they speak Greek (col. XIII). Philodemus also addresses the issue of how wrong views of the gods causes fear, including fear of the future. He reiterates the orthodox Epicurean position that the gods are not omnipotent, saying that they only have control over themselves. Likewise, he defends the Epicurean positions that any liability to pain would destroy their happiness and that the gods act as behavioral ideals.

The main theme of On the Gods I (PHerc. 26) is that a false belief in the nature of the gods, and the connected fear of death, is a major stumbling block to the ataraxia needed for Epicurean pleasure. The early columns of the text, although very poorly preserved, appear to target a group of fellow Epicureans who have wavered on the central position that the gods do not interfere in human affairs (col. I). Philodemus puts forward the orthodox Epicurean belief that the gods are eternally happy, immortal beings whose very nature stops their involvement in human affairs, because doing so would upset their tranquility (col. II.9-15). The better-preserved portion of the treatise outlines two main arguments: one (cols. X-XV), whether humans or animals experience worse mental disturbance (tarachē); Philodemus denies the commonly held view that animals are happier because they do not believe in the gods. Instead, says Philodemus, they are unhappier, because, unlike humans who possess reason, they can never reason their way to a happier state of being. The second argument (cols. XVII-XXIV) is whether fear of the gods or death is worse. To this, Philodemus suggests that both fears are equally bad, because they are closely connected: people usually fear death because they fear punishment by the gods after death. He argues against both fears on two fronts. Firstly, he says that if you eradicate the false notion that the gods will harm you after death by realizing that they cause neither pleasure nor pain, then the fear of death will also stop. Secondly, he writes that you will cease to fear death if you understand the Epicurean view that death is final and that you will feel nothing once you have died.

g. Aesthetics

Ancient critics of Epicurus were fond of depicting him as anti-intellectual. In so doing, they could point to Epicurus’ own statements that paideia, the main system of liberal arts education in the Hellenistic period, held no value for the aspiring philosopher. In reality, Epicurus’ statements on the topic were more nuanced, and Philodemus’ discussions on rhetoric, poetry, and music make this clear. Despite the little evidence that remains for Epicurus’, or his successors’, views on these topics, it is almost certain that they wrote on these topics and that Philodemus’ own works engage with their views. Yet, these extant Herculaneum treatises do not just show a later Epicurean’s ability to clarify the viewpoints of the founders, but they also offer further demonstration of the school’s ability to respond to contemporary debates and discourses. In three separate works On Rhetoric (book 1 PHerc. 1427; book 2 PHerc. 1674/1672; book 3 PHerc. 1426, first draft 1506; book 4 PHerc. 1423, 1007/1673; book 8 PHerc. 832/1015; book 9 PHerc. 1004; book 10 PHerc. 1669), On Poems (book 1 PHerc.466, 444, 1073, 1074a, 1081a; book 2 PHerc. 1074b, 1677a, 1081b, 1676, 994; book 3 PHerc. 1087, 1403, 1113a; book 4 PHerc. 207; book 5 PHerc. 1581, 403, 407, 228, 1425, 1538), and On Music (PHerc. 1497), Philodemus presents different ancient attitudes towards these areas. Although these works are heavily polemical, it is possible to reconstruct Philodemus’ own arguments on aesthetic theory.

Epicurean epistemology and physics form the basis of Philodemus’ theory, and he holds that sensory organs cannot make judgments about rhetoric, poetry, and music because they are irrational. Likewise, the pleasure brought about by speaking, poetry, and music is irrational. A speech, a poem, or a piece of music is judged by dianoia (“thought”). Also underlying Philodemus’ discussion of aesthetics is a theory of art or technē. The technai were an integral part of paideia, and Philodemus’ theory of art engages with broader debates about what constitutes the arts or an art. For Philodemus, an art is a skill that can be taught by method and teaching and that results in a particular atomic arrangement that affects an individual’s diathesis (“disposition”). This in turn makes the person practicing the art more effective than someone who has not had the same training. In brief, Philodemus defines a technē as the practical knowledge of a set of rules and principals. They involve training, skill, and a certain disposition. The result should be something that is not obtainable by an untrained novice. On the basis of this definition, Philodemus argues that sophistic rhetoric, but not political or forensic, is an art.

In On Rhetoric, Philodemus argues, in keeping with his teacher Zeno’s position, that only sophistic rhetoric, which he says is the art of writing speeches and composing display pieces (II.23.33-24.33), is an art, but that political and forensic rhetoric are not. This position rests on the fact that sophistic rhetors have greater success than political or forensic orators at accomplishing their goal of giving good speeches. Sophistic rhetoric is, moreover, something that can be taught because it follows a methodology. The work begins in book 1 with a discussion of different views on the technicity of rhetoric. Philodemus cites the views of non-Epicureans as well as a group of Rhodians who held that no rhetoric could be considered an art. Philodemus presents all of these views as contrary to the school’s founders. Book 2 continues with a polemic concerning the technicity of rhetoric but also offers a defense of Zeno’s view that sophistic rhetoric is an art. He discusses the difference between exact arts (grammar, music, poetry, and painting) and conjectural arts (piloting a ship, medicinal). Book 3 argues against the Stoic Diogenes of Babylon on the relationship between rhetoric, philosophy, and politics, and Philodemus says that sophistic rhetoric cannot produce politicians. Book 4 focuses on rhetorical style, and Philodemus privileges style and delivery over arrangement and invention. In contrast to Cicero, who highlights the role of the orator and privileges practical rhetoric, by arguing that all other arts service oratory (On Oratory 2.2.5 and 3.19.72), Philodemus presents a range of other disciplines as supporting oratory. Book 8 assesses and dismisses the theory of Nausiphanes that natural philosophy creates good speakers. It also attacks Aristotle for giving politics a prominent place in philosophy. Book 9 examines the utility of rhetoric, and book 10 treats other views that rhetoric is more useful than philosophy.

On Poems engages with many similar themes to On Rhetoric. In On Rhetoric, Philodemus examines the questions “what is rhetoric?” and “is it an art?” In On Poems he asks “what is a good poem?” He presents poetry as an art, specifically the art of writing a good poem. Poetry is also an art because poets follow a methodology that can be taught and learned, with the latter meaning that the learner’s atomic disposition is affected by the process. In keeping with Epicurus and the other founders’ views, Philodemus holds that poems have no educational value and that they offer neither knowledge nor ethics. Neither does poetry have any utility; this is the preserve of prose. Philodemus, however, is predominantly interested in the aesthetic question of what makes a poem good. His answer is that a good poem is a mixture of form and content, where form refers to versified words and content refers to the thoughts of the poem. The form is specific to poetry, in the sense that the poet is the only artist to write in meter. Form and content are mutually dependent: the content of a poem cannot be expressed without words, but equally words are meaningless without content, which is a poem’s subject matter. In this Philodemus adheres to the Epicurean theory of language, which holds that words, as opposed to sounds devoid of meaning, involve reasoning (epilogismos). A good poem, then, is good based on its artful composition and its content, although that content will be neither useful nor moral. Moreover, a poet whose disposition has been transformed by training in the art of poetry will more successfully compose a poem than an untrained individual, although Philodemus does not regard a poem’s genre as important: a poem of any genre can be good. A good poem will also generate further thoughts in the audience. Philodemus thus judges poetry purely on its entertainment value and a good poem rests on the poet’s ability. Only philosophy written in prose can argue a point. Poetry, however, is not harmful, especially to Epicureans who hold correct opinions and can thus read a poem for pleasure without being influenced by any incorrect information. Moreover, a sage can be a poet, so long as they use technē to achieve the proper goal of writing a good poem and so long as the writing of poetry is subordinate to their philosophical goals.

On Poems follows Philodemus’ usual habit of argumentation, and it is a polemical work, in which he does not put forward a positive view. Books 1 and 2 heavily criticize euphonists, who argued that sound gives poetry value. Due to the Epicurean view that the senses are irrational, Philodemus strenuously argued against euphony. Book 3 discusses the relationship between euphony and meaning, and the difference between poetic and prosaic words. Book 4 examines the question of genre, while book 5 looks at how poetry actually works and considers the evaluation of a poem’s quality.

Of the three arts, Philodemus is the most ambivalent about music, probably because its aural nature is difficult to reconcile with Epicurean views that the senses are irrational. He recognizes that music can be pleasing. However, unlike poetry, which uses words to convey thoughts, music cannot communicate. Philodemus’ main target is the Stoic Diogenes of Babylon, who argued that music can teach virtues. In contrast, Philodemus argues that the pleasure of listening to music can distract the listener from the content of any accompanying lyrics. Music, like poetry, is a natural but unnecessary pleasure.

5. Influence and Legacy

Philodemus’ philosophical influence was minimal either due to the lack of circulation of his work or due to the Epicurean school’s orthodoxy, which tended to look back to the school’s founders. It seems reasonably certain that Philodemus’ On Frank Criticism influenced Horace’s Satires and perhaps Horace’s interest in Epicureanism more broadly. On Piety may have influenced the structure of Cicero’s On the Nature of the Gods, although it is also possible that they both had a common source. The situation with Sextus Empiricus’ discussion of paideia in Against the Mathematicians 1-6 is similar, and it seems clear that either Philodemus was a source for Sextus or that the two authors shared the same source material. Cicero cites Philodemus, together with a fellow Epicurean Siro, as authorities in On Moral Ends (2.119). The only direct reference to one of Philodemus’ works is by Diogenes Laertius (10.3), who refers to his compilation on the history of philosophy. The influence of Philodemus’ epigram 23 on Catullus 13 is clear. There has been some discussion of his poetic theory’s influence on Augustan poets, especially on their interest in highly-wrought poetic styles.

6. References and Further Reading

a. Primary Sources

There is no single edition containing the full collection of Philodemus’ works. Here is a list of revised editions of the original Greek texts, accompanied by introductory discussions that outline the work’s content, the history of its papyrus, and a commentary. The list is not complete, but it does offer the majority of editions.

  • Amoroso, Filippo. “Filodemo sulla conversazione.” Cronache Ercolanesi, vol. 5, 1975, pp. 63-76.
  • Angeli, Anna. Agli amici di scuola. Bibliopolis, 1988.
  • Capasso, Mario. “L’intellettuale e il suo re (Filodemo, L’adulazione, Pherc. 1675, Col. V 21-31).” Studi di egittologia e di papirologia, vol. 2, 2004, pp. 47-52.
  • Chandler, Clive. Philodemus on Rhetoric. Books 1 and 2: Translation and Exegetical Essays. Routledge, 2006.
  • De Lacy, Phillip, and Estelle Allen De Lacy. Philodemus: On Methods of Inference. Bibliopolis, 1978.
  • Del Mastro, Gianluca. “Il Pherc. 1004: Filodemo, De rhetorica VII.” Zeitschrift für Papyrologie und Epigraphik, vol. 182, 2012, pp. 131-133.
  • Diels, Hermann. Philodemos Über die Götter. Erstes Buch. Verlag der Königl. Akademie der Wissenschaften, 1916.
  • Diels, Hermann. Philodemos Über die Götter. Drittes Buch. Verlag der Königl. Akademie der Wissenschaften, 1917.
  • Dorandi, Tiziano. “Filodemo, Gli Stoici (Pherc. 155 e 339).” Cronache Ercolanesi, vol. 12, 1982, pp. 91-133.
  • Dorandi, Tiziano. Storia dei filosofi: Platone e l’academia. Bibliopolis, 1991.
  • Dorandi, Tiziano. Filodemo, Storia dei filosofi. La Stoà da Zenone a Panezio. Brill, 1994.
  • Dorandi, Tiziano, and Emidio Spinelli. “Un libro di Filodemo sull’avarizia?” Cronache Ercolanesi, vol. 20, 1990, pp. 53-59.
  • Essler, Holger. “Un nuovo frammento di Ermarco nel PHerc. 152/157 (Filodemo, De dis, libro III).” Cronache Ercolanesi, vol. 35, 2005, pp. 53-59.
  • Essler, Holger. “Falsche Götter bei Philodem (DI III KOL. 8,5-KOL. 10,6.)” Cronache Ercolanesi, vol. 39, 2009, pp. 161-205.
  • Fish, Jeffrey. “Philodemus, De Bono Rege Secundum Homerum: A Critical Text with Commentary (Cols. 21-39).” University of Texas at Austin, 1999.
  • Fish, Jeffrey. “Philodemus’ on the Good King According to Homer, Columns 21-31.” Cronache Ercolanesi, vol. 32, 2002, pp. 187-232.
  • Fish, Jeffrey. “The Closing Columns of Philodemus’ on the Good King According to Homer, Pherc. 1507, Cols. 95-98 (= Cols. 40-43 Dorandi).” Cronache Ercolanesi, vol. 46, 2016, pp. 55-81.
  • Gargiulo, Tristano. “Pherc. 222: Filodemo sull’adulazione.” Cronache Ercolanesi, vol. 11, 1981, pp. 103-127.
  • Giuliano, Fabio Massimo. “Pherc. 495-Pherc. 558 (Filodemo, Storia Di Socrate E Della Sua Scuola?): Edizione, commento, questioni compositive e attributive.” Cronache Ercolanesi, vol. 31, 2001, pp. 37-79.
  • Guerra, Tepedino A. “Filodemo sulla gratitudine.” Cronache Ercolanesi, vol. 7, 1977, pp. 93-113.
  • Guerra, Tepedino A. “Il primo libro ‘Sulla Ricchezza’ di Filodemo.” Cronache Ercolanesi, vol. 8, 1978, pp. 52-95.
  • Guerra, Tepedino A. “Il Pherc. 1678: Filodemo Sull’invidia?”  Cronache Ercolanesi, vol. 15, 1985, pp. 113-125.
  • Hammerstaedt, J. “Der Schlußteil Von Philodems Drittem Buch Über Rhetorik.” Cronache Ercolanesi, vol. 22, 1992, pp. 9-117.
  • Henry, W. Benjamin. Philodemus, On Death. Society of Biblical Literature, 2009.
  • Indelli, Giovanni. L’ira. Bibliopolis, 1988.
  • Indelli, Giovanni, and Voula Tsouna-McKirahan.  [Philodemus, ] [On Choices and Avoidances]. Bibliopolis, 1995.
  • Janko, Richard. Philodemus, On Poems. Oxford University Press, 2000.
  • Janko, Richard. Philodemus, On Poems, Books 3-4, with the Fragments of Aristotle, on Poets. Oxford University Press, 2010.
  • Jensen, Christian Cornelius. Peri Kakion Liber Decimus. Teubner, 1911.
  • Konstan, David, et al. Philodemus, On Frank Criticism. Society of Biblical Literature, 1998.
  • Longo Auricchio, Francesca. “Frammenti inediti di un libro della ‘Retorica’ di Filodemo (Pherc. 463).” Cronache Ercolanesi, vol. 12, 1982, pp. 67-83.
  • Méndez, Acosta E., and Anna Angeli. Filodemo. Testimonianze su Socrate. Bibliopolis, 1992.
  • Militello, Cesira. Memorie Epicuree. Bibliopolis, 1997.
  • Monet, Annick. “[Philodème, Sur les sensations] Pherc. 19/698.” Cronache Ercolanesi, vol. 26, 1996, pp. 27-126.
  • Obbink, Dirk. Philodemus, On Piety Part 1. Oxford University Press, 1996.
  • Olivieri, Alessandro. Philodemi Peri Tou Kath’ Omeron Agathou Basileôs Libellus. Teubner, 1909.
  • Scott, Walter. Fragmenta Herculanensia: A Descriptive Catalogue of the Oxford Copies of the Herculaneum Rolls Together with the Texts of Several Papyri Accompanied by Facsimiles. Clarendon Press, 1885.
  • Sider, David. The Epigrams of Philodemos: Introduction, Text, and Commentary. Oxford University Press, 1997.
  • Sudhaus, Siegfried. Philodemi volumina rhetorica. Teubner, 1892-1896.
  • Tsouna, Voula. Philodemus, On Property Management. Society of Biblical Literature, 2012.

b. Secondary Sources

  • Annas, Julia. “Epicurean Emotions.” Greek, Roman, and Byzantine Studies, vol. 30, no. 2, 1989, pp. 145-164.
    • Annas shows the usefulness of Philodemus’ On Anger for reconstructing Epicurean emotional theory.
  • Armstrong, David, et al. Vergil, Philodemus, and the Augustans. University of Texas Press, 2004.
    • An edited collection that seeks connections between Philodemus’ works and Augustan poets, especially Vergil.
  • Asmis, Elizabeth. “Philodemus’s Poetic Theory and ‘On the Good King According to Homer’.” Classical Antiquity, vol. 10, no. 1, 1991, pp. 1-45.
    • Asmis argues that Philodemus presents poetry has having no utility, i.e. the art of writing poetry has no utility. Instead, any utility poetry may have comes from the wise man’s ability to interpret it.
  • Auvray-Assays, Clara, and Daniel Delattre. Cicéron Et Philodème. La Polémique En Philosophie. Éditions Rue d’Ulm, 2001.
    • This edited collection provides discussion on Philodemus’ ethical, theological, and aesthetic treatises.
  • Erler, Michael. “Der Zorn Des Helden. Philodemus ‘De Ira’ Und Vergils Konzept Des Zorns in Der ‘Aeneis’.” Pompeii Bibliography and Mapping Project, vol. 18, 1992, pp. 103-126.
    • Erler shows the connections between Philodemus’ theory of anger and Vergil’s Aeneid.
  • Fish, Jeffrey, and Kirk R. Sanders. Epicurus and the Epicurean Tradition. Cambridge University Press, 2011.
    • This edited collection covers a range of topics using historical, philosophical, and literary approaches. It is not a work principally focused on Philodemus, but he is utilized as a source in each chapter and some chapters are specifically focused on him. There are chapters on Epicurean pedagogy, theology, political theory, and emotions.
  • Fitzgerald, John T., et al. Philodemus and the New Testament World. Brill, 2004.
    • An edited collection on the themes of frank criticism, rhetoric, and economics by classicists and New Testament scholars.
  • Giannantoni, Gabrielle, and Marcello Gigante. Epicureismo Greco e Romano: Atti del congresso internazionale, Napoli, 19-26 Maggio 1993. Bibiliopolis, 1996.
    • This edited collection is not specifically on Philodemus, but it offers papers from scholars who have worked on the Herculaneum papyri and there are specific chapters on Philodemus.
  • Gigante, Marcello. Philodemus in Italy: The Books from Herculaneum. Translated by Dirk Obbink. The University of Michigan Press, 1995.
    • One of the rare monographs on Philodemus, Gigante reconstructs details about Philodemus’ life, provides background information about the excavations at Herculaneum and attitudes towards Philodemus, outlines the content of Philodemus’ works alongside the state of the texts, and discusses Piso and Philodemus’ relationship.
  • Monet, Annick. Le Jardin Romain: Épicurisme et Poésie à Rome. Presses de l’Université Charles-de-Gaulle, 2003.
    • This is an edited collection that interprets Philodemus’ works, along with Lucretius’ De natura deorum, within a Roman context, approaching the topic in a variety of ways ranging from finding possible direct connections between Philodemus, Lucretius, and Cicero to looking at the influences of Philodemus and Lucretius on later sources.
  • Obbink, Dirk. Philodemus and Poetry: Poetic Theory and Practice in Lucretius, Philodemus, and Horace. Oxford University Press, 1995.
    • This edited collection clarifies Philodemus’ definition of art and poetry, and it shows the importance of Philodemus’ contribution to poetic theory.
  • Tsouna, Voula. The Ethics of Philodemus. Oxford University Press, 2007.
    • Tsouna offers a philosophical discussion of Philodemus’ ethical treatises and provides useful information about the condition of his ethical works.

Author Information

Sonya Wurster
Email: swurster@unimelb.edu.au
The University of Melbourne
Australia

The Frankfurt School and Critical Theory

The Frankfurt School, known more appropriately as Critical Theory, is a philosophical and sociological movement spread across many universities around the world. It was originally located at the Institute for Social Research (Institut für Sozialforschung), an attached institute at the Goethe University in Frankfurt, Germany. The Institute was founded in 1923 thanks to a donation by Felix Weil with the aim of developing Marxist studies in Germany. After 1933, the Nazis forced its closure, and the Institute was moved to the United States where it found hospitality at Columbia University in New York City.

The academic influence of the critical method is far reaching. Some of the key issues and philosophical preoccupations of the School involve the critique of modernity and capitalist society, the definition of social emancipation, as well as the detection of the pathologies of society. Critical Theory provides a specific interpretation of Marxist philosophy with regards to some of its central economic and political notions like commodification, reification, fetishization and critique of mass culture.

Some of the most prominent figures of the first generation of Critical Theorists were Max Horkheimer (1895-1973), Theodor Adorno (1903-1969), Herbert Marcuse (1898-1979), Walter Benjamin (1892-1940), Friedrich Pollock (1894-1970), Leo Lowenthal (1900-1993), and Eric Fromm (1900-1980). Since the 1970s, a second generation began with Jürgen Habermas, who, among other merits, contributed to the opening of a dialogue between so-called continental and the analytic traditions. With Habermas, the Frankfurt School turned global, influencing methodological approaches in other European academic contexts and disciplines. It was during this phase that Richard Bernstein, a philosopher and contemporary of Habermas, embraced the research agenda of Critical Theory and significantly helped its development in American universities starting from the New School for Social Research in New York.

The third generation of critical theorists, therefore, arose either from Habermas’ research students in the United States and at Frankfurt am Main and Starnberg (1971-1982), or from a spontaneous convergence of independently educated scholars. Therefore, the third generation of Critical Theory scholars consists of two groups. The first spans a broad time—denying the possibility of establishing any sharp boundaries. It can be said to include also scholars such as Andrew Feenberg, even if he was a direct student of Marcuse, or people such as Albrecht Wellmer who became an assistant of Habermas due to the premature death of Adorno in 1969. Klaus Offe, Josef Früchtl, Hauke Brunkhorst, Klaus Günther, Axel Honneth, Alessandro Ferrara, Cristina Lafont, and Rainer Forst, among others, are also members of this group. The second group of the third generation is instead composed mostly of American scholars who were influenced by Habermas’ philosophy during his visits to the United States.

Table of Contents

  1. Critical Theory: Historical and Philosophical Background
  2. What is Critical Theory?
    1. Traditional and Critical Theory: Ideology and Critique
    2. The Theory/Practice Problem
    3. The Idea of Rationality: Critical Theory and its Discontents
  3. Concluding Thoughts
  4. References and Further Reading

1. Critical Theory: Historical and Philosophical Background

Felix Weil’s father, Herman, made his fortune by exporting grain from Argentina to Europe. In 1923, Felix decided to use his father’s money to found an institute specifically devoted to the study of German society in the light of a Marxist approach. The initial idea of an independently founded institute was conceived to provide for studies on the labor movement and the origins of anti-Semitism, which at the time were being ignored in German intellectual and academic life.

Not long after its inception, the Institute for Social Research was formally recognized by the Ministry of Education as an entity attached to Goethe University Frankfurt. Felix could not imagine that in the 1960s Goethe University Frankfurt would receive the epithet of “Karl Marx University”. The first officially appointed director was Carl Grünberg (1923-9), a Marxist professor at the University of Vienna. His contribution to the Institute was the creation of a historical archive mainly oriented to the study of the labor movement (also known as the Grünberg Archiv).

In 1930, Max Horkheimer succeeded to Grünberg. While continuing under a Marxist inspiration, Horkheimer interpreted the Institute’s mission to be more directed towards an interdisciplinary integration of the social sciences. Additionally, the Grünberg Archiv ceased to publish and an official organ was instead launched with a much greater impact: the Zeitschrift für Sozialforschung. While never officially supporting any party, the Institute entertained intensive research exchanges with the Soviet Union.

It was under Horkheimer’s leadership that members of the Institute were able to address a wide variety of economic, social, political and aesthetic topics, ranging from empirical analysis to philosophical theorization. Different interpretations of Marxism and its historical applications explain some of the hardest confrontations on economic themes within the Institute, such as the case of Pollock’s criticism of Grossman’s standard view on the pauperization of capitalism. This particular confrontation led Grossman to leave the Institute. Pollock’s critical reinterpretation of Marx received support also from intellectuals who greatly contributed to later developments of the School as, for instance, in the case of Leo Lowenthal, Theodor Wiesengrund-Adorno and Erich Fromm. In particular, with Fromm’s development of a psychoanalytic trend at the Institute and with an influential philosophical contribution by Hokheimer, it became clear how under his directorship the Institute faced a drastic turning point which characterized all its future endeavors. The following sections, therefore, briefly introduce some of the main research patterns introduced by Fromm and Horkheimer, respectively.

Since the beginning, psychoanalysis in the Frankfurt School was conceived in terms of a reinterpretation of Freud and Marx. The consideration of psychoanalysis by the Frankfurt School was certainly due to Horkheimer’s encouragement. It was Fromm, nevertheless, who achieved a significant advancement of the discipline; his central aim was to provide, through a synthesis of Marxism and psychoanalysis, “the missing link between ideological superstructure and socio-economic base” (Jay 1966, p. 92). A radical shift though occurred in the late 1930s, when Adorno joined the School and Fromm decided, for independent reasons, to leave. Nevertheless, the School’s interest in psychoanalysis, particularly in Freud’s instinct theory, remained unaltered. This was manifest in Adorno’s paper Social Science and Sociological Tendencies in Psychoanalysis (1946), as well as in Marcuse’s book Eros and Civilization (1955). The School’s interest in psychoanalysis coincided with a marginalization of Marxism, a growing interest into the interrelation between psychoanalysis and social change, as well as with Fromm’s insight into the psychic (or even psychotic) role of the family. This interest became crucial in empirical studies of the 40s that led, eventually, to Adorno’s co-authored work The Authoritarian Personality (1950). The goal of this work was to explore, on the basis of empirical research making use of questionnaires, to define a “new anthropological type”—the authoritarian personality (Adorno et. al. 1950, quoted in Jay 1996, p. 239). Such a character was found to have specific traits such as: compliance with conventional values, non-critical thinking, as well as absence of introspectiveness.

As pointed out by Jay: “Perhaps some of the confusion about this question was a product of terminological ambiguity. As a number of commentators have pointed out, there is an important distinction that should be drawn between authoritarianism and totalitarianism [emphasis added]. Wilhelminian and Nazi Germany, for example, were fundamentally dissimilar in their patterns of obedience. What The Authoritarian Personality was really studying was the character type of a totalitarian rather than an authoritarian society. Thus, it should have been no surprise to learn that this new syndrome was fostered by a familial crisis in which traditional paternal authority was under fire” (Jay 1996, p. 247). Horkheimer’s leadership provided a very distinct methodological direction and philosophical grounding to the research interests of the Institute. As an instance of Horkheimer’s aversion to so-called Lebensphilosophie (philosophy of life), he criticized the fetishism of subjectivity and the lack of consideration for materialist conditions of living. Furthermore, arguing against Cartesian and Kantian philosophy, Horkheimer, by use of dialectical mediation, attempted to rejoin all dichotomies including the divide between consciousness and being, theory and practice, fact and value. Differently from Hegelianism or Marxism, dialectics amounted for Horkheimer to be neither a metaphysical principle nor a historical praxis; it was not intended as a methodological instrument. On the contrary, Horkheimer’s dialectics functioned as the battleground for overcoming overly rigid categorizations and unhelpful dichotomies and oppositions. It originated from criticism by Horkheimer of orthodox Marxism’s dichotomy between productive structures and ideological superstructure, as well as positivism’s naïve separation of social facts and social interpretation.

In 1933, due to the Nazi takeover, the Institute was temporarily transferred, first to Geneva and then in 1935 to Columbia University, New York. Two years later Horkheimer published the ideological manifesto of the School in his Traditional and Critical Theory ([1937] 1976) where he readdressed some of the previously introduced topics concerning the practical and critical turn of theory. In 1938, Adorno joined the Institute after spending some time as an advanced student at Merton College, Oxford. He was invited by Horkheimer to join the Princeton Radio Research Project. Gradually, Adorno assumed a prominent intellectual leadership in the School and this led to co-authorship, with Horkheimer, of one of the milestones works of the School, the publication of Dialectic of Enlightenment in 1947. During the time of Germany’s Nazi seizure, the Institute remained the only free voice publishing in German language. The backlash of this choice, though, was a prolonged isolation from American academic life and intellectual debate, a situation described by Adorno with the iconic expression “message in the bottle” to refer to the lack of a public American audience. According to Wiggershaus: “The Institute disorientation in the late 1930s made the balancing acts it had always had to perform, for example in relation to its academic environment, even more difficult. The seminars were virtually discussion groups for the Institute’s associates, and American students only rarely took part in them” (1995, p. 251).

Interestingly, and not surprisingly, one of the major topics of study was Nazism. This led to two different approaches in the School. One marshaled by Neumann, Gurland and Kirchheimer and oriented mainly to the analysis of legal and political issues by consideration of economic substructures; the other, instead, guided by Horkheimer and focusing on the notion of psychological irrationalism as a source of obedience and domination (see Jay 1996, p. 166).

In 1941, Horkheimer moved to Pacific Palisades, near Los Angeles. He built himself a bungalow near other German intellectuals, among whom were Bertold Brecht and Thomas Mann as well as with other people interested in working for the film industry (Wiggershaus 1995, p. 292). Other fellows like Marcuse, Pollock and Adorno followed shortly, whereas some remained in New York. Only Benjamin refused to leave Europe and in 1940, while attempting to cross the border between France and Spain at Port Bou, committed suicide. Some months later, Arendt also crossed the same border, passing on Adorno Benjamin’s last writing: Theses on the Philosophy of History.

The division of the School into two different premises, New York and California, was paralleled by the development of two autonomous research programs led, on the one hand, by Pollock and, on the other hand, by Horkheimer and Adorno. Pollock directed his research to study anti-Semitism. This research line culminated into an international conference organized in 1944 as well as a four-volume work titled Studies in Anti-Semitism; Horkheimer and Adorno, instead, developed studies on the reinterpretation of the Hegelian notion of dialectics as well as engaged into the study of anti-Semitic tendencies. The most relevant publication in this respect by the two was The Authoritarian Personality or Studies in Prejudice. After this period, only few devoted supporters remained faithful to the project of the School. These included Horkheimer himself, Pollock, Adorno, Lowenthal and Weil. In 1946, however, the Institute was officially invited to join Goethe University Frankfurt.

Upon return to West Germany, Horkheimer presented his inaugural speech for the reopening of the institute on 14 November 1951. One week later he inaugurated the academic year as a new Rector of the University. Yet, what was once a lively intellectual community became soon a small team of very busy people. Horkheimer was involved in the administration of the university, whereas Adorno was constantly occupied with different projects and teaching duties. In addition, in order to keep US citizenship, Adorno had to go back to California where he earned his living by conducting qualitative research analysis. Horkheimer, instead, attempted to attract back his former assistant Marcuse when the opportunity arose for a successor to Gadamer’s chair in Frankfurt, but neither this initiative nor further occasions were successful. Marcuse remained in the United States and was offered a full position at Brandeis University. Adorno returned to Germany in August 1953 and was soon involved again in empirical research, combining quantitative and qualitative methods in the analysis of industrial relations for the Mannesmann Company. In 1955, he took over Horkheimer position as director of the Institute for Social Research, and on 1 July 1957 he was appointed full professor in philosophy and sociology. Even though greatly influential in philosophy, Adorno’s most innovative contribution is unanimously thought to be in the field of music theory and aesthetics. Some of his significant works in this area included Philosophy of Modern Music (1949) and later Vers une Musique Informelle. In 1956, Horkheimer retired just when several important publications were appearing, such as Marcuse’s Eros and Civilization and the essay’s collection Sociologica. These events marked the precise intellectual phase of maturity reached at that time by the Frankfurt School.

The sixties—which saw famous student protests across Europe—also saw the publication of Adorno’s fundamental work, Negative Dialectics (1966). This study, while far from either materialism or metaphysics, maintained important connections with an “open and non-systemic” notion of dialectics. It appeared only a few years later than One-Dimensional Man (1964), where Marcuse introduced the notion of “educational dictatorship”— a strategy intended for the advancement of material conditions aimed at the realization of a higher notion of the good. While Marcuse, quite ostensibly, sponsored the student upheavals, Adorno maintained a much moderate and skeptical profile.

In 1956, Habermas joined the Institute as Adorno’s assistant. He was soon involved in an empirical study titled Students and Politics. The text, though, was rejected by Horkheimer and it did not come out, as it should have, in the series of the Frankfurt Contributions to Sociology. Only later, in 1961, it appeared in the series Sociological Texts (see Wiggershaus 1995, p. 555). Horkheimer’s aversion towards Habermas was even more evident when he refused to supervise his Habilitation. Habermas obtained his Habilitation under the supervision of Abendroth at Marburg, where he addressed the topic of the bourgeois formation of public sphere. This study was published by Habermas in 1962 under the title of The Structural Transformation of the Public Sphere, just before he handed in his Habilitation. With the support of Gadamer he was, then, appointed professor at Heidelberg. Besides his achievements, both in academia and as an activist, the young Habermas contributed towards the construction of a critical self-awareness of the socialist student groups around the country (the so-called SDS, Sozialistischer Deutscher Studentenbund). It was in this context that Habermas reacted to the extremism of Rudi Dutschke, the radical leader of the students’ association who criticized him for defending a non-effective emancipatory view. It was principally against Dutschke’s positions that Habermas, during a public assembly labeled such positions with the epitome of “left-wing fascism”. How representative this expression was of Habermas’ views on student protests has often been a matter of contention.

Discussions of the notion of emancipation had been at the center of the Frankfurt School political debate since the beginning. The concept of emancipation (Befreiung in German), covers indeed a wide semantic spectrum. Literarily it means “liberation from”. The notion spans, therefore, from a sense related to action-transformation to include also revolutionary action.

After his nomination in 1971 as a director of the Max Planck Institute for Research into the Conditions of Life in the Scientific-Technical World at Starnberg, Habermas left Frankfurt. He returned there only in 1981 after having completed The Theory of Communicative Action. This decade was crucial for the definition of the School’s research objectives. In The Theory of Communicative Action (1984b [1981]), Habermas provided a model for social complexities and action coordination based upon the original interpretation of classical social theorists as well as the philosophy of Searle’s Speech Acts theory. Within this work, it also became evident how the large amount of empirical analysis conducted by Habermas’ research team on topics concerning pathologies of society, moral development and so on was elevated to a functionalistic model of society oriented to an emancipatory purpose. The assumption was that language itself embedded a normative force capable of realizing action co-ordination within society. In this respect, Habermas defined these as the “unavoidable pragmatic presuppositions of mutual understanding”. Social action whose coordination-function relies on the same pragmatic presuppositions was seen as connected to a justification discourse based on the satisfaction of specific validity-claims.

Habermas described discourse theory as relying on three types of validity-claims raised by communicative action. He claimed that it was only when the conditions of truth, rightness and sincerity were raised by speech-acts that social coordination could be obtained. As noticed in the opening sections, differently from the first generation of Frankfurt School intellectuals, Habermas contributed greatly to bridging the continental and analytical traditions, integrating aspects belonging to American Pragmatism, Anthropology and Semiotics with Marxism and Critical Social Theory.

Just one year before Habermas’ retirement in 1994, the directorship of the Institut für Sozialforschung was assumed by Honneth. This inaugurated a new phase of research in Critical Theory. Honneth, indeed, revisited the Hegelian notion of recognition (Anerkennung) in terms of a new prolific paradigm in social and political enquiry. Honneth began his collaboration with Habermas in 1984, when he was hired as an assistant professor. After a period of academic appointments in Berlin and Konstanz, in 1996 he took Habermas’ chair in Frankfurt.

Honneth’s central tenet, the struggle for recognition, represents a leitmotiv in his research and preeminently in one of his most important books, The Struggle for Recognition: The Moral Grammar of Social Conflicts ([1986]). This work represents a mature expansion of what was partially addressed in his dissertation, a work published under the title of Critique of Power: Stages of Reflection of a Critical Social Theory (1991 [1985]). One of the core themes addressed by Honneth consisted in the claim that, contrary to what Critical Theory initially emphasized, more attention should have been paid to the notion of conflict in society and among societal groups. Conflict represents the internal movement of historical advancement and human emancipation, falling therefore within the core theme of critical social theory. The so-called “struggle for recognition” is what best characterizes the fight for emancipation by social groups. This fight represents a subjective negative experience of domination—a form of domination attached to misrecognitions. To come to terms with negations of subjective forms of self-realization means to be able to transform social reality. Normatively, though, acts of social struggle activated by forms of misrecognition point to the role that recognition plays as a crucial criterion for grounding intersubjectivity.

Honneth inaugurated a new research phase in Critical Theory. Indeed, his communitarian turn has been paralleled by the work of some of his fellow scholars. Brunkhorst, for instance, in his Solidarity: From Civic Friendship to a Global Legal Community (2005 [2002]), canvasses a line of thought springing from the French Revolution of 1789 to contemporary times: the notion of fraternity. By the use of historical conceptual reconstruction and normative speculation, Brunkhorst presented the pathologies of the contemporary globalized world and the function that solidarity would play.

The confrontation with American debate, initiated systematically by the work of Habermas, became soon an obsolete issue in the third generation of critical theorists—not only because the group was truly international, merging European and American scholars. The work of Forst testifies, indeed, of the synthesis between analytical methodological rigor and classical themes of the Frankfurt School. Thanks to Habermas’ intellectual opening, the third generation of critical theorists engaged into dialogue with French post-modern philosophers like Derrida, Baudrillard, Lyotard and so forth, which according to Foucault are the legitimate interpreters of some central aspects of the Frankfurt School.

2. What is Critical Theory?

“What is ‘theory’?” asked Horkheimer in the opening of his essay Traditional and Critical Theory [1937]. The discussion about method has been always a constant topic for those critical theorists who have attempted since the beginning to clarify the specificity of what it means to be “critical”. A primary broad distinction that Horkheimer drew was that of the difference in method between social theories, scientific theories and critical social theories. While the first two categories had been treated as instances of traditional theories, the latter connoted the methodology the Frankfurt School adopted.

Traditional theory, whether deductive or analytical, has always focused on coherency and on the strict distinction between theory and praxis. Along Cartesian lines, knowledge has been treated as grounded upon self-evident propositions or, at least, upon propositions based on self-evident truths. Accordingly, traditional theory has proceeded to explain facts by application of universal laws, that is, by subsumption of a particular to a universal in order to either confirm or disconfirm this. A verificationist procedure of this kind was what positivism considered to be the best explicatory account for the notion of praxis in scientific investigation. If one were to defend the view according to which scientific truths should pass the test of empirical confirmation, then one would commit oneself to the idea of an objective world. Knowledge would be simply a mirror of reality. This view is firmly rejected by critical theorists.

Under several aspects, what Critical Theory wants to reject in traditional theory is precisely this “picture theory” of language and knowledge as that defined by “the first” Wittgenstein in his Tractatus. According to such a view, later abandoned by “the second” Wittgenstein, the logical form of propositions consists in showing a possible fact and in saying whether this is true or false. For example, the proposition “it rains today” shows both the possibility of the fact that “it rains today” and it affirms that it is the case that “it rains today.” In order to check whether something is or is not the case, one must verify empirically whether the stated fact occurs or not. This implies that the condition of truth and falsehood presupposes an objective structure of the world.

Horkheimer and his followers rejected the notion of objectivity in knowledge by pointing, among other things, to the fact that the object of knowledge is itself embedded into a historical and social process: “The facts which our senses present to us are socially preformed in two ways: through the historical character of the object perceived and through the historical character of the perceiving organ” (Horkheimer [1937] in Ingram and Simon-Ingram 1992, p. 242). Further, with a rather Marxist twist, Horkheimer noticed also that phenomenological objectivity is a myth because it is dependent upon “technological conditions” and the latter are sensitive to the material conditions of production. Critical Theory aims thus to abandon naïve conceptions of knowledge-impartiality. Since intellectuals themselves are not disembodied entities observing from a God’s viewpoint, knowledge can be obtained only from a societal embedded perspective of interdependent individuals.

If traditional theory is evaluated by considering its practical implications, then no practical consequences can be actually inferred. Indeed, the finality of knowledge as a mirror of reality is mainly a theoretically-oriented tool aimed at separating knowledge from action, speculation from social transformative enterprise. Critical Theory, instead, characterizes itself as a method contrary to the “fetishization” of knowledge, one which considers knowledge as something rather functional to ideology critique and social emancipation. In the light of such finalities, knowledge becomes social criticism and the latter translates itself into social action, that is, into the transformation of reality.

Critical Theory has been strongly influenced by Hegel’s notion of dialectics for the conciliation of socio-historical oppositions as well as by Marx’s theory of economy and society and the limits of Hegel’s “bourgeois philosophy”. Critical Theory, indeed, has expanded Marxian criticisms of capitalist society by formulating patterns of social emancipatory strategies. Whereas Hegel found that Rationality had finally come to terms with Reality with the birth of the modern nation state (which in his eyes was the Prussian state), Marx insisted on the necessity of reading the development of rationality through history in terms of a class struggle. The final stage of this struggle would have seen the political and economic empowerment of the proletariat. Critical theorists, in their turn, rejected both the metaphysical apparatus of Hegel and the eschatological aspects connected to Marx’s theory. On the contrary, Critical Theory analyses were oriented to the understanding of society and pointed rather to the necessity of establishing open systems based on immanent forms of social criticism. The starting point was the Marxian view on the relation between a system of production paralleled by a system of beliefs. Ideology, which according to Marx was totally explicable through an underlying system of production, for critical theorists had to be analyzed in its own respect and as a non-economically reducible form of expression of human rationality. Such a revision of Marxian categories became extremely crucial, then, in the reinterpretation of the notion of dialectics for the analysis of capitalism. Dialectics, as a method of social criticism, was interpreted as following from the contradictory nature of capitalism as a system of exploitation. Indeed, it was on the basis of such inherent contradictions that capitalism was seen to open up to a collective form of ownership of the means of production, namely, socialism.

a. Traditional and Critical Theory: Ideology and Critique

From these conceptually rich implications one can observe some of the constant topics which have characterized critical social theory, that is, the normativity of social philosophy as something distinct from classical descriptive sociology, the everlasting crux on the theory/practice relation and, finally, ideology critique. These are the primary tasks that a critical social theory must accomplish in order to be defined as “critical”. Crucial in this sense is the understanding and the criticism of the notion of “ideology”.

In defining the senses to be assigned to the notion of ideology, within its descriptive-empirical sense “one might study the biological and quasi-biological properties of the group” or, alternatively, “the cultural or socio-cultural features of the group” (Geuss 1981, p. 4 ff). Ideology, in the descriptive sense, incorporates both “discursive” and “non-discursive” elements. That is, in addition to propositional contents or performatives, it includes gestures, ceremonies and so forth (Geuss 1981, pp. 6-8); also, it shows a systematic set of beliefs—a world-view—characterized by conceptual schemes. A variant of the descriptive sense is the “pejorative” version where a form of ideology is judged negatively in view of its epistemic, functional or genetic properties (Geuss 1981, p. 13). On the other hand, if one takes “ideology” according to a positive sense, then, reference is not with something empirically given, but rather with a “desideratum”, a “verité a faire” (Geuss 1981, p. 23). Critical Theory, distances itself from scientific theories because, while the latter understands knowledge as an objectified product, the former serves the purpose of human emancipation through consciousness and self-reflection.

If the task of critical social theory is to evaluate the degree of rationality of any system of social domination in accordance to standards of justice, then ideological criticism has the function of unmasking wrong rationalizations of present or past injustices—that is, ideology in the factual and negative sense—such as in the case of the belief that “women are inferior to men, or blacks to whites…”. Thus ideological criticism aims at proposing alternative practicable ways for constructing social bounds. Critical Theory moves precisely in between the contingency of objectified non-critical factual reality and the normativity of utopian idealizations, that is, in between the so-called “theory/practice” problem (see Ingram 1990, p. xxiii). Marcuse, for instance, in the essay Philosophie und Kritische Theorie (1937), defends the view that Critical Theory characterizes itself as being neither philosophy tout court nor pure science, as it claims to be instead an overly simplistic approach to Marxism. Critical Theory has the following tasks: to clarify the sociopolitical determinants that explain the limits of analysis of a certain philosophical view as well as to transcend the use of imagination—the actual limits of imagination. From all this, two notions of rationality result: the first attached to the dominant form of power and deprived of any normative force; the second characterized, on the contrary, by a liberating force based on a yet-to-come scenario. This difference in forms of rationality is what Habermas has later presented, mutatis mutandis, in terms of the distinction between instrumental and communicative rationality. While the first form of rationality is oriented to a means-ends understanding of human and environmental relations, the second form is oriented to subordinating human action to the respect of certain normative criteria of action validity. This latter point echoes quite distinctively Kant’s principle of morality according to which human beings must be always treated as “ends in themselves” and never as mere “means”. Critical Theory and Habermas, in particular, are no exception to these view on rationality, since they both see Ideologiekritik not just as a form of “moralizing criticism”, but as a form of knowledge, that is, as a cognitive operation for disclosing the falsity of conscience (Geuss 1981, p. 26).

This point is strictly connected to another conceptual category playing a great role within Critical Theory, the concept of interest and in particular the distinction between “true interests” and “false interests”. As Geuss has suggested, there are two possible ways to propose such separation: “the perfect-knowledge approach” and “the optimal conditions approach” (1981, p. 48). Were one to follow the first option, the outcome would be one of falling into the side of acritical utopianism. On the contrary, “the optimal conditions approach” is reinterpreted, at least for Habermas, in terms of an “ideal speech situation” that by virtually granting an all-encompassing exchange of arguments, it assumes the function of providing a counterfactual normative check on actual discursive contexts. Within such a model, epistemic knowledge and social critical reflection are attached to unavoidable pragmatic-transcendental conditions that are universally the same for all.

The universality of such epistemological status differs profoundly from Adorno’s contextualism where individual epistemic principles grounding cultural criticism and self-reflection are recognized to be legitimately different along time and history. Both versions are critical in that they remain faithful to the objective of clearing false consciousness from ignorance and domination; but whereas Habermas sets a high standard of validity/non-validity for discourse theory, Adorno’s historicism remains sensitive to degrees of rationality that are context-dependent. In one of his later writings of 1969 (republished in Adorno 2003, pp. 292 ff.), Adorno provides a short but dense interpretation in eight theses on the significance and the mission of Critical Theory. The central message is that Critical Theory, while drawing from Marxism, must avoid hypostatization and closure into a single Weltanschauung on the pain of losing its “critical” capacity. By interpreting rationality as a form of self-reflective activity, Critical Theory represents a particular form of rational enquiry that must remain capable of distinguishing, immanently, ideology from a Hegelian “Spirit”. The mission of Critical Theory, therefore, is not exhausted by a theoretical understanding of social reality; as a matter of fact, there is a strict interconnection between critical understanding and transformative action: theory and practice are interconnected.

b. The Theory/Practice Problem

During the entire course of its historical development, Critical Theory has always confronted itself with one crucial methodological concern: the “theory/practice” problem. To this puzzle critical theorists have provided different answers, such that it is not possible to regroup them into a homogeneous set of views. In order to understand what the significance of the theory/practice problem is, it is useful to refer back to David Hume’s “is/ought” question. What Hume demonstrated through the separation of the “is” from the “ought” was the non-derivability of prescriptive statements from descriptive ones. This separation has been at the basis of those ethical theories that have not recognized moral statements as a truth-property. In other words, alternative reading to the “is/ought” relation have defended either a cognitivist approach (truth-validity of moral statements) or, alternatively, a non-cognitivist approach (no truth-validity), as in the case of emotivism.

Even if characterized by several internal differences, what Critical Theory added to this debate was the consideration both of the anthropological as well as the psychological dynamics motivating masses and structuring ideologies.

As far as the anthropological determinants in closing up the gap of the “theory/practice” problem is concerned, it is possible to take into consideration Habermas’ Knowledge and Human Interest ([1968] 1971). There Habermas combined a transcendental argument with an anthropological one by defending the view according to which humans have an interest in knowledge insofar as such interest is attached to the preservation of self-identity. Yet, to preserve one’s identity is to go beyond mere compliance with biological survival. As Habermas clarifies: “[…] human interests […] derive both from nature and from the cultural break with nature” (Habermas [1968], in Ingram and Simon-Ingram,1992, p. 263). On the contrary, to preserve one’s identity means to find in the emancipatory force of knowledge the fundamental interest of human beings. Indeed, the grounding of knowledge into the practical domain has quite far-reaching implications as, for instance, that interest and knowledge in Habermas find their unity in self-reflection, that is, in “knowledge for the sake of knowledge” (Habermas [1968], in Ingram and Simon-Ingram 1992, p. 264).

The Habermasian answer to the theory/practice problem comes from the criticism of non-cognitivist theories. If it is true, as non-cognitivists claim, that prescriptive claims are grounded on commands and do not have any cognitive content which can be justified through an exchange of public arguments, it follows that they cannot provide an answer to the difference between what is a “convergent behavior”, established through normative power on the basis, for example, of punishment and what is instead the notion of “following a valid rule”. In the latter case, there seems to be required an extra layer of justification, namely, a process through which a norm can be defined as valid. Such process is for Habermas conceived in terms of a counterfactual procedure for a discursive exchange of arguments. This procedure is aimed at justifying those generalizable interests that ought to be obeyed because they pass the test of moral validity.

The Habermasian answer to the is/ought question has several important implications. One implication, perhaps the most important one, is the criticism of positivism and of the epistemic status of knowledge. On the basis of Habermasian premises, indeed, there can be no objective knowledge, as positivists claim, detached from intersubjective forms of understanding. Since knowledge is strictly embedded in serving human interests, it follows that it cannot be considered value-neutral and objectively independent.

A further line of reflection on the theory/practice problem comes from psychoanalysis where a strict separation has been maintained between the “is” and “ought” and false “oughts” have been unmasked through the clarification of the psychological mechanisms constructing desires. Accordingly, critical theorists like Fromm referred to Freud’s notions of the unconscious which contributed defining ideologies in terms of “substitute gratifications”. Psychoanalysis represented such a strong component within the research of the Frankfurt School that even Adorno in his article Freudian Theory and the Pattern of Fascist Propaganda (1951) analyzed Fromm’s interconnection between sadomasochism and fascism. Adorno noticed how a parallel can be drawn between the loss of self-confidence and estimation in hierarchical domination, on the one hand, and compensation through self-confidence which can be re-obtained in active forms of dominations, on the other hand. Such mechanisms of sadomasochism, though, are not only proper of fascism. As Adorno noticed, they reappear under different clothes in modern cultural industry through the consumption of so-called “cultural commodities”.

Notwithstanding the previous discussions, the greatest philosophical role of psychoanalysis within Frankfurt School was exemplified by Marcuse’s thought. In his case, the central problem became that of interpreting the interest in the genealogical roots of capitalist ideology. How can one provide an account of class interests after the collapse of classes? How can one formulate, on the basis of the insights provided by psychoanalysis, the criteria through which it can be distinguished true from false interests? The way adopted by Marcuse was with a revisitation of Freud’s theory of instinctual needs. Differently from Freud’s tensions between nature and culture and Fromm’s total social shaping of natural instincts, Marcuse defended a third—median—perspective where instincts were considered only partially shaped by social relations (Ingram 1990, p. 93 ff). Through such a solution, Marcuse overcame the strict opposition between biological and historical rationality that was preventing the resolution of the theory/practice problem. He did so by recalling the annihilation of individual’s sexual energy laying at the basis of organized society and recalling, in its turn, the archetypical scenario of a total fulfillment of pleasure. Marcuse took imagination as a way to obtain individual reconciliation with social reality: a reconciliation, though, with an underlying unsolved tension. Marcuse conceived of overcoming such tensions through the aestheticization of basic instincts liberated by the work of imagination. The problem with Marcuse’s rationalization of basic instincts was that by relying excessively on human biology, it became impossible to distinguish between the truth and the falsity of socially dependent needs (see on this Ingram 1990, p. 103).

c. The Idea of Rationality: Critical Theory and its Discontents

For Critical Theory, rationality has always been a crucial theme in the analysis of modern society as well as of its pathologies. Whereas the early Frankfurt School and Habermas viewed rationality as a historical process whose unity was taken as a precondition for social criticism, later critical philosophies, influenced mainly by post-modernity, privileged a rather more fragmented notion of (ir)rationality manifested by social institutions. In the latter views, social criticism could not act as a self-reflective form of rationality, since rationality cannot be conceived as a process incorporated in history. One point shared by all critical theorists was that forms of social pathology were connected to deficits of rationality which, in their turn, manifested interconnections with the psychological status of the mind (see Honneth 2004, p. 339 ff.).

In non-pathological social aggregations, individuals were said to be capable of achieving cooperative forms of self-actualizations only if freed from coercive mechanisms of domination. Accordingly, for the Frankfurt School, modern processes of bureaucratic administration exemplified what Weber considered as an all-encompassing domination of formal rationality over substantive values. In Weber, rationality was to be interpreted as purposive rationality, that is, as a form of instrumental reason. Accordingly, the use of reason did not amount to formulating prescriptive models of society but aimed at achieving goals through the selection of the best possible means of action. If in Lukács the proletariat was to represent the only dialectical way out from the total control of formal rationality, Horkheimer and Adorno saw technological domination of human action as the negation of the inspiring purposes of Enlightenment. In the already mentioned work—>Dialectic of Enlightenment (1969 [1947])—Horkheimer and Adorno emphasized the role of knowledge and technology as a “means of exploitation” of labor and viewed the dialectic of reason as the archetypical movement of human self-liberation. Nevertheless, the repression by formal-instrumental rationality of natural chaos pointed to the possible resurgence of natural violence under a different vest, so that the liberation from nature through instrumental reason opened to the possibility of domination by a totalitarian state (see Ingram 1990, p. 63).

According to this view, reason had been seen essentially as a form of control over nature characterizing humanity since its inception, that is, since those attempts aimed at providing a mythological explanation of cosmic forces. The purpose served by instrumental rationality was essentially that of promoting self-preservation, even if this goal turned paradoxically into the fragmentation of bourgeois individuality that, once deprived of any substantive value, became merely formal and thus determined by external influences of mass-identity in a context of cultural industry.

Rationality, thus, began assuming a double significance: on the one hand, as traditionally recognized by German idealism, it was conceived as the primary source of human emancipation; on the other, it was conceived as the premise of totalitarianism. If, as Weber believed, modern rationalization of society came to a formal reduction of the power of rationality, it followed that hyper-bureaucratization of society led not just to a complete separation between facts and values but also to a total disinterest in the latter forms. Nevertheless, for Critical Theory it remained essential to defend the validity of social criticism on the basis of the idea that humanity is embedded in a historical learning process where clash is due to the actualization of reason re-establishing power-balances and struggles for group domination.

Given such a general framework on rationality, it can be said that Critical Theory has undergone several paradigm revolutions, both internally and externally. First of all, Habermas himself has suggested a further pre-linguistic line of enquiry by making appeal to the notion of “authenticity” and “imagination”. This suggests a radical reformulation of the same notion of “truth” and “reason” in the light of its metaphorical capacities of signification (see Habermas 1984a). Secondly, the commitment of Critical Theory to universal validity and universal pragmatics has been widely criticized by post-structuralists and post-modernists who have instead insisted respectively on the hyper-contextualism of the forms of linguistic rationality, as well as on the substitution of a criticism of ideology with genealogical criticism. While Derrida’s deconstructive method has shown how binary opposition collapses when applied to the semantic level, so that meaning can only be contextually constructed, Foucault has oriented his criticisms to the supposedly emancipatory power of universal reason by showing how forms of domination permeate micro-levels of power-control such as in sanatoriums, educational and religious bodies and so on. The control of life—known as bio-power—manifests itself in the attempt of normalizing and constraining individuals’ behaviors and psychic lives. For Foucault, reason is embedded into such practices which display the multiple layers of un-rationalized force. The activity of the analyst in this sense is not far from the same activity of the participant: there is no objective perspective which can be defended. Derrida, for instance, while pointing to the Habermasian idea of pragmatic of communication, still maintained a distinct thesis of a restless deconstructive potential of any constructing activity, so that no unavoidable pragmatic presuppositions nor idealizing conditions of communication could survive deconstruction. On the other hand, Habermasian theory of communicative action and discourse ethics, while remaining sensitive to contexts, pretended to defend transcendental conditions of discourse which, if violated, were seen to lead to performative contradictions. Last but not least, to the Habermasian role of consensus or agreement in discursive models, Foucault objected that rather than a regulatory principle, a true critical approach would simply enact a command in case of “nonconsensuality” (see Rabinow, ed. 1984, p. 379 ff).

3. Concluding Thoughts

The debate between Foucault and Critical Theory—in particular with Habermas—is quite illuminating of the common critical-universalist orientations of the first phase of the Frankfurt School versus the diverging methodologies defended starting from the Habermasian interpretation of modernity. For Foucault it was not correct to propose a second-order theory for defining what rationality is. Rationality is not to be found in abstract forms. On the contrary, what social criticism can only aim to achieve is the unmasking of deeply enmeshed forms of irrationality deposited in contingent and historical institutional embeddings. Genealogical methods, though, do not reject the idea that (ir)-rationality is part of history; on the contrary, they rather pretend to illuminate abstract and procedural rational models by dissecting and analyzing concrete institutional social practices through immanent criticism. To this views, Habermas has objected that any activity of rational criticism presupposes unavoidable conditions in order to justify the pretence of validity of its same exercise. This rebuttal reopened the demands of transcendental conditions for immanent criticism revealed along the same pragmatic conditions of social criticism. For Habermas, criticism is possible only if universal standards of validity are recognized and only if understanding (Verständigung) and agreement (Einverständnis) are seen as interconnected practices.

A further line of criticism against Habermas, one which included also a target to Critical Theory as a whole, came from scholars like Chantal Mouffe (2005). What she noticed is that in the notion of consensus it nested a surrendering to a genuine engagement into “political agonism”. If, as Mouffe claimed, the model of discursive action is bound to the achievement of consensus, then, what rolecan be left to politics once agreement is obtained? The charge of eliminating the consideration of political action from “the political” has been extended by Mouffe also to previous critical theorists such as Horkheimer, Adorno and Marcuse. Criticism concerned the non-availability of context-specific political guidance answering the question “What is to be done?” (see Chambers 2004, p. 219 ff.). What has been noticed is that whereas Critical Theory has aimed at fostering human emancipation, it has remained incapable of specifying a political action-strategy for social change. For the opponents to the Critical Theory paradigm, a clear indication in this sense was exemplified by Marcuse’s idea of “the Great Refusal”, one predicating abstention from real political engagement and pretences of transformation of the capitalist economy and the democratic institutions (Marcuse 1964). It was indeed in view of the reformulation of the Critical Theory ambition of presenting “realistic utopias”, that some of the representatives of the third generation directed their attention. Axel Honneth, for instance, starting from a revisitation of the Hegelian notion of (mis)-recognition and through a research phase addressing social pathologies, has proposed in one of his latest studies  a revisited version of socialism, as in The Idea of Socialism: Towards a Renewal (2017). Nancy Fraser, instead, by focusing on the notion of redistribution has provided key elements in understanding how it is possible to overcome economic inequalities and power-imbalances in post-industrial societies where cultural affiliations are no longer significant sources of power. In his turn, Alessandro Ferrara along his recent monograph The Democratic Horizon (2014), has revived the paradigm of political liberalism by addressing the significance of democracy and tackled next the problem of hypepluralism and multiple democracies. For Ferrara, what is inherent to democratic thinking is innovation and openness. This notion bears conceptual similarities with what Kant and Arendt understood in terms of “broad mindedness”. Seyla Benhabib, along similar lines, has seeked to clarify the significance of the Habermasian dual-track model of democracy, as one based on the distinction between moral issues that are proper of the institutional level (universalism) and ethical issues characterizing, instead, informal public deliberations (pluralism). Whereas the requirement of a universal consensus pertains only to the institutional sphere, the ethical domain is instead characterized by a plurality of views confronting each other across different life-systems. Benhabib’s views, by making explicit several Habermasian assumptions, aim to countervail both post-structuralist worries as well as post-modern charges of political action ineffectiveness of Critical Theory models. Finally, Forst’s philosophical preoccupation has been that of addressing the American philosophical debate with the specific aim of constructing an alternative paradigm to that of liberalism and communitarianism. Forst’s attempt has integrated analytic and continental traditions by radicalizing along transcendental lines some core Habermasian intuitions on rights and constitutional democracy. In his collections of essays, The Right to Justification, Forst suggests a transformation of the Habermasian “co-originality thesis” into a monistic “right to justification”. This move is aimed at suggesting an alternative and hopefully more coherent route of explanation for the understanding of the liberal constitutional experience (Forst, [2007] 2014, see also Forst, [2010] 2011).

4. References and Further Reading

  • Adorno, Theodor W. et al. The Authoritarian Personality, New York: Harper and Brothers, 1950.
  • Adorno, Theodor W. Eine Bildmonographie, Frankfurt am Main: Suhrkamp, 2003.
  • Adorno, Theodor W. “Freudian Theory and the Pattern of Fascist Propaganda” (1951), in Arato, Andrew and Eike Gebhardt (eds.). The Essential Frankfurt School Reader, Continuum: New York, 1982.
  • Brunkhorst, Hauke. Solidarity: From Civic Friendship to a Global Legal Community, trans. by J. Flynn, Cambridge, Mass.: MIT Press, [2002] 2005.
  • Chambers, Simone. “The Politics of Critical Theory”, in Fred Rush Fred (ed.). The Cambridge Companion to Critical Theory, Cambridge: Cambridge University Press, 2004.
  • Couzens, David and Thomas McCarthy. Critical Theory. Oxford: Blackwell, 1994.
  • Ferrara, Alessandro. The Democratic Horizon. Hyperpluralism and the Renewal of Political Liberalism, Cambridge: Cambridge University Press, 2014.
  • Forst, Rainer. “The Justification of Human Rights and the Basic Right to Justification. A Reflexive Approach”, Ethics 120:4 (2010), 711-40, reprinted in Claudio Claudio (ed.). Philosophical Dimensions of Human Rights. Some Contemporary Views, Dordrecht: Springer 2011.
  • Forst, Rainer. The Right to Justification. Elements of a Constructivist Theory of Justice, New York, NY: Columbia University Press, 2014.
  • Geuss, Raymond. The Idea of a Critical Theory. Habermas & the Frankfurt School, New York: Cambridge University Press, 1981.
  • Habermas, Jürgen. Knowledge and Human Interests. Boston: Beacon Press, [1968] 1971.
  • Habermas, Jürgen. “Questions and Counter-Questions”, Praxis International 4:3 (1984a).
  • Habermas, Jürgen. The Theory of Communicative Action, vols. 1 and 2, Boston: Beacon Press, [1981] 1984b.
  • Honneth, Axel. Critique of Power: Reflective Stages in a Critical Social Theory, trans. by Kenneth Baynes. Cambridge, Mass.: MIT Press, [1985] 1991.
  • Honneth, Axel. The Struggle for Recognition: The Moral Grammar of Social Conflicts, trans. by Joel Anderson. Cambridge: Polity Press, [1986] 1995.
  • Honneth, Axel. “The Intellectual legacy of Critical Theory”, in Fred Rush (ed.). The Cambridge Companion to Critical Theory, Cambridge: Cambridge University Press, 2004.
  • Honneth, Axel. The Idea of Socialism: towards a Renewal, Polity Press, Cambridge, 2017.
  • Horkheimer, Max. “Traditional and Critical Theory”, in Paul Connerton (ed.). Critical Sociology: Selected Readings, Harmondsworth: Penguin, [1937] 1976.
  • Horkheimer, Max and Theodor W. Adorno. Dialectic of Enlightenment, New York: Continuum, [1947] 1969.
  • Ingram, David. Critical Theory and Philosophy, St. Paul: Paragon House, 1990.
  • Ingram, David and Julia Simon-Ingram. Critical Theory: The Essential Readings, St. Paul: Paragon House, 1992.
  • Jay, Martin. The Dialectical Imagination, Berkeley: University of California Press, 1996.
  • Lukács, Georg. History and Class Consciousness, Cambridge Mass.: MIT Press, [1968], 1971.
  • Marcuse, Herbert. “Philosophie und Kritische Theorie”, Zeitschrift für Sozialforschung VI:3 (1937).
  • Marcuse, Herbert. One Dimensional Man: Studies in the Ideology of Advanced Industrial Society, Boston: Beacon Press, 1964.
  • Mouffe, Chantal. The Democratic Paradox, London: Verso, 2005.
  • Rabinow, Paul (ed.). “Politics and Ethics: An Interview”, in The Foucault Reader, New York: Pantheon, 1984.
  • Rush, Fred. Critical Theory, Cambridge: Cambridge University Press, 2004.
  • Wiggershaus, Rolf. The Frankfurt School, Cambridge: Polity Press, 1995.
  • Wittgenstein, Ludwig. Tractatus Logico-Philosophicus, London: Routledge, 2001 [1st English edition 1922].

 

Author Information

Claudio Corradetti
Email: Claudio.Corradetti@uniroma2.it
University of Rome Tor Vergata
Italy

Nicholas of Cusa (1401—1464)

In the 21st century, Nicholas of Cusa or Cusanus is variously appreciated as a Christian disciple of the burgeoning Italian humanism of the 15th century, one of the great mystical theologians and reforming bishops of the late Middle Ages, and a dialogical religious thinker whose philosophical and political ideas peacefully contemplate the unity of old wisdom and new, Christian and Muslim religious aspirations, and even the differences between cultures and nations. As a humanist, he praised the plainspoken delivery of the idiota or lay philosopher more than the excessive eloquence or vast erudition of the well-trained scholar. Nicholas of Cusa retrieved the idea of the limits of human knowing not just as a finite end but as a path of inquiry centered on the infinite. The seeker of wisdom who follows Cusanus’s path to wisdom needs to tread that itinerary anew every day. Cusanus combined conjectural knowing with a new synthesis of the immanence of the Absolute in the world and dared to think about the relationship between God and the world in terms of a logic of the coincidence of opposites. He creatively investigated the analogy between a God who creates the world with beauty and the artisanship of the human mind. This process of searching for analogies between the inexpressible Absolute and a world that can be represented and investigated on a map yielded new insights into art, language, and even creation itself.

Cusanus was not considered as a significant thinker in the history of Western thought until the late 18th and early 19th centuries. His entrance onto the stage of modern thought is itself revealing. In the 19th century, German philosophers rediscovered Cusanas’s conjectures on infinite worlds, perspectival knowing, and the scientific method. Later Neo-Kantians such as Hermann Cohen and Ernst Cassirer popularized the idea that Cusanus was a Neoplatonic forerunner of the modern, scientific worldview, and his reputation as a modern German speculative giant began to grow. In 1932 Heidelberg, however, the erudite Jewish scholar Raymond Klibansky started a critical edition of Cusanas’s works partly to challenge the myth of a modern Cusanus by documenting amply the ancient and medieval inheritances. Klibansky’s groundbreaking initiative, which soon moved to Oxford and then Canada due to the Nazi persecution of Jews, would not be completed until the first decade of the 21st century, by which point it included the editing of all of the sermons as well. This massive and long-awaited effort has changed the face of studies of Cusanus permanently. Now scholars of multiple disciplines and philosophers of highly diverse schools of thought can read the complete works and evaluate them in their entirety.

Table of Contents

  1. Life and Works
    1. Life
    2. Works
  2. The Catholic Concordance
  3. Learned Ignorance
    1. Unknowing
    2. On Divine Names
  4. God and the World: Cosmology in Perspective
  5. Imago Dei: The Human Person and the Artistic Image
  6. Metaphor and Transcendence in the Late Works
    1. De possest (1460)
    2. Compendium (1463)
  7. References and Further Reading
    1. Edited Works
    2. English Translations
    3. Selected Secondary Works

1. Life and Works

a. Life

He came from the town of Kues on the Moselle River and was born with the name Nikolaus Cryfftz (or Krebs, in German). He received the Latin name of “Cusanus,” which means “the one from Kues.” His background was neither that of the upper class nor of the lower class since his father “was a moderately well-to-do boatman and vineyard owner who served on juries and lent money to local nobility” (Sigmund xi). He excelled in learning and was sent first to Heidelberg for one year before obtaining a doctorate in canon law in Padua in 1423. There in Padua he made contacts with humanists. Contemporary artistic developments in any case attracted his attention, including the prose works of Petrarch and Leon Battista Alberti’s treatises on art. After a few years of study (and perhaps teaching), in Cologne in the early 1430s, he began to preach with a passion for love of God that he attributed to the Italian reformer Bernardino of Siena. His ordination to the priesthood took place at some point in the 1430s and is one of the few acts in his adult life that is not well documented.

Cusanus did not have the typical formation or career of a medieval philosopher. His legal, administrative, and evangelizing work for the Church was his central occupation, but he obviously found time to write and disseminate philosophy outside of the schools. He was a Roman Cardinal and Papal Legate whose historical memory and legal acumen were highly praised, and he is said to have retained an extraordinary recall of the acts of Church Councils. He twice turned down an offer to assume a position in Canon Law at the university that was formed in Louvain in 1425. His learning was formed by contacts he made at the Council of Basel, in the Roman curia, and in his travels through Italy, Germany, and the Low Countries. Other humanists admired him, in part, because he discovered lost manuscripts, such as twelve previously unknown comedies of Plautus. His reputation grew among humanists, and this renown gave him a new point of entry for professional and ecclesiastical success. He also acquired and assiduously annotated what is thought to be the most impressive book collection of his day. For this reason, Rudolf Haubst coined the label “a doorkeeper of a new age (1988).” Cusanus gathered an impressive collection of manuscripts of ancient wisdom and “deposited” them at a point of transition between worldviews that later thinkers would come to identify as the threshold to modernity. Steeped in medieval sources, Cusanus himself could barely have witnessed the birth of a modern world, but his European commentators in the first half of the 20thcentury nevertheless began to write about him almost exclusively through that lens.

His first major philosophical work, On Learned Ignorance, appeared in 1440 and was on the basis of the study of ancient humanist manuscripts as well as the Biblical ideas he had begun to expound in his preaching.


Opening page of On Learned Ignorance
in Codex Cusanus 218

For Cusanus, the communication of the Word of God in the Church and the sharing of philosophical wisdom were intertwined tasks. In this sense, his works mirror his practically oriented life even though the speculative discourse in the former probably went far beyond the grasp of many of the parishioners and clergy with whom he interacted on a daily basis.

His later works contain highly creative forays into mystical theology and even more daring reflections on the utter incomprehensibility of the idea of God. He generally, but not always, avoided sharp polemics, favoring irenic treatises even as fierce battles raged around him. For example, his most important work in interreligious understanding, On the Peace of Faith (1453), was written as the Muslim Turks took possession of the imperial city of Constantinople and turned the Hagia Sofia into a mosque. Likewise, he spent the years 1457-1458 trapped in a remote castle in Andraz because his reform efforts led the nuns in Sonnenberg to convince the Archduke Sigismund to send an army to his bishopric in Brixen. In seclusion, and awaiting a papal army to free him, Cusanus managed to contemplate ultimate realities and compose at least one major philosophical treatise, On the Beryl.

b. Works

  • De Docta ignorantia (On Learned Ignorance, 1440).
  • De coniecturis (On Conjectures, 1441-2).
  • De quaerendo Deum (On Seeking God, 1445).
  • De filiatione Dei (On Divine Sonship, 1445).
  • De dato patris luminum (On the Gift of the Father of Lights, 1445/6).
  • De Deo abscondito (On the Hidden God, 1444/5).
  • De genesi (On Genesis, 1447).
  • Apologia doctae ignorantiae (The Defense of Learned Ignorance, 1449), a response to charges of heresy and pantheism by the Heidelberg scholastic theologian John Wenck in a work entitled De ignota litteratura (On Unknown Learning, 1442-3).
  • Idiota de mente (The Layman on Mind, 1450).
  • Idiota de sapientia (The Layman on Wisdom, 1450).
  • Idiota de staticis experimentis (The Layman on Experiments done with Weight-Scales, 1450).
  • De visione Dei (On the Vision of God, 1453).
  • De mathematicis complementis (On Complementary Mathematical Considerations, 1453, a second book was added in 1454).
  • De theologicis complementis (Complementary theological considerations, 1453), in which he pursued his continuing fascination with theological applications of mathematical models.
  • Caesarea circuli quadratura (The imperial squaring of the circle, 1457).
  • De beryllo (On the Beryl, 1458), a treatise using a beryl or transparent stone as the crucial symbol.
  • De aequalitate (On Equality, 1459).
  • De principio (On the Beginning, 1459).
  • De possest (On the Actual Existence of Possibility, 1460).
  • De non aliud (On the Not-Other, 1462).
  • De venatione sapientiae (On the Hunt for Wisdom, 1462).
  • De ludo globi (The Bowling-Game, 1463).
  • Compendium (Compendium, 1463).
  • Epistola ad religiosum Nicolaum, novitium Montisoliveti (Letter to the religious Nicholas, novice of the Abbey of Mount Oliveto, 1463).
  • De apice theoriae (On the Summit of Contemplation, 1464).

Except in the case of The Catholic Concordance (translated by Paul Sigmund) and where otherwise noted, the citations below of the works of Cusanus are taken from the translations of Jasper Hopkins. The paragraph numbers from the critical edition are, where possible, adopted in the English translations and are usually indicated below with the abbreviation “N.”

What style of philosophy did Cusanus adopt? His works do not follow the quaestio method of medieval Scholasticism, which favored a juxtaposition of rival arguments in a fixed classificatory scheme. He often employs a form of dialogue, albeit one that differs considerably from the more eloquent humanist dialogue of the 15th century. Some pieces are written in response to questions from friends who were seeking guidance about the practice of the contemplative life. In general, Cusanus experiments with new genres while avoiding the humanist absorption into the taxonomy or learned examination of genres. Above all, he was interested in exploring ways to communicate Christian wisdom that could be made easily accessible to a variety of listeners. In 1450, for example, he composed three books attributed to a Christian layman. These works signaled an approach to learning that engages the wisdom found outside of the standard contexts of learning.

All of his philosophical works were written between 1440 and 1464. Jacques Lefèvre d’Étaples, a Renaissance humanist, prepared a print edition of some works in 1514, but the critical edition prepared by the Heidelberg Academy of Sciences contains not only all the philosophical and theological works but the sermons as well. These works are preserved in Latin except for Sermon 24, which is a philosophical meditation on The Lord’s Prayer that was originally written in the Mosel-Franconian dialect. Nancy Hudson and Frank Tobin prepared an English translation and commentary on this text (Casarella 2006, 1-25).

The Cusanus-Portal (http://www.cusanus-portal.de/) has made available all of the texts in their edited form, several translations, and glossaries. Jasper Hopkins, Professor Emeritus of Philosophy at the University of Minnesota, has translated all of the philosophical works and a good number of the sermons into English. The American Cusanus Society (http://www.americancusanussociety.org/) also maintains valuable resources about recent publications, bibliographies, and upcoming conferences. The society’s Newsletter, which is published annually on the website, contains book reviews, recent conference presentations, and original articles about Cusanus and related topics.

2. The Catholic Concordance

In Cusanus’s time, Archbishops in places like Cusanus’s native diocese of Trier, Germany wielded temporal power. With the death of the local Archbishop in 1430, a local nobleman named Ulrich von Manderscheid made a claim on the archbishopric and in 1431 hired Nicholas of Cusa to represent him and his claim at the Council of Basel. In 1432, Cusanus became incorporated as a member of the Council, which was caught in a bitter dispute with the Pope about the relative authority of the Pope over the Church Council. Soon, Cusanus was more than just an attorney in a case about the political leadership in the Rhine and Moselle valleys. His learning and acumen gained him equal respect on questions of faith and doctrine and placed him at the center of the most pressing theological debates in the Catholic Church of his day. Accordingly, during the year 1433 Cusanus composed a treatise entitled On the Catholic Concordance (De concordantia catholica). He later wrote significant legal briefs on political and ecclesiastical matters, but this early work is his major contribution to both the idea of the Church and political theory.

This treatise reflects the raging polarizations in Basel between the majority of delegates, who favored the authority of the Council over the Pope, and the minority who favored a more pro-Papal line. In the treatise, Cusanus clearly represents conciliarist thinking. A few years after the dissolution of the Council, he sided with the hardened position of the decidedly anti-conciliarist Pope Eugene IV. The motives for this switch are debated by scholars.  The treatise offers a window to the early development of a philosophically inclined and speculative mind grappling with key ideas about the Church and politics. This exposition will be limited to two of the more central and widely discussed ones: consent and harmony (concordantia).

Nicholas develops his theory of consent in The Catholic Concordance, Book II, 8-15. In speaking about the Church, Cusanus does not assume, as one might, that the views of the consenting bishops must be represented in the form of a vote. Voting is just one form of consent in the Church. On the other hand, the third book on the Holy Roman Empire was written in response to the visit of the Emperor Sigismund to the Council of Basel. In it, Cusanus tried to argue for a political reform of the empire that might mirror the ecclesial reform developed in the first two books. In Book III, 37, for example, Cusanus draws inspiration from the Mallorcan thinker Ramon Llull and adopts a system of preferential voting by which the prince electors can elect a Holy Roman Emperor. By comparison, the concept of consent within the Church in the second book is more abstract and sacramental. Cusanus notes that ecclesial consent can be both explicit and tacit. He places a great deal of weight on the force of custom. Yet the novelty regarding a reform of the Church based upon the doctrine of consent is striking. The Pope, he says, does not depend upon the Council; nevertheless, “even in the decision on matters of faith which belongs to him by virtue of his primacy he is under the council of the Catholic Church” (I, 15, no. 61). Furthermore, Cusanus allows for this consensual principle of Church governance to have more far-reaching consequences when he writes:

Since all are by nature free, every governance—whether it consists in a written law or living law in the person of a prince…can only come from the agreement and consent of the subjects. For if men are by nature equal in power and equally free, the true properly ordered authority of one common rule who is their equal in power can only be constituted by the election and consent of the others, and law is also established by consent (II, 14, no. 127).

In the previous medieval tradition the appeal to natural freedom as the basis for participation in governance was not widely held. In this sense, Cusanus may not be the explicit harbinger of modern constitutionalism as some have claimed, but he is certainly opening the door to a relatively unknown and democratically oriented form of thinking about political representation.

The doctrine of harmony in The Catholic Concordance is one that emanates from doctrinal authority, institutional and sacramental presence, and legal jurisdiction of the Church. As Jovino Miroy has argued, there are still glimpses here of the explicitly philosophical notion of harmony that unfolds in Cusanus’s later works. The later works opt explicitly for a dialogical harmony that seeks to encompass seeming contradictions about God and the world in an ordered whole. The basic notion in The Catholic Concordance is that there cannot be discord in God. As such, the world is an image of a harmony that finds its supreme expression in the Triune conception of God, that is, of  God as one nature in three persons (Father, Son, and Holy Spirit). Cusanus seeks to explicate this notion not only as a religious precept but a metaphysical truth (Miroy 90-93). Concordance as an image of divine harmony is traceable in the world as a harmony of differences. Cusanus is too experienced as a lawyer to assume naively that all visible concordances lead immediately to actual peace, but he is committed in both theory and practice to the novel idea that the faith-laden gift of eliciting harmony out of real differences also reveals a more transparent image of God in the world.

3. Learned Ignorance

a. Unknowing

In 1440, Cusanus finished his programmatic work On Learned Ignorance (De docta ignorantia). Cusanus introduces the notion of learned ignorance by reworking the ancient idea that “there is present in all things a natural desire to exist in the best manner in which the condition of each thing’s nature permits this (I,1, N.2).” His program for renewal resembles a largely forgotten Neo-Pythagorean tradition (ibid.). As such he compares the “natural desire to exist in the best manner” with the search for first principles in mathematics. He likens human knowing to searching for comparative relations in mathematical knowledge. When the object of our perception can be apprehended by means of a close proportional tracing to what one knows, then through our judgment we apprehend easily. Otherwise, we need to proceed from later propositions back to earlier ones until we reach the first principles. Here, he says, hard work is required in order to attain certainty (ibid.).

Nicholas states: “Since the desire in us is not in vain, assuredly we desire to know what we do not know (I, 1, N.4).” Drawing upon an attestation in multiple religious traditions and in the early Italian humanist re-appropriation of the model of Socrates found in the dialogues of Plato, Cusanus defines learned ignorance as a state where no greater knowledge can be attained (ibid.). Haubst comments upon the distinctiveness of Cusanus’s path to learned ignorance by means of the ancient dictum that all individuals have by nature a desire to know. Haubst then claims that Cusanus, by contrast with the ancients, never delimits the natural desire of a human being to use his natural powers to recognize or attain knowledge (1991, 66). Cusanus posits an insatiable and limitless desire in the human being that shows the presence of a gift in the order of things. In De Deo abscondito (1444), the pagan interlocutor says that the desire to be in the truth is what has drawn the Christian into worship (N. 6). The Christian agrees, saying that an orientation to the God who is ineffable truth draws him to worship. For Cusanus, there is no fixed measure or perfection or maximum that the human being holds in his or her possession (von Bredow, 69). Learned ignorance opens up a dynamic path of being and knowing for the individual and, in a certain way, for the spirit. This is not, as some have claimed, a slippery slope to the modern conflation of the dynamic, human spirit with the divine one (von Balthasar, 209). For Cusanus, the absoluteness of the difference is always held in view.

At the end of his life, Nicholas returned to the theme of learned ignorance to underscore its novelty. In De venatione sapientiae (1462-63), he first exclaims: Mira res! (“How wondrous a thing!”). He then states that the natural desire of the intellect to know God’s Quiddity is not innate to it:

Rather, [what is innate is its desire] to know that its God is so great that there is no end of his Greatness. Hence, he is greater than everything conceived and knowable (De venatione sapientiae, ch. 12, N. 32.).

This passage highlights the theocentric radicalism of Cusanus’s understanding of desiderium naturale. His Neoplatonism colors his Christian philosophy without overtaking it. In other words, the wondrous incomprehensibility of God—rather than the limited endowments of the human mind—is what he takes to be the much needed new starting point. A middle work confirms this trajectory. Cusanus in chapter 16 of De visione Dei (1453) states that “unless God were infinite, he would not be the end of desire.”  Although his thinking about natural desire arises in the context of what he himself labels an Aristotelian commonplace, it has been suggested by Sophie Berman that his theory of desire here effectively reverses the Aristotelian supposition that a divine being is incompatible with infinity (De visione Dei, ch. 16). By virtue of this pious desire for a reversal of the philosophical tradition, Cusanus embeds a new tradition of thinking about intellectual desire within a richly dynamic theory of the movement of the intellectual spirit (motus desideriosus). This new invention is neither the standard Aristotelianism nor a standard anti-Aristotelian Augustinianism.

b. On Divine Names

In De Docta ignorantia, Cusanus introduces his theory of divine names through the similitude of an infinite sphere. The complete actuality of the sphere is the maximal center that precedes all width, length, and breadth and is also the End and the Middle of these and all other lines. All beings tend toward God but not as an end, like the last stop of a bus or train. The movement of the infinite sphere is the dynamic image of God as “the End of motion, viz., the Form and Actuality of being,” and, likewise, the cessation of motion. Names befit the Maximum according to a similar process of entering into the divine motion and simultaneously being stilled within the same actuality. No name is expunged by the infinitude of divine actuality. Each contributes signification in a manner like the multivalent being of a line within the Cusan infinite sphere. Nor can any discursively posited name stay apace with the motionless motion of the divine source. Names thus hover between perspectival knowing and the Absolute.

In the treatment of affirmative theology in Bk. I, ch. 24, Cusanus symbolizes the divine actuality with the names “Oneness,” “tetragrammaton,” and “Maximum.” The true name can be only that which is signified by the absolute name. Affirmative theology is not thereby relativized. Even if affirmative names befit God only infinitesimally, they do so in relation to created things. In this way, Cusanus deliberately connects his positive theory of naming to an analogical metaphysics of created reality. He maintains that the pagans named God in various ways in relation to created things, allowing the young Cusanus another opportunity to brandish his humanist credentials (I, 25, N. 83-84).

Worship is tied to this upward movement from created things to the infinitely nameable Maximum. At the summit of this contemplative ascent is the ineffable one, spoken of more truly through removal and negation. The lesson regarding this path is re-named from “learned ignorance” to “sacred ignorance (I, 26, N. 87).” Cusanus is still maintaining with Pseudo-Dionysius that names for God cannot by themselves point to the Absolute, but he also believes that the way of negation makes the revelation of the Triune God even more perfect and worthy of silent, liturgical praise. Certain hierarchical realities in creation–the superiority of intelligence over a stone or virtue over drunkenness, for example–are made more evident through removal and negation (I, 26, N. 85). So, when Cusanus concludes his treatment of divine names by stating that, in the end, “precise truth shines within the darkness of our ignorance,” he is illuminating a path to knowing and revering the incomprehensibly good and self-diffusively good God.

In the late tetralogue De li non aliud (1462), Cusanus shows an even greater debt to ancient Neoplatonic authorities on unknowing the Absolute. Here, he is particularly indebted to the metaphysics of negation in Proclus and Pseudo-Dionysius. “Not-other” (non aliud) is introduced as more than another neologism for the ineffable God. It is a phrase adapted from a new reading of the second and third books of the Latin text of Proclus’ Theologia platonis that defines definition itself. For example, in commenting on the divine names “being,” “truth,” and “goodness,” Cusanus tells his Aristotelian interlocutor that “Not-other” is not other than any of these. The “not-other” is “seen to be before these (and other) things in such a way that they are not subsequent to it but exist through it (De li non aliud, ch. 4, N. 14). While the whole schema is noteworthy for its originality, D’Amico is among those who have investigated the heavy dependence on Proclus throughout this writing. This dialectical understanding of the world as an exemplar of the Absolute reconfigures both the “what” and the “how” of naming God. Scholastic theology had already distinguished between the mode of signifying and the thing being signified, but Cusanus is making a much bolder claim. The defining character of the new signifier is the even more fervent abjuring of any middle position between affirmation and negation. The immanence of “the not-other” comes into thought based on its appearance through the non-material “sight” of the intellect (visio intellectualis), namely, through its refraction in all other possible signifiers, including the transcendentals of being, truth, and goodness.

4. God and the World: Cosmology in Perspective

The second book of DDI develops a theological cosmology of the creative presence of the unnameable Absolute. At the center of this book, Cusanus elaborates on the world as a contraction of the Absolute maximum and states that the world is the unfolding of that which is enfolded in the Absolute Maximum. This pairing of complicatio (enfolding) and explicatio (unfolding) is not original to Cusanus, but it is made a hallmark of his new understanding of the world. In order to foreground the newness of his conception of the world, Cusanus states at the outset of DDI II, 11: “Now that learned ignorance has shown these previously unheard of [doctrines] (ista prius inaudita) to be true, perhaps there will be amazement on the part of those who read them.” Scholars today are divided as to whether the inaudita refers backward or forward in the second book. If backward, then Cusanus is marveling on amended speculative insights drawn from Boethius, Thierry of Chartres, and Augustine on the trinity of the universe in DDI, II, 7-10. Certain key features of Cusanus’s cosmology in these sections stem from the Boethian-Chartrian heritage revived in the 15th century: the triad of unitas, equalitas, and conexio, Thierry’s four modes of being and trinity of perpetuals, and the Trinitarian attribution of “exemplar” and “form of forms” to the divine Word (Albertson 2010, 388). To complicate matters further, this section of the text is almost identical to an anonymous 15th century treatise Fundamentum Naturae and may have been plagiarized from it. If inaudita refers to what comes afterwards, as Jasper Hopkins has argued, then the “previously unheard of” doctrines include the idea that the earth is not the center of the universe but “a noble star (II, 12, N. 166).” Following that clue, DDI II, 12 indicates that Cusanus hypothesizes the relativity of the earth as a strictly theological postulate: “Blessed God created all things in such way that when each thing desires to conserve its own existence as a divine work, it conserves it in communion with others (II, 12, N. 166).” Regine Kather has argued that Cusanus’s radical transformation of the Ptolemaic universe bears more in common with Einstein’s denial of any universal center than with the heliocentrism revolutionized by Copernicus and completed by Galileo. Either way, there is both genuine novelty and a conscious recovery of a distinguished literary heritage in Cusanus’s theological cosmology.

De genesi (1447) is a philosophical meditation on the act of creation as such. Biblical texts and even the figure of Moses are mentioned, but the rhetorical appeal lies in the application of Cusanus’s religious cosmology by means of an invented neologism to the question of what distinguishes the Christian apprehension of created reality. The dialogue deals with the genesis of all things, “which are so different and so opposed,” out of the Same (I, N. 143). The metaphysics of the Same does anything but erase worldly difference. Cusanus is claiming that only Absolute Sameness could produce a world of difference because the Sameness of the same is so radically other than the differences that define the world of measurement and finite proportionality. The Absolute Same exists at the point of coincidence with “the Unattainability” that begets a world in which things can be judged to be the same and different (I, N. 151). Cusanus uses a rich variety of artistic and linguistic metaphors to demonstrate how “the unattainable Same shines forth brightly in the countless multitude of all attainable things (II, N. 154).”

5. Imago Dei: The Human Person and the Artistic Image

Cusanus sees the human person as a Deus humanatus (De dato patris luminum, N. 102). Jasper Hopkins renders this as “a God manqué.” By that, he means that the human person is like God in all things except that of being God. In other words, human beings cannot be God but can image God in a human and living way. The humanists who surrounded Cusanus in Italy also played with this notion in their theology and in promoting the rise of a new appreciation of the arts. In fact, Cusanus approximates the “practice-oriented relativism” of the theorist of painting and architecture, Leon Battista Alberti, a thinker who had even called the artist “another God” (Harries 1990, 102). This historical connection seems like a reasonable hypothesis given that Cusanus and Alberti both studied in Padua in the same years although there is no recorded confirmation of their meeting in person. Cusanus also annotated Alberti’s writings on art.

In three writings from 1450, we see the full unfolding of this view of the human person and of art. The dialogue Idiota de mente (1450) is part of a trilogy that deals with the wisdom of the putatively unsophisticated lay philosopher and sheds light on the philosophical question of human creativity. The idiota, a model of lay wisdom, engages the professional philosopher on the question of the mind (mens) as a measure (mensura). The layman is portrayed in the dialogue as a handcraftsman. He is encountered by the philosopher while the former is busy carving a spoon. A wooden spoon in the Italian context signifies homegrown sapientia. So, the dialogue commences with a discussion of “the form-of-spoonness, through which a spoon is constituted a spoon (ch. 2, N. 63).” In sum, the spoonmaker teaches the philosopher that there is a difference between the Peripatetic (Aristotelian) and Academic (Platonist) approach to attaining knowledge of spoonness. The spoonmaker opts for neither explanation but uses the image of spoon carving to suggest a third way of understanding the mind. If the Aristotelian approach represents the claim that knowledge of the exemplar is first gained through the perception of the sensory spoon, then the Platonist represents the opposite view, for example, that knowledge of the exemplar is the condition for the possibility of knowledge of the sensory perception. The layman fashions a humanistically inspired exemplarism whereby form remains infinite and ineffable, but the imposition of an arbitrarily imposed name (coclear, “spoon”) and the perception of an imperfectly carved wooden utensil both lend weight to the unified grasp of the living form (what the Renaissance theorist of art and architecture Leon Battista Alberti called la più grassa Minerva—putting a new emphasis on the visibility of form or “more crude, fatter” wisdom). In other words, forms are not revealed in either a static, ideal world or as already given, fully knowable empirical data; they are grasped not only in themselves but also in the act or self-manifestation of their own production. Arbitrarily, Cusanus pushes to, and perhaps, even beyond the limits of how Platonic and Aristotelian ideas of the mind were typically understood in his day.

The treatise is primarily philosophical but introduces theological topics in equally surprising ways. In Chapter two, Cusanus turns from the question of an inherently unknowable form to the problem of naming. There is naming in both the human and divine mind. This parallel again suggests to him a likeness to the “spoon” inasmuch as both the names we give to things and the craftsman’s artifact are created seemingly at will: “The wood receives a name from the advent of a form, so that when there arises the proportion in which spoonness shines forth, the wood is called by the name “spoon”; and so, in this way, the name is united to the form (Ch. 2, N. 64).” This comparison suggests, somewhat paradoxically, that there is both an arbitrary and a true dimension to a fitting name. Drawing upon his own creative theology of the divine Word that creates the world, Cusanus compares the layman’s dialogical presentation of a new theory of knowledge through the creation of names:

Therefore, there is one ineffable Word, which is the Precise Name of all things insofar as these things are captured by a name through the operation of reason. In its own manner this Ineffable Name shines forth in all [imposed] names. For it is the infinite nameability of all names and is the infinite vocalizability of everything expressible by means of voice, so that in this way every [imposed] name is an image of the Precise Name. (Ch. 2, N. 68).

The creativity of the divine Word is both a power from above, connected to the second person of the Trinity, and an absolute figure for the creative mind that seeks to know the right names of things. As such, reasoning to find the precise name of a thing, Cusanus discovers therein the mind’s capacity to “enfold” within itself the humanized but infinite power to seek and name the absolute in every discrete thing. The word “spoon” is therefore a perfectly fitting name based both upon social convention and the empirical experience of spoons and a quasi-divine cipher for an invisible human power to express the self in the world as a knower of the world.

This analogy of the Word recurs frequently in Cusanus. In the late work De Principio (“On the Beginning,” 1459), for example, he weaves together the ancient distinction between an inner and outer Word with Trinitarian and Biblical theology of the Word. In doing so, Cusanus makes some basic Neoplatonic points about the eternal, ineffable beginning (principium) that “precedes” all that begins. So, for example, the speech of the Word made flesh displays “that the eternal-form-of-Being speaks perceptibly in the things that, through it, exist in a perceptible way (N. 16).” “The Word” (verbum) can also signify “all intellect, which is either Creator-intellect or assimilator-intellect,” the former being the form of forms and the exemplar of all assimilable things that lends figure to them (N. 21, translation Casarella, Peter). In considering the narrative of creation in Genesis, the Word is the formless receptacle in which all things are present before being made (N. 23). Likewise, if one distinguishes between the world seen before its creation and as created, then the former comes to be as Word and the latter as created by the Word (N. 37). Finally, he ends his reflection by praising the “Word of the living God, through which Word all things exist (N. 40).” In sum, there is no comparable relationship between the Word as the absolute beginning of all things and the Word as utterance, as that which has been expressed as outward speech. The metaphor of speech grounds the otherwise mysterious disclosure of a new starting point for a metaphysics of the world as God’s creation. In the Protestant Reformation in the early sixteenth century, a strong emphasis on the theology of the Word as the basis of creation would also be affirmed, but often without such a robust and straightforward affirmation of the mysterious beauty of the glory of God made visible in the created order.

6. Metaphor and Transcendence in the Late Works

In 1453 Nicholas responded to a question posed by monks in Tegernsee (Austria) about the role of intellect in the contemplative life with a lengthy and highly metaphorical mystical treatise on the vision of God. Starting with this work and with increased intensity in the late 1450s and until his death, symbolic language and expressions begin to occupy a central place in his philosophical and theological works. In his speculative treatise of 1458 De Beryl he takes up the likeness between divine and human creativity and declares: Et haec est scientia aenigmatica (N. 7). Hopkins renders this as “Now, this knowledge [of the Divine Intellect] is symbolical knowledge (1998, 794).” In the late works, there is thus a repeated attempt to play with metaphors, grammar, visual images, and knowledge of the world and of God that is itself symbolical for the sake of expressing the infinite variety of the inexpressibly Absolute.

a. De possest (1460)

In three of the late works (De possest, De venatione sapientiae, and De apice theoriae), Cusanus proposes a new way to grasp the relationship between possibility and actuality. In the first two works, he accomplishes this reflection by means of the amalgam possest. This neologism brings together the infinitive of the verb “to be able” (posse) with the third person singular of the present tense of the verb “to be” (est). The signifier thus looks to the union of opposites of two significations (actual existence and possible existence) and two modalities of a verb (one a phrase that is uninflected, non-temporal, and literally “non-finite” and another a temporal act that brings the non-temporal into the present). One possible translation into English of possest is “the actual existence of possibility.” In De apice theoriae Cusanus makes the more radical claim that that which is signified by possest can also be signified by posse ipsum (“to be able to be itself”).

De possest is a dialogue in which the concept “long-pondered” by the Cardinal is introduced and discussed (De possest, N. 1). The point of departure is Romans 1:20: “The invisible things of Him, including His eternal power and divinity, are clearly seen from the creation of the world, by means of understanding created things,” (DP2, comparing DP 15). This draws attention to Cusanus’s ongoing concern to articulate that creation is not just a true doctrine that refutes error but a mode of “showing” whereby the infinite appears in the finite.

The treatise introduces the Cusan style of speculative thought well, for it joins a thorough discussion of possibility and actuality with the seamless incorporation of spiritual themes, like the gift of faith that comes through Jesus Christ (N. 31-33) and reflections on God’s triunity. Three concrete symbols of transcendence are elaborated: the infinite motion of a spinning top (DP 18-23), the word possest (N. 27), and the word “in” (N. 54-6). Cusanus takes “in” as a symbol of the Trinity because of its three lines. In contemplating “in,” Cusanus first considers the construction of the written letters solely through straight lines into a whole greater than the parts and builds his way to that which is signified by the word “in.” The word points to an entrance. All entering, he notes, involves going in. As a prefix “in” is joined to other roots. For example, intueor (“to regard, admire”) signifies that knowledge of God can be likened to “entering” into the wondrous mystery of an ineffable God. The prefix can also be a negation as in “ineffability.” The joining of “in” in the signifier is moreover symbolically the movement beyond the union of positive and negative theology (in the sense of being “enfolded” in the divine mystery). In all these senses, “in” functions like a hidden signifier that precedes all naming of God.

Cusanus likewise plays with a symbol elicited from the signifier possest. The “e” is found in posse, est, and their union. He then argues that the “triune” sharing of the letter “e” in the amalgamation of posse and est signifies a hidden vocalization of both possibility and actuality at the heart of the new divine name. Likewise, God is beyond the union of opposites of absolute possibility and absolute actuality. God, in any case, is said to be hiddenly, Trinitarianly, and immanently discoverable in the world in the same way that “e” is in possest (N. 57).

How does this new divine name articulate the relationship between actuality and possibility? Possibility and actuality are joined and surpassed in the Infinite. The cipher of possest illuminates this union of opposites. There is also a reflection of this union in the world created by the act of a divine possibilizer. On this question, the strictly Aristotelian-Thomistic tradition sees potentiality as a void and actuality as fullness. Cusanus, in certain ways, calls this prioritizing of the actual into question. He does so by introducing in N. 27-29 another pair that is surpassed in God by possest: posse fieri (to be able to be made) and posse facere (to be able to make). Both are self-evident in God’s creation. Seen apart from its material form, the posse fieri of creation would be a purely passive receptacle given that the sole source of its actuality comes from the Creator. God has the possibility of making, and creation possesses the possibility to be made. But Cusanus complicates this scenario without detracting from the centrality of the freedom of God to create something out of nothing. On the part of Creator, God is not only the one who accomplishes and transcends the act of divine making (what he calls “the divine art of the Word” in N. 34). God, is also the absolute possibility that transcends the distinction between potentiality and actuality of making. On the part of creation, one can speak by analogy of a posse fieri and posse facere in the created order, as for example, in the author’s relationship to a book that she has made. The author possesses the posse fieri of the book at least as an abstract possibility of not writing the book (N. 29). As a triune image of the Creator, human authors also possibilize, for they see their own creations not just as abstractly actualized but also in the process of becoming (another translation of fieri). Finite actuality is therefore not just a fullness prior to all possibility and still not posterior to it. It is, one might say by a certain extrapolation, the multivalent presence of possibility in both its coming into being and its having been actualized. God is symbolized in the very depth of the coming to be of things and not just in their completed actuality.

The dialogue brings specific philosophical doctrines to the fore in order to be illuminated by the figure of possest. The interlocutors, for example, also examine the absolute beginning of the world, the epistemological status of the mathematical entities (N. 43), the nature of motion (N. 52-3), the tripartite ordering of theoretical investigations into physics, pure intellectuality, a middle realm consisting of the union of intellectual abstraction with the faculty of the imagination (N. 63-4), the ubiquity and power of form and the nature of the intellect’s abstraction of form from matter (N. 64), and the philosophy of being and not-being (N. 65-6). There is, however, an overriding concern that emerges in each of these engagements. Creation is that being that was brought into being by God as possest. As such, there is a clear and unequivocal negation in the created order; creation is that which is not God. The not-being of creation on this level directs the mind to the triune eternity of possest out of which God creates. To the degree, however, that God creates being as the one possible not-being of creation, God becomes manifest anew as possest in the very order of creation. This is a kind of secondary not-being in the order of being, one whose recognition makes the manifestness of creation apparent. Cusanus amplifies the aesthetic dimension of this revelation by means of the Greek term cosmos:

Now, the name [“cosmos”] denies that the world is ineffable Beauty itself But it affirms that [the world] is the image of that [Beauty] whose truth is ineffable. What, then, is the world except the manifestation of the invisible God? What is God except the invisibility of visible things?—as the Apostle says in the verse set forth at the beginning of our discussion (N. 72).

This abyss between divine manifestness and absolute not-being is also likened to the difference between heaven and hell (N. 72).

In sum, what is the idea of God in De possest and how does it relate to other prevalent conceptions of God in the Christian Middle Ages of the Latin West? Cusanus draws upon the logic of divine perfections in the tradition of St. Anselm but even more heavily upon his own theories of learned ignorance, unions of opposites, and the pairing of complicatioexplicatio. Of the possible paths to God that were being articulated in his milieu, two in particular are not pursued. The first is the nominalist idea of the absolute power of God that stands in some way in opposition to God’s ordained power. William of Ockham had popularized this notion, and Cusanus is clearly interrogating such voluntarism in a critical vein. But neither does he fully endorse the notion of God as actus purus et perfectus developed by Thomas Aquinas. Possest is no compromise between nominalism and Thomism but an original creation that sheds light on both. It is important to note that terms like potentia, possibilitas, and even posse are used interchangeably in the dialogue. Thus, the divine potency in God is clearly affirmed but according to its infinite possibilizing. Cusanus is thus a radical Thomist in that he completely rejects the notion of a divine power that arbitrarily acts without any self-refraction in the rationally knowable order of actual existence. But he is also clearly affirming neither the words nor the spirit of the standard Thomistic position whereby God is pure act devoid of all possibility. In terms of act, possest, he claims, signifies precisely what is signified by the Biblically inspired “I am who am” (N. 14). The act of being of God cannot become more or other than what it already is but an infinite perfection of being is laden with the possibility of being comprehended as an eternally present “is” that is always already able to be. The most radical consequence of this innovation regards the theory of the knowledge of things finite and infinite as an image of such posse (for example, N. 41, 63). The intellect in its learned ignorance becomes aware of a possest of knowing in its apprehension of the divine name and its images in a variety of fields of knowing. Some have likened this newly discovered subjective capacity to the creative imagination of Neokantianism even though they must then abandon the metaphysical project as pursued by Cusanus.

b. Compendium (1463)

In this very late work, Cusanus incorporates the vision of God into a semiotic interpretation of reality. For Cusanus, both man and other animals use a variety of verbal signs. Cusanus notes that a hen makes different noises when she is calling chicks to eat than when she is warning them of the presence of a predator whose shadow she has sighted (N. 4, lines 3–5, p. 5). Despite their natural variety, the semiotic utterances of animals are still unformed signs (confusum signum). The power of human ars presents the possibility of bringing intellectual form to naturally given signs to communicate better a variety of desires (N. 7, lines 13–14). According to Cusanus, the art of writing adds nothing to the formation of the sign but consigns it to a realm of visible signs in which it will remain once the spoken sign is lost from memory. The formation of a species for each word therefore precedes the genesis of writing. Both the receptivity of sensible signs and the spontaneity to create abstract signs are present in speech. Writing mirrors the activity of the imagination, which retains signs created by the intellect.

Both spoken and written words are conventional signs since they do not signify naturally. The conventions by which words are assigned distinct meanings are imposed arbitrarily but are not intrinsically arbitrary. Speech and writing elevate the linguistic capacity of humans so that they too may strive to refabricate the natural knowledge that Adam possessed most completely (ch. 9, N. 26). As a creator of iconic signs, the human knower strives to represent not the thing as it is known in itself but the intention that lies behind the sign. The human knower alone seeks a purely formal sign, one that can be abstracted completely from sensible signs. The search for a purely formal sign points to the analogy between human and divine conception: the human knower “creates knowledge out of signs and words, just as God creates the world out of things (ibid.).” The creative activity of the mind produces something new in a manner analogous to God’s creation of the world. The intention by which God creates must be mirrored in every level of the rational creature’s semiotic creativity.

The image of the human sign-maker is the reclaiming of the living image of God. There is always a gap between the signs the artistic mind makes and the signs that the divine Artist places in the world. There is no proportion between the work of the human and divine sign-maker. This gap thus remains unbridgeable for the finite intellect. In any case, the meditative consideration of the gap and of the desire of the intellect to overcome that gap spurs human creativity to a life-long quest to seek an image of the invisible God precisely in the finite, semiotic world in which we now find ourselves.

7. References and Further Reading

a. Edited Works

  • Cusanus, Nicolaus. Opera Omnia. Iussu et auctoritate Literarum Heidelbergensis ad codicum finem edita. Leipzig/Hamburg: Felix Meiner, 1932.
    • These texts are available but without scholarly apparatuses at the Cusanus-Portal.

b. English Translations

  • Bond, H. Lawrence. Nicholas of Cusa: Selected Spiritual Writings. New York: Paulist, 1997.
  • Führer, M. L. The Layman on Wisdom and the Mind. Ottawa: Dovehouse Editions, 1989.
  • Hopkins, J. Nicholas of Cusa on Learned Ignorance: A Translation and an Appraisal of De Docta Ignorantia. Minneapolis: Banning, 2nd edition, 1985.
  • Hopkins, J. A Concise Introduction to the Philosophy of Nicholas of Cusa. Minneapolis: Banning, 3rd ed. 1986.
  • Hopkins, J. Nicholas of Cusa’s Dialectical Mysticism: Text, Translation, and Interpretive Study of De Visione Dei. Minneapolis: Banning, 2nd edition, 1988.
  • Hopkins, J. Nicholas of Cusa’s De Pace Fidei and Cribratio Alkorani: Translation and Analysis. Minneapolis: Banning, 2nd edition, 1994.
  • Hopkins, J. Nicholas of Cusa on Wisdom and Knowledge. Minneapolis: Banning, 1996.
  • Hopkins, J. Nicholas of Cusa: Metaphysical Speculations, vol. 1. Minneapolis: Banning, 1998.
  • Hopkins, J. Nicholas of Cusa on God as Not-Other, 3rd ed. Minneapolis: Banning, 1999.
  • Hopkins, J. Nicholas of Cusa: Metaphysical Speculations, vol. 2. Minneapolis: Banning, 2000.
  • Hopkins, J. Complete Philosophical and Theological Treatises of Nicholas of Cusa. Minneapolis: Banning, 2001.
  • Hopkins, J. Nicholas of Cusa’s Early Sermons: 1430–1441. Loveland, Colorado: Banning, 2003.
  • Hopkins, J. Nicholas of Cusa’s Didactic Sermons: A Selection. Loveland, Colorado: Banning, 2008.
  • Izbicki, Thomas M., Nicholas of Cusa: Writings on Church and Reform. ITRL 33. Cambridge, MA: Harvard University Press, 2008.
  • Miller, Clyde Lee, The Laymen: About Mind. New York: Abaris Books, 1979.
  • Sigmund, Paul, The Catholic Concordance. Cambridge: Cambridge University Press, 1991.
  • Watts, Pauline Moffitt, Nichola de Cusa: De Ludo Globi (The Game of Spheres). New York: Abaris Books, 1986.

c. Selected Secondary Works

  • Albertson, David. “A Learned Thief? Nicholas of Cusa and the Anonymous Fundamentum Naturae: Reassessing the Vorlage Theory,” Recherches de Théologie et Philosophie médievales 77 (2010), 351-390.
  • Albertson, David. Mathematical Theologies: Nicholas of Cusa and the Legacy of Thierry of Chartres. New York: Oxford University Press, 2014.
  • Bellitto, Christopher M., Izbicki, Thomas M., and Gerald Christianson, ed. Introducing Nicholas of Cusa: A Guide to a Renaissance Man. New York: Mahwah, N. J.: Paulist, 2004.
  • Brient, Elizabeth. The Immanence of the Infinite: Hans Blumenberg and the Threshold to Modernity. Washington, D. C.: The Catholic University of America Press, 2002.
  • Carman, Charles. Leon Battista Alberti and Nicholas of Cusa. Towards an Epistemology of Vision for Renaissance Art and Culture. Surrey, Eng./Burlington,VT: Ashgate, 2014.
  • Casarella, Peter. “Nicholas of Cusa and the Power of the Possible.” American Catholic Philosophical Quarterly 64 (Winter, 1990): 7-34.
  • Casarella, Peter. “Nicholas of Cusa (1401-1464), On Learned Ignorance: Byzantine Light en route to a Distant Shore,” In The Classics of Western Philosophy, edited by Jorge J. E. Gracia, Gregory M. Reichberg, and Bernard N. Schumacher, 183-9. Basil Blackwell: Oxford, 2003.
  • Casarella, Peter, ed. Cusanus: The Legacy of Learned Ignorance. Washington, D.C.: The Catholic University of America Press, 2006.
  • Casarella, Peter. “Cusanus on Dionysius: The Turn to Speculative Theology.” Modern Theology 24 no. 4 (October 2008): 667-678.
  • Casarella, Peter. “Nicholas of Cusa and the Ends of Medieval Mysticism.” In Wiley-Blackwell Companion to Christian Mysticism, edited by Julia Lamm, 388-403. Oxford: Wiley-Blackwell, 2013.
  • Cassirer, Ernst. The Individual and the Cosmos in Renaissance Philosophy. Philadelphia: University of Pennsylvania Press, 1963.
  • Cranz, F. Edward. “The Late Works of Nicholas of Cusa,” In Nicholas of Cusa In Search of God and Wisdom: Essays in Honor of Morimichi Watanabe by the American Cusanus Society, edited by Gerald Christianson and Thomas M. Izbicki, 141-60. Brill: Leiden, 1991.
  • D’Amico, Claudia. “Proclo,” in Nicolás de Cusa, Acerca de lo no-otro o de la definición que todo define, texto crítico original (Buenos Aires: Editorial Biblos, 2008), 359-369.
  • de Certau, Michel. “The Gaze: Nicholas of Cusa.” Diacritics 17, no. 3 (Fall 1987): 2-38.
  • Duclow, Donald. Masters of Learned Ignorance: Eriugena, Eckhart, Cusanus. Aldershot, Eng. and Burlington, VT: Ashgate Variorum, 2006.
  • Dupré, Louis. “Nature and Grace in Nicholas of Cusa’s Mystical Philosophy.”  American Catholic Philosophical Quarterly 64 (Winter, 1990): 153-170.
  • Führer, M. L. Echoes of Aquinas in Cusanus’s Vision of Man. Lanham, MD: Lexington Books, 2014.
  • Führer, M. L. “Wisdom and Eloquence in Nicholas of Cusa’s Idiota de sapientia and de mente.” Vivarium 16 (1980): 169-189.
  • Führer, M. L. “The Evolution of the Quadrivial Modes of Theology in Nicholas of Cusa’s Analysis of the Soul.” American Benedictine Review 36 (1985): 338-339.
  • Harries, Karsten. “Problems of the Infinite: Cusanus and Descartes.”  American Catholic Philosophical Quarterly 64 (Winter, 1990): 89-110.
  • Harries, Karsten. Infinity and Perspective. Boston: MIT Press, 2002.
  • Haubst, Rudolf. Pförtner der neuen Zeit. Trier: Paulinus, 1988.
  • Haubst, Rudolf. “Theologie in der Philosophie—Philosophie in der Theologie des Nikolaus von Kues,” in Rudolf Haubst, Streifzüge in die cusanische Theologie. Münster: Aschendorff, 1991.
  • Hoff, Johannes. The Analogical Turn: Rethinking Modernity with Nicholas of Cusa. Grand Rapids: Eerdmans, 2013.
  • Regine Kather, “‘The Earth is a Noble Star’: The Arguments for the Relativity of Motion in the Cosmology of Nicolaus Cusanus and their Transformation in Einstein’s Theory of Relativity,” in: Cusanus: The Legacy of Learned Ignorance, ed.  Peter J. Casarella (Washington, D.C.: The Catholic University of America Press, 2005), 226-50.
  • Lohr, C. H. “Metaphysics.” In The Cambridge History of Renaissance Philosophy, edited by Charles B. Schmitt and Quentin Skinner, 537-638. Cambridge: Cambridge University Press, 1988.
  • McTighe, Thomas. “The Meaning of the Couple, ‘Complicatio-Explicatio’ in the Philosophy of Nicholas of Cusa.” Proceedings of the American Catholic Philosophical Association 32 (1958): 206-214.
  • Miller, Clyde Lee. Reading Cusanus: Metaphor and Dialectic in a Conjectural Universe. Washington, D.C.: The Catholic University of America Press, 2003.
  • Miroy, Jovino. Tracing Nicholas of Cusa’s Early Development. The relationship between De concordantia Catholica and De docta ignorantia. Louvain-Paris: Éditions Peeters, 2009.
  • Moore, Michael Edward. Nicholas of Cusa and the Kairos of Modernity: Cassirer, Gadamer, Blumenberg. Punctum Books: Brooklyn, N.Y., 2013.
  • O’Rourke Boyle, Marjorie. “Cusanus at Sea: The Topicality of Illuminative Discourse.” Journal of Religion 71 no. 2 (April 1991): 180-201.
  • von Balthasar, Hans Urs. “Nicholas of Cusa: The Knot,” In The Glory of the Lord, vol. 5, The Realm of Metaphysics in the Modern Age, tr. Oliver Davies, et al. (Edinburgh: T&T Clark, 1991), 205-46.
  • von Bredow, Gerda. “Der Sinn der Formel ‘melior modo quo,’” In Gerda von Bredow, Im Gespräch mit Nikolaus von Kues. Münster, Aschendorff, 1995, 61-69.
  • Watanabe, Morimichi. The Political Ideas of Nicholas of Cusa with special reference to his ‘De concordantia catholica.’ Geneva: Librairie Droz, 1963.
  • Watanabe, Morimichi. Nicholas of Cusa: A Companion to his Life and his Times, ed. Gerald Christianson and Thomas M. Izbicki. Surrey, England: Ashgate, 2011.
  • Wikström, Iris. “From Word to Action: The Notion of the Ineffable in De Coniecturis of Nicholas of Cusa.” In Intellect et Imagination dans la Philosophie Médiévale. Intellect and Imagination in Medieval Philosophy, vol. 3, edited by M. C. Pacheco and J.F. Meirinhos, 1709-22. Turnhout: Brepols, 2006.
  • Ziebart, K. M. Nicolaus Cusanus on Faith and the Intellect: A Case Study in Fifteenth Century fides-ratio Controversy. Leiden: Brill, 2014.

 

Author Information

Peter Casarella
Email: Peter.J.Casarella.2@nd.edu
University of Notre Dame
U. S. A.

Mary Astell (1666-1731)

The English writer Mary Astell is widely known today as an early feminist pioneer, but not so well known as a philosophical thinker. Her feminist reputation rests largely on her impassioned plea to establish an all-female college in England, an idea first put forward in her Serious Proposal to the Ladies (1694). She is also remembered for her harsh but witty indictment of early modern marriage in her Some Reflections upon Marriage (1700). Underlying Astell’s feminist ideas, however, are strong philosophical foundations in the form of Cartesian epistemological and metaphysical principles. These principles play an important strategic role in her writings: to raise an awareness in women of their inherent ability to bring themselves to moral and intellectual perfection—to “pull themselves up by their bootstraps,” so to speak—regardless of their external circumstances. Toward this end, Astell urges her fellow women to embrace René Descartes’ “clear and distinct ideas” as the hallmarks of truth and certainty. In accordance with Cartesian rationalism, she teaches her readers that all knowledge can be founded on reason rather than the senses, and she urges them to practice Cartesian rules for thinking in order to attain knowledge of both moral and metaphysical truths. As a dualist, she encourages women to regard their souls as thinking substances distinct from their bodies and as capable of attaining mastery over bodily sensations and passions. In all her major writings, these philosophical themes are so prevalent that Astell might be justly regarded as one of the earliest feminist philosophers of the modern age.

Astell is an unorthodox Cartesian, however, insofar as she breaks from a number of Descartes’ classic doctrines, such as his theory of innate ideas and his views about the essence of the soul. And while Astell is indebted to Descartes’ ethical theory of the passions, her moral-theological viewpoint also closely resembles the Augustinian outlook of her English contemporary John Norris and the French thinker Nicolas Malebranche. As with these men, the intensely religious aspects of her thought cannot be ignored. The same deep religiosity permeates her political writings, and is arguably the main driver behind her critiques of the Whig philosophy of John Locke.

This article covers six key areas of Astell’s philosophy: her theory of knowledge, her metaphysics of mind and body, her philosophy of religion, her moral views, her feminist ideas, and her political thought.

Table of Contents

  1. Life
  2. Theory of Knowledge
  3. Metaphysics of Mind and Body
  4. Philosophy of Religion
  5. Moral Theory
  6. Feminism
    1. Education
    2. Marriage
  7. Political Thought
  8. Legacy
  9. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Astell was born in Newcastle-upon-Tyne, England, on November 12, 1666, and died in Chelsea, London, on May 9, 1731. She was the eldest of two children born to Peter Astell and Mary Errington, both of whom belonged to respected Northumberland families with strong royalist leanings. The most important influence on Astell’s early intellectual development appears to have been her uncle Ralph Astell, a clergyman-poet who was educated at the University of Cambridge in the mid-seventeenth century. Under his tuition, it is likely that Astell gained a strong familiarity with Anglican theology. The works of a number of popular Anglican theologians can be found in the remains of Astell’s library, now held in the Northamptonshire Records Office. Through her uncle’s influence, she may have also become acquainted with the ideas of the Cambridge Platonist Henry More, an early adherent of Cartesian philosophy in England. Ralph Astell attended both St John’s and Emmanuel College in the 1650s, just as More’s career at Cambridge was taking off, and Astell later cites More’s writings in her works.

In 1678, Astell’s father died and her life trajectory took an unexpected turn. As a result of her father’s untimely death, Astell’s financial and social situation grew precarious: her mother had to borrow money to keep the family afloat, and it seems that they could never have afforded Astell’s dowry, even if she had wanted to marry. Though there were rumors that Astell had once been engaged to a clergyman, she remained unmarried and childless all her life, choosing instead to lead the life of a writer.

At some point, probably in the late 1680s, Astell made the bold decision to leave her childhood home and migrate to London, seemingly without any family support. Soon after her arrival in the city, she made the acquaintance of Archbishop William Sancroft; and then in 1689 she dedicated a book of manuscript poetry to him, out of gratitude for his counsel and assistance in her time of need. A few years after completing this manuscript, Astell turned her hand to philosophy. In 1693, she embarked on a correspondence with John Norris, the author of a series of popular religio-philosophical works called the Practical Discourses. Their letters discuss Norris’s appropriation of the moral and metaphysical ideas of Nicolas Malebranche, a French philosopher best known for his doctrine of occasionalism, the theory that God is the only true causal agent in the universe. Their correspondence continued for one year and was eventually published as Letters Concerning the Love of God (1695).

In the mid-1690s, Astell’s writing career began in earnest. In 1694, she published her first Proposal. A few years later, she followed up this original work with a second part offering a method for the improvement of women’s reason, heavily indebted to the ideas of Descartes and his followers Antoine Arnauld and Pierre Nicole. Together with the Letters, the first and second Proposals made Astell something of a minor celebrity in London. She was publicly celebrated for her wit and eloquence, and openly commended by the likes of John Evelyn and Daniel Defoe. At the height of her career, Astell also had the support of several female benefactors of high social standing, including Lady Catherine Jones, Lady Elizabeth Hastings, Lady Ann Coventry, and Elizabeth Hutcheson. As a result, Astell was able to sustain her career as a writer, at least for a decade or so.

In 1700, Astell published her most popular feminist work, Some Reflections upon Marriage, a response to the scandalous marriage of Hortense Mancini, the duchess of Mazarin. Following this, her bookseller Richard Wilkin seems to have commissioned her to write several Tory political pamphlets. In 1704, she published three short tracts: Moderation Truly Stated, An Impartial Inquiry, and A Fair Way with the Dissenters. Then in 1705, Astell published her longest and most sophisticated work of moral philosophy, The Christian Religion, as Profess’d by a Daughter of the Church of England, a work that builds on the same feminist themes as her earlier treatises. In her final publication, Bart’lemy Fair (1709), Astell targets the third earl of Shaftesbury’s defense of free speech in his Letter Concerning Enthusiasm (1708).

After 1709, Astell did not publish any new works. But there is evidence that until her death she kept writing and also diligently editing her previous publications, not only her Christian Religion and Bart’lemy Fair (published in second editions in 1717 and 1720 respectively), but also the second part of her Proposal. In her later years, in keeping with her life-long interest in female education, Astell also took on the practical task of running a charity school for poor girls in her beloved neighborhood of Chelsea.

2. Theory of Knowledge

Astell’s guidelines on how to attain knowledge can be found in the second part of her Proposal (1697). In this work, Astell’s epistemological approach is distinctly rationalist insofar as she regards knowledge as founded on reason alone, and denies that sensory experience can be trusted as a reliable guide to truth. Her strict definition of knowledge is “that clear Perception which is follow’d by a firm assent to Conclusions rightly drawn from Premises of which we have clear and distinct Ideas” (SPL II 149). Like Descartes in his Principles of Philosophy (1644), Astell regards a perception as “clear” when it is accessible to the mind’s eye and the mind’s attention is firmly fixed on it. A perception is “distinct” when it is not only clear but also “particular” and distinguished from all other things. If an idea is both clear and distinct, then in Astell’s opinion we cannot withhold our assent from it (we cannot but affirm that it is true), without offending against reason.

Astell claims that we can attain knowledge by affirming only those ideas that are clear and distinct. To do so, we must learn to regulate the will, the mind’s active faculty of affirming or denying the ideas of the understanding. The will is to blame when we fall into erroneous judgements. We only really go astray because the will foolishly assents to more than it perceives; instead of carefully attending to the ideas of the understanding, it hurries on and makes rash judgments, beyond the scope of its ideas. We cannot successfully regulate the will, according to Astell, until we have learnt to moderate our passions or emotions. Certain emotions, such as pride and vanity, can prevent us from properly engaging in the search for truth. When we are faced with a truth that contradicts our mistaken idea of self-interest, for example, we shut our eyes against it and unreasonably refuse to entertain it.

Accordingly, in Astell’s view, a healthy disengagement from worldly things is an important first step toward the attainment of clarity and distinctness. Toward this end, in both her Proposals, she argues for the necessity of an academic retreat for women, so that they might withdraw from the hurry and noise of the everyday world (temporarily, at least) and focus their attention on nobler subjects. Importantly, she is not so concerned that women acquire knowledge for its own sake, but rather as a means for them to attain enduring happiness in both this life and the next. In her view, reason is the natural light that God has set up in our minds so that we might conform ourselves to his will and come to join him.

To attain both truth and happiness, a woman must follow reliable rules for thinking. Astell’s six rules bear a notable resemblance to Descartes’ own set of rules in his Discourse on the Method (1637), as well as those of his followers Arnauld and Nicole in their Logic, or the Art of Thinking (1662). She states that in any given inquiry, (i) we must acquire a distinct notion of our subject and a precise understanding of any key terms. Then (ii) we must avoid straying into any unnecessary or irrelevant subject matters, and conduct our thoughts in a natural, logical order. It follows that (iii) we must examine the simplest subjects first, before progressing to the study of more complex matters. (iv) We must take care to examine our subject thoroughly, according to each of its parts, and be sure not to leave any part unexamined. And (v) we must keep our focus firmly fixed on the subject at hand. (vi) Finally, and most importantly, we must not judge any further than we perceive, and we must not affirm anything as true unless it is incontestably known to be so.

In her later work, The Christian Religion, Astell deviates from Descartes’ epistemology by suggesting that the perception of truth is a participation in the mind of God (§262). In this respect, Astell comes closer to her unorthodox Cartesian contemporaries Norris and Malebranche, both of whom deny Descartes’ view that our ideas are innate, born within us, in our minds. Instead, her view has more in common with Augustine’s illuminationist theory that the human mind is capable of understanding ideas only by means of the divine light.

3. Metaphysics of Mind and Body

Astell’s argument for the soul-body distinction can be found in section 228 of her Christian Religion, embedded within a larger argument against Locke’s doctrine of “thinking matter.” Astell begins her critique of Locke with an inquiry about the nature of “the thing in us that thinks”: is it immaterial? Or could it be material, as Locke appears to suggest in his Essay Concerning Human Understanding (1690)? In response, she points to the fact that the mind has entirely different properties and affections to the body, and that we can have a complete idea of mind as a thinking thing without considering it as dependent on, or related to, our idea of body as extended substance. But if we can have a complete idea of something in independence of a complete idea of another thing, she says, then those two things are really distinct. The mind and body are therefore distinct. Contra Locke, she says that we can affirm that the idea of thinking being excludes extension, and the idea of extended being excludes thought.

Like Descartes, Astell maintains that the human person is composed of two substances: the soul (or mind), which is a thinking thing, and the body, which is extended substance. However, she makes few explicit statements about how the soul moves the body (soul-body causation) or how the body causes sensations (body-soul causation). Some of her statements appear to suggest that she upholds an occasionalist theory of body-soul causation. According to a Malebranchean occasionalist, neither bodies nor souls have any genuine causal efficacy; only God has the causal power to bring about modifications in the human mind. In one passage of her Christian Religion, Astell suggests that God is the true efficient cause of all sensation, and she seemingly denies that material objects have any power to produce modifications in our souls (§378). These remarks, however, must be placed in the context of Astell’s response to Damaris Cudworth Masham, a Lockean philosopher who had vehemently attacked the Malebranchean moral and metaphysical ideals of the Astell-Norris Letters. In the passage in question, Astell’s main point is that even if we were to embrace those Malebranchean ideals without criticism, it’s not clear that they are as harmful to morality as Masham would suggest.

Other statements indicate that Astell holds an orthodox Cartesian interactionist position on soul-body and body-soul causation. In the Letters, she raises two objections to Norris’s view that God’s will is the only true cause of our sensations and that bodies are incapable of exerting a causal influence on souls. First, she points to the fact that if sensible objects are redundant features of God’s creation, as Norris suggests, then this offends against our idea of God as a supremely wise and perfect creator. Second, she points out that the existence of genuine secondary causes is more befitting of God’s majesty, because if such causes do exist, then he need not continually interfere in his own creation. As an alternative to occasionalism, Astell supports the view that there is a natural power, a “sensible congruity,” in bodies that enables them to cause sensations in the soul. In the second Proposal, Astell also takes an orthodox Cartesian stance by suggesting that the body is disposed to make impressions on the soul and that the soul has an active power to effect changes in the body.

Astell’s philosophical concept of the self as a thinking thing informs her feminist thought. She advises her fellow women that they must learn the value of proper selflove and self-esteem: the love and esteem of their souls and not their bodies. They must cease to live like animals or Cartesian machines, those purely material beings devoid of rationality; they must pursue what is conducive to their perfection as thinking, immaterial beings.

4. Philosophy of Religion

The Christian conception of God plays a crucial role in Astell’s wider project to bring women to the knowledge of the true source of their happiness. We can be assured, she says, that God always does what is best and most becoming of his infinite perfection; and so, we can be assured that the world and everything in it is created according to the eternal and immutable standards of rectitude. It therefore becomes women to live their lives in accordance with the law of God and reason—this is the surest route to their happiness.

Astell presents at least three different types of argument for the existence of God. In her second Proposal, she develops an ontological proof, an argument for God’s existence based on premises that can be known independently of experience. In the same work, immediately following this proof, she formulates a cosmological argument for the existence of God based upon empirical observations about the created world. In The Christian Religion, she once again takes a blended approach by presenting an ontological argument followed by a cosmological proof. Then in her final work Bart’lemy Fair, she offers yet another causal argument, this time based on the principle that a cause must have either the same or higher qualities than its effect.

In her second Proposal, Astell echoes the English translation of Descartes’ Meditations (1680) when she begins her ontological argument with an idea of God as “a being infinitely perfect.” She then asks the question: does this infinitely perfect being exist? Her answer is that, according to our intuitions, the idea of God and the idea of existence are compatible, because existence is a perfection and the necessary foundation of all other perfections (since what doesn’t exist can’t have any perfections). Moreover, if any being is infinite in all perfections, then we cannot deny that that being exists; therefore, we cannot deny that God, an infinitely perfect being, exists. In sections 7–8 of The Christian Religion, Astell strengthens this argument by asserting that an infinitely perfect being would have the perfection of self-existence, rather than ordinary everyday existence. She asserts that God could not derive his being from anyone but himself; if God had derived his existence from someone or something else, then he would not be supremely perfect. So, God must have ontological independence or self-existence; he must exist by his own nature.

A similar appeal to God’s ontological independence lies at the heart of Astell’s cosmological arguments for God. In the second Proposal, her argument begins with the idea of created or contingent beings. In her view, this idea naturally suggests to us the idea of “the power of giving being” to something. How were these contingent beings created? They cannot have had the power of giving being to themselves, because this would imply a contradiction; it would imply, that is, that they could both exist and not exist at the same time. The thing that created these contingent beings would therefore have to be self-existent. It could not be another created, contingent being because this would lead to an infinite regress of such beings. Yet an infinite regress without a last resort offends our basic intuition that something cannot come from nothing (ex nihilo nihil fit). It follows that there must be a last resort or a first cause: there must be a self-existent being who created those contingent beings—and this being is God. Astell presents a similar causal argument in her Christian Religion (§10).

In Bart’lemy Fair, Astell takes a different tack in order to explain why we must regard this self-existent being as the traditional theistic God. Here she implicitly appeals to the principle that a cause must have qualities that are similar to, or higher in perfection than, those contained in its effect. Her proof begins with the empirical observation that there is gravitation or “mutual attraction” between physical bodies in the created world. She then asks, how do we explain this phenomenon? If gravity is not an essential property of matter, then we must say that gravity proceeds from the will and power of a superior cause. But this superior cause cannot be material in nature, for that would imply that matter is superior to matter in general (a contradiction); so, the cause must be immaterial. This superior immaterial cause, moreover, must have the will and power to sustain mutual attraction between bodies. In short, this cause must be the theistic God.

5. Moral Theory

In terms of her moral approach, Astell might best be described as a Christian deontologist; in her view, all human beings have a duty to live in accordance with the law of God. Nevertheless, she is also a virtue theorist to the extent that she thinks that we ought to develop a disposition to obey the divine law, and developing this disposition requires us to cultivate virtue. These moral views can be found in all her works, but especially in her Letters, the second Proposal, and The Christian Religion.

According to Astell’s strict definition, virtue consists in the soul gaining mastery over the bodily impressions and directing its passions toward the right objects, in the right “pitch” (or intensity), according to the dictates of reason (SPL II 214). She warns that the bodily passions of love, hate, fear, desire, and joy can have a disturbing and disquieting effect on the human mind. When we are in the thrall of such passions, we can get carried away and zealously pursue the wrong objects, often to our moral and spiritual destruction. The proper regulation of the passions thus plays an important role in the attainment of virtue.

Astell thinks that the passions need not be obstacles on the path to virtue, provided that they are “hallowed” or purified in some way. As a long-term strategy toward purification, we should meditate carefully on what is truly good and truly bad, and follow only those moral judgements that proceed from knowledge. Crucial to this endeavor, we must learn to focus our attention on the right objects, including our own nature as thinking things, the true nature of material beings, and the nature of an infinitely perfect being. Moral agents often go astray, according to Astell, because they have mistaken or erroneous judgements about the nature and value of these objects.

There are a number of virtues (excellences of character) that feature prominently in Astell’s moral theory; the most significant are benevolence, generosity, and friendship. Benevolence is a wishing well toward others purely for the sake of promoting their well-being, and not for selfish motives. In her writings, the love of benevolence is often contrasted with the love of desire, a selfish egoistic kind of love for others, in which we desire to possess them. On this topic, her views have much in common with the Augustinian outlook of her correspondent John Norris. Like Norris, she maintains that a virtuous agent has properly ordered love. In their Letters, they agree that human beings ought to cultivate an exclusive love of desire for God, an infinitely perfect being, because he is the only being who is truly capable of satisfying our desire. Toward our fellow human beings, we should feel only a love of benevolence; we should cultivate a disinterested goodwill rather than a selfish desire. Unlike Norris, Astell emphasizes that an exclusive desire for God can have the added benefit of helping us to regulate our passions and cultivate a non-possessive attitude toward others.

In Astell’s view, the virtue of generosity (or having “a generous soul” and “a generous temper”) also provides a remedy for our selfish desires. Like Descartes in his Passions of the Soul (1649), she regards the virtue of generosity as a species of self-esteem, a valuing ourselves on the basis of some noble or worthy characteristic. More than this, generosity consists in recognizing that our moral worth consists in exercising our free will, plus a firm commitment always to do our best. Those who have the virtue of generosity eventually cease to desire the approbation of others, because they do not really care what the rest of the world thinks of their choices and actions. So long as they themselves always endeavor to do what is best in their own minds, they are impervious to censure and ridicule.

The difficulty for women, Astell says in her second Proposal, is that they have been culturally conditioned to value themselves on accidental properties such as their looks and their clothing. They have acquired a mistaken sense of self-esteem because they have not been encouraged to value themselves as rational, thinking beings with freedom of will. To cultivate justified self-esteem, according to Astell, women must be permitted to train their reason and to study philosophy and religion. She thinks that Christianity in particular facilitates the cultivation of generosity, because it teaches them that what is truly valuable does not depend on the transient things of this world.

Finally, the virtue of friendship (a species of the love of benevolence) plays an important role in Astell’s moral thought. In her view, one of the chief benefits of her female academy is that it will enable virtuous friendships to flourish among women. These friends will then watch over each other’s moral and intellectual advancement, with the aim of advising and encouraging each other toward perfection.

6. Feminism

a. Education

Astell’s first Proposal is essentially an exercise in consciousness-raising, for the purpose of bringing about the moral and intellectual reformation of early modern women. The “proposal” of Astell’s title is an all-female academic institute, where like-minded scholars of a similar age and social status might live and study together for a number of years. Although a wealthy gentlewoman expressed interest in funding Astell’s proposal, an academy never materialized in her lifetime—possibly due to the suspicion that it sounded like a Catholic nunnery.

Throughout her works, Astell appeals to different philosophical ideas to argue that women should receive a higher education, and to undermine the belief that women are naturally intellectually inferior to men. These ideas include an egalitarian conception of reason, the Cartesian concept of the thinking self, and certain teleological principles.

To challenge the idea that women are mentally inferior, Astell’s historical predecessors traditionally pointed to empirical evidence or famous instances of exemplary women. By contrast, Astell appeals only to an inward consciousness of thought. In her view, the fact that women are thinking things needs no proof or argument; a woman simply has to turn within herself and see that she is capable of exercising her mental faculties. Astell emphasizes that the search for knowledge does not require the mastery of languages, such as Greek and Latin, nor does it require an extensive library or an intimate acquaintance with ancient authorities and obscure terminology. It simply requires the capacity to discern the truth for oneself, and the freedom to affirm or deny the ideas of the mind. In terms of their capacity for rational judgement, Astell says, women are no different to men; they are on a par.

While Astell never articulates the cogito (Descartes’ famous insight that “I think therefore I am”), she does rely on a similar logic. She relies on the idea that if a woman is capable of entertaining a thought in her mind, then it is true that she thinks; it cannot be denied. To improve their reason, according to Astell, women need only familiarize themselves with their own internal “natural logic.” Can they reason about the everyday management of household affairs, can they make informed judgments about the course of a romance or the design of a petticoat? If so, then this provides indisputable evidence of their ability to reason. If women exhibit any defect in reasoning, Astell says, this defect is acquired rather than natural, and can be corrected through proper training and meditation. They can improve their reasoning skills by following simple Cartesian rules for thinking (see the “Theory of Knowledge” section above).

It should be noted that Astell differs from Descartes in emphasizing that we can never have a distinct idea of the self as a thing whose essence consists solely in thinking (SPL II 173). She also differs from Descartes by appealing to God’s final causality in order to bolster her arguments for women’s education. In her writings, she repeatedly emphasizes that an infinitely perfect being does nothing in vain; there can be no feature of his intelligent design that is redundant or superfluous in nature. It follows that if God has bestowed rational minds upon women, then they ought to be permitted to use their minds toward the best ends. When a woman is taught that her duty is to serve a man, or to live a life devoted solely to bodily and material concerns, she is taught to disregard her sacred duty to God. A woman must therefore be educated to use her reason to raise herself toward perfection, just as her creator intended.

b. Marriage

In Some Reflections upon Marriage, Astell examines women’s disadvantages within the early modern marriage state. This work was ostensibly a response to Hortense Mancini’s much-publicized separation from her abusive and unstable husband, the duke of Meilleraye.  Although Astell regards marriage as a sacred institution ordained by God, she complains that in her day it has greatly degenerated from its original blessed state. In the Reflections, her explicit purpose is to analyze why this degeneration has occurred and to see how it might be rectified. She traces the core problem to the moral failings of human beings—but to the failings of men in particular. She highlights the fact that most men do not marry from a love of benevolence toward women but rather from base and selfish motives, such as lust and greed. Marriage would be a happy state today, she insists, if only human beings were guided by their reason and not by brutish passions. Astell warns her fellow women to be extremely wary of entering into marriage in the first place. She points to the fact that a wife is expected to offer blind submission to her husband, even when he does not deserve it. This expectation of submission might lead a woman to ignore the dictates of her reason, the law of God, and to act in terms of worldly self-interest instead. As a result, an unhappy marriage to a vicious man could lead to the destruction of a woman’s soul. As a remedy, Astell once again highlights the necessity of a good education for women, to fortify their reason and to cultivate their virtue. If Mancini had had the benefit of a higher education in philosophy and religion, Astell suggests, her husband’s abuse might not have led to her moral degradation.

Some scholars propose that Astell’s Reflections contains a hidden political sub-text. More specifically, they interpret the work in light of Astell’s conservative Anglican Tory political commitments.  In their view, when Astell highlights female slavery within marriage—when she asks her famous question, “if all Men are born free, how is it that all Women are born slaves?” (RM 18)—she is really presenting an ironic challenge to Whig theorists of her time. They claim that she challenges her Whig opponents to extend the same authority to sovereigns in the state that they uncritically permit to husbands in the domestic sphere. If submission and obedience to authority is acceptable in the family home, she asks, then why not in the state? Whig theorists, such as Locke, ought to practice the same obedience to their political leaders that they exact from their domestic subjects—they ought to practice passive obedience.

7. Political Thought

Astell has been widely interpreted as a critic of Locke’s political thought and as a vocal opponent of the Whig theories of liberty, toleration, and resistance. For some commentators, it is puzzling that Astell could be both a feminist and a High-Church Tory. At first glance, her support for women’s freedom of judgement seems to be incompatible with her support for a political party that opposes freedom of conscience, a tolerationist ethic, and other perceived threats to the Anglican church. To dispel these tensions, scholars have highlighted the fact that Astell’s feminism is founded on philosophical principles, not progressive political ideals, and this partly explains why Astell does not call for full political equality for women in her time.

In keeping with Anglican political theology, Astell maintains that all subjects are bound to observe the doctrine of passive obedience, the idea that subjects must actively obey political authority where they can, and quietly submit to the penalty for disobedience where they cannot (in those cases, for example, where the authority commands something sinful or irreligious). In her view, political subjects are never justified in engaging in active resistance to the crown, even if the crown wields a tyrannical, arbitrary power. These commitments lead Astell to criticize Locke’s views concerning the natural law of self-preservation and the right of resistance in his Two Treatises (1689).

In Locke’s view, every man has an equal right to freedom from arbitrary power. In the natural state, whenever another man threatens to enslave me, I have the right to resist him in order to preserve my life, liberty, and property. In civil society, a political authority is set up to ensure the preservation of my life, liberty, and property; but if that authority fails to act for the public good, and wields a tyrannical, arbitrary power instead, I can still exercise my right of resistance, as an extension of the natural law of self-preservation. I can depose that authority by force, if need be.

In response, in her Christian Religion (§274), Astell agrees with Locke that self-preservation is a fundamental right. But in her view, strictly speaking self-preservation consists in the preservation of the immaterial, immortal soul; so, according to the natural law, we are only ever permitted to act to secure our souls from damnation. From her Anglican viewpoint, soul-preservation entails passive obedience, not active resistance.

8. Legacy

In her lifetime, Astell’s writings were known to the philosophers John Locke, Gottfried Wilhelm Leibniz, and George Berkeley. But her ideas seem to have had the greatest impact on other eighteenth-century defenders of women, such as Mary Chudleigh, Elizabeth Thomas, the writer known as “Eugenia,” Mary Wortley Montagu, and Sarah Chapone. Her influence as a feminist can be discerned right up to the suffragist movement of the late nineteenth century, especially in the writings of English suffragette Harriett McIlquham. In recent history, there have been two revivals of academic interest in Astell as a feminist: the first from the 1890s to the early twentieth century; and the second from the mid-1980s to the present day, facilitated to a great extent by Ruth Perry’s authoritative biography, The Celebrated Mary Astell. Perry claims that Astell would be surprised at the history of her reception as feminist pioneer—Astell thought of herself more as a metaphysician and philosopher than a political reformer.

9. References and Further Reading

a. Primary Sources

  • Astell, Mary, Bart’lemy Fair: Or, An Enquiry after Wit; In which due Respect is had to a Letter Concerning Enthusiasm, To my LORD ***, London: Richard Wilkin, 1709.
    • Astell’s moral-theological critique of Whig political ideas in Shaftesbury’s Letter. No modern edition currently exists.
  • Astell, Mary, Astell: Political Writings, ed. Patricia Springborg, Cambridge Texts in the History of Political Thought, Cambridge: Cambridge University Press, 1996.
    • Contains the third edition of Reflections on Marriage (1706), cited in the text as RM.
  • Astell, Mary, A Serious Proposal to the Ladies, Parts I and II, ed. Patricia Springborg, Peterborough, ON: Broadview Press, 2002.
    • Standard modern edition of Astell’s best-known work. Cited in the text as SPL part, page.
  • Astell, Mary, and John Norris, Letters Concerning the Love of God, ed. E. Derek Taylor and Melvyn New, Aldershot, UK: Ashgate, 2005.
    • Modern edition of Astell’s correspondence with the Malebranchean philosopher John Norris.
  • Astell, Mary, The Christian Religion, as Professed by a Daughter of the Church of England, ed. Jacqueline Broad, The Other Voice in Early Modern Europe: Toronto Series, Toronto, ON: Centre for Reformation and Renaissance Studies and Iter Publishing, 2013.
    • Modern edition of Astell’s most mature work of moral theology, based on 1717 second edition. Cited in the text by section number.

b. Secondary Sources

  • Boyle, Deborah, “Mary Astell and Cartesian ‘Scientia’,” in Judy Hayden, ed., The New Science and Women’s Literary Discourse: Prefiguring Frankenstein, New York: Palgrave Macmillan, 2011, 99–112.
    • Account of Astell’s theory of knowledge and her distinction between faith, science, and opinion.
  • Broad, Jacqueline, The Philosophy of Mary Astell: An Early Modern Theory of Virtue, Oxford: Oxford University Press, 2015.
    • First book-length examination of Astell’s wider philosophy. Presented from the point of view of her ethical theory.
  • Detlefsen, Karen, “Custom, Freedom and Equality: Mary Astell on Marriage and Women’s Education,” in Alice Sowaal and Penny A. Weiss, eds., Feminist Interpretations of Mary Astell, Re-reading the Canon, University Park, PA: Pennsylvania State University Press, 2016, 74-92.
    • Examines Astell’s Cartesian epistemology with a focus on dispelling tensions within her feminism.
  • Goldie, Mark, “Mary Astell and John Locke,” in William Kolbrener and Michal Michelson, eds., Mary Astell: Reason, Gender, Faith, Aldershot, UK: Ashgate, 2007, 65–85.
    • Insightful analysis of Astell’s critique of John Locke’s religious and philosophical ideas.
  • Kinnaird, Joan K., “Mary Astell and the Conservative Contribution to English Feminism,” The Journal of British Studies 19, no. 1 (1979), 53–75.
    • Analysis of connections between Astell’s feminism and her conservative religious and political commitments.
  • Lascano, Marcy P., “Mary Astell on the Existence and Nature of God,” in Alice Sowaal and Penny A. Weiss, eds., Feminist Interpretations of Mary Astell, Re-reading the Canon, University Park, PA: Pennsylvania State University Press, 2016, 168-87.
    • One of the first detailed discussions of Astell’s proofs for the existence of God.
  • Lister, Andrew, “Marriage and Misogyny: The Place of Mary Astell in the History of Political Thought,” History of Political Thought 25, no. 1 (2004), 44–72.
    • Interprets Reflections as a feminist work with the primary aim of urging women to remain single if possible.
  • Myers, Joanne E., “Enthusiastic Improvement: Mary Astell and Damaris Masham on Sociability,” Hypatia: A Journal of Feminist Philosophy 28, no. 3 (2013), 534–50.
    • Provides insight into the so-called debate between Astell and fellow feminist Masham.
  • O’Neill, Eileen, “Mary Astell on the Causation of Sensation,” in William Kolbrener and Michal Michelson, eds., Mary Astell: Reason, Gender, Faith, Aldershot, UK: Ashgate, 2007, 145–63.
    • Interprets Astell as holding a Cartesian interactionist position on mind-body causal relations.
  • Perry, Ruth, The Celebrated Mary Astell: An Early English Feminist, Chicago, IL: University of Chicago Press, 1986.
    • The most authoritative and engaging account of Astell’s life and works.
  • Sowaal, Alice, “Mary Astell’s Serious Proposal: Mind, Method, and Custom,” Philosophy Compass 2 (2007), 227–43.
    • Analysis of Astell’s educational strategy in relation to her theory of mind.
  • Springborg, Patricia, Mary Astell: Theorist of Freedom from Domination, Cambridge: Cambridge University Press, 2005.
    • Interprets Astell’s writings in light of her support for the Tory political party and her High-Church Anglicanism.
  • Squadrito, Kathleen M., “Mary Astell’s Critique of Locke’s View of Thinking Matter,” Journal of the History of Philosophy 25, no. 3 (1987), 433–9.
    • Early article on Astell’s critique of Locke’s claim that God could conceivably add the power of thinking to matter.
  • Taylor, E. Derek, “Mary Astell’s Ironic Assault on John Locke’s Theory of Matter,” Journal of the History of Ideas 62, no. 3 (2001), 505–22.
    • Examines Astell’s critique of Locke with reference to Astell’s own views about the mind-body relationship.
  • Taylor, E. Derek, “Mary Astell’s Work Towards a New Edition of a Serious Proposal to the Ladies, Part II,” Studies in Bibliography 57 (2005–6), 197–232.
    • Provides evidence that Astell may have had plans for a new edition of her second Proposal (1697).

 

Author Information

Jacqueline Broad
Email: jacqueline.broad@monash.edu
Monash University
Australia

The Port Royal Logic

Logic or the Art of Thinking, commonly known as The Port Royal Logic, was written by Antoine Arnauld and Pierre Nicole and first published in 1662. Although it was a textbook containing much worked-over material, the Logic was extremely influential, certainly the most important textbook in logic for the next two hundred years. Part of its influence was due to its accessibility: it was short for a logical treatise and the first logic textbook in a vernacular language. It was quickly translated, had numerous editions, and was popular throughout Europe and the U.S. well into the 19th century. Its technical logic, however, is unoriginal. From a modern perspective the Logic’s interest is twofold: it harmonizes Cartesian dualism with standard doctrines of late medieval logic, and for the first time it gives intentional content a central role in semantics. The two are related. Because dualism was inconsistent with the standard medieval theory of reference, it was necessary to forge a new foundation. To do so, the Logic’s authors relocated objective being, an early version of intentional content, to the center of logical theory. This article focuses on the Logic’s innovations in semantics, especially the role of intentional content, and on the place of its innovations in the history of logic, both where they came from and how they evolved.

As a result of its commitment to dualism, the Logic faced four tasks: (1) to explain anew how terms in mental language “signify” things in the world, (2) to reformulate truth-conditions in a way compatible with its new definition of signification, (3) to preserve the standard proof theory of late medieval logic, and (4) to explain how logical demonstration contributes to scientific knowledge in the context of Cartesian rationalism. These tasks correspond to the four “parts” of the Logic. Parts I to III correspond to the standard books of earlier logical treatises, which follow a division loosely modeled on Aristotle’s Organon: the logic of terms, the logic of propositions, and the logic of arguments. Part IV concerns method, a topic of special interest in the 17th century.

Table of Contents

  1. The Logic of Terms
    1. Summary
    2. Ontology
    3. Dualism, Ideas
    4. Mental Language
    5. Intentional Content
    6. Kinds of Ideas, Occasionalism
    7. Determinative and Explicative Restriction
    8. Indeterminate Restriction
    9. False Ideas
    10. Abstraction
    11. The Categories and Predicables
    12. Nominalism-Realism
    13. Comprehension as a Generalization of Essence
    14. Species and Difference
    15. Signification and Extension
    16. The Structure of Ideas
  2. The Logic of Propositions
    1. Summary
    2. Modality
    3. Distributive and Confused Supposition
    4. Truth-Conditions for Categorical Propositions
    5. The Correspondence Theory of Truth
  3. The Logic of Arguments
    1. Summary
    2. The Syllogistic
    3. Validity
  4. Method
    1. Summary
    2. Necessary and Contingent Truth
    3. Certainty, Clear and Distinct Ideas
    4. Demonstration
    5. Sensation and Knowledge of Contingent Truth
    6. Method: Analysis and Synthesis
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Logic of Terms

a. Summary

(The original edition of Port Royal Logic is hereafter referred to as Logic; Arnauld 2003, hereafter KM, Logic is vol. 5., 99–413. English translation: Arnauld 1996, hereafter B.)  In Book I, the authors lay out the fundamental assumptions and concepts of their semantic theory. These include a substance-mode ontology with its dualistic division into matter and spirit (Logic, Part I, Chapter 2, hereafter I:2); a theory of mental language (I:1, 4); ideas and their causes, including abstraction and restriction (I:1, 5, 8); the traditional ten categories (I:3) and five predicables including genera and species (I:7); false ideas and error (9–11); and essential definitions (I:12–15). Most importantly they explain their theory of reference (I:6). The key concepts possess a definitional order. First, every term possesses by nature an intentional content. This content determines what the term signifies in the world. What it signifies in turn determines its inferior ideas. Inferior ideas then combine to form the term’s extension. Extension will then be the key concept in the definition of truth in Part II. At multiple points in both the introductory Discours and Part I the authors point out the intellectual and moral dangers lurking in equivocation and false ideas.

b. Ontology

In the introduction the authors decline to engage in the realism/nominalism debate, on whether, as they put it, universals exist a parte rei, because they judge the issue uninteresting and useless (Discours I, KM V 112–113, B 11–12). In Part I, nevertheless, they assume a basic substance-mode ontology that is roughly Aristotelian. They divided being into two kinds: substances, which can be conceived as existing independently, and modes (attributes, qualities), which can only be conceived of as existing instantiated in substances (I:2).

c. Dualism, Ideas

To this Aristotelian foundation the Logic adds Cartesian dualism. Substances and their modes divide into two kinds: spiritual and material. The essential property of material substances is extension and that of souls is thought. In the Logic the modes attributed to material substances are those described in Cartesian physics; for example, relative size, position, motion, and shape. Modes attributed to the soul include sensory qualities, ideas, and mental operations. These operations include the three traditionally listed in medieval logic: conception (concevoir), judgment (affirmation and denial, juger), and reason (logical deduction, raisonner), and a fourth, the methodological organization of knowledge (ordoner), which was considered important in 17th century logic (I:Introduction, KM V 125, B 23). These four operations correspond to the four parts of the Logic. Although the authors sometimes used idea loosely to refer to any spiritual mode, in more precise contexts an idea is a mental mode that functions as a term in mental language, or what medieval logicians and Descartes call a concept.

d. Mental Language

The Logic discusses grammar piecemeal (I:1–6 and II). It does not provide an exhaustive breakdown of spoken language into basic parts of speech, nor does it attempt to formulate precise grammar rules for complex expressions like those of a modern generative grammar. As in medieval logic, the spoken language in which logic is conducted (and which the Logic discusses) turns out to be a rather stylized fragment of natural language (I:1, 4). Chomsky surprised the linguistic community in the 1960s by pointing out in Cartesian Linguistics that Logic posits a mental language parallel to speech and suggested that their distinction anticipates his between surface and deep structure (Chomsky 1966, 31 and following). It is more accurate to say that medieval logicians had been working out the theory of mental language for centuries, in which spoken words and phrases were conventional signs for a language of thought that was prior to speech and had its own grammar and semantics. The basic linguistic operations are conceptualization, judgment, and reasoning.

Conceptualization is the act of instantiating in the soul an idea that serves as a basic term in mental grammar. These ideas have semantics. An idea by its nature has an intentional content that the soul is aware of more or less clearly during the act of conceptualization. This intentional content determines what the idea signifies in the world. What it signifies in turn determines other ideas that are “inferior” to it. The set of its inferiors constitute its “extension” in the special sense peculiar to the Logic. An idea that fails to signify anything real (that fails of reference) is called a false idea.

Judgment is the act in which the soul affirms or denies propositions, which are grammatical complexes in which ideas occur as terms. Reasoning is the act where the soul draws a conclusion from other propositions as premises. Part II explains the truth-conditions of propositions, and Part III explains which reasoning patterns are valid.

Substantives and adjectives are the two basic kinds of referring terms in mental propositions. The Logic has no single technical term for reference. Sometimes it is called expression, sometimes representation, but most frequently it is called signification, which was the standard term in earlier logic. Fundamental to the Logic’s semantics is the thesis that signification is explained by intentional content (I:2, 5–6).

e. Intentional Content

As mentioned above, one of the challenges faced by the Logic was how to reconstruct the medieval theory of reference. In earlier Aristotelian accounts, reference is explained by the transmission of a property from the world to the soul. By sensation and abstraction, the view held, an external property was causally transmitted via the sense organs to the brain and from there to the intellect. Once in the intellect it serves as a concept or term in mental language. This term was then said to signify those objects outside the mind that instantiate the transmitted property. Dualism, however, makes this mechanism impossible. If dualism is true, no property can be instantiated in both matter and the soul.

To explain reference, the Logic appeals to intentional content. Intentional content was far from a new idea. Versions had been used throughout the Middle Ages to explain various semantic phenomena (Pasnau 1997). Peter Aureol holds that what we see when we have an illusion, like the apparent movement of the trees from a passing boat, is not something that really exists outside the mind but rather a third entity that only exists “in the eye objectively” and “intentionally.” At some points in his career, Ockham calls “what we understand” when we grasp an abstract noun a “fictum” having esse objectivum and esse cogitum (Willam of Ockham 1978, §10.) Scotus calls something’s nature an “intelligible being” distinct from the thing itself. By the 16th century, it was common for logicians to distinguish between the “formal” and “objective” being of a concept (Cronin 1966). A concept has formal being inasmuch as it is a mode of the soul and as such is part of its “form.” It exhibits objective being because it carries with it the understanding of an object—it “throws” the object “against” the mind. Suárez, for example, holds that an essential definition is true timelessly, even prior to creation, because it signifies objective being. Toletus explains “beings of reason” like a chimera and non-referring terms like antichrist that do not refer to existing things as signifying objective being. By Descartes’ time, the distinction was commonplace in the logic books studied in schools and universities, including the schools attended by Descartes, Arnauld, and Nicole. It is prominent, for example, in treatises by Toletus, Raconis, Fonseca, and Eustache de Saint-Paul. (Toletus S.J. 1596, 3, 30; Raconis 1651, De principis entis, a. 3, §1a, 827; Fonseca S.J. 1599, q. ii, §1; Eustachio-De-S.-Paulo 1648 Metaphysia, De natural entis, de conceptus formali et objectivo, 1; see also Cronin 1966). Descartes appeals to the objective being of the idea of God in his famous ontological argument of Meditation III (§§ 21–22). The Logic prefers to speak about an idea’s content, but Arnauld uses the medieval terminology objective realty or objectively being in On True and False Ideas. (See Arnauld 1813, vol. I, hereafter VFI, Ch. 5, 6; KM I, 202, 205; English translation Arnauld 1990 [1683], hereafter G, 69, 71–127). In the Logic objective being is used to explain not only of signification, but also extension, abstraction, restriction, privative negation, essential definition, ambiguity, equivocation, clear and distinct ideas, and perception.

The explanatory role of intentional content (I:6–7) begins with substantives. Grammatically, substantives are ideas that serve as the subjects or predicates of categorical propositions. Semantically, a substantive is distinguished by its intentional content, which in the case of a substantive is called its comprehension. Comprehension is explained by appeal to substance-mode ontology. A substantive’s comprehension is a series of modes. It modern terms it may be thought of as a set of modes. These form the idea’s content and provide its identity criteria. Two substantives are identical if and only if they have the same comprehensions. Signification is then defined in terms of comprehension. A substantive signifies all and only those entities that satisfy all the modes in its comprehension. The theory is not unlike—indeed it is a remote ancestor of—Frege’s view that sense determines reference. A substantive that signifies many individuals is a common or abstract noun. One that signifies a single individual is a proper noun. Normally, a substantive signifies substances, but it can also signify modes, like whiteness. If a substantive signifies another idea, which is a mode of the soul, it is a term of second intention.

Adjectives too have intentional content, but the terminology is different. Grammatically, adjectives serve as the predicates of categorical propositions or as modifiers of substantives in longer noun phrases. Semantically, an adjective has as its intentional content a mode or sometimes multiple modes. In the case of an adjective these are called its secondary signification. This content determines the objects the adjective is true of or “signifies in the primary sense”: an adjective signifies primarily all and only the entities that instantiate all the modes in its secondary signification. Again, intentional content provides identity conditions: two adjectives are identical if and only if they have the same secondary signification. Following medieval usage, an adjective is called a connotative term (I:8; KM V 152; B 46). It directly signifies a mode and indirectly connotes the individuals in which they inhere. Substantives differ semantically from adjectives in that a substantive’s primary function is to signify an entity in abstraction from its modes. An adjective, however, draws attention to entities by first drawing attention to the mode or modes in its secondary signification. (The primary and secondary terminology derives from Aristotelian metaphysis in which substances are ontologically prior to modes because a mode must exist in a substance.) Because a substantive signifies objects directly but an adjective signifies objects indirectly by first signifying a mode, a substantive is called absolute and adjective relational.

It is clear that the Logic’s authors regarded intentional mode-sets as part of its explanation of conceptualization or of “what it is to understand an idea.” Details are fleshed out in Part IV in the discussion of clear and distinct ideas, and sensation. Like some nominalists who believed in objective being, the Logic’s authors make the point that objective being is not some kind of representative entity between, or in addition to, the soul and the external world. It is a fact of psychology, they hold, that when a perception is experienced during sensation or when an idea is clearly conceived in thought, the soul is aware of the modes that make up its content. No mode of matter experienced in the content of a perception or idea, however, can be true of the soul itself. They are true rather of the material substance outside the soul that is the object of sensation or that the idea signifies.

f. Kinds of Ideas, Occasionalism

Like Descartes (Meditations III.7), the authors hold that there are three kinds of ideas that differ by how they are caused. They are adventitious, innate, and factitious.

Adventitious ideas are those caused by God on the occasion of a bodily sensation. Sensation is more fully explained in Part IV. Because material modes cannot be instantiated in the soul, the Logic is forced to reject the usual Aristotelian account of sensation and concept formation. The material transfer of modes in sensation only goes as far as the brain. The properties of a material substance travel from the object being sensed to perceiver’s sense organs, and from there to the brain, but they stop there. Material modes cannot then be transferred “intentionally” to the soul itself to become consciously perceived. The Logic’s alternative explanation is a form of occasionalism. (On occasionalism in the Logic see I:1, KM V 132–33, B 29-30; I:9, KM V 157–78, B 9–50; I:12, KM V 168–170, B 58–60; VFI 6, KM I 204, G 71–71; VFI 27, KM I 349–50, G 208. For broader accounts in Cartesianism generally, see Nadler 2011, Nadler 1989, and Garber 1993.)

On the occasion of bodily sensation in which a material object transfers its modes to the perceiver’s brain in the form of physical motion, God simultaneously causes to be instantiated in the soul a mental mode. This mode is adventitious and is called a perception in a narrow sense. A perception, moreover, has an intentional content of which the soul is aware with varying degrees of vividness, clarity, and distinctness. Some of these modes, like motion, relative position, and shape, are material and are true of the object outside the mind causing the sensation. Other modes in the perception’s content are sensory. They are true of the soul itself, like colors, tastes, smells, textures, sounds, and feelings of pleasure and pain.

Innate ideas are ideas directly instantiated in the soul by God apart from sensation. They include the idea of infinity and of God himself.

Factitious ideas are caused by the soul itself through one of two mental operations: restriction or abstraction. Both operations were standard topics in earlier logic. The Logic’s account is novel in that it explains their mechanisms in terms of intentional content.

g. Determinative and Explicative Restriction

Grammatically, restriction is a mental operation by which the soul forms a longer substantive phrase by modifying a substantive with an adjective or relative clause. Semantically, a relative clause functions like an adjective: it has a primary and secondary signification. In restriction a new idea is formed. Its comprehension is the intersection of the comprehensions of the two contributing ideas. Since the new comprehension contains more modes than either the substantive or its modifier, it will frequently be true of fewer things and will then be less general. If the restricted phrase signifies fewer individuals than the substantive alone, it is said to be determinative. On the other hand, if the restriction does not signify fewer things but simply adds extraneous information, it is called explicative.

Because an explicative restriction does not reduce the significance range of the modified substantive, the proposition expressed is equivalent to a conjunction of propositions. In one of these, the extraneous modifier is deleted, and in the other, the modifier is predicated of the original substantive. For example, in the Pope, who is the Vicar of Christ, resides in Rome, the relative clause who is the Vicar of Christ does not further restrict the significance range of the subject the Pope. The proposition is therefore equivalent to a conjunction of two propositions: The Pope resides in Rome and the Pope is the Vicar of Christ (I:8, KM V 151–52, B 44–45). This distinction between determinative and restrictive relative clauses had been made frequently in earlier logic. (See, for example, Buridan 2001, 286; Parsons 2014, 5.6.) The Logic adds its explanation in terms of content. The distinction is also made in modern grammar using the terminology restrictive and non-restrictive relative clauses.

h. Indeterminate Restriction

The Logic also recognizes what some commentators call a special type of restriction called indeterminate restriction (II:6, KM V 145, B 40; I:7, KM V 147–48, B 41–42; I:7, KM V 150, B 44; II:3, KM V 199, B 83). (Pariente 1985, 247–238, Auroux 1993, 74.) This is not really a second type of restriction but rather a way of referring to restriction in the metalanguage using an existential quantifier. Indeterminate restriction is important in Part II where it is used to state the truth-conditions of particular affirmative propositions. As explained there, a particular affirmative some S is P is true if there is some third term, call it Q, by which both terms S and P are restricted with the result that the restricted terms have the same extension. As in a similar analysis of Aristotle’s called ecthesis (see, for example, Prior Analytics 28a23–26, 30a9–14), the common subset shared by S and P is “exhibited” by the two restricted terms. In the precise statement of the truth-conditions, restriction occurs in its univocal sense but in such a way that there is an existential quantification in the metalanguage over the restricting term: some S is P is true if and only if there is some idea Q so that restrictions of S by Q and P by Q have the same extension.

i. False Ideas

If the combination of modes in an idea’s comprehension are not jointly true of any actual object, then the idea is said to be false.

If the objects represented by these ideas, whether they be substances or modes, are represented to us as they are in fact, one calls them true [véritables]. If they are not such, they can only be false [elles sont fausses en la maniere qu’elles les peuvent être], and this is what one calls in the School beings of reason, which usually consist of the combination that the soul makes from two ideas real in themselves, but which are not joined in truth to form a single idea. An example is the one that can be formed from a mountain of gold. It is a being of reason, because it is composed of two ideas, of mountain and of gold, which it represents as one even though they would not really be so. (I:2, KM V 136, B 32, author’s translation. See also Discours I, KM V 110, B 9–10; I:9, KM V 157–78, B 49–50; I:11, KM V, 168–170, B 58–60.)

Many of the examples of false ideas given in the Logic are not just false but impossible, either because their contents contain contrary modes or because the laws of nature prevent their joint satisfaction. In earlier logic it was common to call such a non-existing thing a being of reason. It was often said to have objective being and to have some status in reality distinct from the soul and real beings (esse reale, in re). (See, for example, Willam of Ockham 1978 §10, and Suárez 1995.) A standard example was an impossible being like a chimera, goat stag, or golden mountain, as well as a planned but incomplete possible being like a castle, house, or city. The authors of the Logic, however, reject the view that a being of reason possesses a reality independent of the soul, and regard objective being rather as a property of ideas. An idea has objective being to the extent that the soul is aware of the modes in the idea’s intentional content when the idea is instantiated in the soul. As egregious examples of false ideas the Logic cites those with comprehensions that combine spiritual and material modes. Examples include a red, blue and orange rainbow (of water drops); pain caused by fire; heaviness caused gravitational attraction; happiness as caused by material wealth; courage as feats of valor; lack of physical pleasure as evil; and spatial solitude as misery. Some ideas, however, are only contingently false. The Logic remarks, for example, that Alexander, the son of Philip would be a false idea if Alexander had not been Philip’s son. The idea the bent stick in the water would be false if the stick were straight, but true if not. Peter, the denier of Christ happens to be a true idea, but since Peter was free, it might well have been a false idea.

Following Descartes, the Logic places false ideas at the center of its explanation of error, especially the errors characteristic of Aristotelian psychology and various moral failings. Aristotelian accounts of perception err because a mode true of matter cannot travel via sensation and abstraction to become instantiated in the soul. Rather, the material world, which consists of Cartesian extension modified by geometric and mechanical modes, is entirely separate from the soul, which is modified by modes of sensations, feelings, and morals. Ideas that combine the two are false. In addition, many moral failings are grounded in false ideas. When young, we mistakenly believe that moral qualities, which are true of the soul, are caused by material circumstances. We err when we combine them into a single idea, for example, when we combine virtue and worldly wealth.

False ideas are important to logic because they have implications for the theory of truth. Semantically, false ideas are nonreferring terms—they fail of existential import. What are the truth-conditions of an affirmative categorical proposition with a false idea as subject term? Medieval logicians had divided on whether this failure makes the position false. The quotation above, and others in the Logic, strongly suggest that in Arnauld and Nicole’s view an affirmation with a false idea as subject is false. The issue recurs in Part IV’s account of necessary and contingent truth. (See Martin 2012.)

j. Abstraction

 The second way in which the soul causes new ideas is by abstraction. Various accounts of abstraction had been part of logic since Aristotle, but the Logic’s version had to be made consistent with dualism. (For a standard medieval account see Aquinas, Summa Theologica I.I, Q. 85.) To do so, the authors explain its mechanism as a manipulation of intentional content. (I:5, KM V 142–43, B 37–38; I:11, KM V 168–170, B 58–59; VFI 6, KM I 207–210, G 74–76; VFI 11, KM I 234–235, G 98–100.)

Abstraction is either from a sensation or a prior idea. On the occasion of sensation, God causes the soul to experience a vivid awareness of a modal content. Some of these modes, like extension, relative position, and motion, are true of the material object that is the external correlate of the experience and that has caused the corresponding movements in the perceiver’s brain. Other of these modes, like colors, tastes, weight, sounds, and associated feelings, are true of the soul. When attending to this broad content, the perceiver may form an idea from this content. The soul does so by selecting as the idea’s comprehension a subset of modes evident in the perceptual experience.

The second kind of abstraction is from a prior idea. While attending to the comprehension of a prior idea, the perceiver may form a new idea with a new comprehension by selecting a subset of modes from the prior idea’s comprehension. Because the new content contains fewer modes, it is generally true of more things and is therefore more general.

The Logic describes the accumulation of abstract ideas as progressive, starting with abstractions from sensory experience and proceeding to increasingly more general ideas. In earlier logic, abstraction was usually described as progressing in the reverse order (for example, in Aquinas above), starting with abstraction to the most general ideas and then progressing by steps of restriction to more concrete ideas.

k. The Categories and Predicables

Following the logical tradition, the Logic endorses Aristotle’s categories (I:3) and Porphyry’s predicables (I:7). Its list of the ten categories is standard: substance, quantity, quality, relation, activity, passivity, place, time, position, and state. The book later scarcely mentions the distinctions among the various nonsubstance categories. Like some late medieval logicians, the authors evidently regarded distinctions among mode types as unimportant. In particular, the Logic rarely speaks of relations as such, although it recognizes that some modes are internal to a substance and others external. In the medieval fashion, external modes are another name for relations, and are so-called because they hold of a substance only by reference to another substance.

The authors also endorse the five predicables, a standard topic in logic since Porphyry: genus, species, deference, property, and accident. These classify mental predicates according to their degree of necessity. These distinctions, unlike those among type of modes, are important in the Logic.

Genera and species are common nouns in mental language (I:4–6). As such they have comprehensions and signify individuals. In the normal case they signify substances, but as terms of second intention, they may also signify modes. Differences, properties, and accidents too are terms in mental language, but they are adjectives. As such they signify secondarily a mode or modes, and signify primarily the individuals that satisfy these modes. Thus, viewed ontologically, genera and species differ from difference, property, and accident as substances differ from modes. Corresponding to genera and species are the individuals they signify, which in general are substances. Corresponding to differences, properties and accidents are the modes they signify directly and the objects in which the modes inhere indirectly.

The Logic uses the predicables to articulate its account of essential definition. The details follow earlier standard accounts. Every species has an essential or real definition, as distinct from a nominal definition (1:12–014). A nominal definition lays down a convention in which a spoken sound is paired with an idea. (This relation is also called signification, but in this sense it is distinct from signification in the sense of reference, a natural relation between an idea and what it stands for outside the mind. The dual senses were common in medieval logic.) A real or essential definition is a universal affirmative proposition in which a species is the subject term and its genus restricted by a distinguishing adjective is the predicate. The adjective is called the species’ difference (differentia). An essential definition is necessarily true and describes the species’ nature. Part IV assigns a major role in scientific knowledge to essential definitions.

It follows from the account of essential definition that species fall into a structure. Every genus except the highest is itself a species and has its own essential definition. The highest genus, which has no definition, is referred to as being or substance. A species that is not a genus is an infima species. At several places, the Logic mentions the traditional doctrine of differentiation by privation. This is the case in which a genus divides into two species which are such that the difference of the second is the privative negation of the difference of the first. (I:7, KM 148, B 42; II:15, KM 242, B 124.) The species animal, for example, is said to divide into the species human with the difference rational and the species brute with the privative difference irrational. Although the authors do not draw attention to the fact, the account entails genera and species’ conforming to a finite finitely branching tree-structure, which is traditionally called the tree of Porphyry. (See Structure of Ideas below.)

A property (proprium) is an adjective that is not the difference of any species but that nevertheless signifies secondarily a mode necessarily true of a species. As an example, the Logic cites a mode that is necessarily true of a circle but not part of its definition: all lines from the center to the circumference are equal. An accident is such that any species “can be conceived without it.” More precisely, an accident is an adjective that is not a difference or a property. An accident connotes a mode that is true but not necessarily true of what it signifies. It is either not true of every member of a species, or not always true of them.

l. Nominalism-Realism

Like some medieval logicians, the Logic explains genera and species nominalistically. Ontologically genera and species are not classified as some special kind of entity in addition to matter and its modes, or souls and their modes, as some realists had suggested. They are simply ideas, which are modes of the soul. On the other hand, the Logic treats differences, properties, and accidents realistically. Strictly speaking, these too are ideas in mental language (I:7). But they are also adjectives, and as such they signify modes secondarily. Frequently the Logic muddles use-mention and uses difference, property, and accident to refer to these modes themselves. For example, the difference of the species human sometimes means the adjective rational and sometimes the mode rationality. The context makes clear which is intended. The Logic’s overall metatheory is basically realistic because it assumes a fundamental substance-mode ontology in which modes have real existence. Thus, although the Logic’s ontology is basically realistic, which is evident in what it has to say about difference, property, and accident, it adopts the nominalistic move of some earlier logicians of avoiding positing special entities for genera and species by identifying them with modes of the soul.

m. Comprehension as a Generalization of Essence

In the Logic’s technical vocabulary, the collected differences of a species’ higher genera constitute the species’ comprehension, which is its intentional content. Indeed, it is not an exaggeration to say that the Logic’s entire theory of reference via intentional content—comprehension in the case of substantives and secondary signification in the case of adjectives—is a generalization of the Aristotelian theory of essential definition. Available in that theory was the generalization that a species comprises those individuals that satisfy its difference and those of its higher genera. In the Logic, it is these modes that determine what a species term signifies. In other words, a species signifies those individuals that satisfy the modes in its intentional content. It is a small step to attributing a content to all terms and explaining their signification similarly.

n. Species and Difference

The Logic holds an odd view about species. A species and its difference, it holds, are semantically equivalent. They both signify the same individuals and therefore have the same extension (I:7, KM V 147–48, B 42). The authors are here following Aristotle, who maintained that each species has a unique difference. No two species, in other words, have the same difference. (See, for example, Parts of Animals 3, 642b20–643a20.) Moreover, if a difference, which is an adjective, is read as a substantive, the species and the difference signify the same individuals. It is perfectly normal in Latin to read an adjective as a noun when it is not used to modify another noun. The Logic’s example is the word album (white) which may be understood as either an adjective or a noun. Likewise, rationalis is an adjective in animal rationalis, but a noun signifying the same individuals as homo when it occurs alone as a noun. Thus, the signification of the noun is the primary signification of the adjective. In the Logic’s terminology, the secondary signification of the difference when construed as an adjective is identical to its comprehension when construed as a noun. It is for this reason that the Logic at times says that a species and its difference are the same thing.

o. Signification and Extension

Extension is probably the most interesting concept in the Logic’s semantics. Truth is defined in terms of extension but extension is defined in terms of ideas. The result has the appearance of a kind of idealism in which truth is defined independently of the external world. The appearance, however, is misleading. Although the authors are dualists and revolutionaries of a sort who want to define truth using only mental categories, they are also conservative in the sense that they want to maintain a correspondence theory of truth. To do so, they defined extension so that it tracks what happens in the world outside the mind. A universal affirmative, to be sure, is true if the extension of its subject term is contained in the extension of the predicate. Moreover, extension here consists of ideas. But subordination among these ideas corresponds, it turns out, to subordination among things in the world.

The story is somewhat indirect due to the authors’ loose mathematical style. They fail to give what we would regard today as a clear definition of extension. The best they have to say is that a term’s extension consists of its “inferior subjects.” (I:6) They make clear that by “subjects” here they mean ideas. From these two remarks it is possible to piece together a definition: the extension of an idea consists of all its inferior ideas. The problem, however, is that they do not define “inferior.” They do give an example. Various types of triangles, they say, are inferior to the genus triangle. This reading, moreover, conforms with prior usage in medieval logic. The suggestion is that the extension of A is the set of all ideas B such that all the modes in the comprehension of A are included in the comprehension of B. Several things follow: First, the extension of B would be included in that of A if and only if all the ideas defined in terms of B are a subset of all the ideas defined in terms of A. Second, all the ideas defined in terms of B are a subset of all the ideas defined in terms of A if and only if the comprehension of A is a subset of the comprehension of B. It follows that whether a species is inferior to a genus would be a function of the essential definitions. The definition of inferiority would entail that every S is P is true if and only if the comprehension of P is a subset of that of S. A true universal affirmative would then be a matter of conceptual inclusion and, as such, necessary. A plausible example would be every animal is a living being. It would be true because the species animal is included in the genus of living being or, equivalently, animal is defined in terms of living being. Being an essential definition it is also necessary.

This reading, however, is much too narrow to fit other views within the Logic’s more general metatheory. It excludes, for example, the possibility of contingent truth. In particular, it entails the wrong truth-conditions for propositions that affirm accidental predicates. Accidents, of course, are not species. They have no “inferiors” in the proposed sense. On the other hand, contingent propositions like Peter denied Christ and every student in the classroom is asleep can be true, yet the intentional content of the predicate is not contained in that of the subject. Similarly as a contingent matter a universal negative like no doctor is a thief can be true and no doctor is poet false, yet in that case both can be made false as function of ideas. The set of ideas defined in terms of doctor and poet is non-empty if the idea doctor-poet is formed by restriction  Likewise the set of ideas defined in terms of doctor and thief is non-empty if it contains doctor-thief.  (See Auroux 1993, 135, and Martin 2017.) The reading also poses problems for the Logic’s doctrine of false idea. As explained above, the Logic attributes errors in philosophy and morality to believing propositions that have false ideas as subjects. These are ideas that fail to signify any existing thing, like pain caused by fire or virtuous rich man. Affirmative propositions with false ideas as subjects—that is, affirmatives with subject terms that have no existential import—are supposed to be false. On the other hand, the intentional content of mountain is contained in that of golden mountain, and anything defined in terms of golden mountain would be defined in terms of mountain. It would seem, then, that a trivial but empty proposition like every golden mountain is a mountain would be true despite have a false idea as subject. The issue is important in Part IV (see Martin 2012).

The broader context makes clear the correct definition of inferiority. The key is to define inferiority in terms of signification: idea A is inferior to B if and only if every individual that A signifies B signifies. Equivalently, A is inferior to B if and only if all the individuals that satisfy the modes in the intentional content of A also satisfy all the modes in the intentional content of B. The extension of idea A, or Ext(A), is defined as the set of ideas B that signify only individuals that B signifies. It follows that the ideas in a term’s extension, which is the set of its inferiors, signify what the term signifies but do so in finer detail. Let the significance range of idea A, or Sig(A), be the set of all individuals that A signifies. In short, Ext(A) is the set of ideas B such that Sig(B)⊆Sig(A).

The mappings Ext and Sig stand in one-to-one correspondence and as a result the definition of extension insures that a term’s extension provides an indirect way of referring to individuals “outside the mind.” Ext(A) determines Sig(A) because Sig(A) is the set of all individuals that are in any idea in Ext(A). Conversely, Sig(A) determines Ext(A) because Ext(A) is the set of all ideas inferior to A that signify only individuals in Sig(A). Moreover, their inclusion relations mirror one another: Sig(A)⊆Sig(B) if and only if Ext(A)⊆Ext(B). A correspondence theory of truth follows. As Part II explains, the truth-conditions of the universal affirmative are stated in terms of extensional inclusion: every S is P is true if and only if Ext(S)⊆Ext(P). But this holds exactly when Sig(S)⊆Sig(P).

The reader should be warned that the definition of extension in the Logic is rather different from the usual one in modern logic. Modern usage, which follows Leibniz and Frege, identifies the extension of A with Sig(A). That is, in modern usage the extension of A is a set of individuals, not ideas. Although the Logic’s usage has fallen into desuetude, it has historical priority.

p. The Structure of Ideas

The ordering relations on ideas and extensions have suggested to some commentators that the Logic anticipates 19th century Boolean algebra. (See Dominicy 1984, Auroux 1982, and Auroux 1993.) The suggestion is intriguing but overblown. It is true that intentional content seems to be a set of modes and sets are ordered by the subset relation. This ordering, moreover, induces a containment relation on ideas: idea A is contained in idea B (briefly AB) if and only if the intentional content of B is a subset of the intentional content of A. In addition, every idea determines a significance range. The mapping from ideas to significance ranges is, moreover, many-one because distinct ideas may signify the same individuals. For example, a species-difference and its proprium would have the same significance range, as do the two terms Peter and the man who denied Christ three times. The mapping, moreover, is antitonic—it reverses the ordering: if AB, then Ext(B)⊆Ext(A). As pointed out above, there is also a one-one order preserving mapping from significance ranges to extensions. It follows that there is a many-one antitonic mapping from ideas to extensions: if AB, then Ext(B)⊆Ext(A). Thus, as Leibniz later observed, the order of extensions reverses the order of ideas. These are all genuine algebraic properties in the modern sense, and they are in some sense implicit in the Logic. On the other hand, these properties were not remarked upon by the Logic’s authors themselves.

In their own pre-algebraic language about containment, signification, and extension, the authors do remark on order and correspondence. It is an exaggeration, however, to say they noticed duality or the properties of a Boolean algebra (see Martin 2016c). They do not comment on the fact that the order of extensions reverses that of ideas, a necessary condition for duality. They do not point out that ≤ and ⊆ are reflexive, transitive, or asymmetric. Much less do they claim that abstraction and restriction satisfy the conditions for meet and join operations. Abstraction, for example, is treated as a one-place operation, and there is no suggestion that the set of ideas is “closed” under either abstraction or restriction. There is no textual evidence that they envisaged a maximal idea, which would have as its intentional content the set of all modes. It is also unclear whether being, the highest genus, should be regarded as a minimal idea. Is being in the comprehension of golden mountain or square circle? The authors avoid such issues. The few times they refer to a negation as an operation it is as the medieval notion of privative negation rather than as a complementation operation in the modern sense. (I:7, KM 148, B 42; II:15, KM 242, B 124. See Martin 2016b.) They do not even say explicitly that genera and species exhibit the structure of the tree of Porphyry. (See Auroux 1992, Auroux 1993.) All in all, the discussion of structure in the Logic is pre-algebraic, like discussions of structure found in medieval logic, of which the Logic is a continuation.

2. The Logic of Propositions

a. Summary

Part II discusses the properties of expressions in spoken language. Among other expressions it discusses nouns, pronouns, and verbs (II:1), the four categorical propositions (II:3), gappings (II:4-5), false ideas (II:7), exclusives such as only (II:10:1) and exceptives such as except (II:10:2), the alethic modalities (II:8), comparative adjectives (II:10:3), various compound sentences (II:9), and definitions (II:16). Expressions in spoken language represent propositions in mental language, essentially the categorical propositions of the syllogistic and their truth-functions. Part II concludes with the truth-conditions for categorical propositions (II:17–20) and conversion. Much of the material is unoriginal or of slight logical interest. Remarks here are limited to modality and the truth-conditions for categorical propositions.

b. Modality

The alethic modalities—possible, contingent, impossible, and necessary—are discussed briefly and characterized syntactically as verbal modifiers. There is no attempt to provide semantic analysis or truth-conditions. A point of interest is that in a series of mnemonic names setting out four squares of oppositions, one for each of the four modalities, they conflate contingent with possible. That is, they identify contingency with so-called “single-sided” possibility: it is contingent that P means it is possible that P. They are probably following the commentary tradition. At some points in the De Interpretatione Aristotle explains contingency in the single-sided sense, a conflation that had been regularly remarked upon by later commentators. The Logic’s authors may in fact be copying a virtually identical discussion of the mnemonic names from the logic of Eustache de Saint-Paul in which he makes the same conflation using the same names and squares. Fonseca in his logic of roughly the same period is more revealing. He reports Aristotle’s conflations of contingency with single-sided possibility and remarks that in 17th century logical discourse contingency had evolved to its double-sided sense. He nevertheless goes on in his text to provide a list that follows Aristotle and identifies the contingent with the possible. Regardless of the mnemonics at II:8, where the authors themselves actually use the word contingent or contingency in the Logic to state their own views, they use contingent in the double-sided sense following the general usage of the period. For example, in Part IV they describe knowledge of historical and human events, which is based on sensation, as “contingent” with the understanding that the events might have been otherwise.

c. Distributive and Confused Supposition

Part II concludes with sections laying out “axioms” for the truth-conditions for the categorical propositions of the syllogistic. These sections are some of the most interesting parts of the book. The account is not really axiomatic in the modern sense: it is rather a series of informal definitions. From the “axioms” and the explanatory remarks that accompany them, however, it is possible to abstract clear truth-conditions in the modern sense. What is interesting from a modern perspective is that truth is defined as a function of the semantic interpretations of a proposition’s parts, much as in a modern recursive definition. The particular way they do so is also of historical interest because it draws on ideas from medieval supposition theory.

In the theory of supposition, medieval logicians had distinguished various ways in which categorical terms refer. Depending on the type of propositions in which it occurs and its position, a term signifies either all the individuals in its scope, in which case it was said to have distributive supposition, or just some individuals, in which case it was said to have confused supposition. These species of supposition were explained in terms of characteristic entailments that hold between the proposition itself and specific conjunctions and disjunctions of multiple identity statements. (See Parsons 2014, Chapter 7.) There are four cases, one for each of the four types of categorical proposition. It is assumed that there are proper names for each of the individuals that a term signifies:

  1. A universal affirmative is equivalent to a long conjunction of disjunctions. For each individual in the subject’s scope there is a conjunct, and this conjunct consists of a disjunction that affirms of that individual that it is identical to one or the other of the individuals in the predicate’s scope.
  2. A particular affirmative is equivalent to a long disjunction of disjunctions. For each individual in the subject’s scope there is a disjunct, and this disjunct consists of a disjunction that affirms of that individual that it is identical to one or the other of the individuals in the predicate’s scope.
  3. A universal negative is equivalent to a long conjunction of conjunctions. For each individual in the subject’s scope there is a conjunct, and this conjunct consists of a conjunction that denies of that individual that it is identical to each of the individuals in the predicate’s scope.
  4. A particular negative is equivalent to a long disjunction of conjunctions. For each individual in the subject’s scope there is a disjunct, and this disjunct consists of a conjunction that denies of that individual that it is identical to each of the individuals in the predicate’s scope.

These equivalents can be stated briefly in the notation of sentential logic. Let us assume that the constants s1,…,sn name all the individuals in the scope of the subject S, and that p1,…,pn name all the individuals in the scope of the predicate P. The entailments for the four propositional forms are then:


By reference to these equivalents it is possible to give a semantic definition of distributive supposition. The definition depends on whether the term is a subject or predicate. A subject term is distributive if the proposition in which it occurs is equivalent to a conjunction of conjunctions or disjunctions and is non-distributive otherwise. A predicate is distributive if the proposition in which it occurs is equivalent to a conjunction or disjunction of conjuncts and is non-distributive otherwise:

Subject Predicate
A Distributive Non-Distributive
E Distributive Distributive
I Non-Distributive Non-Distributive
O Non-Distributive Distributive

d. Truth-Conditions for Categorical Propositions

What makes the entailments relevant to truth-conditions is that they suggest a way to characterize truth according to distributional properties. For example, a universal affirmative is true if the subject is truly distributive and the predicate non-distributive. From the perspective of modern truth-theory, however, any definition of truth in terms of the medieval notions of distribution and non-distribution would be flawed because it would be circular. Truth cannot be defined in terms of distribution because distribution is defined in terms of entailment, and entailment in terms of truth. Medieval logicians were not troubled about circularity because the distinction between distributive and non-distributive supposition was part of a system of classifying the different ways terms refer. There was no intention of incorporating supposition into a definition of truth, recursive or otherwise.

Arnauld and Nicole, however, in effect noticed that is possible to explain distribution and non-distribution directly without reference to the entailments of “descent and ascent.” It is then possible to use the distinction to state truth-conditions in a non-circular way. More precisely, it is possible to say in the metalanguage, without referring to object language identity statements, that in a true universal affirmative each referent in the extension of the subject is identical to some referent in the extension of the predicate, and so forth for the other propositional types. (See Martin 2013 and Martin 2016a. Compare Pariente 1985, who questions the influence of supposition theory.)

To explain the authors’ metalinguistic approach, it is useful to make use of the notation of restricted quantification. The notation attaches a subscript to the quantifier symbol naming its “extension.”

Here f(v) is a functor f applied to a variable v, and P(f(v)) is an open sentence saying something about f(v); [∀A f(v)] P(f(v)) is read as for any f(v) in A, P(f(v)); and [∃A f(v)] P(f(v)) is read as for some f(v) in A, P(f(v)).

The truth-conditions for the categorical propositions are easily stated in the metalanguage as facts about the identity or non-identity of all or some of the referents to individuals in the subject’s relevant extension to those in the predicate’s. In the notation below, the individual signified by term v, briefly Sig(v), is referred to as being in either the extension of the subject S, briefly Ext(S), or in the extension of the predicate P, briefly Ext(P), or in the intersection of the two extensions, briefly Ext(S) ∩ Ext(P):
In the formulas above let the right hand side be called the truth-conditions of the categorical proposition on the left. Let the outer (left-most) quantifier in the truth-conditions, which has the broader scope, be called the subject quantifier, and let the inner (right-most) quantifier, which has narrower scope, be called the predicate quantifier. It is then possible to define distributive term semantically. A term is used distributively in a proposition if the proposition is true and its quantifier in its truth-conditions is universal, and non-distributively if it is true and its quantifier is existential.

The Logic’s authors make the same distinction using different terminology. They prefer to use universal for distributive and particular for non-distributive, a common variant in earlier logic. They also noticed that the relevant extension of a term varies by its position. A term’s relevant extension is its entire extension if the term is the subject of universal affirmative, universal negatives, or particular negatives, or it is the predicate of a universal negative. In all other cases, a term’s relevant extension is the intersection of the extensions of the proposition’s two terms. (In modern general quantification theory, similar distinctions are made in terms of whether the proposition’s quantifier is monotonic.) Recall that in the Logic, a term’s extension is a set of ideas. In the authors’ terminology, then, a universal term is one that asserts of each idea in that term’s relevant extension that it is identical to (“is put in”) or is not identical to (“is not put in”) the ideas in the relevant extension of its collateral term. A particular term is one that asserts this identity or non-identity only of some. The truth-conditions axiomatized by the authors (II:17–20) can then be easily stated, first in their own terminology (in italics) and then in a modern paraphrase.

every S is P is true iff The proposition is affirmative. The subject is universal. The attribute is particular. The extension of the attribute is restricted by that of the subject. The attribute is put in the subject according to the entire extension of the subject.
The relevant extension of the subject S is its entire extension; the relevant extension of the predicate P is the restriction of its extension of P by that of S. Every element of the relevant extension of S is identical to some element of the relevant extension of P.
no S is P is true iff The proposition is negative. The subject and the predicate are universal. The attribute of a negative proposition is always taken generally. Negative propositions separate the attribute from the subject according to the entire extension of the attribute. The attribute is denied of everything contained in the extension of the subject.
The relevant extension of S is its entire extension. The relevant extension of P is its extension. Every element of the relevant extension of the subject is not identical to every element of the relevant extension of P.
some S is P is true iff The proposition is affirmative. The subject and predicate are particular. The extension of the attribute is restricted by that of the subject. The attribute is conceived only in part of the extension of subject.
The relevant extension of the subject S is the restriction of its extension by that of the predicate P. The relevant extension of P is the restriction of its extension by that of S. Some element of the relevant extension of S is identical to some element in the relevant extension of P.
some S is not P is true iff The proposition is negative. The subject is particular. The attribute is universal. The attribute is denied of everything contained in the extension of the subject. Negative propositions separate the attribute from the subject according to the entire extension of the attribute. Negative propositions separate this attribute from the subject, particularly if it is particular.
The relevant extension of S is its entire extension. The relative extension of P is its extension restricted by that of S. There is some element in the relevant extension of S that is not identical to any element in the extension of P.

 

e. The Correspondence Theory of Truth

Although these truth-conditions are formulated in terms of extensions, which are composed of ideas, the Logic’s broader intention is to capture a correspondence theory of truth. As explained in Part I, there is a one-one mapping between significance ranges and extensions. Accordingly, although the conditions above refer to the identity or non-identity of ideas in term-extensions, ideas in a term’s extension are proxies for individuals in the term’s significance range. It is true that the Logic’s authors say little about the expressive completeness of mental language. They make no claim that there is an idea in mental language for every individual that actually exists—that, in medieval terms, there is an individual concept for each existing thing. But to insure a genuine correspondence theory, all that the authors need assume is that the ideas in a term’s extension “cover” the individuals in the term’s significance range in the sense that for any individual signified by a term there is some idea in its extension that signifies it. In the trivial case in which an idea has no strict inferiors, the idea itself would “cover” its own significance range.

As they stand, the truth-conditions do not address the issue of existential import of the subjects of affirmative propositions. They do not provide for what happens when an affirmative proposition has a false idea as subject term. The discussion of false ideas in Part I requires that these propositions be false. The issue recurs in Part IV in the discussion of necessary and contingent truth.

3. The Logic of Arguments

a. Summary

(Formater: Insert paragraphs for this section here.)

b. The Syllogistic

Acceptable arguments, including the immediate inferences of the square of opposition and syllogisms, are described in terms of “rules.” An example is the rule for accidental conversion: universal affirmative propositions can be converted by adding a mark of particularity to the attribute which becomes the subject (II:1, KM V 250, B 132). All these rules are laid down without proof. A modern reader, on the other hand, would expect that the authors, having just stated the truth-theory for the categorical propositions in Part II, would have made some effort to argue for the validity of the rules in Part III. They seem to have thought, however, that the rules they cite are too obvious to require justification, and indeed most of the logic of Part III is trivial. They remark, for example, that “there is little value in knowing the rules of the syllogism” (IV:introduction, KM V 354, B 227). Of logical errors they say, “… it is almost impossible for a person of average intelligence who has some insight ever to fall into them” (IV:8, KM V 384-385, B 252). On the other hand, in the later sections of Part III, they attach much importance to the avoidance fallacies, especially various kinds of equivocation.

The rules describing the square of opposition, conversion, and the valid moods are formulated using a series of technical terms: subject and predicate; affirmative and negative proposition, universal and particular term; universal and particular proposition; syllogism; major, middle, and minor term. (A singular proposition is classified as a special case of universal proposition.) These are all understood syntactically with the exception of universal and particular term, and affirmative and negative proposition, which in Part II also have semantic senses.

c. Validity

It is surprising that the authors do not attempt to prove the validity of their rules. It had been common to do so in logic since Aristotle. Nor do they attempt a syntactic account of what counts as a valid mood. In the Logic’s first edition they do discuss the traditional method of reducing the valid moods to Barbara and Celarent and describe a set of traditional mnemonic names for the reductions. (See B, xxxv and 156.) From a modern perspective, that procedure is not without interest because it is an early form of an axiom system even though the set being axiomatized (the valid syllogisms) is trivially finite. The authors, however, dismiss reductions as “useless” and omit the topic in later editions. (III:8, B 156. See B, xxxv and 156.) They may do so because they reject one of the traditional reduction rules, per contradictionem (if A,~B├ ~C, then A,CB). The rejection is perhaps due to their more general doubts about indirect proof in Part IV. (See III:9, KM 276, B 157 and IV:2, KM V 367, B 238; IV:9, KM V 388, B 255.)

Although the authors do not prove the validity of their rules or attempt a syntactic characterization of the set of valid moods in general, they do provide what amounts to a syntactic decision procedure for the set of valid moods. They do so by laying down six syntactic rules (III:3), which in various forms have been repeated in textbooks ever since.  

Rule 1. The middle term cannot be taken particularly twice, but must be taken universally once.

Rule 2. The terms of the conclusion cannot be taken more universally in the conclusion than in the premises.

Rule 3. No conclusion can be drawn from two negative propositions.

Rule 4. A negative conclusion cannot be proved from two affirmative propositions.

Rule 5. The conclusion always follows the weaker part. That is, if one of the two propositions is negative, the conclusion must be negative; if one of them is particular, it must be particular.

Rule 6. Nothing follows from two particular propositions.

The set of six rules is not new. The four rules that do not mention universal and particular terms were common in medieval logic. Rules 1 and 2, which are today known as “process rules,” are formulated in terms of universal and particular term and are found in contemporary works. The complete list of six rules is given verbatim in Eustache de Saint-Paul. (Summa philosophiae quadripartita, Logia III.2.I. 117, Eustachio-De-S.-Paulo 1648.) Leibniz later used the same rule set in his more formal version of the syllogistic dividing Rule 5 into two. (See Lensen 1990.)

The rules are interesting if understood syntactically. The vocabulary in which they are framed is clearly syntactic except for perhaps universal and particular term, which in Part II had been defined semantically. But as any student knows who has used the rules, universal and particular term also have simple syntactic definitions. A term is universal (or distributive) if and only if it is the subject of a universal proposition or the predicate of a negative. In a number of places in Part III, the authors refer to them syntactically.

Viewed syntactically, the rule set provides a decision procedure for the set of valid moods. By reviewing syntactically each of the 256 syllogisms, it is easy to confirm of each syllogism that if it is not on the list of 24 valid moods, it violates a least one rule. Conversely, it is easy to check, again syntactically, that if a syllogism violates a rule, it is not on the list of 24 valid moods. However, the authors of the Logic are not interested in metatheory. They do not explicitly make the point that the valid rules are exactly those that do not violate a rule, much less prove it.

4. Method

a. Summary

Part IV is about epistemology, both scientific knowledge, which is certain and based on clear and distinct ideas, and lesser sensory knowledge, which is contingent and concerns current events, history, and the future. The goal is to spell out logic’s role in scientific discovery and justification. The introductory sections (IV:1) distinguish genuine knowledge from philosophical and mathematical speculation, which is illustrated by puzzles arising from infinite divisibility. They continue (IV:2–3) with an account of scientific “method,” which consists of reasoning from causes to effects and conversely from effects to causes. The method they describe, which is divided into analysis and synthesis, makes implicit use of syllogistic techniques. Part IV’s central sections (IV:4–12) contain an extended discussion of scientific and sensory knowledge, including “demonstration,” which is another name for logic. The final sections (IV:13–16) warn about the epistemological and moral difficulties connected with faith and contingent beliefs. Remarks here focus on the central sections concerning epistemology, demonstration, and sensory knowledge. They conclude with an explanation of the role of syllogistic logic implicit in the authors’ notion of method.

b. Necessary and Contingent Truth

Certainty in science and the method for achieving it depend on the kind of truth being sought, and in particular, on whether the goal is necessary or contingent truth. The distinction between necessity and contingency had previously been made in Parts I and II. There, the necessity of essential definitions was contrasted with the contingency of accidental predications. The distinction also played a role in the discussion of false ideas. Affirmations with non-referring subjects are false. Some of these are impossible because the subject term has a comprehension that combines modes that are contrary, contradictory, or naturally incompatible (Part II:i). Part IV expands on the distinction between necessary and contingent truth, committing itself to the view that they differ in existential import:

The first reflection is that it is necessary to draw a sharp distinction between two sorts of truths. First are truths that concern merely the nature of things and their immutable essence, independently of their existence. The others concern existing things, especially human and contingent events, which may or may not come to exist when it is a question of the past. I am referring in this context to the proximate causes of things, in abstraction from their immutable order in God’s providence, because on the one hand, God’s providence does not preclude contingency, and on the other, since we know nothing about it [that is, contingent creation], it contributes nothing to our beliefs about things.

For the other kind of truth [viz. of essential natures], since everything [of this sort] is necessary, nothing is true that is not universally true. So we ought to conclude that something is false if it is false in a single case. (IV:13, KM V 398, B 263. See also II:13:iv)

The authors are committing themselves here to one side of a long debate. Earlier logicians were generally agreed that a contingent affirmation with a non-referring subject is false, but they were divided about the case of necessary propositions like essential definitions. Was “Humans are rational animals” true before creation, when there were no humans? Is a chiliagon has 1000 sides true today even though there are no actual chiliagons? Logicians like Aristotle and William of Ockham were clear that all propositions with non-referring subjects are false, even empty affirmations about a species’ nature. Others, like William of Sherwood, John Buridan, and Francisco Suárez, allowed that propositions that affirm of a species its nature (for example, essential definitions) have a timeless status. If true, they are necessarily true. Descartes too held that God can make some affirmations eternally true, like the species definition every triangle has three sides. On the other hand, Descartes appears to be open to inconsistency because he seems to have been the inspiration of the Logic’s doctrine of false ideas. He held that affirmations with non-referring subjects like chimera are false and a major source of error (compare Meditations III.6, Martin 2011). In the passage above, the Logic’s authors commit themselves to the view that propositions that affirm of a species its nature make no existential claim and that if they are true, they are necessarily so. Other affirmations that do not predicate an essence of a species, including propositions concerning worldly matters—for example, reports of sensation or claims about people, history, or geography—are contingent and carry existential import. It follows that the truth-conditions for categorical affirmatives in Part II, which as stated there do not require existential import, must be amended. For affirmations other than those affirming a nature of a species, an additional condition is necessary. Their truth-conditions should contain the requirement that the subject term signifies at least one existing thing. Doing so would also bring the Logic’s truth-theory into agreement with the prevailing opinion in medieval and then contemporary logic (see Ashworth 1973).

The distinction between necessity and contingency is important in epistemology. Though we can know or fail to know both what is necessary and contingent, the degree of certainty attached to each is different. The most important source of certainty is clear and distinct ideas.

c. Certainty, Clear and Distinct Ideas

In the first of two epistemological axioms, the Logic’s authors endorse Descartes’ doctrine that clear and distinct ideas are a source of knowledge:

First Axiom: Everything contained in the clear and distinct ideas of a thing can be truthfully affirmed of that thing (IV:6, KM V 378, B 250).

Examples of clear and distinct ideas include: ourselves as thinking beings, thinking, judging, reasoning, doubting, willing, desiring, sensing, imagining, shape, motion, rest, extended substance, existence, duration, order, number, and God.

Part I makes clear that substantives and adjectives, which are ideas, have intentional content. Sensory perceptions also have content. It consists of the many modes that flood awareness on the occasion of a sensation. This content, according to the Cartesians, may be clear or distinct. In the logical tradition, distinctness has usually been contrasted with generality. To say that an idea is distinct is to say it is a single idea. If it is distinct, it is unambiguous, and its content is internally consistent and possible (II:12, B 112). Clarity is probably to be understood as it is in Aquinas: it is a kind of intellectual light, a gift of God that allows the soul to be aware of an idea’s modal content. (Compare Aquinas, De veritate, q. 13 a. 2 arg. 4.) Thus, if the soul instantiates a clear and distinct idea, it is aware of a consistent and coherent modal content. Axiom I then says that if S is an idea that is conceived by the soul with clarity and distinctness and P is a mode in its content, the soul knows with certainty that every S is P is true. If the proposition is an essential truth, then it is necessary. Moreover, since S’s comprehension is coherent, the proposition S exists is possibly true, even if S has no actual instances. On the other hand, if the proposition is contingent and true, the proposition S exists is actually true.

Examples of clear and distinct ideas are not limited to species, nor is the knowledge they impart limited to essential definitions. An example is the cogito. Several times the Logic endorses Descartes’ argument that because the soul has a clear and distinct idea of itself as a thinking thing, it knows that it exists. The existence of a soul, however, is not necessary, nor is existence part of its nature. Most of the Logic’s examples of clear and distinct ideas, on the other hand, are from Cartesian science or metaphysics, and their contents illuminate essences. Part IV stressed that the bulk of scientific knowledge consists of the knowledge of essences imparted by clear and distinct ideas.

The necessity of essential truths is highlighted by Part IV’s second epistemological axiom, which is about possibility:

Second Axiom: At least possible existence is contained in the idea of everything we conceive clearly and distinctly (IV:6, KM V 378, B 250).

God insures that the soul never has a clear and distinct idea of an impossible being. The second axiom entails that if an essential affirmation every S is P is grounded in a clear and distinct idea, which is the preferred case in science, then possibly there is an S is true. (Arnauld makes clear elsewhere that he does not believe in the existence of possibilia as a category of being distinct from actual things. See “Arnauld to Leibniz,” May 13, 1686; KM VI, pp. 31–32; Stencil 2016). In the Logic’s first edition the authors go so far as to single out possibility as a marker of truth for essential affirmations:

possibility is a sure mark of the truth with respect to what is recognized as possible, whenever it is a question only of the essence of things (IV:13, B 263.)

The authors are making a point familiar from modal logic. An essential truth is either necessary or impossible. Thus, if it is possible, it is necessary. In the case of an essential definition, then, if it is known to be true scientifically by means of a clear and distinct idea, even if its subject term fails of actual reference, it is true and its subject is possible.

In the same text, the authors explain why a geometrical construction is also a source of certainty. Again, the reasoning turns on possibility. The mere construction of a figure with a property shows that the figure possibly possesses that property. But since the properties of a geometrical figure are either necessary or impossible, this possibility alone insures that the property holds of figures of that type necessarily.

While it is true that clear and distinct ideas have the premier role in scientific justification, they are not the only sources of knowledge. Less certain varieties of knowledge are based on demonstration and sensation.

d. Demonstration

Descartes seemed to have explained demonstrations by appeal to clear and distinct ideas. He interpreted the propositions that make up the lines of a logical or mathematical proof as a series of independent epistemological insights, each justified by its own clear and distinct idea. Once an individual line was formulated and appreciated, the thinker is inspired to conjure up an additional clear and distinct idea, and this forms the justification for the next line of the proof, and so on for the proof’s subsequent lines (See Gaukroger 1989.)

The Logic’s authors have a more modern idea of proof. A demonstration, as they understand it, is a series of lines, each of which is either a premise that is either previously proven or certain in itself, or a line that follows logically from earlier lines of the proof. They say:

A true demonstration requires two things: one, that the content includes only what is certain and indubitable; the other that there is nothing defective in the form of the argument. (IV:8, KM V 384, B 251)

Four types of premises are acceptable in a sound proof: propositions that affirm the content of a clear and distinct idea; nominal definitions, which are true by convention; properties of a geometric construction; and previously proven propositions (IV:8, KM V 384, B 251). All other lines in the demonstration must follow formally from earlier lines by rules of logic, presumably the rules of Part III. Perhaps oddly, the authors regard the application of logical rules as relatively trivial. Applying logical rules, they say, is “natural.” How to do so “does not need to be studied” (IV:7, KM V 397, B 252).

e. Sensation and Knowledge of Contingent Truth

A third source of knowledge is sensation. Although sensation is not certain, it is reliable. Its reliability is based on a demonstration. This is the (brief) argument that sensation is reliable because, if not, God would be a deceiver, which he is not. (VFI 28, KM I 355, G 213–214.) Sensory knowledge, moreover, is largely limited to contingent truths about past, present, or future individuals or events.

Although the account of sensation in Part IV is brief, Arnauld explains it more fully in “On True and False Ideas” (VFI 1; KM I, 190, 193, 195–196, 199; G 58, 62–63, 66). There, Arnauld debates Malebranche on whether perception is representational. Arnauld argues fiercely that the soul perceives the world directly, while Malebranche holds that the soul perceives only intermediate representations, which he identifies as ideas in God’s mind. Arnauld maintains that in sensory perception there are only two substances: the soul and the object sensed. As he puts it, the soul perceives the world “by the idea.” The process has two stages. First, the soul is aware that it is having a perception. Second, it is aware of the perception’s content, which consists of modes, some of which are true of the object in the world being perceived and some of which are true of the soul.

At one point he describes perception as a “relation” (VFI 5, KM I 198, G 66. Compare Raconis, De principiis entis a. 3, 827). According to the then-standard analysis of relations, a relational fact between two individuals breaks down into two non-relational substance-mode facts true respectively of the two relata. In other words, the fact that the relation holds breaks down into two nonrelational facts, in each of which the relatum possesses a mode characteristic of its role in the relation. Perception is such a relation. According to Arnauld, when a perception obtains, the soul instantiates a mode, namely an idea which possesses an intentional content that the soul is aware of. Simultaneously, the object sensed instantiates its modes, namely those modes that impact the body’s sense organs. That the soul and the sensed material object possess their respective modes constitutes the relational fact that the one is perceiving the other. It is God, who is not a deceiver, who insures that the material modes in the idea’s content match the modes of the object outside the mind.

Accordingly, veridical sensation consists of a vivid awareness of multiple modes all at once. Some of these are material modes, and as such they are true of the object outside the mind. These material modes consist of various geometrical and mechanical properties that hold to matter according to Cartesian physics. A perception, however, also contains modes of the soul. These are the sensory modes of color, taste, sound, etc. as well as psychological modes like feelings and states of mind. This rich group of material and spiritual modes constitutes the “content” of the perception. Despite the fact that a perception has a content, it is not an idea. Perceptions, for example, do not serve as terms in the propositions of mental language. Its content would be automatically “false” of any subject because it includes a contrary mixture of material and spiritual modes. Rather, the role of perceptual experience is to provide a rich source of modes for abstraction. The modes that the soul is aware of at the time of a veridical sensation are in fact instantiated, some in matter and some in the soul. If on the occasion of a sensation the soul abstracts an idea with a purely material content, the idea is true of the objects in the world impacting the body’s sense organs; if it abstracts an idea with a purely spiritual content, it is true of the soul. Thus, although the Logic’s authors are “rationalist” Cartesians and attach premier importance to clear and distinct ideas, they also allow for empirical knowledge of the material world, albeit of a less certain sort.

f. Method: Analysis and Synthesis

Although medieval logicians had much to say about method, 16th century figures like Peter Ramus had initiated renewed interest. (See, for example, Edwards 1967.) Earlier in the Logic the authors had made brief methodological remarks on classification and its pitfalls not unlike those of Ramus (IV:2, KM V 243, B 125). Part IV begins with an extended discussion of method.

In the Logic’s account (IV:2, KM V 362-367, B 233–238), method divides into analysis and synthesis. Analysis reasons from effects to causes or from the specific to the general, and synthesis reasons inversely, from causes to effects or from the general to the specific. Both presuppose that science classifies its subject by ideas of increasing generality.

The paradigm the authors seem to have in mind is a chain of syllogisms in the mood Barbara. The chain starts with an affirmative premise that characterizes its subject in terms of a narrow species and finishes with a conclusion that predicates of it a more general idea. The same chain in reverse is a synthesis.

The authors provide an example of analysis. In it, the investigator “discovers” that a subject has St. Louis as a remote ancestor or “cause.” The pattern is a series of syllogisms in Barbara. Each syllogism in the series has two premises, one affirming of a subject that he is the descendent of his father, and a second affirming of his father that he is the descendent of his grandfather. The syllogism’s conclusion then affirms that the subject is the descendent of his grandfather. This pattern is repeated, one syllogism for each subsequent generation, until an ultimate conclusion affirms that the original subject is a descendent of St. Louis. Because increasingly earlier ancestors have increasingly more descendants, each succeeding predicate has a broader extension, and the final predicate is a descendent of St. Louis is the most general of all.

A is a descendant of B, every descendant of B is a descendant of C / ∴ A is a descendant of C

A is a descendant of C, every descendant of C is a descendant of D / ∴ A is a descendant of D

A is a descendant of D, every descendant of D is a descendant of E / ∴ A is a descendant of E

A is a descendant of E, every descendant of E is a descendant of St. Louis / ∴ A is a descendant of St. Louis

Synthesis is the series in reverse order.

Analysis is also called resolution and the method of discovery. It reasons from effect to cause, from a narrower to a broader predicate. Because an effect follows from its cause, it is said to be a posteriori.

Synthesis reasons from cause to effect, and is called the method of composition. Because causes are prior, synthesis is said to be a priori.

Although it is odd to a modern reader to regard a more general class as the cause of its subsets, it was normal in earlier philosophy. Aristotle regarded the genus as the formal cause of the species, and Neoplatonism considered higher nodes in the ontological tree as the more causally productive. The authors retain this paradigm is their understanding of the hierarchy of genera and species represented by the tree of Porphyry. When analysis is applied to the pursuit of the essential truths, it carries the investigator from knowledge of species lower in the tree to knowledge of a genus higher in the tree. As each conclusion is drawn, a new premise would be required that assigns a genus to the species mentioned in the preceding conclusion. An example of the method applied to genera and species is

Socrates is a human, every human is an animal /∴ Socrates is an animal

Socrates is an animal, every animal is a living creature /∴ Socrates is a living creature

Socrates is a living creature, every living creature is a body /∴ Socrates is a body

Socrates is a body, every body is a substance /∴ Socrates is a substance

This chain starts by assigning to Socrates the narrow predicate human, which has the comprehension {rational, self-moving, living, corporeal, being}. It proceeds through species with comprehensions of increasingly fewer modes. It finishes assigning to Socrates the most general genus.

The authors of the Logic were not alone among their contemporaries to have this understanding of cause, or of analysis and synthesis. Spinoza argues for his own version of quasi-Neoplatonic causation in which the order of cause to effect was the same in a sense as the order in logic of predicate to subject. Hobbes defends an account of analysis and synthesis that is almost identical to the Logic’s (Hobbes, De Corpore I.6.1, 66: Hobbes 1992). In various papers, Leibniz explores versions of analysis that are essentially more formal versions of the Logic’s. It was typical of Leibniz to symbolize the predicate of a universal affirmative as a series P1Pk of concatenated terms. In his notation, the term letters are intended to stand for modes that are like those that make up species-comprehensions in the Logic. In a typical example, Leibniz lays down an initial premise S is P1Pk. The “analysis,” then, is a deduction that proceeds by the application of a simplifying inference rule that deletes terms from the predicate, thus making the new line’s predicate more general. The deduction terminates in a line with the most general predicate of all. The inference rule would be: S is X1XnS is X1Xn-1. (See, for example, De arte combinatoria in Parkinson 1966, and Swoyer 1995.)

S is P1,P2,P3,P4

S is P1,P2,P3

S is P1,P2

S is P1

In sum, the Logic’s notion of cause and its associated methods was a symptom of its time. What is of interest from the perspective of logic is that its details make implicit use of technical ideas from syllogistic logic. It should also be remarked, however, that it is hard to see how these methods actually would be of use in Cartesian physics, about which the Logic says very little.

5. References and Further Reading

a. Primary Sources

  • Arnauld, Antoine. 1813. Œuvres Philosophiques d’Antoine Arnauld. Paris: Adolphe Delahays. Abbreviated VFI.
  • Arnauld, Antoine. 1990. On True and False Ideas. Transalated by Stephen Gaukroger. Manchester:, Manchester University Press. Abbreviated G.
  • Arnauld, Antoine. 2003. Œuvres Philosophiques d’Arnauld. Edited by Elmar Kremer and Denis Moreau. Bristol: Theommes Press. Abbreviated KM.
  • Arnauld, Antoine and Pierre Nicole. 1996. Logic or the Art of Thinking. Translated by Jill Vance Buroker. Cambridge: Cambridge University Press. Abbreviated B.
  • Buridan, John. 2001. Summulae de Dialectica. New Haven: Yale University Press.
  • Eustachio-de-S.-Paulo. 1648. Summa philosophiae quadripartita, de rebus dialecticis, ethicis, physicis et metaphysicis. Cantabrigia [Cambridge]: Rogerus Danielis.
  • Fonseca S.J., Petrus. 1599. Commentarii in XII libros Metaphysicarum Aristotelis. Frankfurt.
  • Hobbes, Thomas. 1992. De Corpore. Edited by William Molesworth. London: Routledge-Thoemmes Press.
  • Raconis, C. F. d’Abra de. 1651. Tertia Pars Philosophiae seu Physicae, Quarta Pars Philosophiae seu Metaphysicae. Totius Philosophiae, hoc est Logicae, Moralis, Physicae et Metaphysicae, brevis et accurata, facilique et clara methodo disposita tractatio. Lugdunum [Lyon]: Irenaeus Barlet.
  • Suárez, Francisco. 1995. On Beings of Reason (De entibus rationis) Metaphysical Disputation 54. Milwaukee: Marquette University Press.
  • Toletus S.J., F. 1596. Commentaria una cum quaestionibus in universam Aristotelis logicam. Cologne: Agrippina.
  • William of Ockham. 1978. Expositio in librum Perihermenias Aristotelis. Edited by A. Gambatese and S. Brown. St Bonaventure, New York: Franciscan Institute.

b. Secondary Sources

  • Ashworth, E. J. 1973. “Existential Assumptions in Late Medieval Logic.” American Philosophical Quarterly, 10: 141–147.
  • Auroux, Sylvain. 1982. L’Illuminismo Francese e la Tradizione Logica di Port-Royal. Bologna: CLUEB.
  • Auroux, Sylvain. 1992. “Port-Royal et l’arbre de Porphyre.” Archives et documents de la Sociéte d’histoire et d’épistémologie des sciences du langage, 6: 109–122.
  • Auroux, Sylvain. 1993. La Logique des Idées. Montréal, Paris, Bellarmin : Vrin.
  • Chomsky, Noam. 1966. Cartesian Linguistics. New York: Harper and Row.
  • Cronin, T. J. 1966. Objective Being in Descartes and Suárez. Rome: Gregorian University Press.
  • Dominicy, Marc. 1984. La Naissance de la Grammaire Moderne, Bruxelles: Pierre Mardaga.
  • Edwards, William F. 1967. “Randall on the Development of Scientific Method in the School of Padua—A Continuing Reapraisal.” In Naturalism and Historical Understanding, edited by John P. Anton, 53–69. State University of New York.
  • Garber, Daniel. 1993. “Descartes and Occasionalism.” In Causation in Early Modern Philosophy, edited by Steven M. Nadler, 9–26. University Park, Pennsylvania: Pennsylvania State University Press.
  • Gaukroger, Stephen. 1989. Cartesian Logic. Oxford: Oxford University Press.
  • Lensen, Wolfgang. 1990. “On Leibniz’s Essay Mathesis Rationis.” Topoi, 9, 29–59.
  • Martin, John N. 2011. “Existential Import in Cartesian Semantics.” History and Philosophy of Logic, 32:2, 1–29.
  • Martin, John N. 2012. “Existential Commitment and the Cartesian Semantics of the Port Royal Logic.” In New Perspectives on the Square of Opposition, edited by Jean-Yves Beziau. Bern: Peter Lang.
  • Martin, John N. 2013. “Distributive Terms, Truth, and The Port Royal Logic.” History and Philosophy of Logic, 34:2, 133–154.
  • Martin, John N. 2016a. “A Note on ’Distributive Terms, Truth, and The Port Royal Logic’.” History and Philosophy of Logic, 37:4, 391–392.
  • Martin, John N. 2016b. “Privative Negation in The Port Royal Logic.” Review of Symbolic Logic, 9, 23.
  • Martin, John N. 2016c. “The Structure of Ideas in The Port Royal Logic.” The Journal of Applied Logic, 19, 1–19.
  • Martin, John N. 2017. “Extension in the Port Royal Logic.”  South American Journal of Logic, 3:1, 1-20.
  • Nadler, Steven. 2011. Occasionalism: Causation among the Cartesians. Oxford: Oxford University Press.
  • Nadler, Steven M. 1989. Arnauld and the Cartesian Philosophy of Ideas. Manchester: Manchester University Press.
  • Pariente, Jean-Claude. 1985. L’Analyse du Langage à Port-Royal. Paris: C.N.R.S. Éditions de Minuit.
  • Parkinson, G. H. R. 1966. Leibniz, Logical Papers. Oxford: Clarendon Press.
  • Parsons, Terence. 2014. Articulating Medieval Logic. Oxford: Oxford University Press.
  • Pasnau, Robert. 1997. Theories of Cognition in the Later Middle Ages. Cambridge: Cambridge University Press.
  • Stencil, Eric. 2016. “Essence and Possibility in the Leibniz-Arnauld Correspondence.” Pacific Philosophical Quarterly, 97, 2–26.
  • Swoyer, Chris. 1995. “Leibniz on Intension and Extension.” Nous, 29, 96–114.

 

Author Information

John N. Martin
Email: john.martin@uc.edu
University of Cincinnati
U. S. A.

 

Novalis (Georg Philipp Friedrich von Hardenberg) (1772-1801)

“Novalis” was the pseudonym of Georg Philipp Friedrich Freiherr von Hardenberg, an early German Romantic philosopher, poet, and novelist. Born into a Pietistic family of minor, slightly cash-strapped, Saxon nobility in 1772, he died of tuberculosis in 1801 at the age of 28. Novalis is sometimes seen as the paradigmatic figure of German Romanticism: His early death, the illness and death of his young fiancée Sophie a few years earlier—which inspired one of his most famous works, Hymns to the Night—and the sometimes mystical style of his writing have contributed to his reputation as an otherworldly, even morbid poet. However, Novalis was also a trained philosopher working within the post-Kantian Idealist tradition, with a concern for the problems that occupied this tradition: the possibility of freedom and the nature of the human vocation, the basis of knowledge, the relationship between nature and science, the significance of religion, and the best way to promote a thriving and ethical community.

Novalis was a central figure in the Jena circle of early German Romantics, which was influenced by the work of Fichte, Herder, Goethe, and the Christian mystic Jakob Boehme, and which included Friedrich and August Wilhelm Schlegel, Ludwig Tieck, Caroline Schlegel, Dorothea Veit-Schlegel, and others. During his short life, Novalis wrote philosophical fragments (some of which were published in the Schlegel brothers’ journal Athenaeum), as well as poetry, novels (The Novices of Saïs and Henry of Ofterdingen), philosophical essays (including “Christendom or Europe” and “Faith and Love or The King and Queen”), and notes and short essays on science, medicine, religion, history, language, art, and nature, including many intended for an encyclopedia, which are available in translation as Notes for a Romantic Encyclopaedia. Most of these works were only published after Novalis’s death, with the collection of his writings by Ludwig Tieck and Friedrich Schlegel.

Table of Contents

  1. Life and Works
  2. Cosmology
  3. Novalis’s Account of History
  4. Subjectivity and the Vocation of Humankind
  5. Romanticization and Poetry
  6. The Artist as Genius
  7. Language and the Fragment
  8. The Mediator
  9. Relation to Christianity
  10. References and Further Reading
    1. Works by Novalis in German
    2. Works by Novalis in English Translation
    3. Works About Novalis and Early German Romanticism in English

1. Life and Works

Georg Philipp Friedrich Freiherr von Hardenberg, better known by his pen name Novalis, was born on May 2nd, 1772, at his family’s home at Schloss Oberwiederstedt in the Harz Mountains, about 80 kilometers from Leipzig. Friedrich was the oldest son of eleven children born to Heinrich Ulrich Erasmus Freiherr von Hardenberg (1738-1814) and Auguste Bernhardine von Hardenberg (née von Bölzig; 1749-1818). Hardenberg’s family belonged to the Saxon nobility, although of a relatively low rank, and financial worries feature in many of Hardenberg’s letters as a young man. Financial concerns also motivated the family to move from Schloss Oberwiederstedt to a smaller home in Weißenfels in 1784. Hardenberg’s father was a follower of the pietistic Herrnhuter (Moravian) church founded by Zinzendorf, and attempted to raise his family in strict adherence to his pietistic beliefs. This upbringing had a lasting effect on Hardenberg’s thought.

From 1783 to 1784, Hardenberg lived with his wealthy, aristocratic uncle, Friedrich Wilhelm Freiherr von Hardenberg, who exposed the young Hardenberg to his extensive library and interest in Enlightenment thought. Hardenberg subsequently moved with his family to their home in Weißenfels, situated between Leipzig and Jena, and until 1790 studied at the gymnasium in Eisleben, where the curriculum emphasized literature and rhetoric.

In 1790, Hardenberg began his studies in jurisprudence in Jena before moving to Leipzig and then Wittenberg. Despite beginning his time at university by devoting most of his attention to having fun and flirting with women, when he completed his studies in 1794 he obtained the highest possible grade. During this time, Hardenberg met and studied with or befriended many notable figures who were to have a profound influence on his thought, including Fichte, Schiller, Reinhold, Jean Paul, Schelling, and August and Friedrich Schlegel. Friedrich Schlegel, whom he met in Leipzig, became a particularly close friend and interlocutor, and was a central figure of the Jena circle of early German Romantics. Also during this time, in 1791, Hardenberg published his first piece, a poem dedicated to his friend and mentor Schiller, titled “Klagen eines Jünglings” (“A Youth’s Lament”), in Der Neue Teutsche Merkur.

In 1794, Hardenberg took a job as an assistant to the district official in Tennstedt, Coelestin Augustin Just, who became his friend and, after his death, his biographer. It was while working for Just that Hardenberg travelled to Grüningen, where he met the twelve-year-old Sophie von Kühn at the home of her parents. According to his own account, Hardenberg was immediately captivated by Sophie, and they became engaged the following spring, when Sophie was just thirteen. However, in 1795 Sophie became ill with consumption, from which, after several painful surgeries, she died in 1797 at the age of 15. Sophie’s death came just a few weeks before that of Hardenberg’s closest brother Erasmus, and Hardenberg’s diary entries following their deaths reveal a deep depression from which he gradually emerged over the following months. During this time, he had a vision on Sophie’s grave that he recorded in his diary; with only minor modifications this became the first hymn of his famous Hymns to the Night, which were published in 1800.

From 1795, Hardenberg was employed as a salt-mine inspector for his father. Hardenberg took this apparently mundane role very seriously, and some of his writings, including the novel Henry of Ofterdingen, give rocks and mines an important place as analogies for various aspects of the universe and the self. In 1795 and 1796 Hardenberg studied Fichte intensively, and his notes from this time are published as his Fichte Studies.

In 1797, Hardenberg entered the Mining Academy of Freiberg. He also immersed himself in the study of biology, history, medicine, and the philosophy of Schelling, Kant, Spinoza, Hemsterhuis, and others. In 1798, he published one of his influential pieces, a set of fragments called “Blüthenstaub,” or “Pollen,” in the Schlegel brothers’ journal Athenaeum. The fragments were worked on to some extent by Friedrich Schlegel, reflecting the early Romantic idea of “symphilosophy,” or performing philosophy together. In the same year, Hardenberg published “Faith and Love or the King and Queen” in Yearbooks of the Prussian Monarchy. In this piece, he praises the new King and Queen of Prussia, Friedrich Wilhelm III and Luise, while using them as metaphors to outline his ideas on the ideal state. The essay was not well understood or received at the time, with the monarchs as well as Friedrich Schlegel expressing strong disapprobation. Hardenberg’s pen name “Novalis” first appears as the author of these texts. The name means “new land” and recalls the name “de Novali,” which was used by some of Hardenberg’s ancestors. Hardenberg’s notes for an encyclopedia project, available in German as Das Allgemeine Brouillon and in English translation as Notes for a Romantic Encyclopedia, also date from this period.

In 1798, Hardenberg met Julie von Charpentier, the daughter of a minerology professor at Freiberg. They became engaged the following year. However, like his engagement to Sophie, Hardenberg’s betrothal to Julie was never fulfilled, as he fell ill later that year with the tuberculosis that was to kill him in 1801. The last years of Hardenberg’s life were extremely busy: He worked again as a salt-mine inspector and was promoted to director, and was also appointed a magistrate of Thuringia. In 1799 he met Ludwig Tieck, who immediately became a close friend, with whom he absorbed himself in the study of Jakob Boehme. During this time, Hardenberg wrote his essay “Christendom or Europe” as well as the short novel The Novices of Saïs, the poems collected as Geistliche Lieder (Spiritual Songs), and The Hymns to the Night, which were published in 1800, and worked on Henry of Ofterdingen, a bildungsroman that Hardenberg never completed. In late 1799 the Jena circle of early German Romantics, including Hardenberg, Tieck, the Schlegel brothers and their spouses Dorothea Veit-Schlegel and Caroline Schlegel, and others, met regularly.

Throughout the last part of 1800 and the early months of 1801, Hardenberg’s health worsened, and on March 25th, 1801, he died at Weißenfels. Friedrich Schlegel was by his bedside while his brother Karl played piano for him, and his death was described as very peaceful.

After Hardenberg’s death, Ludwig Tieck and Friedrich Schlegel edited and published the first edition of his collected works, which appeared in two volumes in 1802 (a third volume was added in 1846). Tieck and Schlegel promulgated the myth of Novalis as the otherworldly arch-Romantic, which was unfortunately taken up uncritically by commentators and has shaped his reputation, and that of Romanticism, ever since. Hegel, in particular, contributed to a popular conception of Romanticism in general and Novalis in particular as morbid, overly emotional, and pathologically introspective. This conception does not do justice to Novalis’s rigorous and sophisticated engagement with the philosophical, political, scientific, religious, and literary thought of his time.

2. Cosmology

Novalis’s cosmology is pantheistic; that is, it explains the world as a manifestation of the divine. Novalis presents the universe, including human beings, as the self-development of an originally infinite, undifferentiated, unconscious unity into finite individual entities, for the purpose of self-knowledge, or self-consciousness. While the starting point for this idea was the philosophy of Fichte, Novalis was concerned that Fichte’s emphasis on the development of the subject, or “I,” through positing the object—that is, the real, physical world, or the “not-I” —stripped the physical world of freedom and selfhood: Novalis wonders whether Fichte had “stuffed too much into the I.” Novalis, by contrast, views the world outside the subject as an active interlocutor—as, effectively, another subject. The difference is often summed up by saying that Novalis turned Fichte’s “not-I” into a “you.” Underlying this attribution of selfhood to the world at large is Novalis’s claim that the universe is divine: It is the comprehensible realization of an infinite God, unfolding in space and time.

For both Novalis and Fichte, the self-differentiation of an original absolute into individual beings allows it to perceive and reflect on itself, by creating the subject-object distinction that, like Fichte, Novalis asserts is essential for cognition, and even consciousness. On this model, the reflection on each other of finite elements within the universe is also the self-knowledge of the universe. However, the nature of the universe as originally infinite, whole, and undifferentiated can never be perfectly known. This is because all knowledge and consciousness depends on the subject-object distinction, and so is necessarily mediated by the particular finite entities that make up the world: Thus, Novalis claims, “We seek the unconditioned everywhere, and find only things” (“Pollen,” Schriften II, p.412 #1). Perfect knowledge of the universe is, therefore, a regulative ideal.

While Novalis acknowledges the reality of the world of everyday experience comprised of particular entities, he claims that underneath these divisions the universe is whole, unified, and divine. Thus, Novalis situates human beings in two realities that, he maintains, are in fact two aspects of one and the same world: the everyday universe of individuated entities (including individual human beings), and a spiritual universe of undifferentiated unity. Novalis’s analysis of this model is sophisticated, identifying the limitations of understanding the world based on particular, finite, material things while recognizing its reality, necessity, and value within experience. He acknowledges that the categories and divisions of our everyday perspective on the world have value, expressing gratitude for “scientists” and “scholars” who have measured and calculated the physical world, advancing the self-knowledge of the universe even while they obscure its essential nature as a single spiritual whole; but he also points out some damaging consequences of espousing this worldview, and insists on the importance of moving beyond it. His philosophy and poetry are largely attempts to demonstrate how (and to what extent) we can, first, have epistemological access to the underlying divine unity of the universe; second, articulate this access and communicate it to others; and third, make the relationship between these worlds closer, moving towards a regulative ideal of a perfect correspondence, or unity, in which the material realm manifests its divine inner nature.

As part of this project, Novalis attempts to find ways of overcoming the divisions between individual entities as well as several dichotomies that characterize the way we tend to experience and understand the universe. These include those between subject and object, the divine and the mundane, the rational or spiritual and the emotional or sensuous, the conscious and the unconscious, activity and passivity, and freedom and determinism. Novalis maintains that the segregation of existence into these dualities is a source of unhappiness and alienation. The terms of the dichotomies are usually understood to be mutually exclusive, leading to a fragmentation of both the world at large and, specifically, human identity, and often a rejection and/or devaluation of one or other of the terms. In particular, Novalis is concerned that this dualism often results in devaluation of the physical, emotional, sensuous, unconscious, and mundane aspects of the universe. According to Novalis, this fragmenting and alienating tendency also divides human beings from important parts of themselves, which on this model are construed as external to them—these parts include the natural world, other human beings, and God. In Novalis’s cosmology, while we currently experience these things as existing outside ourselves, at a more fundamental level they and we form a single whole.

For Novalis, the divisions and categories under which we usually perceive our environment obscure the unity of the cosmos and conceal its divine nature by presenting it as purely physical, rather than as a manifestation of spirit. The spiritual seems to be separate from the physical and their relation mysterious. This applies not only to God, but also to aspects of human existence thought to transcend physical processes, such as freedom, thought, and the will—the possibility of these aspects of existence manifesting themselves in a material realm becomes hard to explain. Many of Novalis’s contemporaries, including Kant and Schelling, struggled with this difficulty, and Novalis attempts to resolve this problem by overturning the dualism that lies at its root. Much of Novalis’s writing is concerned with revealing the inherent spirituality and rationality of the frequently devalued material elements of existence, as well as the superficiality of the divisions between human beings, nature, and God.

3. Novalis’s Account of History

Novalis claims that the world that modern human beings inhabit, in which the universe is a system of separate, finite entities and in which human beings are individual subjects, does not reflect the essential nature of the universe. Rather, this state of affairs is a development that began at the start of time, with an initial self-differentiation of an originally unitary cosmos that has become more pronounced through history. According to Novalis, the universe tends to move from an original state of undifferentiated, unconscious unity towards a community of conscious individuals who are aware of their nature as emanations of the divine whole. Novalis’s account of history aims to describe this development, which he presents as taking place through a repeated dialectical process that moves from a state of relatively undifferentiated, unconscious existence, through a state of individuated but fragmented conscious existence, to a state of more unified and harmonious “organic” consciousness, eventually approaching an ideal community of conscious individuals aware of their nature as parts of the same greater whole: “Before abstraction everything is one, but one like chaos; after abstraction everything is unified again, but this unification is a free interconnection of independent, self-determined beings. From a heap, a community has emerged” (“Pollen,” Schriften II p.455 #95). Although human beings epitomize this conscious awareness, Novalis indicates that plants, animals, and other aspects of the natural world also form parts of this ideal community.

Novalis often depicts earlier states of the manifestation of spirit in the world in order to convey this process and, through extrapolation, point both forwards in time to the ideal coming age of communion and backwards to the original position of absolute unity and non-self-awareness that preceded the origination of the world. Novalis has been criticized for creating sentimental idealizations of historical periods, particularly the medieval Europe described in “Christendom or Europe”; however, it is fairer to interpret him as presenting these images not as factual accounts, but as abstracted views of history meant to exemplify the progression from unconscious unity through conscious disunity to conscious unity. He depicts periods in which, prior to the emergence of the modern worldview of the universe as material, atomistic, and causally regulated and of the human being as an individual, conscious subject, human beings, God, and nature existed in closer relation to each other but also with less rational or discursive awareness. The Hymns to the Night describe an ancient time in which human beings lived in communion with nature and saw the spiritual essence of the world in mythical form in all things. “Christendom or Europe” presents a later period in a fictionalized medieval Europe, still before the advent of an Enlightenment worldview, in which education, trade, and communication flourished, but the people still lived in harmony, united under one spiritual goal (Catholicism). Although these examples are from different works, the period described in “Christendom or Europe” can be understood to represent a state of greater development of the self-knowledge of the universe than the pagan age described in the Hymns. While both periods described above seem idyllic because, according to Novalis’s poetic descriptions, the entities that make up the world at these times exist in greater harmony than they do now, the lower intellectual development at these times means they do not manifest consciousness, rationality, and spirit to as high a degree as modernity. However, “Christendom or Europe” suggests that greater differentiation and intellectual development can still be harmonious and unified if these occur within a community working towards a common spiritual goal. Thus, in addition to describing a past stage of the development of the universe, this piece points to the possibility of a future higher synthesis of society into a spiritual community, calling its readers to overcome the relatively fragmented and spiritless situation in which Novalis believes they currently exist.

According to Novalis’s outline of history, the beginning of the Enlightenment marked the emergence of a highly developed cognition of particular entities, but also the loss of the original community that he describes, as well as of the ability to see spiritual significance in physical objects and mundane events: “The gods disappeared with their following—nature stood forlorn and lifeless. Arid count and strict measure bound her with iron chains” (“Hymns to the Night,” in Schriften I, p.145 s.5). The result is the world as it appears to a mentality that emphasizes an intellectual and categorizing approach to experience: a mechanistic, material universe, which permits a detailed understanding of physical processes, but lacks deeper meaning and is unimbued by spirit.

The categorizing and individuating activity of reason is, for Novalis, instrumental in achieving the self-reflective unity that he views as the divine purpose of the universe. Without the fragmentation engendered by division and categorization, spirit would be unable to reflect on itself and would remain in a state of blind self-identity. However, not just greater individuation, but also greater integration with the whole is necessary for knowledge of the universe as essentially unified and divine, rather than just according to its appearance as a set of particular entities and events. Thus, Novalis sees an overemphasis on discursive reason, with its divisive and alienating tendencies, as an antithesis to a preceding state of the world that was less rational and more unified, and as also preparing the ground for a subsequent synthesis into a more complex and self-conscious harmonious whole. At each level, the universe’s consciousness of its essential nature is, at least ideally, enhanced.

4. Subjectivity and the Vocation of Humankind

Novalis is a pantheist, maintaining that what we perceive as particular entities, including individual human beings, are not, in fact, most essentially distinct objects related externally and physically to one another, but are more fundamentally parts of a divine whole, connected internally through their shared spiritual nature. The individual human being is, therefore, a manifestation of God, who is present in all particular entities: “Only pantheistically does God appear wholly—and only in pantheism is God wholly everywhere, in every individual. Thus for the great I the ordinary I and the ordinary you are only supplements” (“Allgemeine Brouillon,” Schriften III, p.314 #398). This means that, on Novalis’s model, parts of the world that seem external to the individual subject—other human beings, animals, plants, objects, and even God—are in fact essential parts of the self, that is, of the “great I.” Overidentification with ourselves as individuals, in particular with ourselves as conscious, active individuals, makes us experience ourselves as fragmented in this way, set over and against a “you” that is, more fundamentally, another part of ourselves. For Novalis, the vocation of humankind is to realize our true nature as part of the divine whole, simultaneously developing closer connections with that “you” and fostering the self-understanding of the universe as a divine absolute: “We are not at all I—but we can and shall become I. We are seeds for becoming I. We shall all transform into a you—into a second I—only thereby do we raise ourselves to the Great I—which is one and all together” (“Allgemeine Brouillon,” Schriften III, p.314 #398).

Novalis’s model of the self reflects the post-Kantian Idealist separation of the everyday, empirical or individual I from the absolute I, often identified with God and sometimes described as “the Absolute” or “spirit,” terms whose relation to each other was at issue for Fichte and Schelling, among others. Novalis’s response to this problem is to claim that, with regard not just to the individual but also to the universe as a whole, we have access to both kinds of existence, although imperfectly, and can combine them, although never completely. The task of doing so is, on Novalis’s account, the human vocation. Taking up this task not only allows human beings to integrate into their selves aspects of their greater self (or God) from which they are currently alienated, but also facilitates the original purpose of the world as the gradual development from an absolute, undifferentiated, blind unity, to a community of individuated entities conscious of their true spiritual nature.

Because the universe, as divine, is both one and infinite, Novalis maintains that the task of uniting oneself with one’s greater self can never be completed while one exists as a finite, conscious individual, and the aim is therefore to draw increasingly close to this union without ever fully attaining it. The approach is characterized by spirit’s increasingly adequate self-expression and self-knowledge in and through the physical world, in particular through human beings and their understanding of and actions in the world. Novalis thus situates the human being within the world as both a part of it and at a special place where it becomes self-aware, and where its essential freedom, rationality, and spirituality are epitomized.

The development of the self-awareness of the universe through the activities of human beings occurs through a process in which the individual both comes to understand that the world is a reflection of him- or herself, indeed at a deeper level part of him- or herself, and shapes the world so as to more closely reflect the spiritual nature that lies within both world and self. Because of their shared spiritual nature, the self and the apparently external world are, on Novalis’s account, analogues of each other. The mind reflects the world in the form of representations, and the world correspondingly manifests the mind in a physical medium, in what Novalis calls “figures” or “hieroglyphs” or “ciphers” —the shapes of objects and events, which form a secret language that we can learn to read. The better we can read this language, the more closely our representations—that is, our minds—reflect the world, and the more we work to interpret the world in this way, the more we invest it with spirit. Thus, Novalis claims, “We are on a mission: our vocation is the cultivation of the earth” (“Pollen,” Schriften II p.427 #32). Novalis maintains that this double process of interpretation begins to mend the fragmentation between minds and bodies and between the spiritual and the physical, allowing a closer mirroring of these at first apparently incommensurate elements, in the process spiritualizing the physical.

It should be noted that the mediation of the spiritual to the world is for Novalis only possible because the world is fundamentally already divine. Thus, human beings do not accomplish a union of two originally or inherently different realms, but the realization of a pre-existing spiritual inner essence of the world. By revealing this spiritual nature, human beings take up their vocation, and the world becomes readable as a symbol and manifestation of the divine.

5. Romanticization and Poetry

Novalis claims that the “cultivation of the earth” that he describes as the vocation of humankind is to be achieved through a form of “poetic” or “Romantic” creativity. The activity of interpreting the world as embodying the divine partially overcomes the separations between the physical and the spiritual and between the self, as subject, and the rest of the world, as object. Novalis refers to this spiritualization of the world as “raising,” “raising to a higher power,” or “romanticizing”: “Insofar as I give the common a high sense, the usual a secret aspect, the known the worth of the unknown, the finite an infinite appearance, I romanticize it” (“Logological Fragments [II],” Schriften II, p.545 #105). The shaping of the physical world to reflect the spiritualized vision of it created by the poet, artist, or genius is the crux of Novalis’s concept of “magical idealism.”

According to Novalis, there are two ways of interpreting and relating to the world: one that perpetuates and even exacerbates the fragmentation of self, nature, and spirit that modern human beings experience in their everyday lives, and one that undermines this fragmentation. The first reflects the excessive rationality epitomized by science, and sees the representations formed by a categorizing and divisive form of discursive thought as more or less accurate reflections of an external world. The second is a “Romantic” attitude epitomized by poetry (although also potentially present in conversation, translation, art, art criticism, and many other practices), which recognizes that one’s representations of the objects of one’s knowledge are based on intuitive connections with these objects, and acknowledges the contingency, subjectivity, and partiality of any attempt to articulate these intuitions or conceptualize the universe. This approach therefore employs emotions and intuitions to inform attempts to understand the world, and those who adopt it are motivated to continually improve on these attempts, thereby fulfilling the human vocation of developing the increasing self-knowledge of the universe.

Although Novalis, like other Romantics, often seems to stress the intuitive aspect of this process, Romantic interpretation is not supposed to be of a raw emotional or intuitive nature, but is rather articulate and rational, being informed, shaped, and mediated by consciousness. Novalis views as relatively undeveloped the “raw, discursive thinker” who interprets the world as atomistic and mechanistic and the “raw, intuitive poet,” whose interpretations have no fixed form (“Logological Fragments [I],” Schriften II pp.524–55 #13); these tendencies are united in the Romantic poet, novelist, philosopher, or artist, who can switch back and forth between these modes and give form to her or his visions of the living, dynamic world of nature. The means by which Romanticization is achieved, therefore, is the synthesis of reason and emotion (or science and imagination, or philosophy and poetry). This synthesis is reflected in the literary and allegorical forms in which Novalis (and other early German Romantics) chose to write as well as in the content of his writings.

Novalis maintains that the world is in principle epistemologically accessible, although a complete understanding of the universe as a divine whole and of oneself as a part of that whole is a regulative ideal. Insofar as individuals encounter their own nature and the rest of the world through mental representations, they experience these things as individual and particular, and the essential nature of the universe as a divine unity eludes them. However, one can partially overcome these separations and glimpse the nature of reality through intuition, imagination, and creative interpretation. It is only when one’s unconscious physical and affective nature participates in constructing interpretations of one’s environment that one can really understand that environment. Thus, for Novalis, an interpretation of the world that raises it towards the divine begins by circumventing narrowly rational categories for acquiring knowledge and allowing one’s intuitions to reveal the way things are.

It is not enough, however, merely to have these intuitions; they must be articulated, that is, as best as possible reproduced in the medium of discursive thought, in order to bring them to consciousness. For Novalis, Romantic creativity does not entail abandoning conscious representation, but rather integrating it with intuitions. A poetic representation is not simply an intellectual model of reality, aiming to adequately describe particular events and objects; nor is it raw intuition of the spiritual wholeness of the universe. Rather, emotions and intuitions let the poet read the world as divine and use language in an imaginative and symbolic way to represent this divine nature. Thus, Romantic interpretation reveals the spiritual unity that underlies all seemingly particular things.

On Novalis’s account, by revealing the spiritual nature of the world in this way, Romantic interpretations actually allow us to inhabit a more spiritual world. According to Novalis, the spiritual essence of things is not given in phenomena but is imparted to phenomena through interpretation, as he explains using the example of music: “All tones that nature brings forth are raw—and spiritless—often only to the musical soul does the sound of the forest—the piping of the wind, the song of the nightingale, the plashing of the brook seem melodious and meaningful” (“Anecdotes,” Schriften II, pp.573–74 #226). By creating order and meaning for objects and events, or in other words by perceiving naturally occurring objects and events in a spiritualized, rational form, the poet or artist invests them with spirit, allowing their inner nature and significance to shine forth.

6. The Artist as Genius

Novalis’s theory of the genius reflects how he thinks interpreting physical objects and events as spiritual actually invests these objects and events with spirit, creating a real physical world that manifests the divine. According to Novalis, the activity of the artist or genius is an exemplification and intensification of what human beings always do. Human beings do not exist in a world that is simply given, but rather project a world on the basis of their understanding or interpretation of their experiences. This, Novalis claims, is the essence of genius: “When we speak of the external world, when we portray real objects, then we act like genius. Thus genius is the ability to act towards imaginary objects like real ones, and also to treat them like these” (“Miscellaneous Remarks,” Schriften II, pp.418–20 #22). The way people interpret and understand their experiences is, therefore, the way in which they create the world that they inhabit. The artist has this capacity to a much higher degree than most people: Novalis refers to the artist as “the genius of genius” (p.420 #22). In other words, artistic activity is a raised form of the everyday human way of being.

Although Novalis describes the world-creating activity of the genius as “spontaneous,” he envisions this activity not as a generation ex nihilo or an imposition of an interpretation on an inert world, but as the way the world expresses itself in a more conscious and articulate, and therefore more spiritual, form. The world the genius creates is, due to her or his intuition and greater connections with the rest of the world, a free expression of the spirit, unity, and life of the universe, including of the genius her- or himself as the place where these characteristics are epitomized and come to expression. The actions of the genius are shaped and informed by the world of which she or he is a part, so that the spontaneous expression of the genius’s spirit that occurs in artistic creation is also a response to what is given. Novalis takes as the archetype of this activity the novelist, who “from his given crowd of accidents and situations—makes a well-ordered, lawlike series” (“Anecdotes,” Schriften II, p.580 #242). The freedom and creativity of the author are restricted by the terms given to him or her, while he or she draws objects and events together into a coherent, meaningful whole. This process can be extended to the task of understanding and acting towards the events of one’s experiences generally: Novalis claims, “All the accidents of our lives are materials out of which we can make what we want” (“Pollen,” Schriften II, pp.437–39 #66).

In other words, the genius is engaged in creative dialogue with his or her surroundings. For Novalis, nature is a language, if one that modern human beings have forgotten how to read and respond to. The beings and events that make up the world, including but not restricted to human beings and their activity, are symbols of the divine. While these symbols are now hidden to most people, the genius can both read and respond to this language of nature, like a participant in a conversation, and in doing so bring this world to a higher, more spiritual expression.

7. Language and the Fragment

According to Novalis, language, the mind, the world, and the divine have analogous structures that allow them to reflect each other. Furthermore, some uses of language bring the world and the mind closer together by allowing them to reflect each other more closely. The kinds of language that do this are not those that give the most accurate descriptions of their objects, but those that stimulate listeners to use their imaginations to intuit something about the world that cannot be captured in discursive categories.

Novalis believes that language signifies, not on the basis of semantic rules for connecting terms to objects, but through association, imagination, and creative interpretation. Like the relationship between the human being and its world, and between these and the divine, the connection between linguistic utterances and the things they signify is one of analogy between realms that at a deeper level share a common structure or essence. One of the clearest sources for Novalis’s account of language is his short essay, “Monologue,” in which he states, “If one could only make it comprehensible to people that it is with language like with mathematical formulae—they constitute a world for themselves—they play only with themselves, embody nothing but their wonderful nature, and just for that reason they are so expressive—just for that reason they reflect in themselves the same play of relations as things” (Schriften II, p.672). In other words, language does not denote particular objects and events, but creates a new world that mirrors or has the same structure, and therefore the same meaning, as these objects and events. Ultimately, both the material world and language have as their object the divine which they, like the human being itself, reflect and embody, and can therefore reveal.

For Novalis, the nature of the relationship between sign and signified entails a degree of separation between them, meaning that the objects of language always escape full articulation. The search to conceptualize and convey the divine essence of things can therefore never be finished, and progress in this search depends on openness and readiness for revision. Language can serve this purpose when it is used in forms that prompt the audience to take an active role and rework what has been said. Novalis attempts to embody this practice in “Monologue,” undermining the claims of his essay to provide an accurate description of how language works: “If I thereby think to have indicated the essence and function of poetry in the clearest way, still I know that no one can understand it” (Schriften II, p.672). By pointing to the inadequacy of his speech, taken literally, Novalis invites his audience to reach beyond the words to grasp his meaning, and provide a better representation of it.

As a result, irony and poetic techniques like metaphor, suggestion, and association emerge as the best tools for understanding the world and oneself, revealing these to others, and constructing more spiritual versions of these things through creative dialogue. Novalis’s use of the fragment for many of his writings is supported by this concept: Because fragments are incomplete, their readers must use their imaginations to complete them, and are thereby called to participate in their vocation. Fragments thus function as “seeds” for developing insights into the true nature of the universe: “Fragments of this kind are literary seeds. There may admittedly be some deaf grains among them: however, if only some bear fruit!” (“Pollen,” Schriften II, p.463 #114). Novalis claims that not just language, but everything we encounter can play this role, intimating a spiritual meaning that we are invited to explore and complete: “Everything is seed” (“Logological Fragments [II],” Schriften II, p.563 #189).

The invitation to others to participate in constructing meaning is an important aspect of Novalis’s account of Romantic interpretation. Rather than a finished or complete system of philosophy, Novalis advocates a continuous activity of “philosophizing” (which he also sometimes called “Fichtesizing”) which gradually reveals the spiritual nature of the world. In part, Novalis’s refusal to find a final form for his philosophy is motivated by his rejection of the demands of Fichte and Karl Leonhard Reinhold for a first principle as a foundation for philosophy. Novalis emphasized the importance of the activity of seeking a ground for knowledge and experience over the ground itself. Furthermore, this activity is not, for Novalis, performed in isolation by a single subject, but is carried forward within a community—this is the “symphilosophy” advocated and to an extent embodied by the Jena Romantics. Fichte’s call to his reader in the revised version of his Wissenschaftslehre to “think the I” is thus altered, in Novalis’s work, to become a call not to an individual but to a community to think the I and its world together.

On Novalis’s account, an imaginative and intuitive use of language contributes to the human vocation of creating a raised or Romanticized world. Because it has been worked on by the human mind, especially where the human mind in question is informed by intuitions of the divine nature of the world, the new world established through language is a raised, spiritualized version of the physical world that it reflects. But in addition, this process can be repeated and refined by working on the constructions created by others. Following a first speaker’s utterance, an audience is called to create a yet more spiritualized version of the world by investing the objects and events described with their own thoughts and feelings. By retracing the meaning of the first speaker’s utterance, a second participant combines three elements in a higher synthesis: the objects and events described by the speaker; the speaker’s spirit as imparted to these objects and events in her words; and his or her own spirit in his interpretation of this picture. The same process of joint, mutually reflective creation characterizes the poetic interpretation of nature, which Novalis understands as like a conversation, as well as the creation of art, art criticism, translation, and potentially many other endeavors.

8. The Mediator

Interactions with other human beings and objects and events in nature are important in Novalis’s account for realizing the inner spiritual unity of the world. In addition, particular figures stand out as especially important for this goal, acting as precursors for unification with the rest of existence and indeed as means to establishing this union. Novalis describes these figures as “mediators,” claiming that “Nothing is more essential to true religiosity than a mediator, who unites us with the Godhead. Unmediated, the human being can absolutely not be in relation to the latter” (“Pollen,” Schriften II, pp.443–45 #74). In particular human beings we see the highest manifestation of spirit in the world, and when we engage imaginatively with them as symbolic figures, we can see how the divine is embodied in the world and draw closer to that divinity.

Novalis’s work includes numerous examples of this kind of relationship. For instance, the teacher in The Novices of Saïs initiates the novices into the secrets of the universe, as an exemplar and tutor in the search for the meaning of nature’s language. In Henry of Ofterdingen, Zulima, who shows Henry how to construct a meaningful narrative out of chance events and gives him a musical instrument with which to begin his life as a poet, provides an axial moment in Henry’s development. Later, the sage Klingsohr and his daughter Mathilde initiate Henry into further pieces of wisdom required to become aware of his unity with existence, and Mathilde and Henry, who marry, share a union that Novalis describes as prefiguring the unification of all things. The same prefiguration occurs in the relationship between the narrator and the beloved in Hymns, in which the bond between the narrator and his dead beloved initiates the narrator’s release from the limits of space, time, and individuation.

Novalis maintains that any object can reveal one’s union with the rest of existence and mediate the divine. What is important is not the object through which one perceives the divinity of existence, but one’s relationship or attitude to that object. However, while the whole world can reveal the divine, other human beings do so more easily, because they more clearly manifest the spiritual within the physical than do other objects. Thus, Novalis claims that as human beings become more sophisticated they tend to choose a more limited range of objects to hold religious significance and to select other human beings as those mediating objects: “The more independent the human being becomes, the more the quantity of mediators shrinks, the quality is refined, and his relationships to these become more various and cultured: fetishes, stars, animals, heroes, idols, gods, one God-man” (“Pollen,” Schriften II, p.443 #74). This seems to imply that Christianity, with its God-man Christ, is a more rational, raised form of religion than earlier or other religions or systems of thought. However, Novalis also suggests that as time goes on individual human beings tend to choose a mediator who is of personal importance to them, making Jesus Christ just one potential mediator among many.

9. Relation to Christianity

Novalis was raised in the pietistic tradition, attending a Lutheran school and having a strictly religious father who attempted to raise his children in line with the precepts of the Herrnhuter church. This background influenced the vocabulary, imagery, and some of the content of Novalis’s philosophy, in particular: his claim that the mundane can be spiritualized by attention to the divine; his emphasis on the need to improve one’s community in order to transform the earth; and his rejection of radical sin. Novalis’s work also incorporates modified versions of central ideas of Christianity more generally, particularly the narrative of Fall and salvation as alienation from and reconciliation with the divine, the role of Christ as an exemplary mediator of the divine, the idea that the world embodies the divine and can be interpreted analogically to reveal this spiritual essence, and the idea of union through love after death. Some commentators, notably William Arctander O’Brien, have argued that Novalis’s work stretches Christian doctrine too far to be considered Christian: O’Brien points to Novalis’s pantheism and his rejection of Jesus Christ’s special status as the son of God as fundamental departures from Christian tradition. However, whether we see Novalis’s philosophy as remaining within a Christian paradigm or moving outside it, several of its central themes take their starting point from Christian models.

Novalis assimilates the Christian narrative of Fall and salvation, in which union with God is lost and sought, to the Fichtean account of the self-differentiation of the absolute into finite entities in space and time in order to achieve self-knowledge. Novalis’s model also reflects the ideas of the Christian mystic Jakob Boehme, whom he studied in detail from 1800. Like Novalis, Boehme claimed that the differentiation of God into particular entities in the world is necessary for the development of self-awareness and a higher form of harmonious existence.

Novalis avoids a puritanical interpretation of the Fall as due to the temptations of the flesh, and to an extent follows an opposite stream within Christian thought, in which consciousness, reason, and knowledge on the one hand, and individuation on the other, are responsible for the alienation of the human being from its true self, the rest of existence and God. For Novalis, an original communion with nature and God was lost through the development and enhancement of consciousness and individuality, which require division and separation. However, Novalis also grants individual existence and discursive reason a positive place in his narrative as essential for realizing the imperative of the universe to know itself. Without these, the universe would remain an unconscious, blind unity.

Novalis does not take from Christianity the moral notion that alienation from the divine is a result of sin, claiming that “To true religion nothing is sin” (“Fragments and Studies,” Schriften III, p.589 #228). However, this rejection or reduction in emphasis on radical sin is a characteristic of pietism, which may have influenced Novalis in this respect. Although Novalis exhorts his audience to take steps to overcome the fragmentation of their existence, he describes the consequences of the approach to the divine in utilitarian, rather than moral, terms, providing a vision of the benefits that attend a closer relationship with the divine, including a deeper connection with the rest of existence, a sense of meaning for one’s life, control over one’s destiny, and the elimination of the fear of death as one realizes that one is part of a greater whole, and therefore that one’s selfhood is, more essentially than existence as an individual, the selfhood of the absolute. These benefits are direct consequences of learning to view the universe in a new way. Novalis does not distinguish between the eventual fate of sinners and saved; all human beings and all of nature will return to unity with the divine when they die, and no one is fully integrated with the divine while a living individual, although some individuals may experience greater connections with the divine nature of existence while alive.

Novalis has sometimes been seen as life-denying and morbid, in part based on his writings on death, in which he famously uses erotic imagery of longing for union with dead loved ones and with the unconditioned. It is true that Novalis wrote some of these passages, at least the vision on the grave of the beloved found in the Hymns to the Night, while very depressed. However, Novalis attempts to give death a positive value as promising union with the divine and some form of eternal life. These concepts have resonance within a Christian context, and in particular his use of the imagery of marriage to prefigure union with the divine has pietistic parallels, but in Novalis’s case these concepts are also shaped by his philosophical commitments. For Novalis, individuated existence is an obstacle to realizing the divine unity of the world, and as a result, the unification with the divine that he calls us to work towards realizing can only be completed in death. Thus, he claims, “Life is the beginning of death. Life is for the sake of death” (“Pollen,” in Schriften II, p.417 #14). Novalis’s emphasis on the value of the process rather than the goal of the human vocation means that this should not devalue life, with its characteristic individuation and consciousness; these are required in order for the universe to become self-consciousness. Furthermore, Novalis’s pantheism means that human beings, like all other particular entities, are manifestations of the divine, with the result that death is not annihilation, but the final transformation of the individual person, or the “ordinary I,” into the “great I” of the divine absolute. Thus, Novalis claims that we will awaken after death into a new state, for which we may be prepared by our Romantic vocation.

Novalis’s attitude towards Jesus Christ provides one of the clearest places where his account modifies Christianity. Christ is a relatively important figure for Novalis, mentioned explicitly several times as an ideal mediator of the divine and spiritual to the world, and in the Hymns to the Night his teachings are described as spreading the word of the overcoming of death in mystical union. However, Novalis does not present Christ as different in kind from other human beings or the rest of the world. Although Christ exemplifies the integration of divine and mundane that Novalis claims it is the human vocation to bring about, and although as a result Christ is an ideal mediator of this spiritualized world to others, Novalis maintains that all entities can potentially play this role—what matters in this respect is the individual’s attitude to these entities, rather than who or what they are.

Novalis’s idea that things in the world can be interpreted as having a divine meaning also has parallels in Christianity. Medieval Christian scriptural exegesis could be applied not only to the Bible itself, but also to physical objects and events, in order to discover doctrinal, moral, and metaphysical and eschatological meanings. Novalis’s work reflects an interest in the fourth or “anagogical” form of interpretation, which was supposed to give knowledge of the heavenly or the spiritual and to initiate the interpreter into hidden knowledge of metaphysics and the afterlife. For Novalis, the beings and happenings of the world form “figures” or “hieroglyphs” that signify the divine and allow those who can read them to function as “prophets.”

10. References and Further Reading

a. Works by Novalis in German

  • Schriften. Zweite, nach den Handschriften ergänzte, erweiterte und verbesserte Auflage in vier Bänden, edited by Paul Kluckhohn and Richard Samuel. Stuttgart: Kohlhammer, 1960.
    • The authoritative edition of Novalis’s collected works, including notes, diary entries, and letters.

b. Works by Novalis in English Translation

  • Fichte Studies, edited by Jane Kneller. Cambridge: Cambridge University Press, 2003.
    • Novalis’s critical reception of Fichte. Includes an informative introduction by Kneller.
  • Henry of Ofterdingen, translated by Palmer Hilty. Long Grove, Illinois: Waveland Press, Inc., 1992. This translation first published in 1964.
    • An unfinished bildungsroman in which Henry, with the aid of various mediating figures, develops towards his vocation as a poet.
  • Hymns to the Night, translated by Dick Higgins. Many editions. This translation first published in 1978.
    • A bilingual edition. The Hymns use Christian, mystical, and Romantic imagery to describe longing for union with loved ones after death.
  • Notes for a Romantic Encyclopaedia: Das Allgemeine Brouillon, translated, edited, and with an Introduction by David W. Wood. Albany, NY: State University of New York Press, 2007.
    • Novalis’s writings on science, religion, art, and nature, intended for an encyclopedia.
  • The Novices of Saïs, translated by Ralph Manheim. Brooklyn, NY: Archipelago Books, 2005.
    • Describes the novices’ mystical search for an understanding of nature, under the guidance of their teacher, who leads them to discover the hidden connections between all things.
  • Philosophical Writings, translated and edited by Margaret Mahony Stoljar. Albany, NY: State University of New York Press, 1997.
    • An abridged, introduction to many of Novalis’s most influential pieces, including “Pollen,” “Monologue,” “Christendom or Europe,” and “Faith and Love or The King and Queen.”

c. Works About Novalis and Early German Romanticism in English

  • Behler, Ernst. German Romantic Literary Theory. Cambridge: Cambridge University Press, 1993.
    • An influential account of the literary theory of the early German Romantics, situating Novalis’s work in the context of his study of Fichte and the work of close contemporaries such as the Schlegel brothers and Tieck.
  • Haywood, Bruce. Novalis, The Veil of Imagery: A Study of the Poetic Works of Friedrich von Hardenberg, 1772–1801. Gravenhage: Mouton, and Cambridge, MA: Harvard University Press, 1959.
    • An introduction to Novalis’s use of imagery.
  • Von Molnár, Géza. Romantic Vision, Ethical Context: Novalis and Artistic Autonomy. Minneapolis: University of Minnesota Press, 1987.
    • An influential study emphasizing a central theme of Novalis’s work: the vocation of the individual to work towards the realization of the unity of the universe.
  • O’Brien, William Arctander. Signs of Revolution. Durham: Duke University Press, 1995.
    • Investigates Novalis’s work on language and symbols in relation to his contemporary political, ethical, religious, and scientific context.
  • Seyhan, Azade. Representation and its Discontents: The Critical Legacy of German Romanticism. Berkeley: University of California Press, 1992.
    • Presents the work of Romantic writers, including Novalis, as explorations of new ways of thinking in the light of political and scientific change, and as important precursors to modern critical theory.
  • Strand, Mary. I/You: Paradoxical Constructions of Self and Other in Early German Romanticism. New York: Peter Lang, 1998.
    • On the work of Romantics, including Novalis, on otherness, particularly women and the Orient.

 

Author Information

Anna Ezekiel
Email: info@annaezekiel.com
McGill University
Canada

Norman Malcolm (1911–1990)

MalcolmNorman Malcolm was instrumental in elaborating and defending Wittgenstein’s philosophy, which he saw as akin to a kind of “ordinary language” philosophy, in America. He also defended a novel interpretation of Moore’s “common sense philosophy” as a version of ordinary language philosophy, although Moore himself disagreed. Malcolm criticized Descartes’ account of mind by elaborating Wittgenstein’s criticisms of a private language. He produced a controversial new modal version of the Ontological Argument for the existence of God. He produced two very different kinds of arguments against the mechanistic view of human beings; the first argues that the mechanist is committed to a “pragmatic paradox,” and the second argues that such accounts may seem empirical but contain a disguised unintelligible metaphysics. He produced two very different kinds of accounts of memory, the earlier more “analytical,” and the later “more historical, systematic, and destructive.”

Malcolm was instrumental in building Cornell into one of the leading philosophy departments in America. He was President of the Eastern Division of the American Philosophical Association from 1972-73. Malcolm authored ten books and a plethora of influential articles and reviews.

Norman Malcolm studied philosophy with O. K. Bouwsma at the University of Nebraska before enrolling as a graduate student at Harvard in 1933. He received his Ph.D. from Harvard in 1940 but spent 1938-39 at Cambridge University in England, where he met G. E. Moore and Ludwig Wittgenstein, which proved decisive in his development. He was briefly an instructor at Princeton before joining the US Navy in 1941. He returned to Cambridge to study again with Moore and Wittgenstein from 1946-47. In 1947, he joined the Sage School of Philosophy at Cornell University, where he remained until his retirement in 1978.

Table of Contents

  1. Biography
  2. Wittgenstein: A Memoir
  3. Dreaming
  4. Malcolm’s Modal Version of the Ontological Argument
  5. Criticism of Descartes
  6. The Conceivability of Mechanism
  7. Philosophy of Mind
  8. Memory
  9. Nothing is Hidden
  10. Wittgenstein: From a Religious Point of View
  11. References and Further Reading
    1. Books
    2. Articles
    3. Reviews
    4. Secondary Sources

1. Biography

Norman Malcolm was born in the tiny town of Selden in northwest Kansas (pop. 250) on June 11, 1911. In his early schooling, his exceptional intellect was soon recognized, and he was sent to Omaha, Nebraska, for high school. He later attended the University of Nebraska, where he studied philosophy with O. K. Bouwsma. He began his graduate studies at Harvard in 1933 and received his Ph.D. in 1940. He spent 1938-39 at Cambridge University in England, where he met G. E. Moore and Ludwig Wittgenstein, which proved decisive in his development. He was briefly an instructor at Princeton before joining the US Navy in 1941. After the war, he returned to Cambridge from 1946-47 to study again with Moore and Wittgenstein. In 1947, he joined the Sage School of Philosophy at Cornell University, where he remained until his retirement in 1978. He was President of the Eastern Division of the American Philosophical Association from 1972-73. Wittgenstein visited Malcolm at Cornell during the summer of 1949, and their discussions during this visit inspired Wittgenstein’s last philosophical work, On Certainty, and Malcolm’s book, Knowledge and Belief. Malcolm was married twice. He had two children, a son and a daughter, by his first wife, Lee. A few years after his divorce from Lee, he met Ruth Riesenberg, an accomplished psycho-analyst and author, in Hampstead, London. Ruth was originally from Santiago, Chile. Ruth and he moved permanently to London soon after marrying.

Malcom enjoyed athletics in his youth—an interest that remained with him for life. He swam regularly before classes at Cornell. During his years at Cornell, he enjoyed sailing on Lake Cayuga and took his role as captain of the ship very seriously. A passenger might be forgiven for conjuring images of Captain Bligh. Malcolm was of a robust constitution (Serafini, 1993, 310-11). One of his close friends on the Cornell faculty relates that when in England in his 60s, Malcom had a back problem, perhaps sciatica, and was getting little or no relief. A friend in Hampstead told him to try the Queen’s horse doctor, who had a reputation for solving her horses’ problems. Malcolm duly went. The horse doctor showed Malcolm a large wooden mallet and how he used it on the horse. He had Malcolm lie stomach down on a table and gave him a massive whack on his back. Malcolm claimed that it cured his problem.

Malcolm’s famous review of Wittgenstein’s Philosophical Investigations in 1954 initiated decades of fruitful controversy about Wittgenstein’s views, which Malcolm understood as akin to an “ordinary language philosophy” (Parker-Ryan, § 2). Malcolm’s aim was to expose the confusions underlying much philosophy and psychology by showing how the relevant philosophical words are actually used in ordinary life. Although Malcolm’s chief philosophical influence was clearly Wittgenstein, he was also much influenced by Moore’s “common sense philosophy.” Malcolm saw Moore as being the first to refute paradoxical philosophical claims by showing that they “go against ordinary language.” Malcolm held that Moore’s common sense philosophy was essentially the same as ordinary language philosophy, although Moore himself rejected this interpretation (Carney, 1962). It is also worth pointing out that though Malcolm emphasized attention to the uses of words in ordinary language, he held that this is not sufficient to resolve philosophical problems (Serafina, 1993, 315; Uschanov, 2002). Finally, although Malcolm was powerfully influenced by Wittgenstein, it would be wrong to think that he slavishly followed his lead (Serafini, 1993, 315-317). For example, whereas Wittgenstein eschewed rational theology, holding that religion is more a matter of faith or passion, Malcolm produced and defended a novel modal version of Anselm’s ontological argument for the existence of God.

Malcolm admitted that it was hard not to pick up some of Wittgenstein’s mannerisms and practices (1970, 26). One story that circulated at Cornell was that a new graduate student turned up late at a seminar that Wittgenstein, during his year at Cornell, was giving at Malcom’s class and whispered to the graduate student beside him, “Who is this guy trying to imitate Malcolm?” Further, since Wittgenstein detested academic life, he often attempted to talk students out of pursuing philosophy as a career and doing something useful with their lives—like becoming a manual worker on a farm and being kind to people (1970, 30). Since Malcolm shared Wittgenstein’s distaste for professional philosophy (Serafina, 1993, 310), he often did the same. Malcolm calls an enthusiastic graduate student into his office. His face is grave. The student can only fear the worst and wonders if it could be the end. Malcolm, speaking with great severity, says, “Are you sure you want to pursue a philosophy career?” The student, with the zeal of Socrates, professes absolute devotion to philosophy. They seem prepared to face the Hemlock. Malcolm, unmoved, tries again. He says, “Are you sure you do not want to do something useful with your life instead—perhaps medical school?” (Serafina, 1993, 311) The student reaffirms that there is nothing else he could possibly do. Malcolm, looking grim and disappointed, shrugs and turns away to rifle through his bookshelves as he says, “Well, I guess there’s nothing to be done about it then!” Despite his misgivings about academic philosophy, however, Malcolm was fascinated by philosophical issues, which he approached with great passion and intensity. He continued, like Wittgenstein, working on philosophy to the end.

Malcolm’s lectures were not typical philosophy lectures. A student sitting through a course of Malcolm’s lectures might have had the feeling that she was not learning much. Sellars has a theory of knowledge, Chisholm has a theory of knowledge, but where is Malcolm’s theory of knowledge? However, by the end of the semester, students often found that they looked at things quite differently from the way they had at the beginning of the course. This is because, following Wittgenstein, Malcolm did not aspire to teach his students philosophical theories, but to impart methods that can be used over and over again on countless different kinds of problems—as Wittgenstein said, “not a single problem” (Philosophical Investigations, § 133). “Each class was a bit like a journey and one either accompanied Malcolm on the journey or not” (Serafini, 1993, 310).

Malcolm employed several methods borrowed from Wittgenstein, including describing the circumstances in which the relevant philosophical words, “knowledge,” “consciousness,” “certainty,” and so forth, are actually used in everyday life, comparing actual uses of words with imaginary language games, imagining a fictitious natural history for the use of words, and attempting to diagnose the motivations for the temptation to use certain words in a misleading way (Richter, § 4). By these means, Malcolm attempted to show that philosophers typically fall into error because they forget, when doing philosophy, how such words are actually used in ordinary life (Serafina, 1993, 321). When confronted by some typical philosophical thesis in class (of the sort that most philosophers take uncritically as grist for the logical mill), Malcolm would appear genuinely puzzled why anyone would say such a peculiar thing while he ran his hands over the top of his head as if searching his brain for the possible meaning of this dark saying. Although Malcolm was always prepared for his classes, he preferred to let the discussion develop organically, often in response to student questions, rather than imposing his own preferred grid on the discussion. One cannot, however, take this ban on philosophical theories too far. When Malcolm taught courses involving the views of some philosophers (Descartes, Leibniz, and so forth), he sympathetically articulated and defended their theories. Thus, a student normally would learn philosophical theories in Malcolm’s courses. Malcolm’s response to these theories was not to oppose them with an alternative theory but to subject them to his understanding of Wittgenstein’s and, perhaps, Moore’s methods.

Malcolm could seem a bit gruff and bearish sometimes. A student in class, labouring to articulate his position, finally manages, with evident relief, to articulate his view. Malcolm’s voice booms out, “Completely wrong!” Serafini (1993, 309) recalls that after receiving an F on his first paper, but then finishing strongly with a series of As and a B+, he asked Malcolm for a recommendation to graduate school. Reviewing his record, Malcolm recites his grades, “A-, A, B+,” leaving that F for last, which he recited in stentorian tones, apparently with some relish” (Serafini, 1993, 311). However, there was always a good dose of humour behind his gruffness. As Kretzmann, Shoemaker and Miller put it, “He could seem gruff and bearish, but those who began by fearing him soon found that he was very warm and kind. He lived his life and conducted his intellectual projects with full, guileless, and fearless commitment, earning the respect of all who knew him.” It is no exaggeration to say that many of his former students and colleagues came to love him.

In the course of his long and productive career, Malcolm exerted an enormous influence over the development of the Cornell Philosophy Department and was instrumental in building it into one of the most highly regarded philosophy departments in America. He had a fierce philosophical integrity and refused to be swayed by the metaphysical and scientistic fashions of the day. Malcolm belongs to a bygone age that has been largely forgotten in the push for more complicated, technical, and abstract philosophical theories (Serafini, 1993, 317). Malcolm spent the last thirteen years of his life living in London where he gave much admired weekly graduate seminars at King’s College, London until the year of his death. A committed Anglican, he died on August 4, 1990, and is buried in the cemetery of the Anglican Church in Hampstead near his London home.

2. Wittgenstein: A Memoir

Malcolm’s famous Memoir of Wittgenstein attempts to paint a picture of the person behind the great philosopher. It is a picture of a person who is intense, brilliant, austere, and eccentric and who suffered greatly throughout his life but who also could be playful, humorous, and compassionate. Despite the fact that he “abhorred” academic life and professional philosophy (1970, 30), Wittgenstein was fierce about attendance at his classes, saying, “My lectures are not for tourists” (1970, 28). Wittgenstein once tried to lecture from notes, but the thoughts that came out were “stale,” and the “words looked like corpses” (2001, 24). Wittgenstein could be “a frightening person” in his classes (2001, 26-27).

The Memoir also sometimes sheds light on Wittgenstein’s philosophy. For example, Malcolm reports that Wittgenstein dismissed attempts to provide a rational foundation or proof for God’s existence, believing instead in a Kierkegaardian type of view that religion is a matter of passion (1970, 59, 82). Wittgenstein referred to Kierkegaard “with something like awe in his expression” (1970, 60). Malcolm also recounts being especially struck by one remark Wittgenstein made during one of their walks that bears on his “use-conception” of meaning: “An expression has meaning only in the stream of life” (1970, 73-75).

Malcolm’s lively portrait of Wittgenstein the person should be of interest both to the philosopher and the historian alike, not only for its portrait of Wittgenstein but also for what it reveals about Malcolm. Although there are great differences between Wittgenstein and Malcolm as human beings, Malcolm’s Memoir emphasizes certain of Wittgenstein’s traits that Malcolm himself shares, such as his distaste for academic life, his impatience with anything less than a full commitment to the philosophical task, and his desire to let philosophical discussions develop naturally rather than to impose his own blueprint on them.

3. Dreaming

Malcolm argues in his paper “Dreaming and Skepticism” (1956) and in his book Dreaming (1959) that the notion of dreams, in the sense of conscious experiences that occur at a definite time and have definite duration during sleep, is “unintelligible” (1959, 52). This contradicts the views of philosophers and psychologists like Descartes, Kant, Moore, Freud, and Russell, who, he holds, assume that human beings have conscious thoughts and experiences during sleep (1959, 1-4). Descartes claimed that he had been deceived during sleep (1959, 101).

Malcolm’s first point is that ordinary language contrasts consciousness and sleep. The claim that one is conscious while one is sleepwalking is “stretching the use of the term” (1959, 27, 84). Malcolm rejects the alleged counterexamples based on sleepwalking or sleep-talking. For example, dreaming that one is climbing stairs while one is actually doing so is not a counterexample because in such cases the individual is not sound asleep after all (Springett, § 3.b.1). “If a person is in any state of consciousness it logically follows that he is not sound asleep” (1956, 21). Our concept of dreaming is based on our descriptions of dreams after we have awakened in “telling a dream” (1959, 55ff, 76, 87ff). Thus, to have dreamt that one has a thought during sleep is not to have a thought any more than to have dreamt that one has climbed a mountain is to have climbed a mountain (1959, 51-53, 57). Since one cannot have experiences during sleep, one cannot have mistaken experiences during sleep (1956), thereby undermining the sort of philosophical scepticism based on the idea that our experiences might be wrong because we might be dreaming.

Malcolm’s second point is that reports of conscious states during sleep are unverifiable (1959, 83ff; Springett, 3.b.i). If Ginet claims that he and Shoemaker saw a bigfoot in charge of the reserve desk at Olin library, one can verify that this took place by talking to Shoemaker and gathering forensic evidence from the library. However, there is no way to verify Ginet’s claim that he dreamed that he and Shoemaker saw a bigfoot working at Olin library (1959, 38-40). Ginet’s only basis for his claim that he dreamt this is that he says so after he wakes up. How does one distinguish the case where Ginet dreamed that he saw a bigfoot working at Olin Library and the case in which he dreamed that he saw a person in a bigfoot suit working at the library but, after awakening, misremembered that person in a bigfoot suit as a bigfoot proper? If Ginet should admit that he had earlier misreported his dream and that he had actually dreamed he saw a person in a bigfoot suit at Olin library, there is no more independent verification for this new claim than there was for the original one. Thus, there is, for Malcolm, no sense to the idea of misremembering one’s dreams (Windt, 2015, 18ff). Malcolm here applies one of Wittgenstein’s ideas from his “private language argument: “One would like to say: whatever is going to seem right to me is right. And that only means that here we can’t talk about ‘right’” (Philosophical Investigations, § 258).

For similar reasons, Malcolm challenges the idea that one can assign definite durations or times of occurrence to dreams (1959, 70-82). If Ginet claims that he ran the mile in 3.4 minutes, one could verify this in the usual ways. If, however, Ginet says he dreamt that he ran the mile in 3.4 minutes, how is one to measure the duration of his dreamt run? If he says he was wearing a stopwatch in the dream and clocked his run at 3.4 minutes, how can one know that the dreamt stopwatch is not running at half speed (so that he really dreamt that he ran the mile in 6.8 minutes)? One might say that dream reports do not carry such implications, but Malcolm would say that just admits the point. The ordinary criteria we use for determining temporal duration do not apply to dreamt events. The general problem in both these cases (dreaming one saw a bigfoot working at Olin library and dreaming that one ran the mile in 3.4 minutes) is that there is no way to verify the truth of these dreamt events—no direct way to access that dreamt inner experience, that mysterious glow of consciousness inside the mind of the person lying comatose on the couch, in order to determine the facts of the matter. This is because, for Malcolm, there are no facts of the matter apart from the dreamer’s reports of the dream upon awakening. Referring to psychological studies of his time, Malcolm claims that the empirical evidence does not enable one to decide between the view that dream experiences occur during sleep and the view that they are generated upon the moment of waking up (1956, 29). Dennett agrees with Malcolm that nothing supports the received view that dreams involve conscious experiences while one is asleep but holds that such issues might be settled empirically (Springett, § 3.d).

Malcolm also argues against the attempt to provide a physiological mark of the duration of a dream, for example, the view that the dream lasted as long as the rapid eye movements (REM). Malcolm replies that “there can only be as much precision in that common concept of dreaming as is provided by the common criterion of dreaming” (1959, 75). These scientific researchers are misled by the assumption that the provision for the duration of dreams “is already there, only somewhat obscured and in need of being made more precise” (1959, 79). However, Malcolm claims, it is not already there (in the ordinary concept of dreaming). These scientific views are making “radical conceptual changes” in the concept of dreaming, not further explaining our ordinary concept of dreaming (1959, 81). Malcolm admits, however, that it might be natural to adopt such scientific views about REM sleep as a convention (1979, 76-77). Malcolm points out, however, that if REM sleep is adopted as a criterion for the occurrence of a dream, then “people would have to be informed upon waking up that they had dreamed or not” (1970, 80).

Malcolm does not mean to deny that people have dreams in favour of the view that they only have waking dream-behaviour (Pears, 1961, 145). “Of course it is no misuse of language to speak of ‘remembering a dream’” (1959, 57-58). His point is that since our shared concept of dreaming is so closely tied to our concept of waking reports of dreams, one cannot form a coherent concept of this alleged inner (private) something that occurs with a definite duration during sleep. Malcolm rejects a certain philosophical conception of dreaming, not the ordinary concept of dreaming, which, he holds, is neither a hidden private something nor mere outward behaviour.

Malcolm’s account of dreaming has come in for considerable criticism. Chihara and Fodor (1965) argue that Malcolm’s claim that occurrences in dreams cannot be verified by others does not require the strict criteria that Malcolm proposes but can be justified by “appeal to the simplicity, plausibility, and predictive adequacy of an explanatory system as a whole.” Dunlop (1974) argues that Malcolm’s account of the sentence “I am awake” is inconsistent. Windt (2015) offers a comprehensive program in considerable detail for an empirical scientific investigation of dreaming of the sort that Malcolm rejects. Canfield (1961), Siegler (1967), and Schröder (1997) propose various counterexamples and counter arguments against Malcolm’s account of dreaming.

4. Malcolm’s Modal Version of the Ontological Argument

In his 1960 paper “Anselm’s Ontological Arguments,” Malcolm states that Anselm gave two different ontological proofs for God’s existence. Anselm’s key premise in the first argument in Proslogion 2 is that a thing is more perfect if it exists than if it does not exist. As Kant points out, that argument is fallacious because existence is not a property of things (Himma, 2.d). Anselm’s second argument, which Malcolm revises and defends, is a modal argument in Proslogion 3 that is similar to arguments advanced by Hartshorne and Plantinga. The key idea here is that though existence is not a perfection, the logical impossibility of nonexistence, that is, necessary existence, is a perfection (and, therefore, a property). Lacewing (2015, 190-193) summarizes Malcolm’s modal argument for God’s existence as follows:

  1. Either God exists or does not exist.
  2. God can neither come into existence nor go out of existence.
  3. If God exists, then He cannot cease to exist.
  4. Therefore, if God exists, He exists necessarily.
  5. If God does not exist, then He cannot come into existence.
  6. Therefore, if God does not exist, His existence is impossible.
  7. Therefore, God’s existence is either necessary or impossible.
  8. However, God’s existence is only impossible if the concept of God is self-contradictory.
  9. The concept of God is not self-contradictory.
  10. Therefore, God’s existence is not impossible.
  11. Therefore, from 7 and 10, God’s existence is necessary.

One objection is that though it has been argued that the concept of God is self-contradictory (Trakakis, § 1.c; Beebe, § 1-3), Malcolm simply assumes that premise 9 is true (Himma, § 4). Another problem is that even if one grants that necessary existence is a property, Malcolm’s argument only shows that if God exists, then God exists necessarily. Finally, is it true that necessary existence is a perfection? If “x necessarily exists” means “x exists in all possible worlds,” why should God’s necessary existence across all possible worlds make God greater in the actual world (Himma, § 4)? For in this actual world, a necessarily existing God is no greater than a God that contingently exists in this world.

5. Criticism of Descartes

Malcolm’s core criticism of Descartes is in his 1975 paper “Descartes’ Proof that He is Essentially a Non-Material Thing.” He attributes the following argument to Descartes: “I think I am breathing entails I exist. I think I am breathing does not entail I have a body. Therefore, I exist does not entail I have a body.” Malcolm rejects the second premise on the grounds that it is conceptually impossible for minds to exist without ever having been united with a body or for minds to exist without there ever having been bodies because the primary use of “he thinks he is breathing” presupposes bodily behavioural criteria for its truth. Malcolm admits there are secondary uses of mental terms that refer to disembodied spirits, but these are parasitic on the primary uses. The paper expresses Malcolm’s most basic understanding of Wittgenstein’s objection to such dualistic views, namely that all such views treat a parasitic use of language as if it makes sense when severed from the primary use of mental terms that are essentially tied to bodily behaviour (Philosophical Investigations, § 571, 579-580). If the criterion for ascribing mental properties essentially involves an appeal to bodily behaviour, then Descartes’ argument for mind-body dualism collapses.

6. The Conceivability of Mechanism

In his 1968 paper “The Conceivability of Mechanism,” Malcolm argues that a completely mechanistic explanation of human behaviour is incompatible with the explanation of the intentional explanation of that behaviour. He argues against the two main attempts to justify such completely mechanistic views. The first is the view that intentional concepts can be defined in terms of non-intentional dispositions to behave in a certain way. The second is the view that intentional states or events are contingently identical with neural states or events. Malcolm argues that if all human behaviour had sufficient mechanistic causes, then human beings would not have intentions or desires. This leads to a “pragmatic paradox” (Chan, 2010). A person S’s assertion that all human behaviour is mechanistically explainable is a pragmatic paradox because S’s utterance can count as meaningful only if S has certain intentions about it (Ginet, 2006, 234). However, in that case, S’s meaningful endorsement of the mechanistic view is itself a counterexample to the asserted mechanistic view. For if the mechanistic view is true, then S’s endorsement of it cannot be meaningful. Although Malcolm’s argument generated a considerable amount of useful discussion at the time, it is not seen as obvious that there is a paradox in the assertion that intentions and thoughts can be realized in the state of a machine. In his 1977 Memory and Mind, Malcolm uses entirely different sorts of arguments against a mechanistic account of human mental phenomena.

7. Philosophy of Mind

Malcolm’s positive philosophy of mind is based on two fundamental principles, both inherited from Wittgenstein. The first deals with ascription of mental properties to others. The second deals with ascription of mental properties to oneself. The first principle is that we justifiably ascribe mental properties (like being in pain) to others on the basis of observable behavioural criteria that are conceptually (non-contingently) connected to those mental properties. Thus, it is part of the concepts of mental properties that there are behavioural criteria that justify ascribing those mental properties to other persons. The second principle is that it is not on the basis of any observable behavioural criteria that we ascribe mental properties to ourselves. One does not ascribe the mental property of being in pain to oneself by observing that one is screaming. Malcolm holds that such self-ascriptions are, rather, analogous to natural expressions of mental states. A child does not need to be taught to cry when it is in pain. Rather, the child cries naturally when it is in pain and later learns to replace the natural crying with linguistic utterances like “I am in pain.”

The asymmetry between first and third person ascriptions does not, however, mean they are completely unrelated. “First person utterances and their second and third person counterparts are linked in meaning by virtue of being tied, in different ways, to the same behavioural criteria” (1971, 91). Indeed, one can only know how to apply mental terms to oneself if one can apply them to others (Thornton, § 5). The behavioural expression of my (first person) being in pain is similar to the behavioural expressions of others that justify me in ascribing that same mental state to them. Introspectionism (exemplified by Descartes) violates the first principle. Behaviourism violates the second principle because Malcolm does not identify the mental state with its behavioural expressions. He only holds that the concept of a mental state is non-contingently connected with the natural and/or learned behavioural expression of those mental states. A key part of Malcolm’s attempt to find a third alternative to the extremes of introspectionism and behaviourism is that the mental state does not reduce to behaviour because behaviour is only an expression of a mental state.

In his 1964 paper “Scientific Materialism and the Identity Theory,” Malcolm argues against Smart’s claim that a sudden thought is contingently identical with a brain process on the grounds that brain states do have specific bodily locations but that we attach no meaning to the bodily location of a thought. Thus, if x is identical with y only if x and y occur at the same place and time and if the identity is contingent, then there is no way to establish that the same location condition is satisfied.

In his 1984 book Consciousness and Causality (David Armstrong also contributes a lengthy section to this book), Malcolm makes an analogous argument that mental states that lack genuine duration (dispositions, beliefs, intentions) cannot be identical with brain states that do have genuine duration. Appealing to the principle of identity cited in the preceding paragraph, if a brain state has a genuine duration (say, 8.1 seconds), but a disposition or intention does not possess genuine duration, then there is no way to establish that such mental and brain states are identical. It is important to acknowledge that some dispositions and intentions can be assigned a precise duration. One might not normally be able to say precisely when one lost the ability to count from 10 to 1 in Yanomami backwards, but in some cases one can do so. Question: “When did you lose the ability to count from 10 to 1 in Yanomami backwards?” Answer: “It was when my wife hit me in the head with the microwave.” However, apart from such exceptional cases, one cannot, for some kinds of mental states, normally assign them a precise temporal duration.

The problem with Malcolm’s arguments in these cases is that even though there are many kinds of mental states for which it is ordinarily impossible to establish a precise spatial or temporal spatial location or duration, one can, it seems, envisage advances in the sciences that might make it plausible to do so. For example, advanced studies of brain processes might discover precise correlations between acquiring certain brain states and acquiring certain mental dispositions, abilities, or intentions. These identities would be viewed as scientific discoveries. Malcolm would reply that this would involve considerable gerrymandering of our ordinary concepts of dispositions, intentions, and abilities. A critic of Malcolm would reply that this kind of gerrymandering of ordinary concepts is normal in the advancement of science and is not specific to changes in the concepts of mental entities. For example, human beings were traditionally divided into males and females, but more detailed scientific knowledge suggests that this traditional division fails to capture the complexity of the human gender reality. That is, one cannot rule out such discoveries simply by appealing to the fact that the concepts in ordinary language conflict on some level with the new concepts developed on the basis of greater scientific knowledge (Serafina, 1993, 321).

Another of Malcolm’s noteworthy contributions to the philosophy of mind comes out in his 1972 presidential address to the Eastern Division of the American Philosophical Association titled “Thoughtless Brutes.” Malcolm objects to Descartes’ view that since propositional representations do not occur in the lower animals, they do not have real sensations. Malcolm does not argue that lower animals do have propositional representations but that Descartes “exaggerated” the role of propositional representations in human beings (Ginet, 2006, 235-236). Since propositional representations play less of a role than most philosophers think, there is no principled reason why one cannot ascribe non-propositional thoughts to some of the higher animals. One correctly says that the dog barking up the tree, where it has just chased the squirrel, believes the squirrel is up the tree. Malcolm issues an important qualification. Though it is wrong to identify thoughts with their linguistic expression, it is also wrong that creatures without language can have thoughts. We can meaningfully say of a person that they have thoughts to which they never give expression only because they participate in a language in which there is an institution of testifying to previously unexpressed thoughts (1972, 55). Since dogs do not speak a human language, how, then, can one assign such thoughts to them? Malcolm holds that some higher animals participate in human language to a sufficient degree that one can attribute some thoughts to them by analogy. There is a squirrel and a rabbit in the field. Rover is told to get the rabbit, whereupon Rover chases the rabbit and ignores the squirrel. Rover must display regular patterns of such linguistically sensitive behaviour. Dogs are not full-blown members of our linguistic community, but they participate in our linguistic practices sufficiently to justify ascriptions of thoughts, beliefs, and desires to them by analogy. They behave much as we do in response to some relatively simple human linguistic behaviour.

Davidson (2001, 97) objects that one cannot say what precisely the dog is supposed to believe. Suppose the tree in question is an oak tree. Does the dog believe the squirrel went up the oak tree? However, there is an important sense in which Davidson misrepresents Malcolm’s position. Davidson (2001, 97-98) thinks that if one allows that the dog thinks the squirrel went up the tree, then “while dropping the feature of semantic opacity, there is a question whether we are using those words [‘thinks,’ ‘believes,’ and so forth] to attribute propositional attitudes.” For it has long been recognized that semantic opacity distinguishes talk about propositional attitudes from talk of other things. However, Malcolm does not hold that it is correct to say that the dog believes the proposition that the squirrel is up the tree (let alone that the dog believes that the proposition that the squirrel is up the tree is true). Recall that Malcolm holds that Descartes overestimates the role of propositional representations in human life. Malcolm distinguishes between, “The dog believes the squirrel is up the tree” and, “The dog believes that the squirrel is up the tree” (where the presence of the “that” in the latter formulation indicates that the alleged believer possesses a great deal of “logical machinery” not required by the former). Malcolm holds that many human beliefs described by logicians as beliefs-that (that is, propositional beliefs) are really non-propositional. When a dog believes the squirrel is up the tree, its belief resembles human non-propositional beliefs (which are more common than many philosophers think). Philosophers and psychologists have, alas, tended to over-intellectualize not just animal mind and behaviour, but also human mind and behaviour. Malcolm and Davidson also both address the moral issues involved in regarding animals as “thoughtless brutes” or mere machines.

8. Memory

Malcolm’s two main works on memory are his 1963b “Three Lectures on Memory” and his 1977a book Memory and Mind. In the first 1963b lecture, “Memory and the Past,” he argues that Russell’s hypothesis that the world began five minutes ago complete with misleading records, delusory memories, and the like is logically untenable. Malcolm’s main argument is that a linguistic community can be said to have mastered past tense statements and have past tense beliefs only if not all of their past tense statements are false. Further, if our apparent memories mostly agree with each other and with the records, then they would be verified as true, and “if the apparent memories were verified, it would not be intelligible to hold that, nevertheless, the past they describe may not have existed” (1963a, 199).

In the second lecture, “Three Forms of Memory,” Malcolm distinguishes factual memory (remembering that p), personal memory (remembering something one has oneself previously experienced), and perceptual memory (personally remembering something by forming a mental image of it). While a personal or perceptual memory always entails some factual memory, there can be factual memories that do not entail any perceptual or personal memory.  There could be a people who lacked perceptual memory altogether but had normal factual memories, but there could not be a creature that we would recognize as human who completely lacked factual memory. Malcolm’s point is that memory involving mental images is not nearly as basic as many philosophers and psychologists have thought.

Malcolm’s main aim in the third lecture is to show that our concept of factual memory “obviously” does not commit one to hold that there must be “a specific brain-state or neural process [mechanism] persisting between the previous and the present knowledge that p” (1963a, 237-8). He adds in the same passage “that our strong desire for a mechanism of memory arises from an abhorrence of the notion of action at a distance-in-time.” He acknowledges that there are causal elements in factual memory but argues that this does not require either the assumption of temporally continuous chain of causation or the existence of causal laws. The view found in accounts of the memory mechanism that there must be a representation that plays a causal role in remembering is unjustified.

Malcolm begins his 1977 Memory and Mind by contrasting his earlier (1963) views on memory with those in this book. Whereas his former views were more “analytical,” his new views, influenced by his discussions with Bruce Goldberg, to whom he dedicates the book, are “more historical, systematic, and destructive” (1977, 9). Part I is about the “mental mechanisms” of memory. Part II is about the “physical mechanisms” of memory.

Malcolm begins Part I by arguing against the common view tracing to Aristotle that memory is always of the past (1977, 15). He undermines this view with a series of examples (for example, “I remember this man”). Most philosophers will admit that there are a lot of odd things one says about memory that do not fit Aristotle’s model but hold that there is a “fundamental” type of memory that does. For example, Broad says that there are many things called “memory” in ordinary language that “do not really deserve the name” (1977, 63). That is, the common view among philosophers is that the concept of memory has “a unity which can be disclosed by analysis” that weeds out the deviant cases. Malcolm now sees this as wrong and counts himself, in his earlier “Three lectures on Memory,” among those misguided philosophers who have accepted that picture—but he has now “freed” himself from it (1977, 16 and note 9).

The core feature of this misguided view is that memory is a causal process, specifically that there is an input to the organism, that this input creates (causes) an enduring internal state of the organism (in its mind or brain), and that the proper stimulation activates this enduring internal state and causes the appropriate “output,” either a conscious state or a “behavioural memory performance” (1977, 28). The description of this process from input, to the enduring internal state of the organism, to the output elicited by the appropriate stimulus, is the description of the “memory mechanism.” The presence of the memory mechanism, of one form or another, constitutes the unitary essence common to all the genuine cases of memory. This memory causal process, in both its mental and physical forms, is analogous to the functioning of a computer. One types the initial input into the computer at time t1, for example, “The first President of the US was George Washington.” This input creates an internal state of the computer, which may lie dormant for years. However, when the appropriate stimulus occurs later at t2, for example, one types the question into the computer, “Who was the first President of the US?” the dormant internal state is activated and produces the response, in this example, the appearance of the words “George Washington was the first President of the US,” on the computer display. The computer has “remembered” the data it earlier received as input. Although the computer model is a physical model, something analogous occurs in the account of the mental memory mechanism. In the mental mechanism, each of these physical items is replaced by a corresponding mental item. Typing of data into the computer is replaced by something like a perception. The alterations in the internal physical state of the computer are replaced by alterations in the mental state of the organism. The physical output, the words on the computer display, is replaced by some kind of mental state (like thinking of the relevant fact). Although this picture, illustrated by the computer model, seems straightforward, Malcolm argues that in both its mental and its physical forms, it involves certain disguised and unintelligible metaphysical ideas (1977, 52).

Although Malcolm holds that there is a nest of interrelated unintelligible metaphysical ideas in these accounts of the mental and physical memory mechanisms, the most central is that a “genuine memory occurrence” must represent what is remembered (1977, 120, 132). In order for the representation to do its job, it must be intrinsically and unambiguously connected with what it represents (1977, 56, 124, 138-140). The account of this intrinsic connection appeals to the view that the structure of the memory must stand in a one-to-one correspondence with the structure of what is remembered (1977, 120, 125-126, 164). In the case of the mental memory mechanism, this condition is often satisfied by the view that the memory is some kind of image of what is remembered (1977, 120-121, 126-128). Since an image resembles what it represents, one can, in principle, introspect the connection between the memory-image and what is remembered. For example, since Jones’ image of the killer resembles the actual killer, it enabled Jones to pick the killer out of a line-up.

Whereas the mental representations often appeal to these conscious features of the representation, the physical memory mechanism is designed to explain how memory responses are caused (1977, 167). Even so, there is a considerable similarity between the accounts of the mental and the physical memory mechanisms. Whereas the central component of the mental memory mechanism is the memory image or picture, the central component of the physical memory mechanism is the memory “trace” (in the brain). This “trace” must also be intrinsically connected with what is remembered. The same idea found in the account of the mental memory mechanism reappears in a new form in the account of the physical memory mechanism. The physical trace must have the same structure as what is remembered (1977, 168). Malcolm traces this idea of the “physical basis of memory” to Plato’s view that the brain is like a “wax tablet” on which experience stamps impressions (1977, 169-170). Crito perceives Socrates snub nose at t1. This leaves an impression (trace) on Crito’s brain. Years later, someone asks Crito what Socrates looks like and he is, by virtue of this trace in his brain, causally enabled to describe Socrates’ snub nose. If the trace in Crito’s brain has degraded a bit over time, he can correctly say that Socrates has a snub nose but might describe it as a bit flatter than it actually is. If Crito’s trace has degraded a great deal, he cannot remember it at all. The fact that brain traces, like impressions in a wax tablet, degrade over time explains why some memories are more accurate than others. The underlying idea, both in the theories of the mental and the physical memory mechanisms, are the same. Both hold that the memory must be isomorphic with what is remembered. Malcolm also holds that the schema for such accounts is laid out in the picture theory in Wittgenstein’s Tractatus (1977, Chapters V and 10). Malcolm’s claim is not that the Tractatus provides an account of memory or of the memory mechanism. It does not. What it does do is provide the logical schema of a kind of account of language (representation, picturing), which is presupposed in the mental and the physical accounts of the memory mechanism.

Malcolm argues that, as Wittgenstein shows in his later works, the Tractatus account of this logical schema is wrong. The accounts of how the mental memory image or copy and the account of how the physical memory brain trace are intrinsically and unambiguously connected with what they are representations of require that the structures of the memory and of what is remembered stand in a one-to-one correspondence with each other. However, this can only work if one can appeal to the absolute structure of the relevant items—but the idea of the absolute structure of something makes no sense (1977, 161-162, 242-244). In order to speak of a correlation between the structure of Xs and Ys, one requires a key of interpretation that identifies the elements of Xs and Ys. The question whether Beethoven’s Quartet Opus 132 is isomorphic with Dostoevsky’s The Brothers Karamazov is meaningless unless one has a key of interpretation identifying the relevant parts of each and a principle for mapping the parts of the one onto those of the other (1977, 230-232). The fundamental question then is whether it is possible to construct a key correlating neural elements (whatever they are) with elements of experience (memories, perceptions, and so forth). Malcolm argues that it is a conceptual point that no such satisfactory key can possibly be produced (1977, 232-234). Malcolm focuses on the question whether it makes conceptual sense to identify the elements of a simple experience like wanting to catch the bus. Malcolm proceeds, following Wittgenstein’s method of dissolving essences by producing concrete examples (Philosophical Investigations, § 3, 23, 35), that there is no one thing common to all cases of wanting to catch the bus. There are “countless” things that can count as Fred’s wanting to catch the bus: his looking up the time the bus is to arrive and leisurely finishing his breakfast, his running hysterically out of the house after the bus after seeing a broken alarm clock, his shouting to his wife to run out and stop the bus for him, his calling the bus company and asking them to delay the bus, his praying to God that the bus will be late today, and so on. There is no essence to wanting to catch the bus that then might be divided into elements by some key in order to be correlated with the relevant neural items.

Why, then, do we think there is such an essence? “We predicate of a thing what lies in the method of representing it” (Philosophical Investigations, § 104). The expression “wanting to catch the bus” has a neat definiteness and is divided into discrete elements (words). One sees no difficulty correlating neural states with those elements. Why, therefore, would there be any difficulty correlating neural elements with what is meant by those words? However, the complete range of activities that could constitute wanting to catch the bus cannot be specified (1977, 237-239). Since there is no possibility of isolating the essence of that experience, there is no possibility of identifying the elements of that essence that are suitable for correlation with neural states. The key condition for providing an account of the memory mechanism is unintelligible. It is, therefore, a conceptual truth that there is no possible key for establishing such correlations.

9. Nothing is Hidden

Malcolm’s first sustained attempt to contrast the key views of the Tractatus with those of Wittgenstein’s later philosophy is presented in his 1986 book Nothing is Hidden. Malcolm identifies 15 key “interlocking” theses in the Tractatus. They are:

  1. The world has a fixed unchanging form that is independent of any facts,
  2. The fixed form of the world is constituted by absolutely simple objects,
  3. These simple objects are the substance of the world,
  4. Thoughts, composed of psychical constituents, underlie the sentences of language,
  5. A thought is intrinsically a picture of a particular state of affairs,
  6. A proposition or thought cannot have a vague sense,
  7. Whether a proposition has sense cannot depend on whether another proposition is true,
  8. To understand the sense of a proposition, it is sufficient to understand the meaning of its constituent parts (the principle of compositionality),
  9. The sense of a proposition cannot be explained but only shown,
  10. There is a general form of all propositions,
  11. Each proposition is a picture of one and only one state of affairs,
  12. When a sentence is combined with a method of projection, the resulting proposition is necessarily unambiguous,
  13. What one means by a proposition is determined by an inner process of logical analysis,
  14. The pictorial nature of our ordinary propositions is hidden, and
  15. Every sentence with a sense expresses a thought that can be compared with reality (1986, viii).

The first eight chapters of the book expound these Tractatus theses and explain Wittgenstein’s “sharp disagreement with them in his later thought” (1986, viii-ix). The ninth chapter deals with Kripke’s account of rule-following in Wittgenstein’s Philosophical Investigations. The tenth chapter considers the ideas of a psychophysical parallelism and mind-brain identity. Chapter eleven discusses Wittgenstein’s last writings on the concepts of certainty and knowledge eventually published as On Certainty.

Malcolm identifies the core thesis of the Tractatus in Chapter 1 as the view that the world has a fixed unalterable form determined by the set of indestructible simple objects. The first three chapters critique these theses with arguments familiar from Memory and Mind. It makes no sense to speak of the absolute unalterable form or essence of the world because ascriptions of structure and of simplicity presuppose a key of interpretation that determines what is to count as a form or structure or simplicity—making them relative to a key.

In Chapter 2, Malcolm argues against Winch’s view that the Tractatus is primarily a theory of language and for his own view that the Tractatus is founded on a metaphysical view of a language-independent form (essence) of the world. Whereas Winch sees the Tractatus primarily as a work in linguistic analysis, Malcolm sees its metaphysics as primary.

In Chapter 4, Malcolm goes against much conventional wisdom and argues that Tractatus thoughts are not just abstract entities but are psychical. His five main theses are:

  1. Thoughts are composed of mental elements,
  2. A thought is, by virtue of its intrinsic nature, a picture of a possible situation,
  3. A physical sentence is not intrinsically a picture but can be made into one; thus, the sense of a physical sentence is bestowed on it by a thought,
  4. A sense is bestowed on a physical sentence by establishing correlations between the elements of the propositional sign and the elements of the thought,
  5. In this way, a thought becomes “perceptible to the senses” (Tractatus, 3.1).

Malcolm concludes the chapter by identifying a Tractatus-like view of thoughts as intrinsically meaningful in John Searle’s Intentionality.

In Chapter 5, Malcolm discusses the Tractatus’ obscure view that “a proposition shows its sense” (4.022). He again goes against the conventional wisdom that what shows the sense of a proposition is its syntactical features or its use and argues instead that what primarily shows its sense are psychical thoughts. Unlike physical signs, which always admit of alternative interpretations, psychical thoughts have the unique ability to show what they mean without interpretation. A psychical thought is, in Goldberg’s (1968) terms, a “meaning terminus.”

In Chapter 6, Malcolm (1986, 103) takes his point of departure from the seemingly incompatible assertions in the Tractatus that “language disguises thought” (4.002) and that “all the sentences of our everyday language, just as they stand, are in perfect logical order” (5.5563). To reconcile these conflicting assertions, Malcolm distinguishes between the processes of analysis everyday people use, which take place mostly unconsciously when they understand a sentence, and the processes of analysis that philosophers employ when they attempt to represent perspicaciously the real logical structure of a proposition (1986, 106). When Ann says that the South Sea Islands are enchanting, Ann, the ordinary person, understands immediately. However, Ann is also a philosopher, and, in that capacity, might work a lifetime without success to provide a complete perspicacious representation of the analysed sense of that one proposition. Thus, language disguises thought from the philosopher but not from the everyday person. Ordinary language, for the everyday person, is in perfect logical order. Indeed, language is in perfect logical order for Ann, the everyday person, but as soon as she puts on her philosopher’s hat, she becomes perplexed.

In Chapter 7, Malcolm contrasts Wittgenstein’s later conception of language with Wittgenstein’s earlier view in the Tractatus. Whereas the Tractatus has a representational view of language, where the core notion of representation (logical picturing) is bound up with a whole series of “interlocking” metaphysical views about simple objects, substance, and absolute structure, Wittgenstein’s later works understand language as built on expressive behaviour (1968, 133). As Malcolm puts it, Wittgenstein eventually realized that language “does not emerge from reasoning but from natural forms of life” (1986, 153).

In Chapter 9, Malcom argues against Kripke’s interpretation that the Philosophical Investigations presents “the most radical and original sceptical problem philosophy has seen to date” (1986, 154). Kripke bases his interpretation on Wittgenstein’s remark at §201 of the Investigations, saying, “This was our paradox: no course of action could be determined by a rule because every course of action can be made out to accord with the rule.” Malcolm points out that Kripke fails to notice that in the very next sentence, Wittgenstein states that this paradox “is a misunderstanding” because “there is a way of grasping a rule which is not an interpretation” (1986, 154-155)—namely, in action. A 1,500-pound grizzly bear explodes from the bushes and heads straight for a group of elderly tourists. The tour guide yells, “Run!” Do the elderly tourists think, “I interpret her to mean that my legs should move rapidly in such and such a fashion”? No! They just run. They have grasped the intended meaning in action, not by “interpreting” it by means of another rule or sign, which, then, stands in need of interpretation by another rule or sign, and so on (1986, 180-181).

In Chapter 10, Malcolm argues against the common view that the mind is, or is realized in, the brain—roughly, the idea that thoughts are “in the head.” Malcom finds this common view to be “extraordinary” (1986, 191). The source of the confusion is that in ordinary life, we often say that our inner thoughts are hidden from everybody else. However, this is a metaphorical use of “inner.” Contemporary philosophers of mind have interpreted this metaphorical usage, which “reflects the different logical level you and I stand with regard to what I think and feel,” to mean quite literally that “thoughts and feelings are actually in the head” (1986, 191). Ironically, this literal interpretation of the view that the mental is inner actually “abolishes this logical difference.” Malcolm sees this as “a splendid illustration of how in philosophy it is possible to saw off the branch on which one is sitting” (1986, 191). The chapter includes an illuminating discussion of Wittgenstein’s criticism of the notion of a psychophysical parallelism in Zettel (§ 606-614).

In Chapter 11, Malcolm considers Wittgenstein’s final notebooks, which consist in rough unrevised notes “with no anticipation of publication” (1986, 201). Although many students find these notes “bewildering,” they “reward hard study” and contain “individual remarks of great beauty.” They also initiate lines of thought entirely new to Wittgenstein (1986, 201). Although this chapter is probably the sketchiest in the book, due to the sketchy nature of these notebooks, the best brief way to summarise the results of the chapter is to focus on the contrast between Descartes’ and Wittgenstein’s ways of conceiving of certainty. Whereas Descartes thinks that certainty is restricted to one’s own ideas, to certain highly abstract propositions, and to what can be deduced from these, Wittgenstein holds that one can have certainty about humdrum contingent propositions of everyday life, such as “My name is Ludwig Wittgenstein” (1986, 235). Further, whereas Descartes believes that a single human being can arrive at many certainties by themselves, Wittgenstein holds that anyone’s certainty about anything presupposes an enormous amount of knowledge and beliefs inherited from others and taken on trust (1986, 235). Once again, Descartes over-intellectualizes the phenomenon of certainty, and his solipsistic method of radical doubt is an illusion. Despite this, Malcolm admits that Wittgenstein is a sceptic in a certain sense. He stresses that though Wittgenstein holds that one can know or be certain about certain things, Wittgenstein always adds the qualifier “in so far as one can know such a thing” (1986, 234). Wittgenstein’s scepticism is “not to be confused with the familiar tradition of Philosophical Scepticism” but is rather philosophical “in the sense of being a set of general observations about the framework and boundaries of the concepts of knowledge and certainty, as these figure in the real life of human beings” (1986, 235).

10. Wittgenstein: From a Religious Point of View

Since Malcolm passed away while writing his final book, Wittgenstein: From a Religious Point of View, the final draft was edited into the published form by Peter Winch, who also contributed a lengthy critical essay to the book. The book takes its point of departure from Wittgenstein’s remarks to his friend Drury that “I am not a religious man but I cannot help seeing every problem from a religious point of view” (1995a, 1). Malcolm admits, with Drury, that this remark makes him wonder whether there are dimensions to Wittgenstein’s thought that he and others have not understood (1995a, 1). The book is Malcolm’s attempt to fathom this elusive dimension of Wittgenstein’s thinking.

Malcolm identifies four respects in which there are analogies between “the grammar of a language” and “what is paramount in religious life”:

First, in both, there is an end to explanation; second, in both, there is an inclination to be amazed at the existence of something; third, into both there enters the notion of an illness; fourth, in both doing, acting, takes precedence over intellectual understanding and reasoning. (1995a, 92)

First, in philosophy, as in religion, explanations come to an end somewhere. For example, Malcolm (1995a, 56-57) argues that, whereas Chomsky holds that one requires a mechanistic explanation of linguistic behaviour, his alleged scientific theory is really metaphysical in nature and does not provide the explanation of language that he claims. Second, Chomsky’s view also illustrates the tendency of philosophers to be amazed at something. Upon observing the paucity of linguistic data available to a child, Chomsky is amazed that the child can somehow learn a full-blown natural language (Malcolm, 1995a, 56-57). Just as a theologian’s amazement at the magnificence of the cosmos leads them to posit a creator to explain its existence, Chomsky’s amazement at the child’s ability to learn a natural language from such meagre data leads him to posit hidden mechanisms to explain this amazing fact. Third, Malcolm (1995a, 89-90) holds that Wittgenstein sees both philosophy and religion as having a tendency to see certain kinds of views and ways of living not as just mistakes but as akin to an illness. The philosopher has not just misapplied some logical rule, but, rather, error occurs because the philosopher’s thinking is in a diseased state. For example, Chomsky is led to posit a kind of explanation that cannot be given and, therefore, fails to appreciate the phenomenon of language that is right before his eyes. Fourth, Malcolm holds that in both philosophy and religion, doing and acting take precedence of intellectual understanding and reasoning” (1995a, 92). For example, to a genuinely religious person, what is important is not that one intellectually believes in God but that one lives accordingly.

Malcolm (1995a, 92) concludes with an admission that his suggestions “may be wide of the mark.” Winch (1995, 132) makes several criticisms of Malcolm’s reading but admits that his views are “less clear cut” than Malcolm’s and adds, pessimistically, that we should not expect a very clear-cut account of what Wittgenstein meant in that remark to Drury. Winch (1995, vii) stresses that though Malcolm was still making improvements to the book at the time of his death, he regarded it as fundamentally complete. However, it seems clear that both Malcom and Winch are still struggling with the meaning of Wittgenstein’s remark to Drury.

11. References and Further Reading

a. Books

  • Malcolm, Norman (1958) Ludwig Wittgenstein: A Memoir (with a biographical sketch of Wittgenstein by G. H.  von Wright), London: Oxford University Press.
  • Malcolm, Norman (1959) Dreaming, London: Routledge and Kegan Paul.
    • A classic work in the philosophy of mind on the philosophy of dreaming.
  • Malcolm, Norman (1963) Knowledge and Certainty, Englewood Cliffs, New Jersey: Prentice-Hall.
    • A collection of Malcolm’s essays published between 1958 and 1962, sometimes with slight corrections.
  • Malcolm, Norman (1971) Problems of Mind, New York: Harper and Row.
    • An excellent introduction to problems in the philosophy of mind.
  • Malcolm, Norman (1977) Memory and Mind, Ithaca, New York: Cornell University Press.
    • Arguably Malcolm’s best book.
  • Malcolm, Norman (1977) Thought and Knowledge, Ithaca, New York: Cornell University Press, 1977b.
    • A collection of Malcolm’s essays published elsewhere.
  • Malcolm, Norman (1984) Consciousness and Causality: A Debate on the Nature of Mind with D. M. Armstrong,  Oxford: Blackwell Publishers.
    • An illuminating back and forth argument between Malcolm and David Armstrong, a prominent materialist in the philosophy of mind.
  • Malcolm, Norman (1986) Wittgenstein: Nothing is Hidden, Oxford: Blackwell Publishers.
    • Malcolm’s sustained attempt to understand the actual relationship between Wittgenstein’s early Tractatus and his later philosophy beginning with the Philosophical Investigations.
  • Malcolm, Norman (1995a) Wittgenstein: A Religious Point of View, Peter Winch (ed.) Ithaca, New York: Cornell University Press.
    • Malcolm’s attempt to understand Wittgenstein’s remark to Drury that he sees problems from a religious point of view. Contains a critical essay on Malcolm’s views by Peter Winch.
  • Malcolm, Norman (1995b) Wittgensteinian Themes: Essays 1978-1989, G. Henrik von Wright (ed.) Ithaca, New York: Cornell University Press.
    • Contains 14 of Malcolm’s essays written during the last 12 years of his life on such topics as thinking, whether “I” is a referring expression, sensations of heat, the standard meter bar, language and instinctive behaviour, idealism, the intentionality of sense impressions, subjectivity, turning to stone (as one thinks), language rules, language games, the mystery of thought, and Moore’s paradox.

b. Articles

  • Malcolm, Norman (1940) “Are Necessary Propositions Really Verbal?” Mind 49 (194): 189-203.
  • Malcolm, Norman (1940) “The Nature of Entailment,” Mind 49 (195): 333-347.
    • This essay discusses only the nature of entailment between contingent propositions.
  • Malcolm, Norman (1942) “Certainty and Empirical Statements,” Mind 51: 18-46.
  • Malcolm, Norman (1942) “Moore and Ordinary Language, The Philosophy of G. E. Moore,” Paul Arthur  Schilpp (ed.) Chicago: Northwestern University Press. Reprinted in (1970) The Linguistic Turn, Richard Rorty (ed.) Chicago: University of Chicago Press.
  • Malcolm’s controversial argument that Moore holds that any philosophical proposition that violates ordinary language is false.
  • Malcolm, Norman (1950) “Defending Common Sense,” Philosophical Review 58 (1949): 201-21.
    • Discusses Wittgenstein’s view that philosophy can deliver only a series of truisms in connection with Moore’s “Proof of an External World.”
  • Malcolm, Norman (1950) “The Verification Argument” in Philosophical Analysis, M. Black (ed.) Ithaca, New York: Cornell University Press. Reprinted with revisions and additional footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1950) “Russell’s Human Knowledge,” The Philosophical Review 59 (1): 94-106.
    • Discusses Russell’s view that the data for all human knowledge are private sensations.
  • Malcolm, Norman (1951) “Philosophy for Philosophers,” Philosophical Review 60: 329-40.
    • Malcolm had originally intended the title to be “Philosophy and Ordinary Language.”
  • Malcolm, Norman (1952) “Knowledge and Belief,” Mind 61 (242): 178-189.
    • Reprinted with certain revisions and additional footnotes in Knowledge and Certainty
  • Malcolm, Norman (1953) “Direct Perception,” Philosophical Quarterly 3 (13): 301-316.
    • Reprinted with revisions and additional footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1953) “Moore’s Use of ‘Know,’” Mind 62 (246): 241-247.
  • Malcolm, Norman (1954) “On Knowledge and Belief,” Analysis 14: 94-97.
  • Malcolm, Norman (1956) “Dreaming and Skepticism,” The Philosophical Review 65: 14-37.
  • Malcolm, Norman (1957) “Dreaming and Skepticism: A Rejoinder,” Australasian Journal of Philosophy 35: 201-211.
  • Malcolm, Norman (1958) “Knowledge of Other Minds,” The Journal of Philosophy 55 (23): 969-78.
    • Reprinted in Knowledge and Certainty.
  • Malcolm, Norman (1959) “Stern’s Dreaming,” Analysis 20 (74): 47.
  • Malcolm, Norman (1960) “Anselm’s Ontological Arguments,” The Philosophical Review 69: 41-60.
    • Reprinted with new footnotes in Knowledge and Certainty.
  • Malcolm, Norman (1961) “Professor Ayer on Dreaming,” The Journal of Philosophy 58 (11): 294-97.
  • Malcolm, Norman, (1962) “Three Lectures on Memory,” (“Memory and the Past,” “Three Forms of Memory,” and “A Definition of Factual Memory”), The Monist 45 (1962): 247-66.
    • Reprinted in Knowledge and Certainty.
  • Malcolm, Norman (1962) “George Edward Moore,” Ajatus.
    • Finnish translation of a paper first published in English in Knowledge and Certainty.
  • Malcolm, Norman (1962) “Memory and the Past,” The Monist 42 (2): 247-266.
    • Reprinted as one of the “Three Lectures of Memory” in 1963 in Knowledge and Certainty.
  • Malcolm, Norman (1963) “Three Lectures on Memory” (“Memory and the Past,” “Three Forms of Memory,”  “A Definition of Factual Memory”) in Knowledge and Certainty.
  • Malcolm, Norman (1964) “Is it a Religious Belief that ‘God Exists,’” John Hick (ed.) Faith and the Philosophers New York: St. Martin’s Press.
  • Malcolm, Norman (1964) “Scientific Materialism and the Identity Theory,” Dialogue 3: 115-25.
    • A classic paper on the identity theory of mind and body.
  • Malcolm, Norman (1965) “Descartes’ Proof that His Essence is Thinking,” Philosophical Review 74: 315-38.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1965) “Rejoinder to Mr. Sosa’s ‘Professor Malcolm on Scientific Materialism and the   Identity Theory,’” Dialogue 3: 424-25.
  • Malcolm, Norman (1967) “Explaining Behaviour,” The Philosophical Review 76 (1): 97-104.
  • Malcolm, Norman (1967) “The Privacy of Experience,” Avrum Stroll (ed.) Epistemology: New Essays in the Theory of Knowledge New York: Harper and Row.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1967) “Wittgenstein, Ludwig Joseph Johann,” Paul Edwards (ed.) The Encyclopedia of Philosophy, v. 5 New York: Macmillan and the Free Press: 327-340.
  • Malcolm, Norman (1968) “The Conceivability of Mechanism,” The Philosophical Review 77: 45-72.
    • Classic but controversial statement of Malcolm’s early arguments against the mechanistic view of human beings.
  • Malcolm, Norman (1970) “Memory and Representation,” Nous 4 (1): 59-71.
    • This paper begins to display the influence of Goldberg’s ideas on Malcolm’s account of memory.
  • Malcolm, Norman (1971) “The Myth of Cognitive Processes and Structures,” T. Mischel (ed.) Cognitive     Development and Epistemology New York: The Free Press.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1972) “Ludwig Wittgenstein: Purity and Passion,” B. Mazlish (ed.) The Horizon Book of Makers of Modern Thought New York: American Heritage.
  • Malcolm, Norman (1973) “Thoughtless Brutes,” Presidential Address, Proceedings of the American Philosophical Association 46: 5-20.
    • Argues against Descartes that some of the higher animals can be said to have thoughts and beliefs.
  • Malcolm, Norman (1974) “Behaviourism as a Philosophy of Psychology,” T.W. Wann (ed.) Behaviourism and Phenomenology: Contrasting Bases for Modern Psychology Chicago: University of Chicago Press.
  • Malcolm, Norman (1975) “Author’s Response,” part of an author-reviewer symposium on Problems of Mind: Descartes to Wittgenstein. Philosophical Forum 14: 289-306.
  • Malcolm, Norman (1975) “The Groundlessness of Belief,” Stuart Brown (ed.) Reason and Religion Ithaca: Cornell University Press.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1976) “Memory as Direct Awareness of the Past,” Godfrey Vesey (ed.) Impressions of Empiricism, Royal Institute of Philosophy Lecture 1974-75 London: St Martin’s Press.
  • Malcolm, Norman (1976) “Wittgenstein and Moore on the Sense of ‘I Know,’” Jaakko Hintikka (ed.) Essays on Wittgenstein in Honour of G. H. von Wright, Acta Philosophica Fennica 28 (1-3): 216-240.
    • Reprinted with revisions in Thought and Knowledge.
  • Malcolm, Norman (1977) “Descartes’ Proof that He is Essentially a Non-Material Thing,” Philosophy Forum 14.
    • Reprinted in Thought and Knowledge.
  • Malcolm, Norman (1978) “Wittgenstein’s Conception of First Person Psychological Sentences as ‘Expressions,’” Philosophical Exchange 2 (1978): 59-72.
  • Malcolm, Norman (1980) “Functionalism in Philosophy of Psychology,” Proceedings of the Aristotelian Society, New Series 80: 211-29.
  • Malcolm, Norman (1980) “Kripke on Heat and Sensation of Heat,” Philosophical Investigations 3 (1): 12-20.
  • Malcolm, Norman (1981) “Kripke and the Standard Meter,” Philosophical Investigations 4 (1):1 9-24.
  • Malcolm, Norman (1981) “Misunderstanding Wittgenstein,” Philosophical Investigations 4 (2): 67-71.
  • Malcolm, Norman (1981) “The Relation of Language to Instinctive Behaviour,” J. R. Jones Memorial Lecture, University College of Swansea.
    • Malcolm remarks here that the editor’s chosen title for Wittgenstein’s notes, Culture and Value, would make Wittgenstein “turn in his grave.”
  • Malcolm, Norman (1982) “Wittgenstein and Idealism,” Godfrey Vesey (ed.) Idealism Past and Present Royal Institute of Philosophy Series: 13, Supplement to Philosophy Cambridge: Cambridge University Press.
  • Malcolm, Norman (1987) Reply to Stephen’s Review Behaviorism 15 (2): 155-156.
  • Malcolm, Norman (2015) Notes of a Discussion between Wittgenstein and Moore on Certainty Mind 124 (493): 73-84.

c. Reviews

  • Malcolm, Norman (1954) Review of “Wittgenstein’s Philosophical Investigations,” The Philosophical Review 63 (4): 530-59.
    • Reprinted with corrections and additional notes in Knowledge and Certainty.
  • Malcolm, Norman (1967) Review of Wittgenstein’s Philosophische Bemerkungen, The Philosophical Review 76 (2): 220-229.
  • Malcolm, Norman (1981) “Wittgenstein’s Bag of Raisins” (review of Ludwig Wittgenstein’s Culture and Value), London Review of Books 3 (3): 7-8.

d. Secondary Sources

  • Alanen, Lilli (1996) “Reconsidering Descartes’ Notion of the Mind-Body Union,” Synthese 106 (1): 3-20.
  • Allen, R. E. (1961) “The Ontological Argument,” The Philosophical Review 70 (1): 56-66.
  • Arrington, Robert (1979) Review of Thought and Knowledge: Essays by Norman Malcolm. Philosophical   Inquiry 1 (1): 164-166.
  • Averill, Edward (1978) Review of Norman Malcolm’s Memory and Mind in Philosophy and Phenomenological Research 39 (1): 1.
  • Baker, G. P. (1990) “Malcolm on Language and Rules,” Philosophy 65 (252): 167-179.
  • Baxi, Madhusudan (1977) “Norman Malcolm’s Analysis of Dreaming,” Indian Philosophical Quarterly 4 (4): 515-526.
  • Baylis, Charles (1951) Review of Norman Malcolm’s “The Verification Argument,” Journal of Symbolic Logic 16 (4): 300-330.
  • Beebe, James. “Logical Problem of Evil,” Internet Encyclopedia of Philosophy.
  • Bedford, Errol (1961) Review of Norman Malcolm’s Dreaming in Philosophy 36: 377.
  • Bernecker, Sven (2007) “Remembering Without Knowing,” Australasian Journal of Philosophy 85 (1): 137-156.
  • Bestor, Thomas (1976) “Dualism and Bodily Movements,” Inquiry 19 (1-4): 1-26.
  • Bouwsma, O. K. (1986) Wittgenstein: Conversations 1949-1951 Indianapolis: Hackett.
  • Britton, Karl (1959) Review of Ludwig Wittgenstein—A Memoir by Norman Malcolm Philosophy 34 (130): 277.
  • Bronstein, Daniel (1940) Review of Norman Malcolm’s “Are Necessary Propositions Really Verbal?” Journal of Symbolic Logic 5 (3): 121-122.
  • Brown, T. Patterson (1961) Professor Malcolm on “Anselm’s Ontological Arguments,” Analysis 22 (1): 12-14.
  • Bursen, Howard (1978) Dismantling the Memory Machine: A Philosophical Investigation of Machine Theories of Memory Springer.
    • Excellent application of Malcolm’s and Goldberg’s insights on memory.
  • Carney, James (1960) Review of Norman Malcolm’s Dreaming in Philosophy of Science 27 (4): 414.
  • Carney, James (1962) “Malcolm and Moore’s Rebuttals,” Mind 71 (283): 353-363.
  • Canfield, J. (1961) “Judgements in Sleep,” The Philosophical Review 70 (2): 224-230.
  • Canfield, John (1981) Review of Wittgenstein’s Lectures on the Foundations of Mathematics from the notes of R. G. Bosanquet, Norman Malcolm, Rush Rhees, and Yorick Smythies Canadian Journal of Philosophy 11 (2): 333.
  • Caldwell, Robert (1965) “Malcolm and the Criterion of Sleep,” Australasian Journal of Philosophy (December): 339-353.
  • Carruthers, P. (1987) Review of Norman Malcolm’s Nothing is Hidden in Philosophical Quarterly 37 (48): 99-100.
  • Carter, Walter (1964) Review of Norman Malcolm’s Knowledge and Certainty: Essays and Lectures in Dialogue 3 (1): 99-100.
  • Castaneda, Hector Neri (1965) “Knowledge and Certainty,” The Review of Metaphysics 18 (3): 508-547.
    • Castaneda argues that in this collection of Malcolm’s chronologically ordered essays, one can detect a drift away from Wittgensteinian “prejudices” and toward a more Chisholm-like method.
  • Cerf, Walter (1962) “Studies in Philosophical Psychology,” Philosophy and Phenomenological Research 22 (4): 537-558.
  • Chan, Timothy (2010) “Moore’s paradox is not just another pragmatic paradox,” Synthese 173: 211-229.
  • Chappell, V. C. (1963) “The Concept of Dreaming,” Philosophical Quarterly 13 (July): 193-213.
  • Chappell, V. C. (1961) “Malcolm on Moore,” Mind 70 (279): 17-425.
  • Chihara, C. S. and Fodor, J. (1965) “Operationalism and Ordinary Language: A Critique of Wittgenstein,”  American Philosophical Quarterly 2: 281-295.
  • Collingwood, Francis (1987) Review of Consciousness and Causality: A Debate on the Nature of Mind by Norman Malcolm and D. M. Armstrong, Modern Schoolman 64 (3): 199-201.
  • Cook, John (1981) “Malcolm’s Misunderstandings,” Philosophical Investigations 4 (2): 72-90.
  • Cornman, James (1965) “Malcolm’s Mistaken Memory,” Analysis 25: 161-167.
  • Davidson, Donald (1982) “Rational Animals,” Dialectica 36 (4): 317-327.
  • Davies, Alex (2012) “How to Use (Ordinary) Language Offensively,” Nordic Wittgenstein Review 1 (1): 55-80.
  • Deangelis, William James (1997) “Ludwig Wittgenstein—A Religious Point of View? Thoughts on Norman Malcolm’s Last Philosophical Project,” Dialogue 36 (4): 819.
  • Dennett, Daniel (1976) “Are Dreams Experiences?” The Philosophical Review 85 (2): 151-171.
    • Dennett here argues that dreams might not, after all, be experiences that occur during sleep.
  • Dennett, Daniel (1979) “The Onus Re Experiences: A Reply to Emmett,” Philosophical Studies 35 (April): 315- 318.
  • Descartes, Rene (1969) Meditations on First Philosophy in The Philosophical Works of Descartes, vol. 1.  Elizabeth S. Haldane and G. R. T. Ross (trans.) Cambridge: Cambridge University Press: 131-200.
  • Deshpande, D. (1976) “Professor Malcolm on Dreaming,” Indian Philosophical Quarterly 3 (3): 259-272.
  • Dilham, Ilham (1966) “Professor Malcolm on Dreams,” Analysis 26 (March): 129-134.
  • Doppelt, Gerald (1979) “The Austin-Malcolm Argument for the Incorrigibility of Perceptual Reports,” Dialectica 32 (2): 59-75.
  • Dunlop, Charles (1974) “Performatives and Dream Skepticism,” Philosophical Studies: An International  Journal for Philosophy in the Analytic Tradition 25 (4): 295-297.
  • Dunlop, C. E. M. (ed.) (1977) Philosophical Essays on Dreaming Ithaca and London: Cornell University Press.
  • Engelmann, Mauro (2013) “Wittgenstein’s ‘Most Fruitful Ideas’ and Sraffa,” Philosophical Investigations 36 (2): 155-178.
  • Fitch, Frederic (1940) Review of Norman Malcolm’s “The Nature of Entailment,” Journal of Symbolic Logic 5 (4): 160-161.
  • Garver, Newton (1994) This Complicated Form of Life Chicago: Open Court.
  • Garver, Newton (2006) Wittgenstein and Approaches to Clarity Amherst: Humanity Books.
  • Hacker, Peter (1987) ‘Critical notice : Norman Malcolm – Nothing is Hidden’, Philosophical Investigations, 10: 142-50.
  • Hacker, Peter (co-authored with G.P. Baker) (1990) ‘Malcolm on Language and Rules’, Philosophy, 65: 167-79.
  • Hacker, Peter (1992) “Malcolm and Searle on ‘Intentional Mental States'”, Philosophical Investigations 15: 245-75.
  • Hacker, Peter (2004) “Malcolm, Norman Adrian (1911–1990)”, Oxford: Oxford University Press.
  • Hamlyn, D. W. (1965) Review of Norman Malcolm’s Knowledge and Certainty, Philosophy 40 (152): 169.
  • Hanfling, Oswald (2003) Review of Norman Malcolm’s Nothing is Hidden, Philosophy 62: 529.
  • Hanfling, Oswald (2003) Wittgenstein and the Human Form of Life London: Routledge.
  • Hartshorne, C. (1965) Anselm’s Discovery: A Re-Examination of the Ontological Proof for God’s Existence, La Salle, Illinois: Open Court.
  • Himma, Kenneth “Anselm: Ontological Argument for God’s Existence,” Internet Encyclopedia of Philosophy
  • Hoffman, Robert (1967) “Malcolm and Smart on Brain-Mind Identity,” Philosophy 42 (160): 128-136.
  • Hyslop, Alec (1973) “Criteria and Other Minds,” Australasian Journal of Philosophy 51 (August): 105-114.
  • Ginet, Carl, and Shoemaker, Sydney (1983) Knowledge and Mind: Philosophical Essays Oxford: Oxford University Press.
    • This excellent collection, presented to Norman Malcolm in honour of his seventh-second birthday,  contains articles by G. E. M. Anscombe, John Canfield, John Cook, Keith Donnellan, Peter Geach, Carl Ginet, Bruce Goldberg, Hide Ishiguro, Thomas Nagel, David Sanford, Sydney Shoemaker, and G. H. von Wright.
  • Ginet, Carl  (2006) “Norman Malcolm (1911-1990),” A Companion to Analytical Philosophy A. P. Martinich and David Sosa (ed’s) Oxford: Blackwell.
  • Goldberg, Bruce (1968) “The Correspondence Hypothesis,” The Philosophical Review 77 (4): 438.
  • Goldberg, Bruce (1983) “Mechanism and Meaning,” Knowledge and Mind Sydney Shoemaker and Carl Ginet (ed’s) Oxford: Oxford University Press: 191-210.
  • Hacker, P. M. S. (1987) Review of Norman Malcolm’s Nothing is Hidden in Philosophical Investigations 10 (2): 142-150.
  • Heil, John (1982) “Speechless Brutes,” Philosophy and Phenomenological Research 42 (March): 400-406.
  • Ichikawa, Jonathan (2009) “Dreaming and Imagination,” Mind and Language 24 (1): 103-121.
  • Iseminger, Gary (1969) “Malcolm on Explanations and Causes,” Philosophical Studies: An International    Journal for Philosophy in the Analytic Tradition 20 (5): 73-77.
  • Kalish, Donald (1961) “Dreaming​,”​ ​Journal of Philosophy 58 (16): 437.
  • Kattsoff, Louis (1965) Review of Norman Malcolm’s Knowledge and Certainty, Philosophy and   Phenomenological Research 26 (2): 263-267.
  • Kramer, Martin (1962) “Malcolm on Dreaming,” Mind 71 (January): 81-86.
  • Kretzmann, Norman; Shoemaker, Sydney; Miller, Richard (1990) “Norman Malcolm June 11, 1911-August 4,  1990,” Cornell University Faculty Memorial Statement.
  • La Croix, Richard (1972) “Malcolm’s Proslogion III Argument,” Sophia 11 (1): 13-19.
  • Lacewing, Michael (2014) “Malcolm’s Ontological Argument,” Philosophy for AS. London: Routledge.
  • Linsky, Leonard (1965) “Malcolm and the Use of Words,” Analysis 26 (2): 59-61.
  • Locke, Don (1978) Review of Norman Malcolm’s Memory and Mind in Mind 87: 631.
  • Long, Douglas (1987) Review of David Armstrong and Norman Malcolm’s Consciousness and Causality,   Teaching Philosophy 10 (1): 83-86.
  • Lurz, Robert (2011) “Belief Attribution in Animals: On How to Move Forward Conceptually and Empirically,” Review of Philosophy and Psychology 1 (1): 19-59.
  • Mannison, Donald (1975) “Dreaming an Impossible Dream,” Canadian Journal of Philosophy 4 (June): 663-75.
  • Martin, Michael (1973) Are Cognitive Processes and Structures a Myth? Analysis 33 (3): 83-88.
  • Martin, Michael (1971) “On the Conceivability of Mechanism,” Philosophy of Science 38 (1): 79-86.
  • Matthews, Gareth (1961) “On Conceivability in Anselm and Malcolm,” The Philosophical Review 70 (1): 110-111.
  • Mayberry, Thomas (1975) Review of Norman Malcolm’s Problems of Mind: Descartes to Wittgenstein in   World Futures 14 (3): 289-295.
  • McDonough, Richard (1986) The Argument of the ‘Tractatus’ Albany: SUNY Press.
  • McDonough, Richard (1989) “Towards a Non-Mechanistic Theory of Meaning,” Mind XCVIII (389): 1-21.
  • McDonough, Richard (1993) “The Philosophical Psychologism of the Tractatus,” The Southern Journal of Philosophy XXXI (4): 425-447.
  • McDonough, Richard (1994) “Wittgenstein’s Reversal on the Language of Thought Doctrine,” Philosophical Quarterly 44 (177): 482-494.
  • McDonough, Richard (1994) “Wittgenstein’s Clarification of Hertzian Mechanistic Cognitive Science,” History of Philosophy Quarterly 11 (2): 219-235.
  • McDonough, Richard (2015) “Wittgenstein’s Augustinian Cosmology in Zettel 608,” Philosophy and Literature 39 (1): 87-106.
  • McDonough, Richard (2016) “Wittgenstein – From a Religious Point of View?” Journal for the Study of Religions and Ideologies, vol. 15 (43): 3-2.
  • McFee, Gr. (1983) Philosophical Inquiry 5 (4): 159-167.
  • Maxwell, Grover; Feigl, Herbert (1961) “Why Ordinary Language Needs Reforming,” The Journal of Philosophy 58 (18): 488-498.
  • Miller, Richard (1978) “Absolute Certainty,” Mind New Series 87 (345): 46-65.
  • Monk, Ray (1990) Ludwig Wittgenstein: The Duty of Genius New York: Penguin.
  • Moon, Andrew (2013) “Remembering Entails Knowing,” Synthese 190 (14): 2717-2729.
  • Moore, G. E. (1903) “The Refutation of Idealism,” Mind 12: 433-53.
  • Moore, G. E. (1969) “A Defence of Common Sense,” Readings in 20th Century Philosophy William Alston and George Nakhnikian (ed’s) London: Macmillan.
  • Moore, G. E. (1992) “A Reply to My Critics,” The Philosophy of G. E. Moore Paul Arthur Schlipp (ed.)  LaSalle: Open Court.
  • Morton, Adam (1985) Review of David Armstrong’s and Norman Malcolm’s Consciousness and Causality in British Journal for the Philosophy of Science 36 (3).
  • Mulhall, S. (1987) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Critique of his Early Philosophy in Mind 96: 113.
  • Oakes, Robert (1974) “God, Electrons, and Professor Plantinga,” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 25 (2): 143-147.
  • Odegard, Douglas (1978) Review of Norman Malcolm’s Thought and Knowledge and Malcolm’s Memory and Mind by Dialogue 17 (3): 566-570.
  • Odell, S. Jack (1971) “Malcolm on Remembering That,” Mind 80 (October): 593.
  • Palmieri, L. E. (1962) To Sleep, Perchance to Dream,” Philosophy and Phenomenological Research 22 (4): 583-586.
  • Pears, David (1961) Review of Norman Malcolm’s Dreaming in Mind 70 (April): 145-163.
  • Pears, David (1989) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Criticism of His Early Thought in Philosophical Review 98 (3): 379.
  • Pintado-Casas, Pablo (1997) Review of Norman Malcolm’s Wittgenstein on Mind and Language de David Stern and of Norman Malcolm’s Wittgensteinian Themes: Essays 1978-1989, Teorema: International Journal of Philosophy 16 (3): 126-129.
  • Plant, Bob (2011) “Religion, Relativism, and Wittgenstein’s Naturalism,” International Journal of Philosophical Studies 19 (2): 177-209.
  • Plantinga, Alvin (1967) God and Other Minds Ithaca: Cornell University Press.
  • Plantinga, Alvin (1974) The Nature of Necessity Oxford: Oxford University Press.
  • Preston, Aaron “George Edward Moore (1873-1958),” Internet Encyclopedia of Philosophy.
  • Preston, Aaron “Analytic Philosophy,” Internet Encyclopedia of Philosophy.
  • Putnam, Hilary (1962) “Dreaming and ‘Depth Grammar,’” Ronald Butler (ed.) Analytical Philosophy: First Series Oxford: Oxford University Press.
  • Richter, Duncan “Ludwig Wittgenstein (1889-1951),” Internet Encyclopedia of Philosophy.
  • Riesenberg-Malcolm, Ruth (1999) On Bearing Unbearable States of Mind London: Routledge.
  • Rowe, William (1971) “Neurophysiological Laws and Purposive Principles,” The Philosophical Review 80 (4): 502-508.
  • Ryan, Sally Parker (2010) “Reconsidering Ordinary Language Philosophy: Malcolm’s (Moore’s) Ordinary Language Argument,” Essays in Philosophy 11 (2): 123-149.
  • Ryan, Sally Parker “Ordinary Language Philosophy,” Internet Encyclopedia of Philosophy.
  • Sayward, Charles (2004) “Malcolm on Criteria,” Behaviour and Philosophy 32: 349-358.
  • Schaffer, Jerome (1984) “Dreaming,” American Philosophical Quarterly 21 (2): 135-146.
  • Schröder, Severin (1997) “The Concept of Dreaming: On Three Theses by Malcolm,” Philosophical Investigations 20 (1): 15-38.
  • Serafini, Anthony (1993) “Norman Malcolm: A Memoir,” Philosophy 68 (265): 309-324.
  • Scott, Frederick (1965) “Scotus, Malcolm, and Anselm,” The Monist 49 (4): 634-638.
  • Shoemaker, Sydney; Swineburne, Richard (1985) Review of Norman Malcolm’s and David Armstrong’s Consciousness and Causality in Mind 94 (374): 302-306.
  • Shope, Robert (1973) “Remembering, Knowledge and Memory Traces,” Philosophy and Phenomenological Research 33 (3): 303-322.
  • Siegler, F. A. (1967) “Remembering Dreams,” The Philosophical Quarterly, 17: 14-24.
  • Soames, Scott (2003) Philosophical Analysis in the Twentieth Century, Volume II: The Age of Meaning. Princeton: Princeton University Press.
  • Soames, Scott (2004) “Malcolm’s Paradigm Case Argument,” Philosophical Analysis in the Twentieth Century. Princeton: Princeton University Press: 157-170.
  • Springett, Ben “The Philosophy of Dreaming,” Internet Encyclopedia of Philosophy.
  • Stern, K. (1959) “Malcolm’s Dreaming,” Analysis 19 (December): 44-46.
  • Stern, David (1991) “Models of Memory: Wittgenstein and Cognitive Science,” Philosophical Psychology 4 (2): 203-218.
  • Sturgeon, Nicholas; Brown, Stuart (1991) “Norman Malcolm 1911-1990,” Proceedings and Addresses of the  American Philosophical Association 64 (5): 70.
  • Swiggers, P. (1987) Review of Norman Malcolm’s Nothing is Hidden: Wittgenstein’s Criticism of his Early Thought in Tijdschrift Voor Filosofie 49: 120.
  • Tang, Hao (2015) “A Meeting of the Conceptual and the Natural: Wittgenstein on Learning a Sensation-Language,” Philosophy and Phenomenological Research 91 (1): 103-135.
  • Thornton, Stephen “Solipsism and the Problem of Other Minds,” Internet Encyclopedia of Philosophy.
  • Tomberlin, James (1972) “Malcolm on the Ontological Argument,” Religious Studies 8 (1): 65-70.
  • Trakakis, Nick “The Evidential Problem of Evil,” The Internet Encyclopedia of Philosophy.
  • Uschanov, T. P. (2002) Ernest Gellner’s Criticisms of Wittgenstein and Ordinary Language Philosophy,” Gavin Kitching and Nigel Pleasants (ed’s) Marx and Wittgenstein: Knowledge, Morality and Politics. London: Routledge.
    • A variant of this paper is titled “The Strange Death of Ordinary Language Philosophy.”
  • Winch, Peter (1995) “Discussion of Malcolm’s Essay” in Norman Malcolm’s Wittgenstein: A Religious Point of View? Peter Winch (ed.) Ithaca: Cornell University Press.
  • Windt, Jennifer (2015) A Conceptual Framework for Philosophy of Mind and Empirical Research. Cambridge:  MIT.
  • Wittgenstein, Ludwig (1958) Philosoph­ical Investigations, Elizabeth Anscombe (trans.). Oxford: Blackwell.
  • Wittgenstein, Ludwig (1961) Tractatus-Logico-Philosophicus, David Pears and B. F. McGuiness (trans..) (London: Routledge and Kegan Paul.
  • Wittgenstein, Ludwig (1970) Zettel, G. E. M. Anscombe (trans.) Berkeley and Los Angeles: University of California Press.
  • Wittgenstein, Ludwig; Moore, G. E.; Malcolm, Norman; Citron, Gabriel (2015) “A Discussion between Wittgenstein and Moore on Certainty: From the Notes of Norman Malcolm,” Mind 124 (494): 73-84.
  • Wolf, Fred Allen (1995) The Dreaming Universe: A Mind Expanding Journey into the Realm in which Psyche and Physics Meet New York: Simon and Schuster Inc.
  • Wright, G. H. (1992) “In Memory of Malcolm, Norman 1911-1990,” Philosophical Investigations 15 (3): 224-226.
  • Yost, Jr., R. M. (1959) “Professor Malcolm on Dreaming and Scepticism—I,” Philosophical Quarterly 9 (April): 142-151.
  • Yost, Jr., R. M. (1959) “Professor Malcolm on Dreaming and Scepticism—II,” Philosophical Quarterly 9 (36): 231-243.

Author Information

Richard McDonough
Email: rmm249@cornell.edu
Arium School of Arts and Sciences
Singapore

Aesthetic Taste

Taste is the most common trope when talking about the intellectual judgment of an object’s aesthetic merit. This popularity rose to an unprecedented degree in the eighteenth century, which is the main focus of this article. Taste became a major concept in aesthetics. This prominence was so pronounced that it might seem that taste as an aesthetic idea developed from nothing during this time. However, the roots for theories of taste stretch back, as many things do, to Plato and Aristotle. In talking about the human soul, for example, Aristotle emphasized the role the senses play in obtaining knowledge and making judgments. As a condition for sentient beings, touch is the main component of taste, since the tongue must touch what it tastes. So, the idea that taste can be used to make judgments was present early on, as the embryonic idea for the more robust theories of taste.

Though it is no secret that theories of taste thrived in the seventeenth and eighteenth centuries, it might still be surprising because of the new intellectual focus. Science and the higher faculties of reason received a greater emphasis, while Alexander Baumgarten began using the word aesthetics to refer to the lower faculties of judgment. Why these lower faculties came to be so popular is unusual in the wake of the scientific developments and ideas of the day. But these philosophers realized there was something in common experience when confronted with beauty that they didn’t understand. Perhaps, people began to believe that humans really are the measure, since they were making these new intellectual advancements. And the ability to judge beauty would become more important, as they believed their judgments were more accurate or substantial. However, they still did not agree about the specifics of the judgments. For David Hume, taste is a subjective feeling with a standard found within the beholders. For Alexander Gerard, taste is an act of the imagination. For Immanuel Kant, taste is subjective, but beautiful objects present themselves as having universal appeal. And this is just a smattering of the different ideas.

Despite this strong beginning, the importance of taste dropped out of most theories of aesthetics by the twentieth century. Yet on a popular level, people continue to refer to good and bad taste in what are meaningful exchanges. Many subsequent philosophers have tried to develop a more involved theory of gustatory taste as a branch of aesthetics. Though this might have its own value, taste in the more traditional sense has not completely faded away, even though people do not any longer devote as much time to theories of taste.

Table of Contents

  1. Early Foundations for Taste: Ancient to Medieval Philosophers
    1. Plato and Aristotle
    2. Plotinus
    3. Augustine and Aquinas
  2. Why Taste Became the Metaphor for Aesthetic Judgment
  3. Eighteenth Century Philosophers: The Century of Taste
    1. Joseph Addison
    2. Anthony Cooper, Third Earl of Shaftesbury
    3. Francis Hutcheson
    4. Moses Mendelssohn
    5. Johann Gottfried Herder
    6. Alexander Gerard and Archibald Alison
    7. David Hume
    8. Edmund Burke
    9. Immanuel Kant
  4. Nineteenth and Twentieth Century Philosophers: The Step Away from Taste
  5. Contemporary Philosophy and Beyond
    1. Pierre Bourdieu
    2. Gustatory Taste
    3. Some Developments in Analytic Philosophy
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Early Foundations for Taste: Ancient to Medieval Philosophers

Theories of taste did not explicitly come to the forefront until the eighteenth century; however, most of the foundational ideas were in place many years prior. The focus was more on beauty and truth rather than on what the beholder felt about a given work. These ideas came to influence the theories of later thinkers as they revitalized, revised, and responded to the writings of these early Greek and medieval philosophers. Here, just a cursory glance of these preliminary thoughts will be reviewed.

a. Plato and Aristotle

Plato, Aristotle, and the other ancient Greeks did not have any specific notion of taste as a means of aesthetic judgment. However, many of their ideas inspired the later developments of theories of taste. Plato’s metaphysical beliefs, especially his view of the perfect forms, had an acute influence on the later Neoplatonists, even on those who did not specifically believe in a realm of the Forms. The traditional understanding of Plato holds that there is a heavenly realm where the perfect Forms of reality exist. Whether or not Plato believed in a literal realm of Forms is open for discussion, but it seems clear that he believed in perfect versions of everything we experience on earth. These Forms are like the templates of reality, and reality is therefore less perfect than these Forms. For instance, the Form of Beauty is the standard by which all other beautiful things are measured. Necessarily, all of the particular things are less beautiful than this perfect Beauty. To reach this higher Beauty, one must rise up to it through a dialectical method, as Socrates learned from Diotima in Symposium. Almost like ascending stairs, one uses the lower beauties of the world to climb up to the higher realms. To explain, we start with physical beauty, move to intellectual beauty, and then arrive at spiritual (or perfect) beauty. More could be said about Plato’s overall view of aesthetics and beauty, but it is important to note here simply that the apprehension of the beautiful is connected with knowledge. As one obtains knowledge, one continues to learn more of beauty. So, knowledge is the key component to developing a better appreciation of beauty and, for Plato, arriving at Beauty itself.

Aristotle, like Plato, did not have a concept of taste per se. The quest was to uncover the principles of beauty. Rather than believing real beauty existed in another world, Aristotle wrote that beauty was a property of objects; it was related to their size and proportion. Though Aristotle did not develop a system of aesthetics as such, he is the first on record to have developed an extended treatment about one of the arts, namely poetry. Since this is the main art that Plato criticized in the Republic, one might wonder whether this was Aristotle’s attempt to further distinguish his own system of philosophy.

While Plato’s view is that Beauty has the same nature but with different degrees in different objects, Aristotle seems to hold the idea that beauty’s nature varies with the different objects (or types of art) in which it is found. Therefore, the beauty of an object might relate to that object’s purpose, though he never directly says so. More important to Aristotle’s view are the concepts of form and unity. As form and unity are necessary for knowledge in the strict sense, so they also provide a kind of knowledge in art as an object imitates something else. While Plato believed that art diminishes the knowledge of something to an almost unrecognizable degree, Aristotle holds that the imitation helps the idea become simpler and therefore more easily understood. The imitation can actually be a useful, sometimes necessary, step in obtaining knowledge. And the imitation, though not always complete, is correct. Rather than rising to some higher forms of beauty beyond this physical world, Aristotle seems to have a more experiential approach to discovering and judging beauty. Each kind of thing has its own form and therefore its own beauty. To develop what we might call aesthetic judgment, though Aristotle does not use that expression, one would have to observe enough samples of different objects of the same kind to discover the order and arrangement proper to those things. Both Aristotle and Plato have beauty located outside of human experience, so taste would have been the search for beauty in things.

b. Plotinus

Plotinus, the most recognized Neoplatonist, developed his metaphysical philosophy around three principles: The One, Intellect, and Soul. Following earlier philosophers’ attempts to derive the more complex things from the simple, Plotinus posits the One as the simplest first principle of everything else, and everything is derived (or emanates) from this first principle. The One is even more foundational for reality than Plato’s Forms, without which the Forms would have no unifying principle. The second principle, Intellect, is where the other Forms reside, so to speak. These Forms, like in Plato’s view, are what give everything else their respective properties. The third principle, Soul, is the principle of desire for those things that are external to the individual. Plotinus, like Plato, posits degrees of beauty, with the lowest being physical beauty, leading up to the Beauty present in the Intellect.

Likely influenced by the ideas in Diotima’s speech in Plato’s Symposium, we begin our ascent from physical beauties until we climb up to the highest Beauty. Where he might differ from Plato is in the hierarchy of beautiful things. Presumably, Plato thought natural things were more beautiful than artifacts because they were closer to the Forms of those things. However, Plotinus claims that the manipulation of a natural object through art made it more beautiful. To demonstrate, Plotinus uses an example of two stones: one is naturally occurring and the other has been wrought by an artist into the image of a god. Which one is more beautiful? Plotinus thinks it is clear that the one that has been imbued with the soul of a human artist has achieved a higher degree of beauty. The soul of the spectator can enjoy this object more than the natural stone because he recognizes the work of a like soul. As one ascends toward Beauty, the goal is to rely less on the senses, though they are the necessary condition for beginning the ascent. It seems for Plotinus that taste is not necessarily developed or reasoned about. Rather, it is almost like a reaction in the soul, based on the knowledge that the soul possesses.

c. Augustine and Aquinas

Medieval philosophers were concerned with metaphysical properties like beauty more than any notion of individual preference or taste. This was partly because beauty, for most of the philosophers of this time, was an objective property. There wasn’t any room for disagreement about whether something was actually beautiful, though they presumably debated about whether one’s particular knowledge about the beauty of an object was correct. It was widely believed that the true, the good, and the beautiful were linked to each other. Talking about any one of these concepts involved overlapping discussions of the other two. For instance, theories of beauty consisted of some discussion about its relation to the true and good. For present purposes, the main two representatives of the early and later middle ages will be discussed: Augustine and Aquinas (See also the article on Medieval Theories of Aesthetics).

Continuing the basic ideas of Plato, Augustine thought perfect beauty existed in God rather than the impersonal realm of the Forms. In fact, God is the highest beauty, and everything participates in beauty because everything is created by God. In physical objects, Augustine believed two primary attributes made those things beautiful: equality and unity. Unity is found in everything that exists, but equality (proportion or symmetry) is not necessarily found in everything, especially those things made by people. Augustine provides an example that shows we at least aim for equality in our work: If you want to put two windows on the side of a house, you do not want one to be gigantic and the other tiny. You want them to be the same size, assuming the wall is an even rectangle. For Augustine, the judgment of beauty is founded upon a person’s apprehension of the unity and equality of an object. And this involves reason, which isn’t that different from previous thinkers who thought knowledge was a necessary aspect of grasping beauty. The standard of beauty is in God’s mind, so the beholder must come to understand this standard through some divine illumination. Without God’s help, a person might see vaguely the beauty of an object, but it is God alone who can help the beholder grasp the fullness of beauty. Though Augustine does not have a theory of taste, we might say that one’s taste is perfected the closer one is aligned with God.

Unlike Augustine, Aquinas adheres more carefully to the overall philosophical views of Aristotle rather than Plato, though Plato’s influence is not absent. Finding beauty present in physical objects, Aquinas famously asserts that beauty is that which pleases when seen. It might appear that Aquinas’s definition asserts a subjective understanding of beauty; namely, whatever pleases the onlooker becomes beautiful. However, the word seen implies contemplation of the object. Once again, knowledge comes into the apprehension of the beautiful. Aquinas’ view of beauty differs from the platonic view in that beauty is really present in the object, though similar to Augustine’s view, beauty is still God, who is the ultimate cause of all beauty.

Recall that Augustine offered two main traits of beautiful things: equality and unity. Similarly, Aquinas presents us with three conditions of beauty: proportion, wholeness, and radiance. Proportion involves symmetry but is not limited to this one aspect. It involves whether there is an overall balance achieved in the object. Wholeness (or integrity) is the condition that involves the degree to which something attains its proper form. For example, a dancer sitting down is less beautiful—as a dancer—than when he or she is actually dancing. The last condition, radiance, is the most evasive. It might just be, for Aquinas, the most important condition because objects might have proportion and be whole yet still not be radiant. Generally speaking, it is that quality of an object that makes us want to perceive it again. It involves the way an object “shines” before the beholder. For Aquinas, perceiving the beauty in an object is not passive; it is an activity of the intellect. Like judging the truth of a proposition, a judgment of beauty begins with cognition. Then, the beholder makes a judgment based on these three conditions. Taste, if Aquinas had a theory, might be the ability to recognize these three universal conditions in their specific instantiations.

2. Why Taste Became the Metaphor for Aesthetic Judgment

We have mentioned briefly the basic ideas that created the foundation for theories of taste, but we still need an explanation as to why taste became the metaphor for aesthetic judgment. The sense of taste, in ancient times, was connected with the appetite, not with rational judgments. Seeing and hearing provide the most information and thus were considered the best senses for gaining knowledge. The other three senses simply help round out that knowledge. Therefore, it would have been natural to assume that seeing and hearing are also the best senses for pronouncing a judgment of beauty or sublimity. After all, these two senses were thought to be necessary for making intelligent judgments. But it was taste that became the main faculty for making aesthetic judgments, especially for the 18th century philosophers. Of course, it was not literal tasting, but metaphorical, that was at work here. Because of this, some posited a sixth, internal sense that they referred to as taste. Still, one might wonder why taste suddenly emerged as the metaphor for making judgments about beauty. Though there is no exact reason why taste rose to this prominent role, there are a couple of ideas about it worth mentioning here.

In the Aristotelian tradition, taste is connected strongly with the sense of touch. Though he maintained five senses as we do today, Aristotle considered whether there might only be four. It is necessary for the tongue to touch food, for example, for the food to be tasted. In the middle ages, this became more significant as different tastes were believed to elicit healing and nutrition on the body. It was believed that different flavors held different properties for the body, and a mixture of flavors was necessary in order to maintain healthy balance. Flavor was not accidental in different foods. Thus, good taste, in the sense of diet, was necessary for one’s physical well-being.

In the later middle ages, taste was occasionally related to the term honest by referring to objects as, for example, an honest painting. This description might seem unusual since honesty is often connected with truth. And we consider honesty describing only a being with a will to choose, because one has to decide to be honest in a given situation. An inanimate object cannot make such a choice. Calling an object honest, however, was a reflection on the viewer or, more specifically, ideal type of viewer. Basically, an honest object is one that an honorable person would consider to be beautiful. It would be an object that is well suited for its purpose. This idea is connected with the belief that the good and beautiful are related, so of course, the good person is better suited to apprehend the beauty of an object. Taste was recognized as the sense associated with the ability to discriminate, namely flavors. But taste also became the metaphor for discriminating or judging the beauty of an object.

3. Eighteenth Century Philosophers: The Century of Taste

Theories of taste sprung up in the eighteenth century, which is why George Dickie refers to it as “the century of taste,” which is also the name of his book. Everyone had to contribute something to the discussion, and then it seemed to die down as quickly as it had arisen. Prior to this century, most of the discussion centered on theories about beauty, which was deemed objective, but now philosophers began to look more toward themselves to understand their reactions and preferences to such things in both art and nature. This shift began their theorizing about taste, which turned the discussion toward the subjective. And then, a century later, this discussion transformed into theories about the aesthetic attitude.

a. Joseph Addison

Joseph Addison (1672-1719) did not present a systematic treatment of aesthetics, but he did promise and deliver original ideas spread throughout his essays for the Spectator in 1712. Specifically, Addison set out to investigate the pleasures of the imagination. The first essay in this series of eleven is devoted to taste. He writes that most languages employ the metaphor of taste to indicate the faculty of the mind that distinguishes between faults and perfections in writing. This faculty of mental taste (which involves the perception of beauty), like that of sensitive (or physical) taste, has degrees of refinement. So Addison was trying to help those in the middle-class utilize their brief moments of leisure for these kinds of pleasures of imagination. The pleasures of cognition, which involve intellectual thought, might not be possible for some people who have lesser intelligence or lack access to education. But the pleasures of the imagination—eyesight furnishes the ideas here—are just as good and are more easily obtained. After all, every image, says Addison, enters our minds through sight.

Addison asserts that taste is a person’s psychological response to literature. Though his remarks are mostly framed in the context of literature, Addison’s basic ideas became the foundation for people’s thoughts about other kinds of art and nature. Even though the faculty of taste is present in people at birth, it must still be cultivated to be brought to its fullest ability to judge. This should not be surprising, because the same is true for the sensitive taste. Putting something in one’s mouth enables the sensation of taste to work quite automatically. But it can take years of experience and practice to develop a sensitive taste refined enough to detect the subtle differences between two glasses of whisky.

The pleasures of the imagination are found in two types. Following Locke, Addison maintains that no images enter one’s mind without going through the sense of sight. The primary pleasures come directly from the visual objects, which are present to the observer. The secondary pleasures arrive from those objects that are remembered or fictitious, being only in the mind (at least at the present moment). A person, through imagination, can manipulate or alter the images that are in the mind. The aesthetic pleasure arises solely through the contemplation of these ideas or those images in the mind. These pleasures of the imagination are greater than sensual pleasure and are as great as the cognitive pleasures. They have at least one advantage over the cognitive pleasures—the pleasures of the imagination are much easier to obtain because one has to simply open one’s eyes.

b. Anthony Cooper, Third Earl of Shaftesbury

Anthony Cooper (1671-1713), the Third Earl of Shaftesbury (usually just called Shaftesbury), started his thoughts on aesthetics from Neoplatonic metaphysics. Shaftesbury developed his belief that taste was inborn in human beings, an idea perhaps similar to recollection in Plato. For anyone reading Shaftesbury, it becomes clear early on that he is not interested in developing a “system” of aesthetics. His thoughts cycle through his narrative, especially in his work “The Moralists.” Woven throughout these works are many important ideas that Shaftesbury does not always fully develop but were still highly influential to those writing after him.

Shaftesbury maintains that people grasp beauty and goodness in exactly the same way, which involves the moral sense. The beautiful is closely related to virtue in his thinking; hence, moral theory permeates most aspects of Shaftesbury’s understanding of aesthetics. One’s sensibility in the realm of morality is intertwined with one’s apprehension of beauty. Not only does he borrow from Neoplatonism, but Shaftesbury also emphasizes experience, showing elements of empiricism in the development of his ideas. He holds that forms of the beautiful and good are embedded in people’s minds, but each person has an internal (moral) sense to which he or she can appeal. These two attitudes provide tension among the characters—Theocles and Philocles—in his prose as they seek to sort out opinions concerning taste. It was likely this interplay of contrary ideas that led Shaftesbury to utilize the prose style where he told a story of a group of people having discussions about taste and beauty. This style invites the reader into the discussion, similar to Plato’s use of dialogue.

The key aesthetic property is harmony, which is found in nature as created by God. Seeing God as the ultimate artist, Shaftesbury extends aesthetic appreciation to the natural world as the ultimate aesthetic object. An important part of Shaftesbury’s belief is that the moral sense allows a person to comprehend an object’s beauty immediately, without the need to use reason. Intuition is at work here more than sensation. Obviously, the object is initially perceived by the senses, but then it is immediately judged by the internal (or moral) sense.

Though it seems possible for variation among different people’s internal senses, Shaftesbury did not think that aesthetic judgments were relative. He believed in a universal standard of judgment for beauty. Philocles claims that an inward eye immediately differentiates the fair and admirable from the deformed and foul. This ability must be natural, since it differentiates as soon as objects are perceived by the senses. If the ability to discriminate (through one’s taste) between beauty and ugliness is immediate, then taste cannot have its ultimate grounding in the process of reason, which takes time. Experience affirms the immediacy of one’s judgments concerning objects of perception. For example, being captivated by a sunset usually does not require more than a glance to draw the viewer in. Therefore Shaftesbury, through Theocles, maintains that taste cannot have its ultimate source in discursive reasoning.

Some things can block people or cloud their minds from being able to make sound judgments. Even though taste resides innately in human beings, passions and ignorance prevent one’s internal sense from successfully comprehending beauty in sensible things. Shaftesbury acknowledges that one cannot escape these obstacles; however, one can learn to control them in order to avoid being tossed around by one’s whimsical feelings. The internal sense connects goodness and beauty for Shaftesbury; therefore, one can allow beauty to affect oneself more fully by cultivating a virtuous or harmonious life. Someone deprived of virtue will be less able to perceive beauty than one who lives a virtuous life.

Theocles declares several times that beauty and good are the same thing, which the inward eye enables people to immediately perceive. Then, the beauty and goodness of these objects are compared to the innate concept of harmony. It seems that the closer they are to this notion of harmony, the more beauty they are judged to possess, remembering that this happens without reasoning about the object. The same would also apply to one’s judgment about the actions of others, whether they are noble or evil. When one builds good foundations of “order, peace, and concord,” then one is able to immediately connect with beauty. The reverse is also true: if one is unable to experience the beautiful, then it is indicative that one’s life is disharmonious. Philocles raises an interesting objection to Theocles’ schema. He wonders why there are so many different people believed to be virtuous, yet their actions are often conflicting. Theocles agrees that seemingly virtuous people differ in their opinions about heroes and whether gardens or paintings are better. These differences create tension when seeking which opinions should have authority. Shaftesbury is vague on this point: It seems he is trying to claim that happiness is the measure of a successful life. And virtue, which leads to success and happiness, is the prerequisite for developing taste to comprehend beauty. In the end, it seems that the individual is responsible for his or her own happiness and has to make decisions accordingly. If there are different, even contrary, examples of a virtuous life, then it seems difficult to know whether one is actually living the virtuous life and able to apprehend beauty to a fuller degree. But he still maintains that a developed sensibility allowing one’s innate taste to have full play is the result of guiding oneself toward a moral life with happiness as the standard of measure.

c. Francis Hutcheson

Francis Hutcheson (1694-1746) started from a Lockean view of sensation, which is divided into simple and complex ideas and primary and secondary qualities. With that foundation and his belief in a moral sense, Hutcheson also posits an innate and internal sense that was necessary for perceiving beauty. One reason for this view is exemplified by the fact that some people’s external senses are fully functioning, yet they find no enjoyment in the arts. If their five senses are working properly, then the hindrance must come from another sense. Moreover, there are things like mathematical or logical theorems that are deemed beautiful, but they are perceived by the mind and not the five senses. Finally, as further proof, Hutcheson notes that beauty is perceived immediately and does not require any knowledge; people make aesthetic judgments quite instantly.  So, for Hutcheson the ability to grasp beauty must be another, internal “sense.

The internal sense—Hutcheson does not clearly define it—is a mental faculty that functions much like one of the five senses. However, it recognizes beauty in both sensuous and mental experiences, which makes it sufficiently distinct. Hutcheson holds a complementary place to Shaftesbury in the development of the idea of innate taste. Hutcheson also blends his aesthetic theories with his moral theories, and both contexts allow for innate elements in human beings. Like an external sense, this internal sense is natural and is not governed by one’s will. Hutcheson points out that the will does not determine whether an object causes pain or pleasure. It is a natural instinct to pull one’s hand away upon touching something hot. Experiencing some objects causes pleasure, while other objects inevitably cause pain. As an analogy, Hutcheson demonstrates that pleasure in artistic objects—architecture, painting, musical composition, and so on—is also innate and necessary. Though he finds the faculty of taste to be an internal sense, Hutcheson explains that the pleasure arises out of the harmony, order, and design of the object. But he does not think that simple ideas, like color, sound, or mode of extension, can provide the same pleasure.

Concerning taste, Hutcheson believed that beauty represents the idea, while the sense of beauty represents our ability to grasp this idea. The combination—one’s ability to perceive beauty internally—is what he refers to as taste. When perceiving beauty, we should note that Hutcheson proposes a distinction between absolute (or original) beauty and comparative (or relative) beauty. Objects have absolute beauty when they are beautiful in themselves without a comparison with any other object. Comparative beauty, on the other hand, is grounded in the comparison between the object of the perception and the object that it imitates.

Beauty, for Hutcheson, is mostly comparative, which means it would not exist without relating to the mind of a perceiver. Objects play their part by exciting in people feelings of beauty when there is “uniformity amidst variety,” which is the primary property of beauty. When the uniformity is multiplied, then the beauty increases. For example, an equilateral triangle has less beauty than a square, while a perfect hexagon has more beauty than both of them. On the other side, more uniformity enhances the beauty when variety is multiplied. For example, a square is more beautiful than a rhombus. This uniformity with variety triggers the internal (and innate) sense of taste in human beings, causing them to apprehend the beauty of the object. External things only contribute by relating to this internal sense, causing it to activate feelings of pleasure. This activation of pleasure notifies observers that they are experiencing something that is beautiful.

d. Moses Mendelssohn

As a committed rationalist, Moses Mendelssohn (1729-1786) did not want to rely on emotional responses for aesthetic experiences. He was dedicated to the principles of Leibnizian metaphysics. Mendelssohn’s goal of understanding the world could only come from rational principles applied to reality. The rationalists advanced the notion that clear and distinct ideas are present when one understands the interconnectedness of things. Taste also falls under the rationalist scheme and is something acquired and developed rather than an internal sense that is natural. Since clear and distinct ideas are not easily realized, Leibniz suggests that most of our knowledge consists of clear and confused ideas. Clear ideas arise from an object that is distinguishable in a sense perception, but they can be confused (that is, not distinct) because their contents are not distinguishable. Clear and confused ideas usually result when one knows the whole and not the parts, that is, the interconnectedness of the parts is not known.

In “On Sentiments,” Mendelssohn presents a series of letters, written by Theocles, that was a reaction to Shaftesbury who had a character with the same name. Mendelssohn believed that views like Shaftesbury’s, though freethinking, lacked the rigor necessary for precision. Mendelssohn’s Theocles admits that when someone does not have the requisite experience of beauty, it was likely from lack of preparation. Theocles claims that he prepares himself to experience beauty, and this preparation is necessary for the experience. It might be similar to a runner stretching before running a race. People ready themselves in many different contexts, so it should not seem odd to prepare for an aesthetic experience. Mendelssohn’s Theocles explains that he actually prepares to experience something pleasurable by initially striving to perceive it distinctly. Making a transfer from parts to whole, the distinct ideas fade out into the background and become confused. Since it is necessary for the whole to be present to the senses at once, the universe can only be a beautiful object for the mind of God. Hence, the finitude of mankind prevents objects too massive or too miniscule from being perceived as beautiful.

Mendelssohn describes some criteria for explaining why an object is effective at presenting a perfection or an imperfection, which aids in apprehending beauty. He describes three proportions that act on our impulses: (1) the proportion to the magnitude of the good, (2) the proportion to the magnitude of our insight, and (3) the proportion to the time required to consider this good. The first proportion relates to perfection, implying that things which possess a higher degree of perfection are more pleasing to the mind. The second one relates to knowledge: the more distinct one’s knowledge is of something, then the more impact that thing has on the individual. The last proportion requires more explanation. It relates to the speed of the perception. The less time it takes to perceive a perfection, then the more pleasant is the knowledge of that object. Something that can be perceived quickly might produce greater desire in the perceiver than something that is more perfect. By learning to see things clear and confused, that is, the whole but not the parts, one can learn to perceive more quickly. One learns to train the soul through habit and practice; the goal is to become so trained that an action no longer requires thought (or at least requires less thought). Practice and intuitive knowledge are the two main ways to increase the speed of one’s thoughts. Practice involves constantly reviewing things, such as inferences in practical philosophy until it becomes ingrained in one’s mind. Intuitive knowledge entails continually learning to apply the practiced inferences to concrete situations. In terms of aesthetic experience, one learns through reason things that are supremely beautiful by being often exposed to beauty. Eventually, one practices and applies taste through the instrument of reason until it becomes embedded, and it will eventually function without thought.

Mixed sentiments—those combining pleasure and displeasure—are another indicator of Mendelssohn’s belief that taste is acquired. Sympathy is the primary example Mendelssohn employs to illustrate the notion of mixed sentiments. Sympathy expresses love for an object, while also being discontent at the object’s or person’s misfortune. He demonstrates this idea using examples from drama. When a tragedy is about to occur, the audience can appreciate the ability of the actors, directors, and writers to make them feel terror; however, the audience is not afraid for themselves but the characters who are about to suffer. The interesting thing about mixed sentiments is that they penetrate more deeply and vividly into one’s mind than any type of pure pleasure. Like learning to recognize the three proportions, habit is also required to develop an understanding of the mixed sentiments. One must practice utilizing mixed sentiments to discover and experience beauty and sublimity.

Mixed sentiments lead Mendelssohn into thoughts on perceptions and one’s reaction. An extremely large object that we could think about as a whole but could not comprehend in person causes a mixed sentiment of gratification and trembling if we continue to think about it. As examples, he suggests the depths of the ocean, a desert stretching out to the horizon, or the seemingly endless stream of stars in the sky. One feels euphoric, a pleasing nausea. Pure pleasure will eventually breed boredom induced from monotony, but mixed sentiments will overpower one’s senses, making one want to perceive it again and again. Mixed sentiments and training the mind are two important facets of Mendelssohn’s understanding for how people develop or acquire taste.

e. Johann Gottfried Herder

Johann GottfriedHerder (1744-1803) shared the notion of reasoned or developed taste with Mendelssohn. He deviated from Mendelssohn by grounding everything in nature, while Mendelssohn was a staunch advocate of Leibnizian metaphysics, grounding everything in reason. It might seem that a belief in the supremacy of nature would lead one to the view of innate taste, like the view held by Shaftesbury. However, Herder does not begin with innate ideas like those in the Platonist school; he places more emphasis on discovering and developing an ability to perceive beauty. Herder adds a step to Mendelssohn’s view, rather than opting for innate taste. Mendelssohn basically believed that reason develops taste, while Herder believed that nature leads to reason, which then leads to taste. In commenting on the natural aspect of taste, Herder explicitly claims that truth and beauty are disclosed through the use of reasons. When one is induced by reasons, then one will naturally expect everyone to accept the same reasons as evidence of truth or beauty. He was well aware that not everyone would actually agree with the same type of reasoning concerning the beauty of a given object. He merely asserts that it is natural to expect (or want) others to be in agreement.

People tend to have differing views about what counts as beautiful or ugly, and it is important for Herder’s view that this occurrence be explained. Dealing with people’s differences of opinion is one of Herder’s distinguishing characteristics. He was very interested in the way that diverse people develop and come to think and act in distinct ways from other people, and he points out the fact that taste changes throughout time and from place to place. He links this change, as well as others, with culture and upbringing. Everyone, according to Herder, possesses an aesthetic nature, which is one’s capacity to apprehend beauty through the senses. This aesthetic nature is the starting point for each person, but it develops in different ways depending on one’s culture, background, and experience. For example, if someone immerses oneself completely in the art of music, then one will be exceptionally trained to hear the melody of music. At the same time, this person might be ill-equipped to perceive visual beauty because one’s eyes might not be as well trained as one’s ears. Nature has equipped everyone with similar capacities to perceive beauty, but each person is responsible for developing these capacities. On the other hand, people are restricted by how much their society and environment have contributed to developing their tastes as a whole. Beauty is not always obvious in every culture, but Herder claims that it is always present, at least in a foundational way. Utilizing one’s reason and overcoming one’s background are necessary for developing good or refined taste.

f. Alexander Gerard and Archibald Alison

Much like Hutcheson, Alexander Gerard (1728-1795) and Archibald Alison (1757-1839) built their theories of taste upon a foundation of Locke’s notion of ideas. They each developed from this foundation views of taste called associationism—a view that the mind (or imagination) relates ideas that are similar to each other or conjoined by custom or experience. Even though their theories differed in degree, there is enough overlap to list them together.

Gerard believed that taste was a kind of internal sense similar to the external senses. Like those five senses, experiences for this one were also simple and immediate. As soon as something comes into your field of vision, the sense of sight perceives it immediately. Likewise, as soon as beauty—or another aesthetic property—enters into your perception, you can immediately experience its beauty. Gerard divided up his study into seven principles of the internal sense (or powers of the imagination), not only a sense of beauty like Hutcheson. The seven principles are novelty, sublimity, beauty, imitation, harmony, oddity (humorousness), and virtue. This may seem like a curious list for contemporary theories of aesthetic taste, but Gerard’s association theory makes sense of these principles.

Taste, for Gerard, is a kind of critical perception, which he calls relishing. It went beyond simply perceiving an object. Anyone who ingests food can taste it in the most primal sense of the term. But to discern differences and subtleties requires a whole other set of abilities. The pleasure is derived from the seven categories because they require moderate difficulty to formulate or comprehend this new idea. Basically, the new object is associated to previous ideas in the mind of the perceiver, and this is an act of the imagination. Rather than being a mere feeling, the imagination follows rules to make these associations. Strong passions conjure up these associations, in a sense, but then the mind continues the process of associating these feelings with the appropriate concepts.

People improve their taste when judgment and imagination are combined through the following factors: sensibility, refinement, correctness, and proportion (or comparative adjustment). All but the last refer to a single property among various objects. Sensibility is basically a person’s range of feeling pleasure and pain, which, Gerard notes, differs from person to person. Refinement involves making comparisons, especially between lower and higher degrees of a particular quality. Correctness, for Gerard, means alleviating the confusion between what are merits and what are blemishes. Proportion, on the other hand, compares whole objects with each other, rather than mere properties. One’s taste improves as one develops a refined ability to utilize these four factors to unite the seven principles when apprehending an object of beauty.

Alison provides an overly detailed association theory of taste, but here only the basic ideas of his view will be presented. To begin with, beauty is found in the mind of the perceiver; he does not consider it a property of the object. He maintains this opinion because, when describing an experience of beauty, one always resorts to talking about how it made him or her feel. Imagine someone claiming that a given object is extremely beautiful, and yet it is an object of indifference. That seems impossible, which is why Alison believed feeling is necessary for beauty. And this feeling of beauty arises through what he calls a train of taste. This is similar to someone having a train of thought, where one thought is associated with or leads to another thought and so on. A train of taste begins with a simple emotion—such as cheerfulness—that arises when perceiving an object. This simple emotion becomes the starting point for a train that associates the ideas of emotions. While this is the necessary starting point for an aesthetic experience, this train must also produce emotions.

Alison’s association view claims that an affective quality of an object becomes associated with ideas of emotions as a train of taste. The constant conjunction between the material quality and the abstract or emotional quality become correlated through experience. To illustrate, thunder might produce fear in a child because the child associates the noise and lightning with the emotion of fear; on the other hand, a farmer might feel joy upon hearing thunder if this season has been particularly dry. Unlike these examples, differences in people’s tastes result from an absence of the right associations. People, for different reasons, may fail to produce the requisite trains of taste that lead to the right emotion. This can be caused by different concerns interfering with one’s ability to allow the trains of taste to develop. Worrying about paying next month’s rent, for example, could hinder one’s ability to follow the train of taste where it will naturally lead. Thus Alison, like many others, posits a notion of disinterestedness as a necessary condition—one must not be distracted by cares in order to allow one’s taste to apprehend and appreciate beauty.

g. David Hume

Even though David Hume (1711-1776) wrote little on aesthetics, his condensed essay “Of the Standard of Taste” was highly regarded by those who came after him. One cannot successfully treat the subject of taste thoroughly without some reference to this essay. Hume is generally labeled an empiricist, but in terms of taste, we could classify him as an ideal observer theorist who allows for some individual and cultural preferences. Empiricism, however, seems an apt label when considering certain elements of his essay on taste, namely that its foundation is experience. Art as a social practice is contained, for Hume, under the general theory of human action that he presents elsewhere but does not develop explicitly for his aesthetic views.

Hume draws a distinction between sentiments and determinations. Sentiments are always right because they do not reference anything beyond themselves. However, determinations are not all correct because they make reference to something beyond themselves, something that could be verified or falsified. Beauty is not a quality of objects; therefore, judgments of beauty and taste are sentiments, not determinations. If beauty was a quality of objects, then we would have a standard of beauty contained within those beautiful objects. Despite this result, Hume still wants to allow for certain kinds of opinions that seem correct from experience. While there are some objects that might be close in beauty to each other, there are others that clearly seem to be more beautiful than other objects.

As a prime example, Hume claims quite famously that no one—with a right mind—would think that Ogilby and Milton have no difference in excellence. But this difference is not something in the object itself, for beauty is not a property of objects but is in the mind only. So the objects that affect the higher sentiments of the person are the ones that we deem more beautiful. These affects are the result of cultural convention and therefore are subject to change. But within a culture, there is a standard of taste that isn’t explicit, like the law, but is based on experience (comprising practice and comparison), especially experience of the right kind of person. Hume appeals to a true judge that would be able to perfectly assess the beauty of an object because this person would possess “strong sense, united to delicate sentiment, improved by practice, perfected by comparison, and cleared of all prejudice.” The combined opinion of these very rare individuals would compose the standard of taste. The standard of taste lives within these true judges. By recognizing the better judgments certain people have displayed, the standard of taste they represent becomes public.

It is important, however, not to confuse Hume’s true judge with someone like a contemporary art critic. The judge is not applying a standard of taste to the different objects of perception. If so, then beauty would be found in the objects or in some other realm. To better understand what Hume means, we can explain it this way: Many people have experienced looking at something and not understanding what they are seeing. And then someone else comes along and shows them what to look for (or how to see it properly). All of a sudden, they are able to perceive the object properly. For another example, take seeing someone in the distance. You might think the person is your friend. But as the distance becomes smaller and the perception clearer, it is now obvious that the person is a stranger. These analogies are what Hume has in mind. The true judge does not apply a standard, but the true judge has more perfect perception. And ideal perception is the key to having good taste. It follows that becoming better at perceiving objects will make one’s taste better.

h. Edmund Burke

Like Hume and others, Edmund Burke (1729-1797) recognized that nothing seems more indeterminate than taste. Hume tried to show that since we believe there are expert opinions on matters of taste, then taste cannot be simply a personal whim. He even asserted it is likely that the standard of reason and taste are the same in human beings. The explanation, Burke claims, for thinking that reason and taste seem so different is because more people cultivate reason to a higher level. An error in reasoning could have far more negative consequences than an error in taste. For example, a heart surgeon considering which kind of operation is necessary will have greater direct consequences than someone trying to reason about whether Pablo Picasso is a better painter than Marc Chagall. The urge to cultivate taste is not present, so most people do not devote much time to it.

Though Burke recognized the ambiguities surrounding taste, he set his goal to try to uncover principles of taste. One of his starting points was the uniformity of people’s organs of perception. Many people emphasize the differences between people’s perception of the same event, which leads to the belief that people perceive things differently. Burke, however, maintains that if people’s sense organs functioned completely differently, then every kind of reasoning would be impossible. If two people were looking at a tree, for example, then there would be nothing on which to ground their separate claims that it is a tree. They might choose to describe what they see in contradictory terms, but as Burke claims, their sense organs must actually perceive the same object. Part of his method was to catalogue the different kinds of objects and how they affect the senses and which senses they affect. Specifically, Burke chose to categorize objects according to their giving pleasure or pain. Through this catalog, Burke believed he demonstrated that people have the same physical responses of pain and pleasure to various objects. This catalog further gives foundation for a more precise theory of taste by showing the similar responses people have toward different sense stimuli.

For Burke, pleasure and pain compose the two, main aesthetic starting points for a judgment of taste, first going through the senses and the imagination. Since one rarely moves from pain to pleasure or the reverse, Burke introduces indifference as the neutral starting point for experience. In other words, one moves from a state of indifference to either pleasure or pain. If one is in an indifferent (or neutral) state, then music, for example, compels one to move to a state of pleasure. The power of the imagination utilizes the pleasure or pain to recognize the property of the object that led to that particular feeling. Depending on which one, the object is judged to be beautiful or ugly in accordance with the degree of pleasure or pain. So, Burke’s notion of taste consists of three things: primary pleasures of sense, secondary pleasures of the imagination, and conclusions of the reasoning faculty.

I. Immanuel Kant

Owing to his third major critique about aesthetic judgment, Immanuel Kant (1724-1804) remains an overwhelming influence in aesthetics. Much has been written about different aspects of Kant’s aesthetic theory, so this section will focus solely on his ideas surrounding taste. Though Kant fully believed that taste is subjective, he nevertheless referred to judgments of taste rather than something like feelings of taste. This choice was not a denial that feelings are relevant, since taste has to do with pleasure, but he wanted to uncover whether there were any a priori principles for taste.

As someone who liked theoretical systems, it is no surprise that Kant divides judgments of taste into moments. There are four moments that correspond with the four judgments (quality, quantity, relation, and modality) found in the Critique of Pure Reason. The first moment, disinterested pleasure, corresponds with quality. It means that in order for a judgment to be one of taste, it must not involve any interest beyond itself. Disinterested is not the same as uninterested. Disinterested is closer to a kind of detachment. The object has nothing to give other than the pleasure of itself; there is not an interest beyond itself. If one found an expensive object, one might declare that it is beautiful. However, this would not, strictly speaking, be a judgment of taste, if one were also thinking about the amount of money to be gained from its sale.

The second moment, universal pleasure, corresponds with quantity. It preserves the common belief (or feeling) that judgments about beauty are not completely subjective. We often expect others to share this belief. For example, we would find it highly unusual, if not disturbing, that someone literally believed that a sunset did not possess at least some beauty. Since Kant does not assert a specific standard of beauty, he doesn’t claim everyone will actually agree about which objects are beautiful. Judgments of beauty are singular; they are about one object at a time, and each judgment presents itself as having universal appeal.

The third moment, the form of purposiveness, corresponds with relation. Specifically, he is focusing on the relation of an end or purpose, a final cause. The purpose for which an object is made governs the way it is made. A hammer has a purpose as it was made to put nails into wood; so, the idea of its purpose existed before the actual hammer. However, judgments of taste (or beauty) do not depend on concepts, so it seems that they could not have purpose. But Kant believes that a judgment of beauty cannot be solely a feeling: it must be based on formal properties. To overcome this problem, Kant employs the expression “purposiveness without a purpose.” This is subjective; we must imagine that the object has a purpose even though, for an aesthetic judgment, it does not.

The fourth and last moment, necessary pleasure, corresponds with modality. Unsurprisingly, Kant does not think people find something beautiful because they must necessarily find it so. Kant explains that this necessity implies that the beautiful object is exemplary. When we see a beautiful work of art, we want to imitate it as if there were rules to follow to produce an equally beautiful object. Artists employ techniques that can be learned, but Kant believes that it is not possible to teach someone how to make a beautiful work of art even if that person masters all the techniques of a given art. Taken together, these four moments compose the basic aspects involved in making a judgment of taste.

4. Nineteenth and Twentieth Century Philosophers: The Step Away from Taste

Theories of taste rose up in the 18th century and diminished almost as quickly. As demonstrated by sheer numbers, 19th century philosophers were less concerned with taste than 18th century thinkers. They didn’t abandon aesthetic taste; rather, they moved from talk of aesthetic taste to talk of an aesthetic attitude. On some level, this change might seem like a mere semantic difference, but though it overlaps with taste, talk of an aesthetic attitude offers certain differences. (See also the article on Aesthetic Attitude for a fuller treatment.)

Taste is very outward looking, especially as it relates to aesthetic judgment. The object possesses concrete properties that the perceiver ought to judge as beautiful or not. Failure to make the correct judgment was considered as something deficient with the beholder. For some previous philosophers, it could be a flaw with a person’s virtue that hinders the ability to perceive the beauty of the object. For others, it might be more connected to a lack of knowledge or at least the right kind of knowledge. The key idea for most traditional theories of taste was that the object has properties that the beholder must discover, though the views of people like Hume start to show a shift.

In contrast, aesthetic attitude brings the individual onlooker more to the forefront. The beholder’s state of mind becomes more important as his or her attitude helps or hinders the possibility of aesthetic experience. Whether or not the original aesthetic attitude theorists believed so, these theories allow for a wider range of objects to be considered aesthetic objects. Just by adopting an aesthetic attitude, it seems like any object could be viewed as an aesthetic object. With the taste theorists, the object, apart from the spectator, must be worthy of the aesthetic appreciation it receives. Another difference lies in the fact that the aesthetic attitude can seemingly be turned on and off. Someone could adopt the aesthetic attitude in a given instance, but ignore it in a very similar situation the next day. There seems to be some truth to this because you could walk into an art museum wanting and expecting to experience wonderful things, but you could also enter with a refusal to see anything in an aesthetic light. Taste, according to the respective theories, is not something that is turned on and off. A person either has a developed or attuned sense of taste or not. In other words, the aesthetic attitude is a point of view one adopts, while aesthetic taste seems to be more connected with one’s development and nature.

Two main versions of aesthetic attitude theories occur in the writings of Arthur Schopenhauer and Edward Bullough. Schopenhauer’s (1788-1860) thoughts on aesthetics, we might say, mark the transition from theories of aesthetic taste to aesthetic attitude. Schopenhauer often uses the term aesthetic contemplation rather than attitude. But it seems clear that the later use of attitude can be applied retroactively to his use of contemplation. In order to have an aesthetic experience, the perceiver must have a different kind of perception about the object. No longer focused on the particulars, the perceiver experiences the ideas that are embedded in the object. We might postulate that this shift from particulars to ideas occurs when the perceiver has adopted the aesthetic attitude, though Schopenhauer never clearly spells this out. This attitude and experience are only temporary; it’s an impermanent rest from the suffering of life. The attitude is very important for Schopenhauer. Most things, when viewed with the right aesthetic attitude, will become beautiful in the mind (or perception) of that specific person.

Edward Bullough (1880-1934) is not a common name in the larger history of philosophy, but he made a small but significant contribution in the field of aesthetics. Working as a psychologist, he developed a notion of psychical distance (a continuation of disinterestedness) that was to ground his idea of aesthetic attitude. He often uses the expression aesthetic consciousness instead of aesthetic attitude.

Bullough wanted to develop a notion of the experience of art without appealing to any single characteristic found in all art, since he did not believe there was such a characteristic. This belief helps to illustrate the shift that had taken place since the 18th century, when many still believed beauty was the main characteristic of art. Bullough was more concerned with focusing on the experience that the work of art causes for the beholder. Two people looking at the same object, for instance, might have very different experiences. His solution to this dilemma is what he calls psychical distance. Bullough believed that the beholder must have the correct amount of distance between herself and the work of art. Too much or too little distance will prevent the complete aesthetic experience. It might be similar to having a conversation: Imagine trying to talk to someone in a normal conversation, and he moves his face one inch away from yours. It would be much too distracting to continue. Likewise, if someone were standing one hundred feet away from you, it would not be possible to have the intimacy that a good conversation requires. While there isn’t an exact distance one must have when experiencing art (or having a conversation), there is a range of distances, and the beholder must be in that range in order for the possibility of the aesthetic experience. For Bullough, this distance is what directly affects one’s taste in works of art. It is important for the beholder to learn to gauge the right distance, which can vary from person to person. People create the distance by removing practical interest from the object.

5. Contemporary Philosophy and Beyond

Theories of taste reached their peak in the 18th century. They diminished and then changed in the 19th century. They were left without much significance in the 20th century. Now, in the 21st century, few people really speak about a theory of taste. Are these theories merely relics of the past that we should find interesting only as historical artifacts? How can we account for the fact that people speak commonly and meaningfully about aesthetic taste while it seems to have diminished in academic discourse? It is not obvious how we should answer such questions. However, even though taste is no longer a prominent idea, there have been some notable contributions in the contemporary world.

a. Pierre Bourdieu

A French sociologist, Pierre Bourdieu (1930-2002) attempted to apply the methods of the social sciences to an understanding of aesthetics. In this way, he is unique because he did not work in the traditional philosophical framework that surround questions of beauty, taste, and aesthetic experience. He studied how people come to develop their tastes in various areas, but especially music. While money and time are important for developing cultural knowledge, Bourdieu claims that a crucial component comes from how someone is raised in the home and other institutions, like school. He uses the term cultural capital to refer to someone’s social assets, such as education. While money may help someone gain some social assets, the salient idea is that cultural capital helps one achieve a higher class beyond their purely financial assets.

To the embedded responses a particular individual has to cultural objects, Bourdieu gives the name habitus. People belong to different aesthetic spheres, and their preferences are very similar within the sphere. He concludes that there is no value that guides one’s aesthetic taste; it is developed within a person’s class. This differs from views in traditional philosophy that tend to favor notions of beauty and taste from beyond one’s vantage point in the realm of ideas, or even God, without reference to a person’s class or context. Since people approach things from a particular situation, Bourdieu maintains that people’s social context contributed significantly to their approach to aesthetic taste. In order to demonstrate this idea, Bourdieu surveyed many people belonging to different social classes. He discovered, for instance, that people from the working class believe that objects should serve a function, even aesthetic objects. However, those from the upper classes believe an object could be valuable for its own sake. One class, thought Bourdieu, would almost be disgusted by the dominant art in another class. Thus, for Bourdieu, taste is developed within one’s social context, but one could move to a different class by acquiring cultural capital.

b. Gustatory Taste

Aesthetic taste began as a convenient metaphor for the judgment of the beautiful. Some recent philosophers have begun to examine whether taste must be considered only a metaphor disconnected from its natural setting. In other words, might real gustatory taste have a substantial connection with the traditional and more metaphorical notion of taste? It is a contentious topic with very few middle ground positions.

Gustatory taste can be altered—positively and negatively—with experience or education. People have different methods of preparing foods all over the world that produce different flavors. Knowing how to blend flavors and how to properly consume certain items will refine one’s taste and enjoyment of certain foods and drinks. Scotch, for example, is a complex drink that can contain sweet, smoky, spicy, citrus, and other flavors. Knowing how to drink scotch to taste all of these flavors is not automatic. While there might not be an absolutely correct way to drink it, there are ways to drink it so that you taste all it has to offer. Similarly, in the context of art, one could learn how to appreciate certain kinds of art by learning how to appropriately perceive and experience them. This has nothing to do with whether the person will actually like certain art. The point is merely that one can alter or improve one’s taste by learning more about the object or type of object. This education and refinement will usually increase the pleasure received in both contexts.

Whether gustatory taste is on par with traditional aesthetic taste seems to hinge on the status of food as art. This is where the larger questions loom for connecting the two kinds of taste. There are some more generally agreed upon characteristics of art that could help negotiate this question. Art is generally considered a kind of expression of emotions or ideas. While someone cooking might have positive emotions about the food or those who will consume it, the food itself does not seem to express emotion. Now there might be a situation where one person claims that the cook must love her because the cook prepared her favorite meal. There is some communication here, but the question is whether the communication was through the food as art, because something similar could be communicated with store-bought chocolate or even a bottle of wine. These things might not carry meaning (or the same meaning) for anyone else. Insofar as meaning or expression is necessary for art, gustatory taste seems to fall short of the traditional (and metaphorical) theories of taste, as suggested by Elizabeth Telfer, though she believes food to be a minor art.

c. Some Developments in Analytic Philosophy

Even though the zenith for theories of taste has passed, it has found some interest among contemporary analytic philosophers. Talk of aesthetic judgment and interpretation are more prevalent, but there are some important themes that have received attention in recent discussion. With the rise in connecting gustatory taste with aesthetic taste, some philosophers have given more weight to the personal interaction one has with an aesthetic object. Carolyn Korsmeyer and others have pointed out that taste in both the literal and metaphorical senses require a personal experience with the object. It would be suspect for people to claim that they dislike bananas, for example, if they had never eaten one or even seen one. Similarly, people cannot reasonably claim to dislike an opera or painting that they have never observed or experienced. This lack of personal familiarity becomes even more acute were they to try making a specific claim, such as stating that the colors of a painting are not well balanced throughout the composition. This claim seems impossible without actually seeing the painting. While we might trust our friend’s negative review and decide not to see a work of art, we cannot reasonably make the stronger claim that something is wrong with the work without actually experiencing it. Furthermore, there is a difference between claims of taste and other kinds of factual claims. From second-hand testimony, we could learn that a sculpture is made of bronze, but we could not learn how beautiful it is unless we see it for ourselves. It seems reasonable that some kind of personal experience of an object itself or similar object (including audio or visual representations) is important for an evaluative judgment of taste. Even if it was possible to make a judgment of taste without direct experience, it would at least be necessary for someone to have a little knowledge of the kind of object under discussion.

Some questions arise about which objects are appropriate for one’s judgment of taste and which people’s opinions matter. Imagine a person who claimed that a toaster was the most beautiful object he had ever seen. While it seems likely that most people would not agree, is this person wrong? Frank Sibley claims that anyone can notice the non-aesthetic qualities of an object, but only some people notice its aesthetic qualities. These qualities help an observer recognize an object that is admirable. But they are not easily recognizable because of the experiences and training that each observer possesses. An issue with this view is that there can be a wide variety of legitimate opinions. One person claims an object is mildly beautiful, while another claims the same object is supremely beautiful. Both views are based on their perceptions of the same object’s aesthetic qualities. Some might say that one person’s taste is more refined. However, there are two residual questions. Which person’s taste is refined? Plus, Jerrold Levinson raises the question about what might motivate someone to cultivate her taste to be able to perceive the finer aspects of an aesthetic object. Answers to questions about the right observer and the right object never seem to lead to a concrete answer, which creates problems for theories of taste.

In the 18th century, many connected taste with a robust account of moral goodness. With that connection dismissed by many over the last century, theories of taste, along with theories of beauty and sublimity, suffered as well. The early 21st century, however, has brought a renewed interest in several related areas: ideas of beauty with people like Roger Scruton and Nick Zangwill, the sublime with Emily Brady, and aesthetic experience with Richard Shusterman. This reappearance suggests that these traditional aesthetic concepts were perhaps ignored for too long. Thus, taste might also have the possibility of new life in the 21st century.

6. References and Further Reading

a. Primary Sources

  • Bullough, Edward. “‘Psychical Distance’ as a Factor in Art and an Aesthetic Principle.” The British Journal of Psychology, vol. 5, no. 2, 1912, pp. 87–118.
    • This article presents his most famous idea: psychical distance.
  • Burke, Edmund. A Philosophical Enquiry into the Origin of Our Ideas of the Sublime and Beautiful. London, 1757.
    • The earlier version has his essay “On Taste,” which presents his main ideas concerning taste.
  • Cooper, Anthony. Third Earl of Shaftesbury. Characteristics of Men, Manners, Opinions, Times. London, 1711.
    • The section called “The Moralists” is where Shaftesbury spells much of his view of taste.
  • Herder, Johann Gottfried. Selected Writings on Aesthetics. Edited and translated by Gregory Moore, Princeton University Press, 2006.
    • This is a compilation of Herder’s works on aesthetics, and a main discussion of taste is found in the chapters called “Critical Forests: Fourth Grove” and “The Causes of Sunken Taste.”
  • Hume, David. “Of the Standard of Taste.” In Four Dissertations, Edinburgh, 1757.
    • Hume introduces his notion of the ideal judge in this essay.
  • Hutcheson, Francis. An Inquiry into the Original of Our Ideas of Beauty and Virtue. London, 1725.
    • Section VI develops his belief that people have a universal sense of beauty.
  • Kant, Immanuel. Critique of Judgment. Berlin, 1790.
    • His section discussion of the four moments are of particular importance to this topic.
  • Mendelssohn, Moses. Philosophical Writings. Berlin, 1761.
    • In the section “On Sentiments,” Mendelssohn (or his Theocles) talks about how he prepares himself to experience art and beauty.
  • Plotinus, The Enneads.
    • In the first Ennead, tractate 1, section 1, Plotinus discusses beauty, especially his belief that symmetry cannot be the only requirement of beauty.
  • Schopenhauer, Arthur. The World as Will and Idea. Leipzig, 1819.
    • His major work dealing with the major branches of philosophy, but Book 3 (Volume 1) is where he focuses on aesthetics.

b. Secondary Sources

  • Beardsley, Monroe. Aesthetics from Classical Greece to the Present: A Short History. The University of Alabama Press, 1966.
    • A very accessible history of the development of aesthetic ideas.
  • Cahn, Steven M. and Aaron Meskin. Aesthetics: A Comprehensive Anthology. Blackwell Publishing, 2008.
    • This is one of the best anthologies for the history of aesthetics, incorporating selections from most of the main philosophers throughout history.
  • Carruthers, Mary. The Experience of Beauty in the Middle Ages. Oxford University Press, 2013.
    • Chapter 4 offers an insightful analysis of how taste rose to prominence during the medieval period.
  • Dickie, George. The Century of Taste: The Philosophical Odyssey of Taste in the Eighteenth Century. Oxford University Press, 1996.
    • An excellent resource on five of the major philosophers on taste: Hutcheson, Gerard, Alison, Hume, and Kant.
  • Gaut, Berys and Dominic McIver Lopes, editors. The Routledge Companion to Aesthetics. 3rd ed., Routledge, 2013.
    • This is a great resource for an introduction to a wide array of issues in aesthetics, but Carolyn Kormeyer’s entry on “Taste” is most relevant for this article.
  • Neill, Alex and Aaron Ridley, editors. Arguing about Art: Contemporary Philosophical Debates. 2nd ed., Routledge, 2002.
    • This book features some competing arguments on a variety of issues, but offers a helpful exchange about whether food is art in Part 1.
  • Wenzel, Christian Helmut. An Introduction to Kant’s Aesthetics: Core Concepts and Problems. Blackwell Publishing, 2005.
    • A very accessible explanation of the main ideas of Kant’s aesthetic theory.

 

Author Information

Michael R. Spicher
Email: mrspicher@massart.edu
Massachusetts College of Art and Design
U. S. A.

History of Love

What is love? We all wish to have the answer to one of the most universal, mysterious, and all-permeating phenomena on this planet. And even if we perhaps have a special feeling and intuitive insight that love “is related to everything else, but near things are more related than distant things,” as Waldo Tobler said, we still have not found and offered a full or finite definition of this multifaceted, dynamic, creative and all-encompassing phenomenon that is love. Another view, held by Spinoza, is that love elevates us up to an expansive love of all nature. For him, an act of love is an ontological event that ruptures existing being and creates new being.

However, since love is an ontological event, creation of new being also coincides with different concepts throughout history, since each period brings a new way of being and living. Thus, each period in history offers a prevailing concept of love: in ancient, pre-Socratic times, we have Empedocles’ Love (Philotes) and Strife (Neikos); in Socratic times, Plato’s Eros and Aristotle’s Philia; in the middle ages, St. Paul’s Agape and St. Augustine’s Caritas; in the Renaissance, Rousseau’s notion of a modern romantic pair of Emile and Sophie; in modern times, Freud’s love as transference; and finally, in postmodern times we tackle the notion of duties to children. These concepts of love are not always independent of one another, as later philosophers often implement earlier conceptions into their own interpretations.

Table of Contents

  1. Presocratic Period
    1. Empedocles
  2. The Classical (Socratic) Period
    1. Plato
    2. Aristotle
  3. Christian Period
    1. St. Paul
    2. St. Augustine of Hippo
  4. The Enlightenment Period
    1. Rousseau
  5. The Modern and Postmodern Periods
    1. Sigmund Freud
    2. Duties to Children
  6. References and Further Reading

1. Presocratic Period

a. Empedocles

Empedocles was a Sicilian, a high-born citizen of Acragas and a pre-Socratic philosopher, among whom were also Heraclitus and Parmenides. Empedocles is the last Greek philosopher who wrote in verse, which suggests that he knew the work of Parmenides, who also wrote in verse. Empedocles’ work should be understood in relation not only to Parmenides’ but also to Pythagoras’’ and the Sensualists, who emphasized the importance of our senses. On the other hand, Empedocles’ notion of Love and Strife being fundamental cosmic forces on which his cosmology and ethics lie is an  original thesis that no other philosopher afterwards continued (in some ways Freud was the only one who used Empedocles’ notions of Love and Strife in his writings on Eros and Thanatos).

In Empedocles’ cosmology, Love stands as a cosmic, consistent principle due to which the world exists through mixing of the elements (earth, air, fire, and water), or as he says:

From these [Elements] come all things that were and are and will be; and trees spring up, and men and women, and beasts and birds and water-nurtured fish, and even the long-lived gods who are highest in honour. For these [Elements] alone exist, but by running through one another they become different; to such a degree does mixing change them. (Fragment 21)

For Empedocles, elements are like letters in an alphabet, which emphasizes the ability of elements to form different types of matter in the same way a limited number of letters can form different words through combinations of letters, or basic colours can be used to create different hues and patterns. The cause of this mixture and of these combinations are the cosmic forces of Love (Philotes)—the force of attraction and combination, and Strife (Neikos)—the force of repulsion and fragmentation. These two forces are engaged in the eternal dialectic and they each prevail in turn in an endless cosmic cycle:

I shall tell thee a twofold tale. At one time, it grew to be one only out of many; at another, it divided up to be many instead of one. There is a double becoming of perishable things and a double passing away. The coming together of all things brings one generation into being and destroys it; the other grows up and is scattered as things become divided. And these things never cease continually changing places, at one time all uniting in one through Love, at another each borne in different directions by the repulsion of Strife. (Fragment 17)

This cycle of love-strife consists of four phases in a Sphere: two full phases, one governed by Love and another by strife, and two transitional phases: a phase from Strife to Love, and a phase from Love to Strife. In the beginning, the Sphere was filled with love and the four elements were so close together that we could not discern them. After some time, however, Strife came into the Sphere and Love started to outflow from it. When Strife gained enough concentration in the Sphere, it resulted in the movement and fragmentation of the four elements into separate forms.

But it seems that Empedocles needed “evolution” (development) in his cosmology and the ensuing dynamic movement of the cosmos, so he introduced movement through two transitional phases: phases from Love to Strife and from Strife to Love. In this way, he got a third phase in which, as a consequence of the previous phases, Love regains power through coming into the centre of the Sphere, while Strife moves to its margin. And then, in the fourth and last phase of the cycle, Strife returns to the centre, and Love moves to the margin. This process then repeats over and over again. The idea of Love and Strife moving in and out of the Sphere may be an echo of Empedocles’ medical knowledge (he was a well-known physician), especially of the function of the heart. Thus, according to Empedocles, the world exists in continuous movement through different phases of a cycle, in which a certain type of stability exists in eternal elements. And it is precisely this continuous movement of the elements which produces a continuous state of organic evolution and from which all beings originate.

2. The Classical (Socratic) Period

a. Plato

Plato, born a nobleman in an aristocratic family, was not only a philosopher but also a mathematician, a student of Socrates, and later, a teacher of Aristotle. He was the first to lay the foundation of the Western philosophy and science. He also founded the first known academy, which can be considered the first institution of higher education in the Western world.

Plato’s most important works on love are presented in Symposium (although he changed his abstract outlook on love as universal Ideas (of Truth, Beauty and Goodness) later in Pheadrus to meet also the erotic and “subjective” aspects of the ideal Love.) In Symposium, meaning a feast, he presents seven speeches about love going from speaker to speaker as they sit at the table. He introduces seven speakers represent five types of love known at that time, with Socrates offering a unique and new philosophical concept of love he learned from Diotima, and concluding with Alcibiades, the final speaker, presenting his own love experience with Socrates.

Phaedrus, who is the “father” of the idea of talking about love, claims that Love is a God, and is one of the most ancient Gods. According to Hesiod, Love was born to Chaos and Earth. Love gives us the greatest goods and guidance. Phaedrus prefers love between an older man (erast) and a young boy (eromenos) because it encourages a sense for honour and dishonour (shame), two necessary virtues of citizenship, for love will convert the coward into an inspired hero who will, for instance, die for his beloved.

Pausanias, who was sitting next, speaks next. He says that Phaedrus should have distinguished heavenly and earthly loves. The first has a noble purpose, delights only in the spiritual nature of man, and does not act on lust. The second one is the love of the body, and is of women and boys as well as men. And when we are in the domain of earthly love, which operates on lust, we can see the powerful influence that pursuing sexual pleasure has on a person’s actions and life: we become slaves to our passions and subservient to others, a distinct threat to freedom and thus a happy life.

Aristophanes comes next, but he has the hiccups and requests that Eryximachus the physician either cure him or speak in his turn. Eryximachus does both, and after prescribing for the hiccups, speaks as follows: he agrees with Pausanias that there are two kinds of love; furthermore, he concludes that this double love extends over all things— animals and plants, as well as humans. In the human body lies both good and bad love, and medicine is the art of showing the body how to distinguish the two.

Aristophanes is the next speaker. He argues that “original” humans used to be beings with two faces and four arms and legs, but we were cut into two by Zeus due to our arrogance and disobedience of the Gods. Since then people go around the world seeking their missing half. Eros, the God of love, is here to assist us in finding this missing half, who is our spiritual kin. Aristophanes also claims there were three genders of the original human beings: male (two males), female (two females), and androgynous (male-female). Males were descended from the sun, females from the earth, and androgynes from the moon. Thus, Eros’ task is to make our race happy again through our completion and regression to the original state. However, making us complete again is not as easy task as we would expect. When Zeus cut people in half, they were at first cut in such a way that the halves could not sexually merge; they were able to just kiss and hug and were kept in this unsatisfied situation until they died. For this reason, Zeus gave them sexual organs. Sexual organs enabled the halves to merge in coitus and, at least for a little while, release the halves from their tension of desire for each other. Martha Nussbaum, however, has observed that this option pushes people to live within a domain of repetitive needs and desires which distract them from other businesses in life. It is very difficult to meet such halves, and an even bigger puzzle is how we would recognize them (what are the signs of meeting the right half?) (Nussbaum, 2001).

Socrates, being aware of this problem of Aristhopanes’ Eros, offered a response to Aristophanes and claims that a) “love is neither love for the half nor the whole, if one or the other has not some good, beauty and truth” (Plato, 1960, p. 94); and b) love, or Eros, is primarily a relationship between a knowledge-lover (philosopher) and ultimate knowledge/wisdom (Love which is Goodness/Beauty/Truth and part of the Heaven/Angelic domain). Thus, our love is based on the notion that the aim of love is not a person but something immaterial (the ultimate Heavenly Ideas of Goodness/Beauty/Truth), which enables an anchor within ourselves. And how can we achieve this? The next four steps up the ladder from the material towards the immaterial will show us. But before we introduce the four steps upwards into the angelic domain, we must say that the originator of the theory of Eros is not Socrates, but the Greek priestess Diotima. Socrates says that he merely repeats what he was told by her, and that is

(a) the general description of Eros or love is a desire for something that we do not have—we desire what we lack. And what do we lack? We desire beauty, goodness, and truth. But if we desire something that we do not have—does that mean Eros is ugly, bad, and foul? Eros is neither beautiful nor ugly, neither good nor bad, neither wise nor stupid, neither God nor mortal: Eros is something in between: Eros is an intermediary power, transferring prayers from men to gods and commands from gods to men. We must also distinguish Eros from a beloved one, because Eros is the loving one. And such a notion of Eros resembles the position of a philosopher: “Sophia (wisdom) is one of the most beautiful things in the world. Sophia is the love of wisdom; therefore Eros must be a philosopher, that is a lover of wisdom who stands in between the fair and the foul, the good and the bad, the ugly and the beautiful.” (Plato, 1960, p. 96).

(b) If Eros desires the beautiful, then the question arises: What does Eros desire of the beautiful? He desires possession of the beautiful which, if we substitute it with the good, means desire to possess happiness. And when something makes us happy, we wish to have the everlasting possession of the good. And how do we achieve that? By reproducing it. This is the reason men and women at a certain age desire to produce offspring, as with birth comes beauty and mortal men and women reach immortality.

(c) Eros, as desire of the good and beauty, brings forth a desire for immortality; this principle extends not only to men but also to animals. This is also why parents love their children—for the sake of their own immortality—and why men love the immortality of fame. Intellectuals and artists do not ‘create children; instead, they conceive concepts of wisdom, virtue, and legislation.

(d) Thus, men who are concerned more with the physical level take care of children and love a woman, and those who are concerned about the spiritual level take an interest in justice, virtue, and philosophy (world of ideas of Goodness/Beauty/Truth per se), and love Man (as mankind). And how do we get to this Goodness/Beauty/Truth? Love starts with loving beautiful forms, and proceeds to beautiful minds. From minds, one can learn to love laws and institutions, then sciences; he sees that there is a single science uniting all of nature’s beauty. In knowing this, he can perceive beauty with the mind’s eye, not the body’s eye, and will know true wisdom and the friendship of God.

The last speech is by Alcibiades. We learn that Alcibiades (who is stunningly beautiful, an acclaimed war and strategic leader, winner of many prestigious awards, and praised and adored by many Athenians) is in love with Socrates. He fell in love because as he said: “I have heard Pericles and other great orators, and I thought that they spoke well, but I never had any similar feeling…. He is the great speaker and enchanter who ravishes the souls of men; the convincer of hearts, too” (Plato, 1960, p.104). So Alcibiades was surprised that beneath an ugly and neglected appearance there was great treasure, and he explains his love for Socrates by first comparing him to the busts of Silenus, and secondly, to Marsyas the flute-player. “For Socrates produces the same effect with the voice which Marsyas did with the flute—he uses the commonest words as the outward mask of the divinest truths with which he touches the soul” (Plato, 1960, p. 105). Then he proceeds: “Socrates is exactly like the busts of Silenus, which are set up in the statuaries and shops, holding pipes and flutes in their mouths; and they are made to open in the middle, and have images of gods inside them … and we will learn that his words hold the light of truth, and even more, that they are divine.” (Plato, 1960, p. 106).

This uniqueness of Socrates is his main attraction. According to Lacan, however, we should consider a bust as an agalma—a source (or rather an object) of a lover’s desire or desire of (his) love.

A particular agalma someone sees in the other is that something he desires in this and not in the other person. Desire as such points towards a peculiar object (of desire) because it emphasizes and chooses exactly this and not any other object and makes it incomparable and incommensurable with the others. (Lacan, 1994, p. 16)

And that desire aims strictly for a subjective and particular choice (or projection), maybe not reflecting something real in the person at all, as Socrates reveals with his “mysterious” reply to Alcibiades: “But look again sweet friend, and see whether you are not deceived in me. The mind begins to grow critical when the bodily eyes fail and it will be a long time before you get old” (Plato, 1960, p. 107). So, Socrates wanted to show Alcibiades that what he has sought and loved in him is actually in himself as well. Discovering your true self gives you the greatest self-satisfaction and, at the same time, knowledge of how to become a better person; and this treasure can be shared with others, too, becoming good, beautiful, and truthful—something Socrates did by calling his endeavour a midwifery, that is, helping others bring forth into the light what was already in themselves.

In Plato’s second work on love, Phaedrus, he discusses another notion of love. He begins this work by denying the good of any love because he connects it with irrational behaviour conditioned by lust and desire. Sometimes a lover acts against the good of the beloved because of his desire, jealousy, possessiveness, and envy, and sometimes he acts even against himself when, as a rejected lover in the worse-case scenario, he takes his own life. For these reasons, Socrates favours a friend over a lover. Socrates thinks that if a lover behaves against his or his beloved’s goodness, then Eros must not be God. After all, God should do Men good and should uplift lovers into the realms of Heavenly bliss. Socrates, however, a little later on, changes his mind and says that he was wrong by stating that Eros is not a God. In fact, Eros is connected with the true love(r). “The ‘’true lover’’ has a mania for the good, and this kind of mania, coming from the divine, is superior to human self-control of irrational passions … and is an expression of the desire of the immortal soul, which has experienced the supreme good/beauty of the divine and wants to reclaim it.” (A. H. Kissel).

The soul, however, has the elements of the rational, harmonious, good and the disharmonious, aggressive, bad which are like the “good horse” (metaphorically presented as a white horse) and the ”bad horse” (metaphorically presented as a black horse) that must be driven in concord; when these elements are disordered, the soul loses its wings and adds a mortal body (Plato, 1963). “The goal of the incarnated soul is to learn how to manage the ‘bad horse’ through habitual reining-in, in order that its wings grow again; the soul must regain self-control and true knowledge” (A. H. Kissel. But many souls mistake “their own opinions for true knowledge” (Plato, 1963, 248b). Souls which have better and deeper knowledge and understanding of our heavenly origins and are in better accord with their heavenly nature are incarnated as better beings. According to this, the true lover of wisdom and the good, that is, the philosopher, is on the top of all Man. The same holds for an artist (the true lover of beauty). Others follow in the next order: the just king, the statesman, the doctor, the prophet and priest, the representational artist (poet), the manual labourer, the sophist, and last, the tyrant. The just are reincarnated to a higher level, and the unjust to a lower level, until the wings grow back and heaven is regained. True and divine love occurs when a lover meets his lover on the same level (as lovers are like mirrors to each other) which is why Socrates states that people who attract one another do so because they are the followers of a certain deity who help each other to ascend. (That is the reason why, for instance, people who love wisdom and justice follow Zeus, the ones who love royal treats follow Apollo, the ones who like to fight follow Ares, and so on.) But most importantly, a “true love is a divine one as far as it is connected with virtue, justice, modesty, inspiration, enthusiasm and self-control, and it only occurs when lovers bring of each other their best godlike qualities” (Plato, 1963, 253b).

In the last part of the Pheadrus, Socrates states that those who know divine love also know how to discern a good speech that conveys truth, goodness, and beauty from a false one by drawing on analogy of irrational and true love as stated above. “Writing speeches is not in itself a shameful thing. It’s not speaking or writing well that’s shameful; what’s really shameful is to engage in either of them shamefully or badly” (ibid., 258d).

b. Aristotle

Upon Plato’s death, Aristotle left for Assos in Mysia (today known as Turkey), where he and Xenocrates (c. 396 B.C.E.-c. 314 B.C.E.) joined a small circle of Platonists who had already settled there under Hermias, the ruler of Atarneus. Under the protection of Antipater, Alexander’s representative in Athens, Aristotle established a philosophical school of his own, the Lyceum, also known as the Peripatetic School due to its colonnaded walk.

Aristotle speaks about love mostly in Nicomachean Ethics, books VIII and IX. He speaks about Philia (friendship-like love) as the highest form of spiritual love and having the highest spiritual value. This kind of friendship is friendship of the same and not based on any external benefits. It is led by reciprocal sympathy, support and encouragement of virtues, emotions, intellectual aspirations, and spirit. “For all friendship is for the sake of good or of pleasure-good … and is based on a certain resemblance; and to a friendship of good men all the qualities we have named belong in virtue of the nature of the friends themselves….” (VIII:3, 1156b, trans. Ross). We can’t have many such friends, however, because our time is limited.

But when Aristotle says that a person needs to abandon his Philia for a friend if he changes or becomes vicious, this does not mean that he terminates friendship due to his own interest. He means that it happens because one of the friends realizes that he can’t do anything to contribute to the goodness of the other. He describes an example when we cannot talk of a true honest friendship any longer—when friendship is based only on pleasure and benefit. In the case of friendship based on benefits, friends are used only as a means to achieve a certain purpose (some goods, whether symbolic or material) and those who are together with others only for pleasure do not love the friend for his own sake but for their own pleasure. Such friendships cannot last long because when the reasons for friendship vanish, the friendship itself disappears. Friendships formed on the basis of pleasure or benefit can be formed between two bad people or between good and bad people, but true friendship can be formed only between two good people. Good people are friends because they themselves are good. Bad people do not feel any pleasant feelings towards a friend unless he offers some kind of benefit. According to Aristotle, friendship does not show only the values and preferences of the society and the country, but also, more importantly, the moral character of a person.

Friends who love each other love in them what they themselves believe to be of value:

We love in friends that which represents a value for us—a friend is a representation of a certain value. Thus, when a good person becomes our friend he himself is of value to us. Friends receive and give the same amount of good wishes and time, and feel the same joy or happiness in each other. True friendship is equality in all aspects, as a true friend is another self. (VIII:3, 1166a–­1172)

And what does Aristotle say on the relationship between man and woman, as seen in Book VIII? Friendship between men and women, in his eyes, seems to exist by nature and humans are tend to form couples more than they form cities, as the household came earlier and is just as necessary as the city. Other animals unite only for the purpose of reproduction, but human beings live together also for other purposes of life. However, Aristotle still thought a lot within the biological domain, meaning that for him

… from the start the functions are divided, and those of man and woman are different; so they help each other by throwing their peculiar gifts into the common stock. It is for these reasons that both utility and pleasure seem to be found in this kind of friendship. But this friendship may be based also on virtue, if the parties are good; for each has its own virtue and they will delight in the fact. (VIII:12, 1162a)

And children seem to be a bond of union; for “children are a good common to both, and what is common holds them together” (VIII:12, 1162a, 14–31). Parents love their children as they love themselves, and children love their parents because their being comes from them. Siblings love each other because of their common parentage. The friendship between siblings and kinsmen is like being comrades. The friendship between parents and children is much more pleasurable than other friendships due to the long sharing of lives. However, friendship between parents and children is not equal, as they have contributed different things to the relationship and the parents hold a superior position. The same, Aristotle thinks, holds for man (husband) being superior to woman (wife). However, even Stoics a little later on thought of man and woman, husband and wife, as equal since we are all endowed with a divine mind/spirit. Being loved is desirable in itself, preferable even to being honoured.

3. Christian Period

a. St. Paul

St. Paul is the most important of the Apostles who taught the Gospel of Christ in the first century. Fourteen epistles in the New Testament have been credited to Paul. Seven are considered to be genuine (Romans, First Corinthians, Second Corinthians, Galatians, Philippians, First Thessalonians, and Philemon), three are doubtful, and the final four are believed not to have been written by him. Paul’s works contain the first written account of what it means to be a Christian and thus the first account of Christian spirituality.

St. Paul is most known by his letters to Romans and Corinthians. In the Letter to the Romans he says: “For with the heart, one believes resulting in righteousness; and with the mouth confession is made resulting in salvation” (Romans 10:10, World English Bible). One who speaks about faith in God makes others happy, offers consolation, and invites other people on the path of Jesus Christ, and secondly, one who talks about God and His revelation, recognition, prophecy, and teaching, is building a church of God. Through annunciation of the holy wisdom he addresses those ready to be redeemed and consecrated into eternal life through love, hope, and faith and by leaving behind their carnal body. According to St. Paul there exist two bodies: the carnal (lustful) and the heavenly (pure) within a unity called God’s temple or the Holy Spirit.

But what is spiritual and heavenly cannot be seen with the eyes nor heard with the ears. “However, we acquire a spiritual body only through the death of the carnal, sensual body. We have a carnal body which needs to die in order to allow a spiritual body to be born through Jesus Christ, crucified God” (Nygren, 1953, p. 203). But this raises a paradoxical question: how did we come to this transient world if there is no other God; are things flowing into the world from two different sources? We should approach the God who is (in) this world and more than this world differently from our perspective of death, law, desire, knowledge, and power. Instead, Paul talks of grace, faith, love, and hope. Jewish religion and tradition, for instance, maintains that God is a transcendence which cannot be attained by men; however, in Christianity man can reach God through becoming like Christ on the Cross. The resurrection of Christ is an event which broke the law of death and enabled a new life with God and in God through the grace of God.

And essential for this new life is unconditional love (Agape), which people were given as a gift by Jesus Christ. Christ, who sacrificed himself for all people: all we have to do is to open up to his love. And what is Agape? St. Paul in his Letter to the Corinthians says:

Love is patient and is kind; love doesn’t envy. Love doesn’t brag, is not proud, does not behave itself inappropriately … does not rejoice in unrighteousness, but rejoices with the truth; bears all things, believes all things, hopes all things, endures all things. Love never fails. (1 Corinthians 13:5)

Christ is the only source of love in the world that combines words (thoughts) and actions and gifts. If we did not experience unconditional love that was found through crucified Christ, we would not know God’s love in the Christian sense of the word. Paul sees in the Christ from the Cross an event of sacrifice, in fact God’s own sacrifice. God’s love is not one that desires but gives. With this Paul emphasizes the features of Christian Love that are spontaneous and the altruistic nature of God’s unconditional love (Agape), which manifested upon Christ’s death for the poor, weak, ill, foreigners, enemies, and atheists.

Agape, as a self-sacrificed love, is reflected in the commandments:

“You shall not commit adultery,” “You shall not murder,” “You shall not steal,” “You shall not covet,” and whatever other commandments there are, are all summed up in this saying, namely “You shall love your neighbour as yourself.” Love doesn’t harm a neighbour. Love therefore is the fulfilment of the law.” (Romans 13:09–13.11)

This law of God’s universal love, which is mapped onto the love for your neighbour as love for yourself, Paul thus defined as undivided and undefined faith with the fewest number of laws/prohibitions possible.

Concrete implications of God’s unconditional love can be seen also in the relationship between man and woman. According to Paul, women are mysterious, dark, and penetrable, while men are open, light, and penetrating, but in the face of God all people and beings are equal: men, women, Jews, Greeks, Christians. “Let the husband give his wife the affection owed her, and likewise also the wife her husband. The wife doesn’t have authority over her own body, but the husband. Likewise, also the husband doesn’t have authority over his own body, but the wife” (1 Corinthians 7:3–7:5).

God in general prefers asceticism and celibacy. However, good Christians need to give these up if they wish to marry and have children. Thus, God allows sexual intercourse but only for having children, because reproduction serves to continue the human species and does not encourage sin and desire for pleasure of flesh. On the other hand, Christianity produced the difference between men and women by stating that man is better and above woman: “But I would have you know that the head of every man is Christ, and the head of the woman is man, and the head of Christ is God.” Corinthians 11:3). It is obvious that in this view woman and man are not equal as stated, and this led to a long road of female subjugation, injustice, and suffering.

b. St. Augustine

St. Augustine was an early Christian theologian whose writings were very influential in the development of Western Christianity and Western philosophy. He was on one hand Plato’s follower, and his critic in the light of neoplatonism, and on the other hand he was an interpreter of Christian teachings, especially those of St. Paul and other apostles. He was the first to create and establish a concept of love that included Eros and Agape in the form of Caritas.

Greatly influenced by Neoplatonist versions of Symposium and his studies of Agape, St. Augustine in his early period described a positive paradigm of Christian life, in the sense of Agape through different stages, in works such as De Quantitate Animae and De Genesi contra Manicheos. In these works, he fights against the teachings of the Manicheans who were inspired by Mani (3rd cn. C.E. in Babylonia). Later on, however, he refutes this kind of Platonic ascension and develops his own kind of Christian Agape and platonic Eros, which is neither Eros nor Agape, but Caritas. What is the reason for Augustine’s combination of Eros and Agape? Where does he see a flaw in Eros that must be repaired by Agape? The answer lies in pride (superbia), which is connected with Eros.

He writes in Confessions: “When the soul ascends higher and higher into the spiritual realm, person starts getting a feeling of pride and self-sufficiency which makes that person stay within himself instead of reaching beyond the self towards the heavenly.” (Augustine, 1960, p. 39). This is because man cannot reach heaven by himself. Although Platonic Eros presents love built on human will, power, and knowledge (which will bring us to heavenly domain of the Ideas), to Augustine this is false, and only God himself can free and redeem us as Augustine states in his famous work City of God: “In order to heal human pride, God’s son descended to show the way to became humble” (Augustine, 1994a, p. 273: VIII:7, ) and continues: “… pride is the beginning of the sin … Therefore, humbleness is highly advised in the city of God.” (XIV:13). This is the reason why Christian spirit emphasizes humbleness (humilitas), which is Jesus Christ. Augustine saw the remedy for Eros’s pride and self-sufficiency, preventing Eros reaching its goal, as God’s love or Caritas.

And what is God?

All people see God as the highest, most beautiful, the brightest, eternal, wise, good, true and truthful entity who ever existed at all. No one on the Earth possesses the features God has. He is life itself, pure love and the origin of everything that is: God … gives preference to that which lives before to that which is dead and he is the highest Good (Summum Bonum). (Augustine, 1994a, p. 524, note 1).

Even more, death is the biggest enemy of the heavenly kingdom, therefore Augustine concludes that: “… life will be truly happy when it is going to be eternal” (Augustine, 1994a, p. 25). Hannah Arendt correctly observes that such a concept of love was defined in two steps: “First, that which is good is an object of yearning, i.e. something useful which can be found in this world and we hope to get into everlasting possession. In the second, good is defined through fear of death and destruction” (Arendt, 1996, p. 12).

Augustine’s introduction of human (soul) yearning for the highest good (Summum Bonum) and eternal life reveals an additional difference between Man and God. Namely, people are, contrary to God, made creatures—and live solely through him. A man-made creature does not possess his own bonum but he needs to find it—which is achieved through love as a yearning to acquire good. Happiness is thus having this good and keeping it in our life. Desire and yearning is thus a sign of a created creature, whereas God himself is without desire and lives according to himself and through himself. Such a God is self-sufficient and autarkical. The fundamental difference between God’s made creatures and their Creator is in the metaphysical difference between eternity and time. Creatures belong to the world of transitions: created beings never fully exist (the past is gone, the future is yet to be), and they exist only in now which soon turns into the past—what truly exists is only now which is not in time, but in eternity, which is God.

However, this is not the whole story of love, because Augustine divides love into that which is good/proper/right and that which is bad/false, according to the object desired—the choice of the object is very important because we become what we love. Therefore, if a loving one chooses created and transient objects of this world, we have love called Cupiditas; if he chooses an eternal and non-created object (God), we have Caritas.

4. The Enlightenment Period

a. Rousseau

Jean Jacques Rousseau was a philosopher, pedagogue, composer, writer, and one of the first autobiographers in the world. His political ideas were highly influential for the French Revolution and later for socialism and even nationalism. In his early writings, Rousseau claimed again and again that human nature was corrupted by the habits and manners of society in the big cities, which made people shift from natural (moral, political, spiritual) values to artificial and immoral values, based only on looks, superficial talk, material goods, and civil and cultural conventions. Rousseau notices this corruption on social and personal levels in the relationships between men and women, thus he suggested a new way to form loving relationships.

In Julie, or the New Heloise, we follow a romantic and tragic love story between Saint-Preux and Julie. According to Rousseau, a man and a woman seal their love in marriage when they feel that they cannot change what they feel for each other: “We share the same picture of the world…. we have the same outlook on the world and why would I not believe that what we share in our hearts we also share in the level of our beliefs and judgements” (1984, book 1, p. 65). Another important component of true love is benevolence: “Man can resist almost anything but benevolence, and in order to get benevolence you give it” (ibid, p. 190). And there exists yet another feature of love: enthusiasm, which not only provides lovers and partners with enormous energy, but also drives them beyond themselves and towards the ideal of perfection and highest moral virtue. For Rousseau, love is goodness that works for and has its origin in a balanced nature of a person. Love originates in a good-natured person from a balanced combination of our instincts, heart, mind, and soul: what the heart feels, the mind confirms. Reason is also important for love, so that lovers know how to lead and handle their needs and desires properly.

However, what has not been said so far is that Saint-Preux was at first Julie’s teacher and, to his surprise and despite all they felt and discovered, she later married the older, wealthy, and educated de Wolmar, and they all lived on a property called Clarens. Even more interesting is that Rousseau wrote a love story in which, even after Julie gave birth to two children, she remained in love with Saint-Preux and later admitted her affair to de Wolmar, who was saddened upon learning this fact but continued to love her nonetheless. But why did Rousseau put an obstacle to Saint-Preux’s and Julie’s love, and why did Julie accepted to marry the older and wealthy de Wolmar? Jean Starobinski in his book Transparency and Obstruction provides a plausible insight:

By introducing a marriage with older, de Wolmar, and having children with him, Rousseau simply tried to include “all” into a new kind of society he envisioned, in which no one would be left out: Julie would fulfil her parent’s wishes and comply with the moral order of that time, de Wolmar would get the girl he wanted, Julie continues her pedigree and Saint-Preux and Julie remain in love: what we find again in a higher level is a new love and new society which coincide. Erotic demand and demand for order are eventually in peace with each other…. In the refreshed society benevolence and gentle sympathy rule, and this is the result of a total transparency of consciousness of the people living at Clarens. (1988, p.104)

All this sounds ideal, and we would expect that we reached the final level of true love and community. However, we are faced with yet another surprise—Julie’s death at the end. Why would Rousseau want Julie to die? Julie dies because she had fulfilled the duty of moral-social order but not her personal wish for a happy life together with the one she truly loved. The last words of Julie to Saint Preux clearly reveal this: “No, I am not leaving you, I go to wait for you. The virtue that set us apart on earth will bring us back together in the eternal home” (ibid., p. 409).

But if Rousseau showed us the tragic-passionate love in Julie, he clearly set up a description of a marriage in his famous work Emile: Or On Education where he, for the first time in Western society, describes a basis for a free romantic love, sealed in marriage without the pressure of social moral order or duty.

Rousseau in the first half of Emile presents the whole physical, emotional, rational, and spiritual upbringing of a child (Emile), according to which pedagogy as a field came into existence. This article won’t go into that, but will shortly present the fifth book of Emile and Rousseau’s opinion of the love between the pubescent Sophie and Emile. At this age they are both mature enough to meet and know each other and to seal their love in marriage. It is clear from the start that Rousseau does not promote equality of men and women, but sees them as complements to one another in the eyes of nature. And from the nature argument he infers that a man is (or should be) superior and a woman inferior, as they both serve the same end, their union and reproduction, but in different ways; each with their own means, capabilities, and contributions. And it is based on this inference that Rousseau proposes the first moral difference between genders: a man is active, bright, strong, a leader, proud, and a penetrator, and a woman is passive, dark, penetrable, weak, a follower, modest, and full of grace; a man needs to have power and will (and needs to develop musculature), and a woman needs to not offer too much resistance but instead possess grace and charm with which to seduce. A man, Rousseau says, is more of the head (reason, intelligence, knowledge) and spirit, while a woman is more in tune with the heart, body, and intuition. A man is made for ruling and the public sphere, and a woman for obeying and the domestic sphere: she needs to learn how to bring up children and please her husband as this is her task and the reason for her origin (design). Her domain is the house, children, husband, and garden, as Rousseau claims, and the husband is immersed in intellectual, creative, and spiritual matters and matters of controlling, manipulating, and maintaining his “garden.” A man also needs to learn how to please his wife, however, in order to not make her bitter and angry. Because a bitter and angry wife does not fulfil her marital duties and is not a good mother.

Rousseau knew that he assigned an unequal status to men and women, yet he stated that this was due to a higher unity called family, and that the new society is built on diversity and difference as seen in nature (which to a degree resembles Aristotle’s view). In this way we can read Emile: Or On Education as some sort of guide to marriage, which was highly influential in the 18th century. But it is still unclear why Rousseau, who was so liberal and open-minded in other areas, was so conservative in gender matters.

5. The Modern and Postmodern Periods

a. Sigmund Freud

Sigmund Freud was trained in medicine (neurophysiology) and later became the founding father of psychoanalysis. Freud set up a practice in neuropsychiatry with the help of Joseph Breuer. That is how he came to know Anna O., who was Joseph Breuer’s patient from 1880 through 1882. Eleven years later, Breuer and Freud wrote a book on hysteria in which they claimed that when a client becomes aware of the meanings of his or her symptoms (as can occur through hypnosis), unexpressed emotions find release and no longer exhibit themselves as symptoms. Breuer called this catharsis, from the Greek word for cleansing, and through catharsis, Anna lost many symptoms of her hysteria. Freud also noted that Breuer and Anna were falling in love with one another. (This later served as the basis for his idea of transference love.)

One of Freud’s most amazing achievements, however, was the discovery of the processes of the unconscious mind. Freud found out from his practice that the unconscious mind signals coded messages in the form of dreams and symptoms, which must be deciphered by the analyst. Freud’s way of provoking the unconscious mind was by using rememoration or associative language, which means speaking freely until the answer to the problem surfaces. At some point, however, associative language could not provide any more answers and the language was interrupted by what Freud called resistance and silence resulted. Freud found out that this silence serves as a birthplace not only for love, but also for our drives (Freud, 1995). Love is that which starts showing itself through language and moves to that which is beyond language—into drives.

And what is a drive that is not an animal instinct? In his famous work Three Essays on the Theory of Sexuality (1905), Freud tells us that drive presents itself without words, mostly through crying and meaningless shouts—some sort of stream of energy where there are no borders between subjects and objects. These shouts reach their limit with the use of swear words. Just after  swear words we come to the border, and when it is crossed language appears and the drive disappears. Subjectivity, reflection, and distance appear and the drive is transformed. The border can be crossed from the other side: When words are without power and the subject disappears, it makes a space for an uncontrolled stream of energy, which flashes away the distance and intermediary and enables a state that is solid and liquid at the same time.

Where does drive originate? Freud sees drives as a borderline between our body and psyche, composed of four components: on one side, we have the pair of tension and pressure and, on the other side, the pair of aim and goal. The first two have physical bases and the other two psychological bases. The overall source of drive, however, lies in our body, which is a combination of sexual organs, genes, and hormones that all form some sort of energetic tension inside the body, which can be released with heterosexual intercourse. But Belgian psychoanalyst Paul Verhaeghe in his work Love in a Time of Loneliness (1990) is against this notion of drives because, in his opinion, it ignores one of two important aspects of drives: each drive is always partial and autoerotic. Consequently, he thinks that a drive is neither heterosexual nor homosexual. When he says that a drive is partial, he means that something in particular attracts us to the other person (not necessarily of the opposite sex) and vice versa—this attraction includes different parts of the body and other activities as well, either passive or active, and does not necessarily lead to intercourse with the aim of procreation. Interestingly enough, a drive does not need the whole body, but only parts of the body, hence the different drives: oral, anal, voyeuristic, exhibitionistic, and the like. Also, all these body parts represent our contact with the external world: mouth, eyes, ears, nose, breasts, feet, genitals, and anus, which accompany activities such as smelling, watching, listening, touching, sucking, and penetration.

In the pleasure we get from our drive’s tendency to release tension, by tearing down the barriers of our ego (via sobbing, shouts, swears) and then putting them up again (via language), Freud recognizes drive’s connection to death and life. Freud named these two tendencies of each drive Thanatos and Eros, and claimed that they are intrinsically connected into a whole. The definitions of Eros and Thanatos are taken from Empedocles’s definitions of Philotes and Neikos as fundamental ontological principles. Eros carries the power of uniting different elements into a bigger unity: Eros is the union of different elements so division does not exist anymore. Thanatos is, on the contrary, a process of fragmentation, an explosion, a big bang which releases tension. According to Freud, drives aim at the pleasure of reaching the original, zero-tension, or unity of mind-spirit-body, which Lacan later calls jouissance, the energy of the highest pleasure.

Freud and, later, Lacan thought that love and successful relationships (partnership or marriage) depend on a solution of the internal conflict between drive and desire—this duality Freud saw in the division between pleasure of sexual drive and a desire for love. Other divisions are consciousness and unconscious, ego, id, and superego, and sensual, sexual, and emotional levels of our being.

Freud identifies the beginning of duality of drive and love in the mother/child relationship, with the first activity of pleasure being a child’s sucking to drink milk. Consequently, the birth of desire, love, and yearning bear witness to these lost original first years of the child’s relationship with his mother, which serves as a matrix for all subsequent relationships, in which people try either to replicate it or deny it and replace it with another better one. This kind of love that we as grownups try to repeat Freud calls, as mentioned earlier, transference love. Freud came to know this through sessions with his patients who fell in love with him, although he recognized that they were not actually in love with him but had transferred their original attachment to their father to him.

According to Freud this first relationship with our parents (especially mother) shows the following traits of totality and exclusivity (unity of mother and child), loss (the aforementioned totality is lost after the birth, especially with the introduction of language), and power (the mother and child relationship changes and starts to include giving, receiving, rejection, forgiveness, and reparation, which are constitutive of their relationship).

In addition, in Totem and Taboo: Resemblances between the Psychic Lives of Savages and Neurotics (1913), Freud uses the story of King Oedipus to create and illustrate the so-called Oedipus Complex, in which the superego (the universal law, the law of the father), uses guilt to prevent continuation of incestuously oriented relationships between mother and child. “In Western patriarchal societies, the boy learns that a solution to the manqué of the mother lies in replacing her with the father/man and his genital organ and by promising himself that someday he, likewise, will be a big and a powerful man” (p. 48).

b. Duties to Children

At one time, it was thought that children had only duties and did not have rights as well: we used to believe that children had duties to their parents, duties such as to love thy parents, obey them, and care for them when they grow old, but times change and philosophers, sociologists, anthropologists, social workers, and others started debating about the rights of children and about whether parents had duties toward their children, such as to love them, as well. For example, philosophers such as Liao, Boylan, and Feinberg in their articles present several positions regarding duties to children related to correlative claim rights, and one of the most important is to love them. But why do they take such a position, that duty must correlate with claim rights, and why do they emphasize that parents need to love their children?

It is obvious that children are the most vulnerable people on the planet and are likely to fall to poverty, illness, and death due to illnesses and violence. “Children are also very susceptible to violence and exploitation through child labor, land mines, war, sex trafficking, and other sorts of exploitation…. And … many children face dropout in the secondary school and even less of them go to college and university.” (Boylan, 2011, p. 2). All the facts listed show that children are a vulnerable group that need special care, love, understanding, and protection. Before we can take a justified position regarding the duties parents may have towards their children, however, we need to understand and define what love is in this regard. Matthew Liao, in his article “The right of children to be loved” (2006b), argues that children, as human beings, have the right to the essential goods, possibilities, and conditions necessary for human beings to pursue the good life, their own and others.

Rights are powerful tools of protection and therefore having rights to the essential conditions for a good life is of primary importance to human beings. Whatever else they may want, most human beings would want to have a good life. Children being loved is one of the most essential conditions for a good life.” (pp.424–425).

Mere provision of the structural goods necessary for as many options as possible is not the best of all possible worlds. Love and doing well for the child are also necessary.

There is something odd, however, about declaring it a duty of parents to love their children. This is because love is often considered to be under the genus of emotions. Emotions are often taken to be out of one’s direct control, and “love out of inclination cannot be commanded” (Kant, 2003, p. 161). Is this completely true, and how can we reasonably argue for parents’ duty to love their children? Again Liao, in his article “The right of children to be loved’ (2006b), presents a reasonable and favourable argument as to why parental love is a necessary component of parenting. One strong reason is that many children, despite “being well fed, have died or have suffered serious physical, social and cognitive harms as a result of lack of love. So, even granting that being fed is more urgent then being loved, we still should give the right of children to be loved a very high priority.” (p. 25). Liao thus claims that a strong sense of warmth and affection is a crucial part of the emotional aspects of parental care and love. In this way, the claim that children need to be loved is an empirical claim.

It is also argued that children need this emotional aspect of love in order to develop certain capacities necessary to pursue a good life:

Human beings need certain basic goods, such as food, water and air in order to sustain themselves corporeally. In order to be able to pursue the good life, they also need certain basic capacities such as the capacity to think, to feel, to be motivated by facts, to know, to choose and act freely (liberty), to appreciate the worth of something, to develop interpersonal relationships and to have control of the direction of their life (autonomy). Finally, in order to exercise these capacities, they need to have some opportunities for jobs, social interaction, acquiring further knowledge, evaluating and appreciating things and determining the direction of their lives.” (ibid., p. 10–11).

6. References and Further Reading

  • Arendt, Hannah (1996). Love and St. Augustine. Chicago, IL: The University of Chicago Press.
  • Augustine (1955). Treatises on marriage and other subjects. Roy J. Deferrari (Ed.). Washington, DC: Catholic University of America Press.
  • Augustine (1960). The confessions of Saint Augustine (John K. Ryan, Trans.). New York, NY: Image Books.
  • Augustine (1994a). The city of God (Marcus Dods, trans.). Peabody, MA: Hendrickson Publishers.
  • Augustine (1994b). On Christian doctrine. In Philip Schaff (Ed.), A select library of the Nicene and post-Nicene fathers. Peabody, MA: Hendrickson Publishers.
  • Boylan, Michael (2011). Duties to children. In Michael Boylan (Ed.), The morality and global justice reader (385–405). Boulder, CO: Westview.
  • Cranston, Maurice (1991). Jean-Jacques: The early life and work of Jean-Jacques Rousseau, 1712–1754. Chicago, IL: The University of Chicago Press.
  • Feinberg, M (1980). The child’s right to an open future. In W. Aiken & H. LaFollette (Eds.), Whose child? Children’s rights, parental authority, and state power (124–153). New Jersey, NJ: Littlefield, Adams, & Co.
  • Freud, Sigmund (1913). Totem und tabu: Einige übereinstimmungen im seelenleben der wilden und der neurotiker [Totem and Taboo: Resemblances between the Psychic Lives of Savages and Neurotics]. Leipzig, Germany: Hugo Heller.
  • Freud, Sigmund (1968). Moses and monotheism. Hertfordshire, United Kingdom: The Garden City Press.
  • Freud, Sigmund (1989). Totem and taboo. New York, NY: W. W. Norton & Company, Inc.
  • Freud, Sigmund (1995). Opombe o transferni ljubezni [Comments on transference love]. Problemi, 33(1–2), 53–63.
  • Freud, Sigmund (1997). Sexuality and the psychology of love. Philip Rieff (Ed.). New York, NY: Touchstone Edition.
  • Freud, Sigmund (2000). Three essays on the theory of sexuality (James Strachey, trans.). New York, NY: Basic Books.
  • Grimsley, Ronald (1973). The philosophy of Rousseau. Oxford, United Kingdom: Oxford University Press.
  • Guthrie, W. K. C. (1956). Plato: Protagoras and Meno. London, United Kingdom: Sage.
  • Kirk, Geoffrey S., & Raven, John E. (1984). The presocratic philosophers. Cambridge, United Kingdom: Cambridge University Press.
  • Kingsley, Peter (1995). Ancient philosophy, mystery, and magic: Empedocles and
  • Pythagorean tradition. Oxford, United Kingdom: Oxford University Press.
  • Kant, Immanuel (2003). Utemeljitev metafizike nravnosti [The metaphysics of morals]. Ljubljana, Slovenia: Založba ZRC.
  • Lacan, Jacques (1994). Sections from his work on transference. Filozofija skozi
  • psihoanalizo [Philosophy through psychoanalysis]. Ljubljana, Slovenia: Analecta.
  • Liao, S. Matthew (2006a). The idea of a duty to love. Journal of Value Inquiry 40(1): 1–22.
  • Liao, S. Matthew (2006b). The right of children to be loved. Journal of Political Philosophy 14(4), 420–440.
  • Liao, S. Matthew (2012). Why children need to be loved. Critical Review of International Social and Political Philosophy 15(3), 347–358.
  • Martin, Alain, & Primavesi, Oliver (1998). L’Empédocle de Strasbourg (P. Strasb. gr. Inv. 1665–1666). Berlin, Germany: Walter de Gruyter.
  • Nussbaum, Martha (1986). The fragility of goodness: Luck and ethics in Greek tragedy and philosophy. Cambridge, United Kingdom: Cambridge University Press.
  • Nussbaum, Martha (2001). Upheavals of Thought: The intelligence of emotions. New York, NY: Cambridge University Press. Nygren, Anders (1953). Agape and Eros. London, United Kingdom: S.P.C.K.
  • Plato (1960). Symposium (S. Groden, trans.). Amherst, MA: University of Massachusetts Press.
  • Plato (1963). Eutyphro and Phaedrus. In Edith Hamilton & Huntington Cairns (Eds.), The collected dialogues. Princeton, NJ: Princeton University Press.
  • Rousseau, Jean Jacques (1979). Emile: Or on education (Allan Bloom, trans.). London, United Kingdom: Basic Books.
  • Rousseau, Jean Jacques (1997). Julie, or the new Heloise: Letters of two lovers who live in a small town at the foot of the Alps (Philip Stewart, trans.). Lebanon, NH: University Press of New England.
  • Spinoza, Baruch (1992). The Ethics (Seymour Feldman, trans.). Indianapolis, IN: Hackett.
  • Starobinski, Jean (1988). Jean-Jacques Rousseau: Transparency and obstruction. Chicago, IL: University of Chicago Press.
  • Tobler, Waldo (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(2), 234–240.
  • Verhaeghe, Paul (1999). Love in a time of loneliness. London, United Kingdom: Rebus.

 

Author Information

Katarina Majerhold
Email: katarina.majerhold@gmail.com
Slovenia

Understanding in Epistemology

Epistemology is often defined as the theory of knowledge, and talk of propositional knowledge (that is, “S knows that p”) has dominated the bulk of modern literature in epistemology. However, epistemologists have recently started to turn more attention to the epistemic state or states of understanding, asking questions about its nature, relationship to knowledge, connection with explanation, and potential status as a special type of cognitive achievement. There is a common and plausible intuition that understanding might be at least as epistemically valuable as knowledge—if not more so—and relatedly that it demands more intellectual sophistication than other closely related epistemic states. For example, while it is easy to imagine a person who knows a lot yet seems to understand very little, think of the student who merely memorizes a stack of facts from a textbook; it is considerably harder to imagine someone who understands plenty yet knows hardly anything at all.

It is controversial just which epistemological issues concerning understanding should be central or primary—given that understanding is a relative newcomer in the mainstream epistemological literature. That said, this article nonetheless attempts to outline a selection of topics that have generated the most discussion and highlights what is at issue in each case and what some of the available positions are. To this end, the first section offers an overview of the different types of understanding discussed in the literature, though their features are gradually explored in more depth throughout later sections. Section 2 explores the connection between understanding and truth, with an eye to assessing in virtue of what understanding might be defended as ‘factive’. Section 3 examines the notion of ‘grasping’ which often appears in discussions of understanding in epistemology. Furthermore, Section 3 considers whether characterizations of understanding that focus on explanation provide a better alternative to views that capitalize on the idea of manipulating representations, also giving due consideration to views that appear to stand outside this divide. Section 4 examines the relationship between understanding and types of epistemic luck that are typically thought to undermine knowledge. Section 5 considers questions about what might explain the value of understanding; for example, various epistemologists have made suggestions focusing on transparency, distinctive types of achievement and curiosity, while others have challenged the assumption that understanding is of special value. Finally, Section 6 proposes various potential avenues for future research, with an eye towards anticipating how considerations relating to understanding might shed light on a range of live debates elsewhere in epistemology and in philosophy more generally.

Table of Contents

  1. Types of Understanding
  2. Is Understanding Factive?
    1. The Factivity of Understanding-Why
    2. A Weak Factivity Constraint on Objectual Understanding
    3. Moderate Views of Objectual Understanding’s Factivity
  3. Coherence and the Grasping Condition
    1. Understanding as Representation Manipulability
    2. Understanding and Knowledge of Causes
    3. Understanding, Abilities and Know-How
    4. Understanding as Explanation
    5. Understanding as Well-Connected Knowledge
  4. Understanding and Epistemic Luck
    1. Understanding as (Partially) Compatible with Epistemic Luck
    2. Newer Defenses of Understanding’s Compatibility with Epistemic Luck
  5. Understanding and Epistemic Value
    1. Transparency
    2. Cognitive Achievement
    3. Curiosity
  6. Future Research on Understanding
  7. References and Further Reading

1. Types of Understanding

We regularly claim that people can understand everything from theories to pieces of technology, accounts of historical events and the psychology of other individuals. Consequently, engaging with the project of clarifying and exploring the epistemic states or states attributed when we attribute understanding is a complex matter. As Zagzebski (2009: 141) remarks, different uses of understanding seem to mean so many different things that it is “hard to identify the state that has been ignored” (italics added). Zagzebski notes that this easily leads to a vicious circle because “neglect leads to fragmentation of meaning, which seems to justify further neglect and further fragmentation until eventually a concept can disappear entirely.”

It will accordingly be helpful to narrow our focus to the varieties of understanding that feature most prominently in the epistemological literature. For one thing, it is prudent to note up front that there are uses of ‘understanding’ that, while important more generally in philosophy, fall outside the purview of mainstream epistemology. Most notably here is what we can call linguistic understanding—namely, the kind of understanding that is of particular interest to philosophers of language in connection with our competence with words and their meanings (see, for example, Longworth 2008). In addition, it is important to make explicit differences in terminology that can sometimes confuse discussions of some types of understanding.

An influential discussion of understanding is Kvanvig’s (2003). Firstly, Kvanvig introduces propositional understanding as what is attributed in sentences that take the form “I understand that X” (for example, John understands that he needs to meet Harold at 2pm). Some (for example, Gordon 2012) suggest that attributions of propositional understanding typically involve attributes of propositional knowledge or a more comprehensive type of understanding—understanding-why, or objectual understanding (these types are examined more closely below).

A second variety of understanding that has generated interest amongst epistemologists is, understanding-why. This type of understanding is ascribed in sentences that take the form ‘I understand why X’ (for example, “I understand why the house burnt down”). Some of Pritchard’s (for example, 2009) earlier work on understanding uses the terminology ‘atomistic understanding’ as synonymous with ‘understanding-why’ and indeed his more recent work shifts to using the latter term. There is debate about both (i) whether understanding-why might fairly be called explanatory understanding and (ii) how understanding-why might differ from propositional knowledge.

Thirdly, and perhaps most interestingly, objectual understanding is attributed in sentences that take the form “I understand X” where X is or can be treated as a body of information or subject matter. For example, Kvanvig describes it as obtaining “when understanding grammatically is followed by an object/subject matter, as in understanding the presidency, or the president, or politics” (2003: 191). Objectual understanding is equivalent to what Pritchard has at some points termed ‘holistic understanding’ (2009: 12). Grimm (2011) suggests that what we should regard as being understood in cases of objectual understanding—namely, the ‘object’ of the objectual attitude relation—can be helpfully thought of as akin to a “system or structure [that has] parts or elements that depend upon one another in various ways.”

With these three types of understanding in mind—propositional understanding, understanding-why and objectual understanding—the next section considers some of the key questions that arise when one attempts to think about when, and under what conditions, understanding should be ascribed to epistemic agents.

2. Is Understanding Factive?

Knowledge is almost universally taken to be to be factive (compare, Hazlett 2010). In other words, S knows that p only if p is true. But is understanding factive? This is not so obvious, and at least, not as obvious as it is in the case of knowledge. This section considers the connection between understanding-why and truth, and then engages with the more complex issue of whether objectual understanding is factive.

a. The Factivity of Understanding-Why

There is little work focusing exclusively on the prospects of a non-factive construal of understanding-why; most authors, with a few exceptions, take it that understanding-why is obviously factive in a way that is broadly analogous to propositional knowledge. For example, Hills (2009: 4) says “you cannot understand why p if p is false” (compare: S knows that p only if p).  Pritchard (2008: 8) points out that—for example—if one believes that one’s house burned down because of the actions of an arsonist when it really burnt down because of faulty wiring, it just seems plain that one lacks understanding of why one’s house burned down.

However, Baker (2003) has offered an account on which at least some instances of understanding-why are non-factive. Her line is that understanding-why involves (i) knowing what something is, and (ii) making reasonable sense of it. If making reasonable sense merely requires that some event or experience make sense to the epistemic agent herself, Baker’s view appears open, as Grimm (2011) has suggested, to counterexamples according to which an agent knows that something happened and yet accounts for that occurrence by way of a poorly supported theory. For example, a self-proclaimed psychic might see someone trip and believe that he caused this person’s fall. Further, suppose that the self-proclaimed psychic even has reason to believe he is right to think he is psychic, as his friends and family deem that it is safer or kinder to buy into his delusions outwardly. A view on which the psychic’s epistemic position in this case qualifies as understanding-why would be unsatisfactorily inclusive. This is perhaps partially because there is a tendency to hold a person’s potential understanding to standards of objective appropriateness as well as subjective appropriateness.

A more charitable interpretation of Baker’s position would be to read “making reasonable sense” more strongly. For example, we might require that the agent make sense of X in a way that is reasonable—few would think that the psychic above is reasonable, though it is beyond the scope of the current discussion to stray into exploring accounts of reasonableness.

b. A Weak Factivity Constraint on Objectual Understanding

It is plausible that a factivity constraint would also be an important necessary condition on objectual understanding, but there is more nuanced debate about the precise sense in which this might be the case. A useful taxonomising question is the following: how strong a link does understanding demand between the beliefs we have about a given subject matter and the propositions that are true of that subject matter? One can split views on this question into roughly three positions that advocate varying strengths of a factivity constraint on objectual understanding.

On the weakest view, one can understand a subject matter even if none of one’s beliefs about that subject matter are true. Zagzebski (2001), whose view maintains that at least not all cases of understanding require true beliefs, gestures to something like this view. In addition, Zagzebski supports the provocative line that understanding can perhaps sometimes be more desirable when the epistemic agent does not have the relevant true beliefs. Her key thought here is that grasping the truth can actually impede the chances of one’s attaining understanding because such a grasp might come at too high a cognitive cost. Her main supporting example is of understanding the rate at which objects in a vacuum fall toward the earth (that is, 32 feet per second), a belief that ignores the gravitational attraction of everything except the earth and so is therefore not true. Nonetheless, Zagzebski thinks that believing this actually allows us more understanding for most purposes than the ‘vastly more complicated’ truth owing to our cognitive limitations.

Zagzebski’s weak approach to a factivity constraint aligns with her broadly internalist thinking about what understanding actually does involve—namely, on her view, internal consistency and what she calls ‘transparency.’ A theoretical advantage to a weak factivity constraint is that it neatly separates propositional knowledge and objectual understanding as interestingly different. Nevertheless, distinguishing between the two in this manner raises some problems for her view of objectual understanding, which should be unsurprising given the aforementioned counterexamples that can be constructed against a non-factive reading of Baker’s construal of understanding-why.

For example, and problematically for any account of objectual understanding that relaxes a factivity constraint, people frequently retract previous attributions of understanding. Consider a student saying, “I thought I understood this subject, but my recent grade suggests I don’t understand it after all”. These retractions do not t seem to make sense on the weak view. In addition, the weak view leaves it open that two agents might count as understanding some subject matter equally well in spite of the fact that for every relevant belief that one has, the other agent maintains its denial. In other words, each denies all of the other’s respective beliefs about the subject, and yet the weak view in principle permits that they might nonetheless understand the subject equally well. And furthermore, weakly factive accounts welcome the possibility that internally coherent delusions (for example, those that are drug-induced) that are cognitively disconnected from real events might nonetheless yield understanding of those events. Proponents of weak factivity must address both of these potentially problematic results.

There is arguably a further principled reason that an overly weak view of the factivity of understanding will not easily be squared with pretheoretical intuitions about understanding. Specifically, a very weak view of understanding’s factivity does not fit with the plausible and often expressed intuition that understanding is something especially epistemically valuable. For example, Kvanvig (2003: 206) observes that “we have an ordinary conception that understanding is a milestone to be achieved by long and sustained efforts at knowledge acquisition” and Whitcomb (2012: 8) reflects that “understanding is widely taken to be a “higher” epistemic good: a state that is like knowledge and true belief, but even better, epistemically speaking.” Yet, these observations do not fit with the weak view’s commitment to, for example, the claim that understanding is achievable in cases of delusional hallucinations that are disconnected from the facts about how the world is.

Elgin (2007), like Zagzebski, is sympathetic to a weak factivity constraint on objectual understanding, where the object of understanding is construed as “a fairly comprehensive, coherent body of information” (2007: 35). According to Elgin, a factive conception of understanding “neither reflects our practices in ascribing understanding nor does justice to contemporary science”.  Though her work on understanding is not limited to scientific understanding (for example, Elgin 2004), one notable argument she has made is framed to show that “a factive conception cannot do justice to the cognitive contributions of science and that a more flexible conception can” (2007: 32).

As Elgin (2007) notes, it is normal practice to attribute scientific understanding to individuals even when parts of the bodies of information that they endorse diverge somewhat from the truth. As will see, a good number of epistemologists would agree that false beliefs are compatible with understanding. However, Elgin takes this line further and insists that—with some qualifications—false central beliefs, and not merely false peripheral beliefs, are compatible with understanding a subject matter to some degree. Consider here two cases she offers to this effect:

EVOLUTION: A second grader’s understanding of human evolution might include as a central strand the proposition that human beings descended from apes. A more sophisticated understanding has it that human beings and the other great apes descended from a common hominid ancestor (who was not, strictly speaking, an ape). The child’s opinion displays some grasp of evolution. It is clearly cognitively better than the belief that humans did not evolve. But it is not strictly true. Since it is central to her take on human evolution, factivists like Kvanvig must conclude that her take on human evolution does not qualify as understanding. (2007: 37)

COPERNICUS: A central tenet of Copernicus’s theory is the contention that the Earth travels around the sun in a circular orbit. Kepler improved on Copernicus by contending that the Earth’s orbit is not circular, but elliptical. Having abandoned the commitment to absolute space, current astronomers can no longer say that the Earth travels around the sun simpliciter, but must talk about how the Earth and the sun move relative to each other. Despite the fact that Copernicus’s central claim was strictly false, the theory it belongs to constitutes a major advance in understanding over the Ptolemaic theory it replaced. Kepler’s theory is a further advance in understanding, and the current theory is yet a further advance. The advances are clearly cognitive advances. With each step in the sequence, we understand the motion of the planets better than we did before. But no one claims that science has as yet arrived at the truth about the motion of the planets. Should we say that the use of the term ‘understanding’ that applies to such cases should be of no interest to epistemology? (2007: 37-8)

How should an account of objectual understanding incorporate these types of observations—namely, where the falsity of a central belief or central beliefs appears compatible with the retention of some degree of understanding? Pritchard (2007) has put forward some ideas that may prevent the need to adopt a weak view of understanding’s factivity while nonetheless maintaining the key thrust of Elgin’s insight. In particular, as Pritchard suggests, we might want to consider that agents working with the ideal gas law or other idealizations do not necessarily have false beliefs as a result, even if the content of the proposition expressed by the law is not strictly true. This is a point Elgin is happy to grant. See Elgin (2004) for some further discussion of the role of acceptance and belief in her account.  In other words, even though there is no such gas as that referred to in the law, accepting the law need not involve believing the law to be true and thus believing there to be some gas with properties that it lacks.

The underlying idea in play here is that, in short, thinking about how things would be if it were true is an efficacious way to get to further truths; an insight has attracted endorsement in the philosophy of science (for example, Batterman 2009). Working hypotheses and idealizations need not, on this line, be viewed as representative of reality—idealizations can be taken as useful fictions, and working hypotheses are recognized as the most parsimonious theories on the table without thereby being dubbed as wholly accurate. Since, for instance, the ideal gas law (for example, Elgin 2007) is recognized as a helpful fiction and is named and taught as such, as is, naïve Copernicanism or the simple view that humans evolved from apes. It is not only unnecessary, but moreover, contentious, that a credible scientist would consider the ideal gas law true. It seems as though understanding would possibly be undermined in a case where someone relying on the ideal gas law failed to appreciate it as an idealization. That is, there is something defective about a scientist’s would-be understanding of gas behavior were that scientist, unlike all other competent scientists, to reject that the ideal gas law is an idealization and instead embraced it as a fact. Putting this all together, a scientist who embraces the ideal gas law, as an idealization, would not necessarily have any relevant false beliefs. Therefore, the need to adopt a weak factivity constraint on objectual understanding—at least on the basis of cases that feature idealizations—looks at least initially to be unmotivated in the absence of a more sophisticated view about the relationship between factivity, belief and acceptance (however, see Elgin 2004).

Nevertheless, considering weakly factive construals of objective understanding draws attention to an important point—that there are also interesting epistemic states in the neighborhood of understanding. These similar states share some of the features we typically think understanding requires, but which are not bona fide understanding specifically because a plausible factivity condition is not satisfied. A good example here is what Riggs (2003) calls intelligibility, a close cousin of understanding that also implies a grasp of order, pattern and connection, but does not seem to require a substantial connection to truth. Grimm (2011) calls this ‘subjective understanding.’ He describes subjective understanding as being merely a grasp of how specific propositions interlink—one that does not depend on their truth but rather on their forming a coherent picture. Since what Grimm is calling subjective understanding (that is, Riggs’s intelligibility) is by stipulation essentially not factive, the question of the factivity of subjective understanding simply does not arise. Though in light of this fact, it is not obvious that ‘understanding’ is the appropriate term for this state. Consider here an analogy: a false belief can be subjectively indistinguishable from knowledge. We could, for convenience, use the honorific term ‘subjective knowledge’ for false belief, though in doing so, we are no longer talking about knowledge in the sense that epistemologists are interested in, any more than we are when, as Allan Hazlett (2010) has drawn attention to, we say things like “Trapped in the forest, I knew I was going to die; I’m so lucky I was saved.” Perhaps the same should be said about alleged subjective understanding: to the extent that it is convenient to refer to non-factive states of intelligibility as states of ‘understanding’, we are no longer talking about the kind of valuable cognitive achievement of interest to epistemologists.

c. Moderate Views of Objectual Understanding’s Factivity

At the other end of the spectrum, we might consider an extremely strong view of understanding’s factivity, according to which understanding a subject matter requires that all of one’s beliefs about the subject matter in question are true. Such a constraint would preserve the intuition that understanding is a particularly desirable epistemic good and would accordingly be untroubled by the issues highlighted for the weakest view outlined at the start of the section. However, such a strong view would also make understanding nearly unobtainable and surely very rare—for example, on the extremely strong proposal under consideration, recognized experts in a field would be denied understanding if they had a single false belief about some very minor aspect of the subject matter. This is of course an unpalatable result, as we regularly attribute understanding in the presence of not just one, but often many, false beliefs. This point aligns with the datum that we often attribute understanding by degrees. That is, we often describe an individual as having a better understanding of a subject matter than some other person, perhaps when choosing whom to approach for advice or when looking for someone to teach us about a subject. While we would apply a description of ‘better understanding’ to agent A even if the major difference between her and agent B was that A had additional true beliefs, we would also describe A as having ‘better understanding’ than B if the key difference was that A had fewer false beliefs. If we sometimes attribute understanding to two people even when they differ only in terms of who has more false beliefs about a subject, this difference in degrees indicates that one can have understanding that includes some false beliefs. We can acknowledge this simply by regarding B’s understanding as, even if only marginally, relatively impoverished, rather than by claiming, implausibly, that no understanding persists in such cases. This leaves us, however, with an interesting question about the point at which there is no understanding at all, rather than merely weaker or poorer understanding.

Regarding factivity, then, it seems there is room for a view that occupies the middle ground here. We can accommodate the thought that not all beliefs relevant to an agent’s understanding must be true while nonetheless insisting that cases in which false beliefs run rampant will not count as understanding. Kvanvig (2003; 2009) offers such a view, according to which understanding of some subject matter is incompatible with false central beliefs about the subject matter. This view, while insisting that central beliefs must all be true, is flexible enough to accommodate that there are degrees of understanding—that is, that understanding varies not just according to numbers of true beliefs but also numbers of false, peripheral beliefs. It also allows attributions of understanding in the presence of peripheral false beliefs, without going so far as to grant that understanding is present in cases of internally consistent delusions—as such delusions will feature at least some false central beliefs. In this respect, then, Kvanvig’s view achieves the result of a middle ground.

However, advocates of moderate approaches to the factivity of understanding are left with some difficult questions to answer. Many of these questions have gone largely unexplored in the literature. For example:

  • In virtue of what does a belief count as ‘central’ in the relevant sense?
  • Moderate factivity implies that we should withhold attributions of understanding when an agent has a single false central belief, even in cases where the would-be understanding is of a large subject matter where all peripheral beliefs in this large subject matter are true. This consequence does not intuitively align with our practices of attributing understanding. The proponent of moderate factivity owes an explanation.
  • How should we distinguish between peripheral beliefs about a subject matter and beliefs that are not properly about the subject matter in question, while retaining a meaningful distinction between peripheral and central beliefs?

Although a moderate view of understanding’s factivity may look promising in comparison with competitor accounts, many important details remain left to be spelled out.

3. Coherence and the Grasping Condition

When considering interesting features that might set understanding apart from propositional knowledge, the idea of grasping something is often mentioned. For example, Kvanvig (2003) holds that understanding is particularly valuable in part because it requires a special “grasp of “explanatory and other coherence-making relationships.” Riggs (2003: 20) agrees, stating that understanding of a subject matter “requires a deep appreciation, grasp or awareness of how its parts fit together, what role each one plays in the context of the whole, and of the role it plays in the larger scheme of things” (italics added). Relatedly, Van Camp (2014) calls understanding a “higher level cognition” that involves recognizing connections between different pieces of knowledge, and Kosso (2007: 1) submits that inter-theoretic coherence is the hallmark of understanding, stating “knowledge of many facts does not amount to understanding unless one also has a sense of how the facts fit together.” While such remarks are made with objectual understanding (that is, understanding of a subject matter) in mind, there are similar comments about understanding-why (for example, Hills 2009) that suggest an overlapping need to consider connections between items of information, albeit on a smaller scale.

Such discussions, though they can be initially helpful, raise a nest of further questions. This in part for three principal reasons. Firstly, ‘grasping’ is often used in such a way such that it is not clear whether it should be understood metaphorically or literally. If the former, then this is unfortunate given the theoretical work the term is supposed to be doing in characterizing understanding. If the latter—that is, if we are to understand ‘grasping’ literally, then, also unfortunately, we are rarely given concrete details of its nature. A second reason that adverting to grasping-talk in the service of characterizing understanding raises further question is that it is often not clarified just what relationships or connections are being grasped, when they are grasped in a way that is distinctive of understanding. And, thirdly, two questions about what is involved in grasping can easily be run together, but should be kept separate. Call these, for short, the ‘relation question’ and the ‘object question’.

Relation question: What is the grasping relationship? (For example, is it a kind of knowledge, another kind of propositional attitude, an ability, and so on?)

Object question: What kinds of things are grasped? (For example, propositions, systems, bodies of information, the relationships thereof, and so on?)

Take first the object question. Since Kvanvig claims that the coherence-making relationships that are traditionally construed as necessary for justification on a coherentist picture are the very relations that one grasps (for example, the objects of grasping) when one understands, the justification literature may be a promising place to begin. Put generally, according to the coherentist family of proposals of the structure of justified belief, “a belief or set of beliefs is justified, or justifiably held, just in case the belief coheres with a set of beliefs, the set forms a coherent system, or some variation on these themes” (Olsson 2012: 1). Of course, many interrelated questions then emerge regarding coherence. For the purposes of thinking about understanding, some of the most important will include: (i) what makes a system of beliefs coherent? and (ii) what qualifies a group of beliefs as a system in the sense that is at issue when it is claimed that understanding involves grasping relationships or connections within a system of beliefs? For example, we might suppose that a system of beliefs contains only beliefs about a particular subject matter, and that these beliefs will ordinarily be sufficient for a rational believer who possesses them to answer questions about that subject matter reliably. Such a theory raises questions of its own, such as precisely what answering reliably, in the relevant sense, demands.

What is the grasping relation? Is it a kind of knowledge, another kind of propositional attitude, an ability, and so forth? Kvanvig does not spell out what grasping might involve, in the sense now under consideration, in his discussion of coherence, and the other remarks we considered above. He leaves grasping at the level of metaphor or uses it them literally but never develops it. Given the extent to which grasping is highly associated with understanding and left substantively unspecified, it is perhaps unsurprising that the matter of how to articulate grasping-related conditions on understanding has proven to be rather divisive. Kelp (2015) makes a helpful distinction between two broad camps here. On the one hand, we have manipulationists, who think understanding involves an ability (or abilities) to manipulate certain representations or concepts. On the other hand, there are explanationists, who argue that it is knowledge or evaluation of explanations that is doing the relevant work. However, it is not entirely clear that extant views on understanding fall so squarely into these two camps. Many seem to blend manipulationism with explanations, suggesting for example that what is required for understanding is an ability associated with mentally manipulating explanations. To complicate matters further, some of the philosophers who appear to endorse one approach over the other can elsewhere be seen considering a more mixed view (for example, Khalifa 2013b).

The next section considers some of the most prominent examples of attempts to expand on or replace a grasping condition on understanding. Some focus on understanding-why while others focus on objectual understanding.

a. Understanding as Representation Manipulability

Wilkenfeld (2013) offers the account that most clearly falls under Kelp’s characterization of manipulationist approaches to understanding. As Wilkenfeld sees it, understanding should be construed as “representational manipulability,” which is to say that understanding is, essentially, the possessing of some representation that can be manipulated in useful ways. Unlike de Regt and Dieks (2005), Wilkenfeld aims to propose an inclusive manipulation-based view that allows agents to have objectual understanding even if they do not have a theory of the phenomenon in question. His conception of mental representations defines these representations as “computational structures with content that are susceptible to mental transformations.” Wilkenfeld constructs a necessary condition on objectual understanding around this definition. His view is that understanding requires the agent to, in counterfactual situations salient to the context, be able to modify their mental representation of the subject matter. This allows the agent to produce a slightly different mental representation of the subject matter that “enables efficacious inferences” pertaining to (or manipulations of) the subject matter.

What is it to have this ability to modify some mental representation? Wilkenfeld suggests that this ability consists at least partly in being able to correct minor mistakes in one’s mental representation and use it to make assessments in similar cases. Though the demandingness of this ability need not be held fixed across practical circumstances. The context-sensitive element of Wilkenfeld’s account of understanding allows him to attribute adequate understanding to, for example, a student in an introductory history class and yet deny understanding to that student when the context shifts to place him in a room with a panel of experts.

There are three potential worries with this general style of approach. Firstly, Wilkenfeld’s context-sensitive approach is in tension with a more plausible diagnosis of the example just considered: rather than to withhold attributing understanding in the case where the student is surrounded by experts, why not—instead—and in a way that is congruous with the earlier observation that understanding comes in degrees—attribute understanding to the student surrounded by experts, but to a lesser degree (for example,  Tim has some understanding of physics, while the professor has a much more complete understanding). Carter (2014) argues that shifting to more demanding practical environments motivates attributing lower degrees of understanding rather than (as Wilkenfeld is suggests) withholding understanding.

Secondly, one might wonder if Wilkenfeld’s account of understanding as representation manipulation is too inclusive—that it rules in, as cases of bona fide understanding, representations that are based on inaccurate but internally consistent beliefs. If so, then the internally consistent delusion objection typically leveled against weakly nonfactive views raises its head. However, this concern might be abated with the addition of a moderate factivity constraint (for example, the constraint discussed in section two above) that rules out cases of mere intelligibility or subjective understanding).

Thirdly, Kelp (2015) has an objection that he thinks all who favor a manipulationist line should find worrying. Specifically, he points out that an omniscient agent who knows everything and intuitively therefore understands every phenomenon might do so while being entirely passive—not drawing interferences, making predictions or manipulating representations (in spite of knowing, for example, which propositions can be inferred from others). If Kelp’s thought experiment works, manipulation of representations cannot be a necessary condition of understanding after all. This objection is worth holding in mind when considering any further positions that incorporate representation manipulability as necessary. That said, for manipulationists who are not already inclined to accept the entailment from all-knowing to omni-understanding, the efficacy against the manipulationist is diffused as the example does not get off the ground. One reason a manipulationist will be inclined to escape the result in this fashion (by denying that all-knowing entails all-understanding) is precisely because one already (qua manipulationist) is not convinced that understanding can be attained simply through knowledge of propositions. In this respect, it seems Kelp’s move against the manipulationist might get off the ground only if certain premises are in play which manipulationists as such would themselves be inclined to resist.

b. Understanding and Knowledge of Causes

Grimm (2011) also advocates for a fairly straightforward manipulationist approach in earlier work. He suggests that manipulating the “system” allows the understander to “see” the way in which “the manipulation influences (or fails to influence) other parts of the system” (2011: 11). He argues that we can gain some traction on the nature of grasping significant to understanding if we view it along such manipulationist lines. So, on Grimm’s (2011) view, grasping the relationships between the relevant parts of the subject matter amounts to possessing the ability to work out how changing parts of that system would or would not impact on the overall system. He considers that grasping might be a “modal sense or ability” that allows the understander to, over and above registering how things are. Grasping also allows the understander to anticipate what would happen if things were relevantly different—namely, to make correct inferences about the ways in which relevant differences to the truth-values of the involved propositions would influence the inferences that obtain in the actual world. That said, Grimm’s more recent work (2014) expands on these earlier observations to form the basis of a view that spells out grasping in terms of a modal relationship between properties, objects or entities—a theory on which what is grasped when one has understanding-why will be how changes in one would lead (or fail to lead) to changes in the other. His central claim in his recent work is that understanding can be viewed as knowledge of causes, though appreciating how he is thinking of this takes some situating, given that the knowledge central to understanding is non-propositional.

Although a large number of epistemologists hold that understanding is not a species of knowledge (e.g. Kvanvig 2003; Zagzebski 2001; Riggs 2003; Pritchard 2010), Grimm’s view is rooted in a view that comes from the philosophy of science and traces originally to Aristotle. Essentially, this view traditionally holds that understanding why X is the case is equivalent to knowing why X is the case (which is in turn supposed to be equivalent to knowing that X is the case because of Y). In short: understanding is causal propositional knowledge. Sliwa 2015, however, defends a stronger view, according to which propositional knowledge is necessary and sufficient for understanding. Although, many commentators suggest that understanding requires something further, that is something in additional to merely knowing a proposition or propositions, Grimm thinks we can update the “knowledge of causes” view so that this intuition is accommodated and explained. In particular, he wants to propose a non-propositional view that has at its heart “seeing or grasping, of the terms of the casual relata, their modal relatedness”, which he suggests amounts to seeing or grasping “how things might have been if certain conditions had been different.” To be clear, the nuanced view Grimm suggests is that while understanding is a kind of knowledge of causes, it is not propositional knowledge of causes but rather non-propositional knowledge of causes, where the non-propositional knowledge is itself unpacked as a kind of ability or know-how.

Grimm develops this original position via parity of reasoning, taking as a starting point that the debate about a priori knowledge, for example, knowledge of necessary truths, makes use of metaphors of “grasping” and “seeing” that are akin to the ones in the understanding debate. An important observation Grimm makes is that merely assenting to necessary truths is insufficient for knowing necessary truths a priori—one must also grasp orsee the necessity of the necessary truth. Grimm thinks the metaphor involves something like apprehending how things stand in modal space (that is, that there are no possible worlds in which the necessary truth is false). He argues that what is grasped or seen when one attains a priori knowledge is not a proposition but a certain modal relationship between properties, objects or identities. He suggests that the primary object of a priori knowledge is the modal reality itself that is grasped by the mind and that on this basis we go on to assent to the proposition that describes these relationships. Hence, he argues that any propositional knowledge is derivative.

In terms of parallels with the understanding debate, it is important to note that the knowledge of causes formula is not limited to the traditional propositional reading. The ambiguity between assenting to a necessary proposition and the grasping or seeing of certain properties and their necessary relatedness mirrors the ambiguity between assenting to a casual proposition and grasping or seeing of the terms of the causal relata: their modal relatedness. However, Grimm is quick to point out that defending one of these two similar views does not depend on the correctness of the other. His modal model of understanding fits with the intuition that we understand not propositions but “relations between parts to wholes” or “systems of various thoughts.”

Grimm has put his finger on an important commonality at issue in his argument from parity. However, Pritchard (2014) responds to Grimm’s latest proposal with a number of criticisms. Perhaps the strongest of these is his suggestion that while the faculty of rational insight is indispensable to the grasping account of a priori, it is actually essential to knowledge of causes that it not be grasped through rational insight. This is because we don’t learn about causes a priori. On this basis Pritchard insists that Grimm’s analogy breaks down.

This aside, can we consider extending Grimm’s conception of understanding as non-propositional knowledge of causes to the domain of objectual understanding? While his view fits well with understanding-why, it is less obvious that objectual understanding involves grasping how things came to be. For one thing, abstract objects, such as mathematical truths and other atemporal phenomena, can plausibly be understood even though our understanding of them does not seem to require an appreciation of their coming to existence. For example, I can understand the quadratic formula without knowing, or caring, about who introduced it. But more deeply, atemporal phenomena such as mathematical truths have, in one clear sense, never come to be at all, but have always been, to the extent that they are the case at all. This holds regardless of whether we are Platonists or nominalists about such entities.

Secondly, even subject matters that traffic in empirical rather than abstract atemporal phenomena (for example, pure mathematics), are not clearly such that understanding them should involve any appreciation for their coming to be, or their being caused to exist. Here is one potential example to illustrate this point: consider that it is not clear that people who desire to understand chemistry generally care about “the cause of chemistry”. A potential worry then is that the achievement one attains when one understands chemistry need not involve the subject working the subject matter—in this case, chemistry’s—cause.

Grimm anticipates this point and expresses a willingness to embrace a looser conception of dependence than causal dependence, one that includes (following Kim 1994) species of dependence such as mereological dependences (that is, dependence of a whole on its parts), evaluative dependences (that is, dependence of evaluative on non-evaluative), and so on. A restatement of Grimm’s view might accordingly be: understanding is knowledge of dependence relations. This broader interpretation seems well positioned to handle abstract object cases, for example, mathematical understanding, when the kind of understanding at issue is understanding-why. For, even if understanding why 2×2=4 does not require a grasp of any causal relation, it might nonetheless involve a grasp of some kind of more general dependence, for instance the kind of dependence picked out by the metaphysical grounding relation. However, it is less clear at least initially that retreating from causal dependence to more general dependence will be of use in the kinds of objectual understanding cases noted above. For example, when the issue is understanding mathematics, as opposed to understanding why 2×2=4, it is perhaps less obvious that dependence has a central role to play.

c. Understanding, Abilities and Know-How

Another seemingly promising line—one that engages with the relation question discussed above—views grasping as intimately connected with a certain set of abilities. Hills (2009) is an advocate of such a view of understanding-why in particular. Specifically, Hills outlines six different abilities that she takes to be involved in grasping the reasons why p—abilities which effectively constitute, on her view, six necessary conditions for understanding why p. These six abilities allow one to “be able to treat q as the reason why p, not merely believe or know that q is the reason why p.” They are as follows:

(i) an ability to follow another person’s explanation of why p,

(ii) an ability to explain p in one’s own words,

(iii) an ability to draw from the information that q the conclusion that p (or that probably p),

(iv) an ability to draw from the information q’ the conclusion that p’ (or probably p’),

(v) an ability to give q (the right explanation) when given the information that p, and

(vi) an ability to give q’ (the right explanation) when given the information p’.

On the most straightforward characterization of her proposal, one fails to possess understanding why, with respect to p, if one lacks any of the abilities outlined in (i-vi), with respect to p. Note that this is compatible with one failing to possess understanding why even if one possesses knowledge that involves, as virtue epistemologists will insist, some kinds of abilities or virtues. That said, Hills adds some qualifications. For one thing, she admits that these abilities can be possessed by degrees. Secondly, she concedes that it is possible that in some cases additional abilities must be added before the set of abilities will be jointly sufficient.

Hills thinks that mere propositional knowledge does not essentially involve any of these abilities even if (as per the point above) propositional knowledge requires other kinds of abilities. To defend the claim that possessing the kinds of abilities Hills draws attention to is not a matter of simply having extra items of knowledge—she notes that one could have the extra items of knowledge and still lack the ‘good judgment’ that allows you to form new, related true beliefs. The possession of such judgment plausibly lines up more closely with ability possession (that is, (i)-(vi)) than with propositional attitude possession.

If Hills is right about this connection between grasping and possessing abilities, it might seem as though understanding-why is, at the end of the day, very similar to knowing-how (see, however, Sullivan 2017 for resistance to this suggestion).. This is a view to which Grimm (2010) is also sympathetic, remarking that the object of objectual understanding “can be profitably viewed along the lines of the object of know-how,” where Grimm has in mind here an anti-intellectualist interpretation of know-how according to which knowing how to do something is a matter of possessing abilities rather than knowing facts (compare, Stanley & Williamson 2001; Stanley 2011). Grimm (2014) also notes that his modal view of understanding fits well with the idea that understanding involves a kind of ability or know-how, as one who sees or grasps how certain propositions are modally related has the ability to answer a wide variety of questions about how things could have been different. Grimm does not make the further claim that understanding is a kind of know-how—he merely says that there is similarity regarding the object, which does not guarantee that the “activity” of understanding and know-how are so closely related.

However, if understanding-why actually is a type of knowing how then this means that intellectualist arguments to the effect that knowing how is a kind of propositional knowledge might apply, mutatis mutandis, to understanding-why as well (see Carter and Pritchard 2013). Hills herself does not believe that understanding-why is some kind of propositional knowledge, but she points out that even if it is there is nonetheless good cause to think that understanding-why is very unlike ordinary propositional knowledge. Drawing from Stanley and Williamson, she makes the distinction between knowing a proposition “under a practical mode of presentation” and knowing it “under a theoretical mode of presentation.” Stanley and Williamson admit that the former is especially tough to spell out (see Glick 2014 for a recent discussion), but it must surely involve having complex dispositions, and so it is perhaps possible to know some proposition under only one of these modes of presentation (that is, by lacking the relevant dispositions, or something else). Hills thinks that moral understanding, if it were any kind of propositional knowledge at all, would be knowing a proposition under a practical mode and “not necessarily under a theoretical mode.”

d. Understanding as Explanation

The group designated “explanationists” by Kelp (2015) share a general commitment to the idea that knowledge of explanations should play a key role in a theory of understanding (for example, Hempel 1965; Salmon 1989; Khalifa 2012; 2013). For those who wonder about whether the often-discussed “grasping” associated with understanding might just amount to the possession of further beliefs (rather than, say, the possession of manipulative abilities), this type of view may seem particularly attractive (and comparatively less mysterious). On such a view, grasping talk could simply be jettisoned altogether. However, the core explanationist insight also offers the resources to supplement a grasping account.  On such an interpretation, explanationism can be construed as offering a simple answer to the object question discussed above: the object of understanding-relevant grasping would, on this view, be explanations. As it turns out, not all philosophers who give explanation a central role in an account of understanding want to dispense with talk of grasping altogether, and this is especially so in the case of objectual understanding.

Khalifa’s (2013) view of understanding is a form of explanatory idealism. While Khalifa favors earlier accounts of scientific understanding to the more recent views that have been submitted by epistemologists, he is aware that some criticisms (for example, Lipton (2009) and Pritchard (2010)) to the effect that requiring knowledge of an explanation is too strong a necessary condition on understanding-why. His alternative suggestion is to propose explanation as the ideal of understanding, a suggestion that has as a consequence that one should measure degrees of understanding according to how well one “approximate[s] the benefits provided by knowing a good and correct explanation.” Khalifa submits that this line is supported by the existence of a correct and reasonably good explanation in the background of all cases of understanding-why that does not involve knowledge of an explanation—a background explanation that would, if known, provide a greater degree of understanding-why.

This line merits discussion not least because the idea that understanding-why comes by degrees is often ignored in favor of discussing the more obvious point that understanding a subject matter clearly comes by degrees. One issue worth bringing into sharper focus is whether knowing a good and correct explanation is really the ideal form of understanding-why. In particular, one might be tempted to suggest that some of the objections raised to Grimm’s non-propositional knowledge-of-causes model could be recast as objections to Khalifa’s own explanation-based view. For example, we might suppose an agent has a maximally complete explanation of how Michelangelo’s David came into existence between 1501 and 1504, what methods were used to craft it, what Michelangelo’s motivating reasons were at the time, how much clay was used, and so on. But when the object of understanding why   is essentially evaluative—for example, understanding why the statue is beautiful—it seems that the quality of one’s understanding could vary dramatically even when we hold fixed that one possesses a correct and complete explanation of how the statue came to be (that is, both a physical and social description of these causes). To the extent that this is correct, there is some cause for reservation about measuring degrees of understanding according to how well they “approximate the benefits provided by knowing a good and correct explanation.” A proponent of Khalifa’s position might, however, view the preceding response as question-begging. For if the view is correct, then an explanation for why one’s understanding why the painting is beautiful is richer, when it is, will simply be in terms of one’s possession of a correct answer to the question of why it is beautiful.

It is moreover of interest to note that Khalifa (2013b) also sees a potential place for the notion of ‘grasping’ in an account of understanding, though in a qualified sense. On the view he recommends, the ability to grasp explanatory or evidential connections is an ability that is central to understanding only if the relevant grasping ability is understood as involving reliable explanatory evaluation. Khalifa’s indispensability argument—which he calls the ‘Grasping Argument’ runs as follows:

  1. Understanding entails true beliefs of the form q explains p.
  2. Understanding entails that such beliefs must be the result of exercising reliable cognitive abilities.
  3. If understanding entails true beliefs of the form q explains p, and also entails that such beliefs must be the result of exercising reliable cognitive abilities, then these abilities involve evaluating (or discriminating between) explanations.
  4. So understanding entails that beliefs of the form q explains p are the result of exercising a reliable cognitive ability to evaluate explanations. (Khalifa 2013b: 5)

Khalifa is, in this argument stipulating that (1) is a “ground rule for discussion” (2013b: 5). One point that could potentially invite criticism is the move from (1) and (2) to (3). A worry about this move can be put abstractly: consider that if understanding entails true beliefs of form <q explains>, and that beliefs of form <q explains p> must themselves be the result of exercising reliable cognitive abilities, it might still be that one’s reliable <q explains p>-generating abilities are exercised in a bad environment. For example, an environment where one’s abilities so easily could generate false beliefs of form <q explains p> despite issuing (luckily) true beliefs of the form <q explains p> on this occasion. Contrary to premise (3), such abilities (of the sort referenced by Khalifa in premise 2 and 3) arguably need not involve discriminating between explanations, so long as one supposes that discriminating between explanations is something one has the reliable ability to do only if one could not very easily form a belief of the form <q explains p> when this is false.

More generally, though, it is important to note that Khalifa, via his grasping argument, is defending reliable explanatory evaluation as merely a necessary—though not sufficient—component of grasping. In so doing, he notes that the reader may be inclined to add further internalist requirements to his reliability requirement, of the sort put forward by Kvanvig (2003). As such, Khalifa is not attempting to provide an analysis of grasping.

It is worth considering how and in what way a plausible grasping condition on understanding should be held to something like a factivity or accuracy constraint. If a grasping condition is necessary for understanding, does one satisfy this condition only when one exercises a grasping ability to reflect how things are in the world? Or, should we adopt a more relaxed view of what would be required to satisfy this condition—namely, a view that focuses on the way the agent connects information.

Strevens (2013) focuses on scientific understanding in his discussion of grasping. He also suggests, like Khalifa, that grasping be linked with correct explanations. As such, his commentary here is particularly relevant to the question of whether gasping is factive. Riggs (2003: 21-22) asks whether an explanation has to be true to provide understanding, and Strevens thinks that it is implied that grasping is factive.

However, Strevens nonetheless offers a rough outline of a parallel, non-factive account of grasping, what he calls ‘grasping*’. He wants us to suppose that grasping has two components—one that is a purely psychological (that is, narrow) component and one that is the actual obtaining of the state of affairs that is grasped. He gives the name ‘grasping*’ to the purely psychological component that would continue to be satisfied even if, say, an evil demon made it the case at the moment of your grasping that there was only an appearance of the thing that appears to you to be the case. This would be the non-factive parallel to the standard view of grasping. Strevens, however, holds that than an explanation is only correct if its constitutive propositions are true, and therefore the reformulation of grasping that he provides is not intended by Strevens to be used in an actual account of understanding. The idea of grasping* is useful insofar as it makes clearer the cognitive feat involved in intelligibility, which is similar to understanding in the sense that it “implies a grasping of order, pattern and connection” between propositions (Riggs, 2004), but it does not require those propositions to be true. Just as we draw a distinction between this epistemic state (that is, intelligibility, or what Grimm calls ‘subjective understanding’) and understanding (which has a much stricter factivity requirement), it makes sense to draw a line between grasping* and grasping where one is factive and the other is not.

Likewise, just as all understanding will presumably involve achieving intelligibility even though intelligibility does not entail understanding, so too will all grasping involve grasping* even though grasping* does not entail grasping. Consider, on this point, that a conspiracy theorist might very well grasp* the connection between (false) propositions so as to achieve a coherent, intelligible, though wildly off-base, picture. The conspiracy theorist possesses something which one who grasps (rather than grasps*) a correct theory also possesses, and yet one who fails to grasp* even the conspiracy theory (for example, a would-be conspiracy theorist who has yet to form a coherent picture of how the false propositions fit together) lacks.

e. Understanding as Well-Connected Knowledge

Assuming that we need an account of degrees of understanding if we are going to give an account of outright understanding (as opposed to working the other way around, as he thinks many others are inclined to do), Kelp (2015) suggests we adopt a knowledge based account of objectual understanding according to which “maximal understanding of a given phenomenon” is to be cashed out in terms of fully comprehensive and maximally well-connected knowledge of that phenomenon. Kelp’s account, then, explains our attributions of degrees of understanding in terms of approximations to such well-connected knowledge. He says that knowledge about a phenomenon (P) is maximally well-connected when “the basing relations that obtain between the agent’s beliefs about P reflect the agent’s knowledge about the explanatory and support relations that obtain between the members of the full account of P” (2015: 12).

This view, he notes, can make sense of the example (see §3(b))—which he utilizes against manipulationists accounts—of the omniscient, omni-understanding agent who is passive (that is, an omni-understanding agent who is not actively drawing explanatory inferences) as one would likely attribute to this agent maximally well-connected knowledge in spite of that passivity. Meanwhile, when discussing outright (as opposed to ideal) understanding, Kelp suggests that we adopt a contextualist perspective. In a given context, then, one understands some subject matter P only if one approximates fully comprehensive and maximally well-connected knowledge of P “closely enough” that one is sufficiently likely to successfully perform any task relating to P that is determined by the context, assuming that one “has the skills needed to do so and to exercise them in suitably favorable conditions”. Kelp points out that this type of view is not so restrictive as to deny understanding to, for example, novice students and young children.  It should be noted that Hills 2009: 7 is also sympathetic to a similar thought, suggesting that the threshold for understanding might be contextually determined. However, Kelp admits that he wonders how his account will make sense of the link between understanding and explanation, and one might also wonder whether it is too strict to say that understanding requires knowledge as opposed to justified belief or justified true belief.

4. Understanding and Epistemic Luck

With a wide range of subtly different accounts of understanding (both objectual and understanding-why) on the table, it will be helpful to consider how understanding interfaces with certain key debates in epistemology. One natural place to start will be to examine the relationship between understanding and epistemic luck. Many epistemologists have sought to distinguish understanding from knowledge on the basis of alleged differences in the extent to which knowledge and understanding are susceptible to being undermined by certain kinds of epistemic luck.

While the matter of how to think about the incompatibility of knowledge with epistemic luck remains a contentious point—for instance, here modal accounts (for example, Pritchard 2005) are at odds with lack-of-control accounts (for example, Riggs 2007), few contemporary epistemologists dissent from the comparatively less controversial claim that knowledge excludes luck in a way that true beliefs and sometimes even justified true beliefs do not (see  Hetherington (2013) for a dissenting position). That said, the question of whether, and if so to what extent, understanding is compatible with epistemic luck, lacks any contemporary consensus, though this is an aspect of understanding that is receiving increased attention.

Zagzebski (2001) and Kvanvig (2003), have suggested that understanding’s immunity to being undermined by the kinds of epistemic luck which undermine knowledge is one of the most important ways in which understanding differs from knowledge. Riaz (2015), Rohwer (2014) and Morris (2012) have continued to uphold this line on understanding’s compatibility with epistemic luck and defend this line against some of the objections that are examined below. However, Pritchard’s work on epistemic luck (for example, 2005) and how it is incompatible with knowledge leads him to reason that understanding is immune to some but not all forms of malignant luck (that is, luck which is incompatible with knowledge). Finally, on the other side of the spectrum from Zagzebski and Kvanvig, and also in opposition with Pritchard, is the view that understanding’s immunity to epistemic luck is isomorphic to knowledge’s immunity to epistemic luck. This view, embraced by DePaul and Grimm (2009), implies that to the extent that understanding and knowledge come apart, it is not with respect to a difference in susceptibility to being undermined by epistemic luck.

a. Understanding as (Partially) Compatible with Epistemic Luck

Consider the view that the kinds of epistemic luck that suffice to undermine knowledge do not also undermine understanding. As Kvanvig sees it, knowing requires non-accidental links between (internal) mental states and external events in just the right way. But, the chief requirement of understanding, for him, is instead that there be the right coherence-making relations in some agent’s collection of information (that is, that the agent has a grasp of how all this related information fits together. In order to illustrate this point, Kvanvig invites us to imagine a case where an individual reads a book on the Comanche tribe, and she thereby acquires a belief set about the Comanche. In such a case, Kvanvig says, this individual acquires an “historical understanding of the Comanche dominance of the Southern plains of North America from the late 17th until the late 19th century” (2003: 197). Kvanvig stipulates that there are no falsehoods in the relevant class of beliefs that this individual has acquired from the book, and also that she can correctly answer all relevant questions whilst confidently believing that she is expressing the truth. He claims that while we would generally expect her to have knowledge of her relevant beliefs, this is not essential for her understanding and as a result it would not matter if these true beliefs had been Gettierised (and were therefore merely accidentally true). In short, then, Kvanvig wants to insist that the true beliefs that one attains in acquiring one’s understanding can all be Gettiered, even though the Gettier-style luck which prevents these beliefs from qualifying as knowledge does not undermine the understanding this individual acquires. So, understanding is compatible with a kind of epistemic luck that knowledge excludes.

Pritchard, meanwhile, claims that the matter of understanding’s compatibility with epistemic luck can be appreciated only against the background of a distinction between two kinds of epistemic luck—intervening and environmental—both of which are incompatible with knowledge. Both are “veritic” types of luck on Pritchard’s view—they are present when, given how one came to have one’s true belief, it is a matter of luck that this belief is true (Pritchard 2005: 146). Intervening epistemic luck is the sort present in the Gettier’s original cases (1963) which convinced most epistemologists to abandon the traditional account of knowledge as justified true belief. Cases of intervening luck take—to use a simple example—the familiar pattern of Chisholm’s “sheep in a field case”, where an agent sees a sheep-shaped rock which looks just like a sheep, and forms the belief “There is a sheep”. The agent’s belief is justified and true, thanks to the fact that there is a genuine sheep hiding behind the rock, but the belief is not knowledge, as it could easily have been false. It is just dumb luck the genuine sheep happened to be in the field. By contrast, the paradigmatic case of environmental epistemic luck is the famous ‘barn façade’ case (for example, Ginet 1975; Goldman 1979), a case where what an agent looks at is a genuine barn which unbeknownst to the individual is surrounded by façades which are indistinguishable to the agent from the genuine barn. Here, and unlike in the case of intervening epistemic luck, nothing actually goes awry, and the fact that the belief could easily have been false is owed entirely to the agent’s being in a bad environment, one with façades nearby.

Armed with this distinction, Pritchard criticizes Kvanvig’s assessment of the Comanche case by suggesting that just how we should regard understanding as being compatible or incompatible with epistemic luck depends on how we fill out the details of Kvanvig’s case, which is potentially ambiguous between two kinds of readings.  In order to make this point clear, Pritchard suggests that we first consider two versions of a case analogous with Kvanvig’s. In the first version, we are to imagine that the agent gets her beliefs from a faux-academic book filled with mere rumors that turn out to be luckily true. In this Gettier-style case, she has good reason to believe her true beliefs, but the source of these beliefs (for example, the rumor mill) is highly unreliable and this makes her beliefs only luckily true, in the sense of intervening epistemic luck. Contrast this—call it the ‘intervening reading’ of the case—with Pritchard’s corresponding environmental reading of the case, where we are to imagine that the agent is reading a reliable academic book which is the source of many true beliefs she acquires about the Comanche. But in this version of the case, suppose that, although the book is entirely authoritative, genuine and reliable, it is the only trustworthy book on the Comanche on the shelves—every book on the shelves nearby, which she easily could have grabbed rather than the genuine authoritative book, was filled with rumors and ungrounded suppositions. Pritchard’s verdict is that we should deny understanding in the intervening case and attribute it in the environmental case. Pritchard’s assessment then of whether understanding is compatible with epistemic luck that is incompatible with knowledge depends on which kind of epistemic luck incompatible with knowledge one is discussing.

While Pritchard’s point here is revealed in his diagnosis of Kvanvig’s reading of the Comanche case, he in several places prefers to illustrate the idea with reference to the case in which an agent asks a real (that is, genuine, authoritative) fire officer about the cause of a house fire and receives a correct explanation. Suppose further that the agent could have easily ended up with a made-up and incorrect explanation because (unbeknownst to the agent) everyone in the vicinity of the genuine fire officer who is consulted is dressed up as fire officers and would have given the wrong story (whilst failing to disclose that they were merely in costume). Pritchard maintains that it is intuitive that in the case just described understanding is attained—you have consulted a genuine fire officer and have received all the true beliefs required for understanding why your house burned down, and acquire this understanding in the right way. Meanwhile, he suggests that were you to ask a fake fire officer who appeared to you to be a real officer and just happened to give the correct answer, it is no longer plausible (by Pritchard’s lights) that you have understanding-why.

For a less concessionary critique of Kvanvig’s Comanche case, however, see Grimm (2006). According to Grimm, cases like Kvanvig admit of a more general characterisation, depending on how the details are filled in. Grimm puts the template formulation as follows: “A Comanche-style case is one in which we form true beliefs on the basis of trusting some source, and either (a) the source is unreliable, or (b) the source is reliable, but in the current environment one might easily have chosen an unreliable source.” After analysing variations of the Comanche case so conceived, Grimm argues that in neither (a)- or (b)-style Comanche cases do knowledge and understanding come apart. If this is right, then at least one prominent case used to illustrate a luck-based difference between knowledge and understanding does not hold up to scrutiny.

b. Newer Defenses of Understanding’s Compatibility with Epistemic Luck

In contrast with Pritchard’s “partial compatibility” view of the relationship between understanding and epistemic luck, where understanding is compatible with environmental but not with intervening luck, Rohwer (2014) defends understanding’s full compatibility with veritic epistemic luck (that is, of both intervening and environmental varieties). Rohwer argues that counterexamples like Pritchard’s intervening luck cases only appear plausible because the beliefs that make up the agent’s understanding come exclusively from a bad source. For example, Pritchard’s case of the ­fake fire officer—which recall is one in which he thinks understanding (as well as knowledge) is lacking—is one in which Rower points out taht all of the true beliefs and grasped connections between those beliefs are from a bad source. Rohwer’s inventive move involves a contrast case featuring “unifying understanding”, that is, understanding that is furnished from multiple sources, some good and some bad. Such cases she claims feature intervening luck that is compatible with understanding. While Pritchard can agree with Rohwer’s conclusion that understanding (and specifically as Rohwer is interested in, scientific understanding) is not a species of knowledge, the issue of adjudicating between Rohwer’s intuition in the case of unifying understanding and the diagnosis Pritchard will be committed to in such a case is complicated. An important question is whether there are philosophical considerations beyond simply intuition to adjudicate in a principled way why we should think about unifying understanding cases in one way rather than the other.

Morris (2012), like Rohwer, also defends lucky understanding—in particular, understanding-why, or what he calls “explanatory understanding”). He argues that intuitions that rule against lucky understanding can be explained away. For example, he attempts to explain the intuitions in Pritchard’s intervening luck spin on Kvanvig’s Comanche case by noting that some of the temptation to deny understanding here relates to the writer of the luckily-true book himself lacking the relevant understanding. Morris challenges the assumption that hearers cannot gain understanding through the testimony of those who lack understanding, and accordingly, embraces a kind of understanding transmission principle that parallels the kind of knowledge transmission principle that is presently a topic of controversy in the epistemology of testimony. Morris suggests that the writer of the Comanche book might lack understanding due to failing to endorse the relevant propositions, while the reader might have understanding because she does endorse the relevant proposition. He claims further that this description of the case undermines the intuition that the writer’s lack of understanding entails the reader’s lack of understanding. Of course, though, just as Lackey (2007) raises ‘creationist teacher’ style cases against knowledge transmission principles, one might as well raise a parallel kind of creationist teacher case against the thesis that one cannot attain understanding from a source who herself lacks it. In such a parallel case, we simply modify Lackey’s original case and suppose that Stella, a creationist teacher, who does not believe in evolution, nonetheless teaches it reliably and in accordance with the highest professional standards. As Lackey thinks students can come to know evolutionary theory from this teacher despite the teacher not knowing the propositions she asserts (given that the Stella fails the belief condition for knowledge), we might likewise think, and contra Morris, that Stella might fail to understand evolution. This is because Stella lacks beliefs on the matter, even though the students can gain understanding from her. To the extent that such a move is available, one has reason to resist Morris’s rationale for resisting Pritchard’s diagnosis of Kvanvig’s case.

5. Understanding and Epistemic Value

The topic of epistemic value has only relatively recently received sustained attention in mainstream epistemology. Even so, and especially over the past decade, there has been agreement amongst most epistemologists working on epistemic value that that understanding is particularly valuable (though see Janvid 2012 for a rare dissenting voice). It is also becoming an increasingly popular position to hold that understanding is more epistemically valuable than knowledge (see Kvanvig 2003; Pritchard 2010). Although the analysis of the value of epistemic states has roots in Plato and Aristotle, this renewed and more intense interest was initially inspired by two coinciding trends in epistemology. On the one hand, there is the increasing support for virtue epistemology that began in the 1980s, and on the other there is growing dissatisfaction with the ever-complicated attempt to generate an account of knowledge that is appropriately immune to Gettier-style counterexamples (see, for example, DePaul 2009).

Unsurprisingly, the comparison between the nature of understanding as opposed to knowledge has coincided with comparisons of their respective epistemic value, particularly since Kvanvig (2003) first defended the epistemic value of the latter to the former. For example, in Whitcomb (2010: 8), we find the observation that “understanding is widely taken to be a ‘higher’ epistemic good: a state that is like knowledge and true belief, but even better, epistemically speaking.” Meanwhile, Pritchard (2009: 11) notes “as we might be tempted to put the point, we would surely rather understand than merely know.” A helpful clarification here comes from Grimm (2012: 105), who in surveying the literature on the value of understanding points out that the suggestion seems to be that understanding (of “a complex of some kind”) is better than the corresponding item of propositional knowledge. This type of a view is a revisionist theory of epistemic value (see, for example, Pritchard 2010), which suggests that one would be warranted in turning more attention to an epistemic state other than propositional knowledge—specifically, according to Pritchard—understanding. The following sections consider why understanding might have such additional value.

a. Transparency

According to Zagzebski (2001), the epistemic value of understanding is tied not to elements of its factivity, but rather to its transparency. She claims, “it may be possible to know without knowing one knows, but it is impossible to understand without understanding one understands” (2001: 246) and suggests that this property of understanding might insulate it from skepticism. Zagzebski does not mean to say that to understand X, one must also understand one’s own understanding of X (as this threatens a psychologically implausible regress), but rather, that to understand X one must also understand that one understands X. Thus, given that understanding that p and knowing that p can in ordinary contexts be used synonymously (for example, understanding that it will rain is just to know that it will rain) we can paraphrase Zagzebski’s point with no loss as: understanding X entails knowing that one understands X. To the extent that this is right, Zagzebski is endorsing a kind of ‘KU’ principle (compare: KK).

Grimm (2006) and Pritchard (2010) counter that many of the most desirable instances of potential understanding, such as when we understand another person’s psychology or understand how the world works, are not transparent. In other words, they claim that one cannot always tell that one understands. Consider, for instance, the felicity of the question: “Am I understanding this correctly?” and “I do not know if I understand my own defense mechanisms; I think I understand them, but I am not sure.” The other side of the coin is that one often can think that one understands things that one does not (for example, Trout 2007). Consider how some people think they grasp the ways in which their zodiac sign has an influence on their life path, yet their sense of understanding is at odds with the facts of the matter. More generally, as this line of criticism goes, sometimes we simply mistake mere (non-factive) intelligibility for understanding. As it were, from “the inside”, these can be indistinguishable much as, from the first-person perspective, mere true belief and knowledge can be indistinguishable. To the extent that these worries with transparency are apt, a potential obstacle emerges for the prospects of accounting for the value of understanding in terms of its transparency. Examples of the sort considered suggest that—even if understanding has some important internalist component to it—transparency of the sort Zagzebski is suggesting when putting forward the ‘KU’ claim, is an accidental property of only some cases of understanding and not essential to understanding.

b. Cognitive Achievement

Pritchard’s (2010) account of the distinctive value of understanding is, in short, that understanding essentially involves a strong kind of finally valuable cognitive achievement, and secondly, that while knowledge comes apart from cognitive achievement in both directions, understanding does not. If, as robust virtue epistemologists have often insisted, cognitive achievement is finally valuable (that is, as an instance of achievements more generally), and understanding necessarily lines up with cognitive achievement but knowledge only sometimes does, then the result is a revisionary story about epistemic value. In other words, one mistakenly take knowledge to be distinctively valuable only because knowledge often does have something—cognitive achievement—which is essential to understanding and which is finally valuable.

Firstly, achievement is often defined as success that is because of ability (see, for example, Greco 2007), where the most sensible interpretation of this claim is to see the ‘because’ as signifying a casual-explanatory relationship—this is, at least, the dominant view. The thought is that, in cases of achievement, the relevant success must be primarily creditable to the exercise of the agent’s abilities, rather than to some other factor (for example, luck). Achievements, unlike mere successes, are regarded as valuable for their own sake, mainly because of the way in which these special sorts of successes come to be.

It is helpful to consider an example. If we consider some goal—such as the successful completion of a coronary bypass—it is obvious that our attitude towards the successful coronary bypass is different when the completion is a matter of ability as opposed to luck. Assume that the surgeon is suffering from the onset of some degenerative mental disease and the first symptom is his forgetting which blood vessel he should be using to bypass the narrowed section of the coronary artery. The surgeon’s successful bypass is valued differently when one is made aware that it was by luck that he picked an appropriate blood vessel for the bypass. Given that the result is the same (that is, the patient’s heart muscle blood supply is improved) regardless of whether he successfully completes the operation by luck or by skill, the instrumental value of the action is the same. Given that the instrumental value is the same, our reaction to the two contrasting bypass cases seems to count in favor of the final value of successes because of ability—achievements. So too does the fact that one would rather have a success involving an achievement than a mere success, even when this difference has no pragmatic consequences. To borrow a case from Riggs, stealing an Olympic medal or otherwise cheating to attain it lacks the kind of value one associates with earning the medal, through one’s own skill. Achievements are thought of as being intrinsically good, though the existence of evil achievements (for example, skillfully committing genocide) and trivial achievements (for example, competently counting the blades of grass on a lawn) shows that we are thinking of successes that have distinctive value as achievements (Pritchard 2010: 30) rather than successes that have all-things-considered value.

Due to the possibility of overly simple or passive successes qualifying as cognitive achievements (for example, coming to truly believe that it is dark just by looking out of the window in normal conditions after 10pm), Pritchard cautions that we should distinguish between two classes of cognitive achievement—strong and weak:

Weak cognitive achievement: Cognitive success that is because of one’s cognitive ability.

Strong cognitive achievement: Cognitive success that is because of one’s cognitive ability where the success in question either involves the overcoming of a significant obstacle or the exercise of a significant level of cognitive ability.

On the basis of considerations Pritchard argues for in various places (2010; 2012; 2013; 2014), relating to cognitive achievement’s presence in the absence of knowledge (for example. in barn façade cases, where environmental luck is incompatible with knowledge but compatible with cognitive achievement) and the absence of cognitive achievement in the presence of knowledge (e.g. as in testimony cases in friendly environments, where knowledge acquisition demands very little on the part of the agent), he argues that cognitive achievement is not essentially wedded to knowledge (as robust virtue epistemologists would hold). In fact, he claims, the two come apart in both directions: yielding knowledge without strong cognitive achievement and—as in the case of understanding that lacks corresponding knowledge—strong cognitive achievement without knowledge. By contrast, Pritchard believes that understanding always involves strong cognitive achievement, that is, an achievement that necessarily involves either a significant exercise of skill or the overcoming of a significant obstacle. If Pritchard is right to claim that understanding is always a strong cognitive achievement, then understanding is always finally valuable if cognitive achievement is also always finally valuable, and moreover, valuable in a way that knowledge is not. See, however, Carter & Gordon (2014) for a recent criticism on the point of identifying understanding with strong cognitive achievement. See further Bradford (2013; 2015) for resistance to the very suggestion that there can be weak achievements on Pritchard’s sense—namely, achievements that do not necessarily involve great effort, regardless of whether they are primarily due to ability.

c. Curiosity

Taking curiosity to be of epistemic significance is not a new idea. Whitcomb (2010) notes that Goldman (1999)  has considered that the significance or value of some item of knowledge might be at least in part determined by whether, and to what extent, it provides the knower with answers to questions that they are curious about. Whitcomb also cites Alston (2005) as endorsing a stronger view, according to which true belief or knowledge gets at least some of its epistemic value from its connection to, and satisfaction of, curiosity.

What is curiosity? According to Goldman (1991) curiosity is a desire for true belief; by contrast, Williamson views curiosity as a desire for knowledge. Kvanvig (2013) claims that both of these views are mistaken, and in the course of doing so, locates curiosity at the center of his account of understanding’s value. More specifically, Kvanvig aims to support the contention that objectual understanding has a special value knowledge lacks by arguing that the nature of curiosity—the “motivational element that drives cognitive machinery” (2013: 152)—underwrites a way of vindicating understanding’s final value.

The notion of curiosity that plays a role in Kvanvig’s line is a broadly inclusive one that is meant to include not just obvious problem-solving examples but also what he calls more “spontaneous” examples, such as turning around to see what caused a noise you just heard. He takes his account to be roughly in line with the layman’s concept of curiosity. His central claim is that curiosity “provides hope for a response-dependent or behaviour-centred explanation of the value of whatever curiosity involves or aims at”. This is explained in the following way: “If it is central to ordinary cognitive function that one is motivated to pursue X, then X has value in virtue of its place in this functional story.” Regarding the comparison between the value of understanding and the value of knowledge, then, he will say that if understanding is fundamental to curiosity then this provides at least a partial explanation for why it is superior to the value of knowledge. Kvanvig identifies the main opponent to his view, that the scope of curiosity is enough to support the unrestricted value of understanding, to be one on which knowledge is what is fundamental to curiosity.  Specifically, he takes his opponent’s view to be that knowledge through direct experience is what sates curiosity, a view that traces to Aristotle. A central component of Kvanvig’s argument is negative; he regards knowledge as ill-suited to play the role of satisfying curiosity, and in particular, by rejecting three arguments from Whitcomb to this effect. According to his positive proposal, objectual understanding is the goal and what typically sates the “appetite” associated with curiosity. He concedes, though, that sometimes curiosity on a smaller scale can be sated by epistemic justification, and that what seems like understanding, but is actually just intelligibility, can sate the appetite when one is deceived.

Grimm (2012) has wondered whether this view might get things “explanatorily backwards”. This is because we might be tempted to say instead that we desire to make sense of things because it is good to do so rather than saying that it is good to make sense of things because we desire it. He also suggests that what epistemic agents want is not just to feel like they are making sense of things but to actually make sense of them. Owing to Kvanvig’s use of the words “perceived achievement”, Grimm thinks that the curiosity account of understanding’s value suggests that subjective understanding (or what is referred to as ‘intelligibility’ above) can satisfy the desire to make sense of the world or “really marks the legitimate end of inquiry.”

6. Future Research on Understanding

Where should an investigation of understanding in epistemology take us next? Although a range of epistemologists highlighting some of the important features of understanding-why and objectual understanding have been discussed, there are many interesting topics that warrant further research. For one thing, if understanding is both a factive and strongly internalist notion then a radical skeptical argument that threatens to show that we have no understanding is a very intimidating prospect (as Pritchard 2010:86 points out). This skeptical argument is worth engaging with, presumably with the goal of showing that understanding does not turn out to be internally indistinguishable from mere intelligibility.

Secondly, there is plenty of scope for understanding to play a more significant role in social epistemology. For example, Carter and Gordon (2011) consider that there might be cases in which understanding, and not just knowledge, is the required epistemic credential to warrant assertion. Questions about when and what type of understanding is required for permissible assertion connect with issues related to expertise. In particular, how we might define expertise and who has it. And, relatedly in social epistemology, we might wonder what if any testimonial transmission principles hold for understanding, and whether there are any special hearer conditions demanded by testimonial understanding acquisition that are not shared in cases of testimonial knowledge acquisition.

Thirdly, even if one accepts something like a moderate factivity requirement on objectual understanding—and thus demand of at least a certain class of beliefs one has of a subject matter that they be true—one can also ask further and more nuanced questions about the epistemic status of these true beliefs. Must they be known or can they be Gettiered true beliefs? Or—and this is a point that has received little attention—even more weakly, can the true beliefs be themselves unreliably formed or held on the basis of bad reasons. For example, if I competently grasp the relevant coherence-making and explanatory relations between propositions about chemistry which I believe and which are true but which I believed on an improper basis. For example, by trusting someone I should not have trusted, or even worse, by reading tea leaves which happened to afford me true beliefs about chemistry. Would this impede one’s understanding? If so, why, and if not why not? Relatedly, if framed in terms of credence, what credence threshold must be met, with respect to propositions in some set, for the agent to understand that subject matter? One helpful way to think about this is as follows: if one takes a paradigmatic case of an individual who understands a subject matter thoroughly, and manipulates the credence the agent has toward the propositions constituting the subject matter, how low can one go before the agent no longer understands the subject matter in question?

Fourthly, a relatively fertile area for further research concerns the semantics of understanding attribution. To what extent do the advantages and disadvantages of, for example, sensitive invariantist, contextualist, insensitive invariantist and relativist approaches to knowledge attributions find parallels in the case of understanding attributions. Is it problematic to embrace, for example, a contextualist semantics for knowledge attributions while embracing, say, invariantism about understanding?

Fifthly, to what extent might active externalist approaches (for example, extended mind and extended cognition) in epistemology, the ramifications of which have recently been brought to bear on the theory of knowledge (see Carter, et. al 2014), have for understanding? Toon (2015) has recently suggested, with reference to the hypothesis of extended cognition, that understanding can be located partly outside the head. Are the prospects of extending understanding via active externalism on a par with the prospects for extending knowledge, or is understanding essentially internal in a way that knowledge need not be?

Finally, there is fruitful work to do concerning the relationship between understanding and wisdom. For example, in Whitcomb (2011) we find the suggestion that theoretical wisdom is a form of particularly deep understanding. Whether wisdom might be a type of understanding or understanding might be a component of wisdom is a fascinating question that can draw on both work in virtue ethics and epistemology.

7. References and Further Reading

  • Alston, W. Beyond ‘Justification’: Dimensions of Epistemic Evaluation. Ithaca, N.Y.: Cornell University Press, 2005.
    • Includes Alston’s view of curiosity, according to which the epistemic value of true belief and knowledge partially comes from a link to curiosity.
  • Baker, L. R. “Third Person Understanding” in A. Sanford (ed.), The Nature and Limits of Human Understanding. London: Continuum, 2003.
    • Outlines a view on which understanding something requires making reasonable sense of it.
  • Batterman, R. W. “Idealization and modelling.” Synthese, 169(3) (2009): 427-446.
    • Endorses the idea that when we consider how things would be if something was true, we increase our access to further truths.
  • Bradford, G. “The Value of Achievements.” Pacific Philosophical Quarterly, 94(2) (2013): 204-224.
    • Resists Pritchard’s claim that there can be weak achievements, that is, ones that do not necessarily involve great effort.
  • Bradford, G. Achievement. Oxford: Oxford University Press, 2015.
    • A monograph that explores the nature and value of achievements in great depth.
  • Carter, J. A. and Gordon, E. C. “Norms of Assertion: The Quantity and Quality of Epistemic Support.” Philosophia 39(4) (2011): 615-635.
    • Argues that a type of understanding might be the norm that warrants assertion in a restricted class of cases.
  • Carter, J. A. and Gordon, E. C. “On Pritchard, Objectual Understanding and the Value Problem.” American Philosophical Quarterly 51 (2014): 1-14.
    • Criticizes the claim that understanding-why should be identified with strong cognitive achievement.
  • Carter, J. A., Kallestrup, J. Palermos, S.O. and Pritchard, D. “Varieties of Externalism.” Philosophical Issues 41(1) (2014): 63-109.
    • Considers some of the ramifications that active externalist approaches might have for epistemology.
  • Carter, J. A. and Pritchard, D. “Knowledge-How and Epistemic Luck.” Noûs (2013).
    • Discusses whether intellectualist arguments for reducing know-how to propositional knowledge might also apply to understanding-why (if it is a type of knowing how).
  • DePaul, M. “Ugly Analysis and Value” in A. Haddock, A. Millar and D. Pritchard (eds.), Epistemic Value. Oxford: Oxford University Press, 2009.
    • Looks at the increasing dissatisfaction with ever-more complicated attempts to generate a theory of knowledge immune to counterexamples.
  • Elgin, C. Z. “True enough.” Philosophical issues, 14(1) (2004): 113-131.
    • Includes further discussion of the role of acceptance and belief in her view of understanding.
  • Elgin, C. “Understanding and the Facts.” Philosophical Studies 132 (2007): 33-42.
    • Argues against a factive conception of scientific understanding.
  • Elgin, C. “Exemplification, Idealization, and Understanding” in M. Suárez (ed.), Fictions in Science: Essays on Idealization and Modeling. London: Routledge, 2009.
    • Explores the epistemological role of exemplification and aims to illuminate the relationship between understanding and scientific idealizations construed as fictions.
  • DePaul, M. and Grimm, S. “Review Essay: Kvanvig’s The Value of Knowledge and the Pursuit of Understanding.” Philosophy and Phenomenological Research 74 (2007): 498-514.
    • Includes criticism of Kvanvig’s line on epistemic luck and understanding.
  • De Regt, H. and Dieks, D. “A Contextual Approach to Scientific Understanding.” Synthese 144 (2005): 137-170.
    • Offers an account of understanding that requires having a theory of the relevant phenomenon.
  • Gettier, E. “Is Justified True Belief Knowledge?” Analysis 23 (6) (1963). 121-132.
    • Contains the famous counterexamples to the Justified True Belief account of knowledge.
  • Ginet, C. Knowledge, Perception and Memory. Dordrecht: Reidel, 1975.
    • Contains the paradigmatic case of environmental epistemic luck (that is, the fake barn case).
  • Goldman, A. “What is Justified Belief?” In G. S. Pappas (ed.), Justification and Knowledge. Dordrecht: Reidel, 1979.
    • Often-cited discussion of the fake barn counterexample to traditional accounts of knowledge that focus on justified true belief.
  • Goldman, A. “Stephen P. Stitch: The Fragmentation of Reason.” Philosophy and Phenomenological Research 51(1) (1991): 189-193.
    • Discusses the connection between curiosity and true belief.
  • Goldman, A. Knowledge in a Social World. Oxford: Oxford University Press, 1999.
    • Contains exploration of whether the value knowledge may be in part determined by the extent to which it provides answers to questions one is curious about.
  • Gordon, E. C. “Is There Propositional Understanding?” Logos & Episteme 3 (2012): 181-192.
    • Examines reasons to suppose that attributions of understanding are typically attributions of knowledge, understanding-why or objectual understanding.
  • Greco, J. “The Nature of Ability and the Purpose of Knowledge.” Philosophical Issues 17 (2007): 57-69.
    • Discusses and defines ability in the sense often appealed to in work on cognitive ability and the value of knowledge.
  • Grimm, S. “Is Understanding a Species of Knowledge?” British Journal for the Philosophy of Science 57 (2006): 515-535.
    • Analyzes Kvanvig’s Comanche case and argues that knowledge and understanding do not come apart in this example.
  • Grimm, S. “Understanding” In S. Bernecker and D. Pritchard (eds.), The Routledge Companion to Epistemology. New York: Routledge, 2011.
    • An overview of the object, psychology, and normativity of understanding.
  • Grimm, S. “The Value of Understanding.” Philosophy Compass 7(2) (2012): 103-177.
    • Gives an overview of recent arguments for revisionist theories of epistemic value that suggest understanding is more valuable than knowledge.
  • Grimm, S. “Understanding as Knowledge of Causes” in A. Fairweather (ed.), Virtue Epistemology Naturalized: Bridges Between Virtue Epistemology and Philosophy of Science. Dordrecht: Springer, 2014.
    • A novel interpretation of the traditional view according to which understanding-why can be explained in terms of knowledge of causes.
  • Hazlett, A. “The Myth of Factive Verbs.” Philosophy and Phenomenological Research 80:3 (2010): 497-522.
    • Argues that the ordinary concept of knowledge is not factive and that epistemologists should therefore not concern themselves with said ordinary concept.
  • Hempel, C. Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. New York: Free Press, 1965.
    • Early defence of explanation’s key role in understanding.
  • Hetherington, S. “There Can be Lucky Knowledge” in M. Steup, J. Turri and E. Sosa (eds.), Contemporary Debates in Epistemology (2nd Edition). Oxford: Wiley-Blackwell, 2013.
    • A paper in which it is argued that (contrary to popular opinion) knowledge does not exclude luck.
  • Hills, A. “Moral Testimony and Moral Epistemology.” Ethics 120 (2009): 94-127.
    • In looking at moral understanding-why, outlines some key abilities that may be necessary to the “grasping” component of understanding.
  • Janvid, M. “Knowledge versus Understanding: The Cost of Avoiding Gettier.” Acta Analytica 27 (2012): 183-197.
    • Disputes the popular claim that understanding is more epistemically valuable than knowledge.
  • Kim, J. “Explanatory Knowledge and Metaphysical Dependence.” In his Essays in the Metaphysics of Mind. New York: Oxford University Press, 1994.
    • Contains Kim’s classic discussion of species of dependence (for example, mereological dependence).
  • Kelp, C. “Understanding Phenomena.” Synthese (2015).
    • Divides recent views of understanding according to whether they are “manipulationist” or “explanationst”; argues for a different view according to which understanding is maximally well-connected knowledge.
  • Khalifa, K. Inaugurating understanding or repackaging explanation. Philosophy of Science, 79(1) (2012): 15-37.
    • Argues that we should replace the main developed accounts of understanding with earlier accounts of scientific explanation.
  • Khalifa, K. “Is Understanding Explanatory or Objectual?” Synthese 190(6) (2013a): 1153-1171.
    • Proposes a framework for reducing objectual understanding to what he calls explanatory understanding.
  • Khalifa, K. “Understanding, Grasping and Luck.” Episteme 10 (1) (2013b): 1-17.
    • Argues against compatibility between understanding and epistemic luck.
  • Kvanvig, J. The Value of Knowledge and the Pursuit of Understanding. NY: Cambridge University Press, 2003.
    • The root of the recent resurgence of interest in understanding in epistemology. This paper proposes a revisionist view of epistemic value and an outline of different types of understanding.
  • Kvanvig, J. “The Value of Understanding” In D. Pritchard, A. Haddock and A. Millar (eds.), Epistemic Value. Oxford: Oxford University Press, 2009.
    • Argues that the concerns plaguing theories of knowledge do not cause problems for a theory of understanding.
  • Kvanvig, J. “Curiosity and a Response-Dependent Account of the Value of Understanding.” In T. Henning and D. Schweikard (eds.), Knowledge, Virtue and Action. Boston: Routledge, 2013.
    • Proposes an account of understanding’s value that is related to its connection with curiosity.
  • Lackey, J. “Why We Don’t Deserve Credit for Everything We Know.” Synthese 156 (2007).
    • Contains Lackey’s counterexamples to the knowledge transmission principles.
  • Lipton, P. “Understanding Without Explanation” in H. de Regt, S. Leonelli, and K. Eigner (eds.), Scientific Understanding: Philosophical Perspectives. Pittsburgh, PA: University of Pittsburgh Press, 2009.
    • Argues that requiring knowledge of an explanation is too strong a condition on understanding-why.
  • Longworth, G. “Linguistic Understanding and Knowledge.” Nous 42 (2008): 50-79.
    • A discussion of whether linguistic understanding is a form of knowledge.
  • Morris, K. “A Defense of Lucky Understanding.” The British Journal for the Philosophy of Science 63 (2012): 357-371.
    • Attempts to explain away the intuitions suggesting that lucky understanding is incompatible with epistemic luck.
  • Olsson, E. “Coherentist Theories of Epistemic Justification” in E. Zalta (ed.), The Stanford Enclopedia of Philosophy.
    • An overview of coherentism that can be useful when considering how theories of coherence might be used to flesh out the grasping condition on understanding.
  • Pritchard, D. Epistemic Luck. Oxford: Oxford University Press, 2005.
    • An in-depth exploration of different types of epistemic luck.
  • Pritchard, D. “Recent Work on Epistemic Value.” American Philosophical Quarterly 44 (2007): 85-110.
    • Looks at understanding’s role in recent debates about epistemic value and contains key arguments against Elgin’s non-factive view of understanding.
  • Pritchard, D. “Knowing the Answer, Understanding and Epistemic Value.” Grazer Philosophische Studien 77 (2008): 325-39.
    • Explores understanding as the proper goal of inquiry, in addition to discussing understanding’s distinctive value.
  • Pritchard, D. “Knowledge, Understanding and Epistemic Value” In A. O’Hear (ed.), Epistemology (Royal Institute of Philosophy Lectures). Cambridge: Cambridge University Press, 2009.
    • Argues that understanding (unlike knowledge) is a type of cognitive achievement and therefore of distinctive value.
  • Pritchard, D. “The Value of Knowledge: Understanding.” In A. Haddock, A. Millar and D. Pritchard (eds.), The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press, 2010.
    • A longer discussion of the nature of understanding and its distinctive value (in relation to the value of knowledge) than in his related papers.
  • Pritchard, D. “Knowledge and Understanding” in A. Fairweather (ed.), Virtue Epistemology Naturalized: Bridges Between Virtue Epistemology and Philosophy of Science. Dordecht: Springer, 2014.
    • Criticizes Grimm’s view of understanding as knowledge of causes.
  • Riaz, A. “Moral Understanding and Knowledge.” Philosophical Studies 172(2) (2015): 113-128.
    • Argues against the view that moral understanding can be immune to luck while moral knowledge is not.
  • Riggs, W. “Understanding Virtue and the Virtue of Understanding” In M. DePaul and L. Zagzebski (eds.), Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Oxford University Press, 2003.
    • Introduces intelligibility as an epistemic state similar to understanding but less valuable.
  • Riggs, W. “Why Epistemologists Are So Down on Their Luck.” Synthese 158 (3) (2007): 329-344.
    • Defends a lack of control account of luck.
  • Rohwer, Y. “Lucky Understanding Without Knowledge.” Synthese 191 (2014): 945-959.
    • Claims that understanding is entirely compatible with both intervening and environmental forms of veritic luck.
  • Salmon, W. “Four Decades of Scientific Explanation.” In Minnesota Studies in the Philosophy of Science, vol. 13. Eds. Philip Kitcher and Wesley Salmon. Minneapolis: University of Minnesota Press, 1989.
    • Another significant paper endorsing the claim that knowledge of explanations should play a vital role in our theories of understanding.
  • Sliwa, P. IV—Understanding and Knowing. In Proceedings of the Aristotelian Society (Hardback) (Vol. 115, No. 1pt1): pp. 57-74, 2015.
    • Defends the strong claim that propositional knowledge is necessary and sufficient for understanding.
  • Stanley, J. Know How. Oxford: Oxford University Press, 2011.
    • Outlines and evaluates the anti-intellectualist and intellectualist views of know-how.
  • Stanley, J and Williamson, T. “Knowing How.” Journal of Philosophy 98(8) (2001): 411-444.
    • An earlier paper defending the intellectualist view of know-how.
  • Strevens, M. “No Understanding Without Explanation.” Studies in History and Philosophy of Science 44 (2013): 510-515.
    • Defends views that hold explanation as indispensable for account of understanding and discusses what a non-factive account of grasping would look like.
  • Sullivan, E. “Understanding: Not Know-How.” Philosohpical Studies (2017). https://doi.org/10.1007/s11098-017-0863-z
    • Resists the alleged similarity between understanding and knowing-how.
  • Toon, A. “Where is the Understanding?” Synthese, 2015.
    • Uses the hypothesis of extended cognition to argue that understanding can be located (at least partly) outside the head.
  • Trout, J.D. “The Psychology of Scientific Explanation.” Philosophy Compass 2(3) (2007): 564-591.
    • Contains a discussion of the fact that we often take ourselves to understand things we do not.
  • Van Camp, W. “Explaining Understanding (or Understanding Explanation.” European Journal for Philosophy of Science 4(1) (2014): 95-114.
    • Uses the concept of understanding to underwrite a theory of explanation.
  • Whitcomb, D. “Wisdom.” In S. Bernecker and D. Pritchard (eds.), The Routledge Companion to Epistemology. New York: Routledge, 2011.
    • An overview of wisdom, including its potential relationship to understanding.
  • Whitcomb, D. “Epistemic Value” In A. Cullison (ed.), The Continuum Companion to Epistemology. London: Continuum, 2012.
    • An overview of issues relating to epistemic value, including discussion of understanding as a “higher” epistemic state.
  • Wilkenfeld, D. “Understanding as Representation Manipulability.” Synthese 190 (2013): 997-1016.
    • Builds an account of understanding according to which understanding a subject matter involves possessing a representation that could be manipulated in a useful way.
  • Zagzebski, L. “Recovering Understanding” In M. Steup (ed.), Knowledge, Truth and Obligation. Oxford: Oxford University Press, 2001.
    • Incudes arguments for the position that understanding need not be factive.
  • Zagzebski, L. On Epistemology. CA: Wadsworth, 2009.
    • An overview of the background, development and recent issues in epistemology, including a chapter on understanding as an epistemic good.

 

Author Information

Emma C. Gordon
Email: emma.gordon@ed.ac.uk
University of Edinburgh
Scotland, U.K.

Ayn Rand (1905—1982)

Ayn Rand was a major intellectual of the twentieth century. Born in Russia in 1905 and educated there, she immigrated to the United States after graduating from university. Upon becoming proficient in English and establishing herself as a writer of fiction, she became well-known as a passionate advocate of a philosophy she called Objectivism. This philosophy is in the Aristotelian tradition, with that tradition’s emphasis upon metaphysical naturalism, empirical reason in epistemology, and self-realization in ethics. Her political philosophy is in the classical liberal tradition, with that tradition’s emphasis upon individualism, the constitutional protection of individual rights to life, liberty, and property, and limited government. She wrote both technical and popular works of philosophy, and she presented her philosophy in both fictional and nonfictional forms. Her philosophy has influenced several generations of academics and public intellectuals, and has had widespread popular appeal.

Regarding human nature, Rand said, “Man is a being of self-made soul.” Rand believes human beings are not born in sin or with destructive desires; nor do they necessarily acquire them in the course of growing to maturity. Instead one is born morally tabula rasa (a blank slate), and through one’s choices and actions one acquires one’s character traits and habits. Having chronic desires to steal, rape, or kill others is the result of mistaken development and the acquisition of bad habits, just as are chronic laziness or the habit of eating too much junk food. And just as one is not born lazy but can by one’s choices develop oneself into a person of vigor or sloth, so also one is not born antisocial but can by one’s choices develop oneself into a person of cooperativeness or conflict.

Table of Contents

  1. Life
  2. Rand’s Ethical Theory: Rational Egoism
  3. Reason and Ethics
  4. Criticisms of Rand’s Ethics
  5. Conflicts of Interest
  6. Rand’s Influence
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Ayn Rand’s life was often as colorful as those of her heroes in her best-selling novels The Fountainhead and Atlas Shrugged. Rand first made her name as a novelist, publishing We the Living (1936), The Fountainhead (1943), and her magnum opus Atlas Shrugged (1957). These philosophical novels embodied themes she subsequently developed in nonfiction form in a series of essays and books written in the 1960s and 1970s.

Born in St. Petersburg, Russia, on February 2, 1905, Rand was raised in a middle-class family. As a child, she loved storytelling, and at age nine she decided to become a writer. In school she showed academic promise, particularly in mathematics. Her family was devastated by the communist revolution of 1917, both by the social upheavals that the revolution and the ensuing civil war brought and by her father’s pharmacy being confiscated by the Soviets. The family moved to the Crimea to recover financially and to escape the harshness of life the revolution brought to St. Petersburg. They later returned to Petrograd (the new name given to St. Petersburg by the Soviets), where Rand was to attend university.

At the University of Petrograd, Rand concentrated her studies on history, with secondary focuses on philosophy and literature. At university, she was repelled by the dominance of communist ideas and strong-arm tactics that suppressed free inquiry and discussion. As a youth, she had been repelled by the communists’ political program, and now an adult, she was also more fully aware of the destructive effects that the revolution had had on Russian society more broadly.

Having studied American history and politics at university, and having long been an admirer of Western plays, music, and movies, she became an admirer of American individualism, vigor, and optimism, seeing them as the opposites of Russian collectivism, decay, and gloom. Not believing, however, that she would be free under the Soviet system to write the kinds of books she wanted to write, she resolved to leave Russia and go to America.

Rand graduated from the University of Petrograd in 1924. She then enrolled at the State Institute for Cinema Arts in order to study screenwriting. In 1925, she finally received permission from the Soviet authorities to leave the country in order to visit relatives in the United States. Officially, her visit was to be brief; Rand, however, had already decided not to return to the Soviet Union.

After several stops in western European cities, Rand arrived in New York City in February 1926. From New York, she traveled on to Chicago, Illinois, where she spent the next six months living with relatives, learning English, and developing ideas for stories and movies. She had decided to become a screenwriter, and, having received an extension to her visa, she left for Hollywood, California.

On Rand’s second day in Hollywood, an event occurred that was worthy of her fiction. She was spotted by Cecil B. DeMille, one of Hollywood’s leading directors, while she was standing at the gate of his studio. She had recognized him as he was passing by in his car, and he had noticed her staring at him. He stopped to ask why she was staring, and Rand explained that she had recently arrived from Russia, that she had long been passionate about Hollywood movies, and that she dreamed of being a screenwriter. DeMille was then working on “The King of Kings,” and gave her a ride to his movie set and signed her on as an extra. During her second week at DeMille’s studio, another significant event occurred: Rand met Frank O’Connor, a young actor also working as an extra. Rand and O’Connor were married in 1929, and they remained married for fifty years until his death in 1979.

Rand worked for DeMille as a reader of scripts and struggled financially while working on her own writing. She also held a variety of non-writing jobs until in 1932 she was able to sell her first screenplay, “Red Pawn,” to Universal Studios. Also in 1932 her first stage play, “Night of January 16th,” was produced in Hollywood and later on Broadway.

Rand had been working for years on her first significant novel, We the Living, and finished it in 1933. However, for several years it was rejected by various publishers, until in 1936 it was published by Macmillan in the U.S. and Cassell in England. Rand described We the Living as the most autobiographical of her novels, its theme being the brutality of life under communist rule in Russia. We the Living did not receive a positive reaction from American reviewers and intellectuals. It was published in the 1930s, a decade sometimes called the “Red Decade,” during which American intellectuals were often pro-communist and respectful and admiring of the Soviet experiment.

Rand’s next major project was The Fountainhead, which she had begun to work on in 1935. While the theme of We the Living was political, the theme of The Fountainhead was ethical, focusing on individualist themes of independence and integrity. The novel’s hero, the architect Howard Roark, is Rand’s first embodiment of her ideal man, the man who lives on a principled and heroic scale of achievement.

As with We the Living, Rand had difficulties getting The Fountainhead published. Twelve publishers rejected it before it was published by Bobbs-Merrill in 1943. Again not well received by reviewers and intellectuals, the novel nonetheless became a best seller, primarily through word-of-mouth recommendation. The Fountainhead made Rand famous as an exponent of individualist ideas, and its continuing to sell well brought her financial security. Warner Brothers produced a movie version of the novel in 1949, starring Gary Cooper and Patricia Neal, for which Rand wrote the screenplay.

In 1946, Rand began work on her most ambitious novel, Atlas Shrugged. At the time, she was working part-time as a screenwriter for producer Hal Wallis. In 1951, she and her husband moved to New York City, where she began to work full-time on Atlas. Published by Random House in 1957, Atlas Shrugged is her most complete expression of her literary and philosophical vision. Dramatized in the form of a mystery about a man who stopped the motor of the world, the plot and characters embody the political and ethical themes first developed in We the Living and The Fountainhead and integrates them into a comprehensive philosophy including metaphysics, epistemology, economics, and the psychology of love and sex.

Atlas Shrugged was an immediate best seller and Rand’s last work of fiction. Her novels had expressed philosophical themes, although Rand considered herself primarily a novelist and only secondarily a philosopher. The creation of plots and characters and the dramatization of achievements and conflicts were her central purposes in writing fiction, rather than presenting an abstracted and didactic set of philosophical theses.

The Fountainhead and Atlas Shrugged, however, had attracted to Rand many readers who were strongly interested in the philosophical ideas the novels embodied and in pursuing them further. Among the earliest of those with whom Rand became associated and who later became prominent were psychologist Nathaniel Branden and economist Alan Greenspan, later Chairman of the Federal Reserve. Her interactions with these and several other key individuals were partly responsible for Rand’s turning from fiction to nonfiction writing in order to develop her philosophy more systematically.

From 1962 until 1976, Rand wrote and lectured on her philosophy, now named “Objectivism.” Her essays during this period were mostly published in a series of periodicals: The Objectivist Newsletter, published from 1962 to 1965; the larger periodical The Objectivist, published from 1966 to 1971; and then The Ayn Rand Letter, published from 1971 to 1976. The essays written for these periodicals form the core material for a series of nine nonfiction books published during Rand’s lifetime. These books develop Rand’s philosophy in all its major categories and apply it to cultural issues. Perhaps the most significant of these books are The Virtue of Selfishness, which develops her ethical theory, Capitalism: The Unknown Ideal, devoted to political and economic theory, Introduction to Objectivist Epistemology, a systematic presentation of her theory of concepts, and The Romantic Manifesto, a theory of aesthetics.

During the 1960s, Rand’s most significant professional relationship was with Nathaniel Branden. Branden, author of The Psychology of Self-Esteem and later known as a leader in the self-esteem movement in psychology, wrote many essays on philosophical and psychological topics that were published in Rand’s books and periodicals. He was the founder and head of the Nathaniel Branden Institute, the leading Objectivist institution of the 1960s. Based in New York City, the Nathaniel Branden Institute published with Rand’s sanction numerous periodicals and pamphlets and sponsored many lectures in New York that were then distributed on tape around the United States and the rest of the world. The rapid growth of the Nathaniel Branden Institute and the Objectivist movement came to a halt in 1968 when, for both professional and personal reasons, Rand and Branden parted ways.

Rand continued to write and lecture consistently until she stopped publishing The Ayn Rand Letter in 1976. Thereafter she wrote and lectured less as her husband’s health declined, leading to his death in 1979, and as her own health began to decline. Rand died on March 6, 1982, in her New York City apartment.

2. Rand’s Ethical Theory: Rational Egoism

The provocative title of Ayn Rand’s The Virtue of Selfishness matches an equally provocative thesis about ethics. Traditional ethics has always been suspicious of self-interest, praising acts that are selfless in intent and calling amoral or immoral acts that are motivated by self-interest. A self-interested person, on the traditional view, will not consider the interests of others and so will slight or harm those interests in the pursuit of his own.

Rand’s view is that the exact opposite is true: Self-interest, properly understood, is the standard of morality and selflessness is the deepest immorality.

Self-interest rightly understood, according to Rand, is to see oneself as an end in oneself. That is to say that one’s own life and happiness are one’s highest values, and that one does not exist as a servant or slave to the interests of others. Nor do others exist as servants or slaves to one’s own interests. Each person’s own life and happiness are their ultimate ends. Self-interest rightly understood also entails self-responsibility: One’s life is one’s own, and so is the responsibility for sustaining and enhancing it. It is up to each of us to determine what values our lives require, how best to achieve those values, and to act to achieve those values.

Rand’s ethic of self-interest is integral to her advocacy of classical liberalism. Classical liberalism, more often called “libertarianism” in the twentieth century, is the view that individuals should be free to pursue their own interests. This implies, politically, that governments should be limited to protecting each individual’s freedom to do so. In other words, the moral legitimacy of self-interest implies that individuals have rights to their lives, their liberties, their property, and the pursuit of their own happiness, and that the purpose of government is to protect those rights. Economically, leaving individuals free to pursue their own interests implies in turn that only a capitalist or free market economic system is moral: Free individuals will use their time, money, and other property as they see fit, and will interact and trade voluntarily with others to mutual advantage.

3. Reason and Ethics

Fundamentally, the means by which humans live is reason. Our capacity for reason is what enables us to survive and flourish. We are not born knowing what is good for us; that is learned. Nor are we born knowing how to achieve what is good for us; that too is learned. It is by reason that we learn what is food and what is poison, what animals are useful or dangerous to us, how to make tools, what forms of social organization are fruitful, and so on.

Thus, Rand advocates rational self-interest: One’s interests are not whatever one happens to feel like; rather it is by reason that one identifies what is in one’s interest and what is not. By the use of reason one takes into account all of the factors one can identify, projects the consequences of potential courses of action, and adopts principled policies of action.

The principled policies a person should adopt are called virtues. A virtue is an acquired character trait; it results from identifying a policy as good and committing to acting consistently in terms of that policy.

One such virtue is rationality: Having identified the use of reason as fundamentally good, the virtue of rationality is being committed to acting in accordance with reason. Another virtue is productiveness: Given that the values one needs to survive must be produced, the virtue of productiveness is being committed to producing those values. Another is honesty: Given that facts are facts and that one’s life depends on knowing and acting in accordance with the facts, the virtue of honesty is being committed to awareness of the facts.

Independence and integrity are also core virtues for Rand’s account of self-interest. Given that one must think and act by one’s own efforts, being committed to the policy of independent action is a virtue. And given that one must both identify what is in one’s interests and act to achieve it, the virtue of integrity is a policy of being committed to acting on the basis of one’s beliefs. The opposite policy of believing one thing and doing another is of course the vice of hypocrisy; hypocrisy is a policy of self-destruction, on Rand’s view.

Justice is another core self-interested virtue: Justice, on Rand’s account, means a policy of judging people, including oneself, according to their value and acting accordingly. The opposite policy of giving to people more or less than they deserve is injustice. The final virtue on Rand’s list of core virtues is pride, the policy of “moral ambitiousness,” in Rand’s words. This means a policy of being committed to making oneself be the best one can be, of shaping one’s character to the highest level possible.

The moral person, in summary, on Rand’s account, is someone who acts and is committed to acting in their best self-interest. It is by living the morality of self-interest that one survives, flourishes, and achieves happiness.

4. Criticisms of Rand’s Ethics

Every aspect of Rand’s philosophy is subject to lively criticism and debate, but her normative views are the ones most focused upon.

From the broadly defined conservative right, the main criticisms are (a) that Rand’s metaphysical naturalism involves an atheism that undercuts religious metaphysics, (b) that her strong emphasis upon empirical data and reason undercut epistemologies based on faith and tradition, and (c) that her normative individualism undercuts the commands of duty, obligation and selflessness that are necessary for achieving social values. From the left, again defined broadly, the main criticisms are (a) that Rand’s individualism atomistically isolates each of us from genuine society, (b) that her advocacy of free markets enables strong-versus-weak exploitation, and in left-postmodern critique (c) that her philosophical fundamentals commit her to an untenable foundationalism and absolutism.

Here we will focus only on the arguments over Rand’s account of self-interest, which is currently a minority position and subject to strong criticism from both the philosophical left and the philosophical right.

The contrasting view of self-interest typically pits it against morality, holding that one is moral only to the extent that one sacrifices one’s self-interest for the sake of others or, more moderately, to the extent one acts primarily with regard to the interests of others. For example, standard versions of morality will hold that one is moral to the extent one sets aside one’s own interests in order to serve God, or the weak and the poor, or society as a whole. On these accounts, the interests of God, the poor, or society as a whole are held to be of greater moral significance than one’s own, and so accordingly one’s interests should be sacrificed when necessary. These ethics of selflessness thus believe that one should see oneself fundamentally as a servant, as existing to serve the interests of others, not one’s own. “Selfless service to others” or “selfless sacrifice” are stock phrases indicating these accounts’ view of appropriate motivation and action.

One core difference between Rand’s self-interest view and the selfless view can be seen in the reason why most advocates of selflessness think self-interest is dangerous: conflicts of interest.

5. Conflicts of Interest

Most traditional ethics take conflicts of interest to be fundamental to the human condition, and take ethics to be the solution: Basic ethical principles are to tell us whose interests should be sacrificed in order to resolve the conflicts. If there is, for example, a fundamental conflict between what God wants and what humans naturally want, then religious ethics will make fundamental the principle that human wants should be sacrificed for God’s. If there is a fundamental conflict between what society needs and what individuals want, then some versions of secular ethics will make fundamental the principle that the individual’s wants should be sacrificed for society’s.

Taking conflicts of interest to be fundamental almost always stems from one of two beliefs: that human nature is fundamentally destructive or that economic resources are scarce. If human nature is fundamentally destructive, then humans are naturally in conflict with each other. Many ethical philosophies start from this premise—for example, Plato’s myth of Gyges, Jewish and Christian accounts of original sin, and Freud’s account of the id. If what individuals naturally want to do to each other is rape, steal, and kill, then in order to have society these individual desires need to be sacrificed. Consequently, a basic principle of ethics will be to urge individuals to suppress their natural desires so that society can exist. In other words, self-interest is the enemy, and must be sacrificed for others.

If economic resources are scarce, then there is not enough to go around. This scarcity then puts human beings in fundamental conflict with each other: For one individual’s need to be satisfied, another’s must be sacrificed. Many ethical philosophies begin with this premise. For example, Thomas Malthus’s theory that population growth outstrips growth in the food supply falls into this category. Karl Marx’s account of capitalist society is that brutal competition leads to the exploitation of some by others. Garrett Hardin’s famous use of the lifeboat analogy asks us to imagine that society is like a lifeboat with more people than its resources can support. And so, in order to solve the problem of destructive competition the lack of resources leads us to, a basic principle of ethics will be to urge individuals to sacrifice their interests in obtaining more, or even some, so that others may obtain more or some and society can exist peacefully. In other words, in a situation of scarcity, self-interest is the enemy and must be sacrificed for others.

Rand rejects both the scarce resources and destructive human nature premises. Human beings are not born in sin or with destructive desires; nor do they necessarily acquire them in the course of growing to maturity. Instead one is born morally tabula rasa (“blank slate”), and through one’s choices and actions one acquires one’s character traits and habits. As Rand phrased it, “Man is a being of self-made soul.” Having chronic desires to steal, rape, or kill others is the result of mistaken development and the acquisition of bad habits, just as are chronic laziness or the habit of eating too much junk food. And just as one is not born lazy but can by one’s choices develop oneself into a person of vigor or sloth, one is not born antisocial but can by one’s choices develop oneself into a person of cooperativeness or conflict.

Nor are resources scarce, according to Rand, in any fundamental way. By the use of reason, humans can discover new resources and how to use existing resources more efficiently, including recycling where appropriate and making productive processes more efficient. Humans have, for example, continually discovered and developed new energy resources, from animals to wood to coal to oil to nuclear fission to solar panels; and there is no end in sight to this process. At any given moment, the available resources are a fixed amount, but over time the stock of resources are and have been constantly expanding.

Because humans are rational they can produce an ever-expanding number of goods, and so human interests do not fundamentally conflict with each other. Instead, Rand holds that the exact opposite is true: Since humans can and should be productive, human interests are deeply in harmony with each other. For example, my producing more corn is in harmony with your producing more peas, for by our both being productive and trading with each other we are both better off. It is to your interest that I be successful in producing more corn, just as it is to my interest that you be successful in producing more peas.

Conflicts of interest do exist within a narrower scope. For example, in the immediate present available resources are more fixed, and so competition for those resources results, and competition produces winners and losers. Economic competition, however, is a broader form of cooperation, a social way to allocate resources without resorting to physical force and violence. By competition, resources are allocated efficiently and peacefully, and in the long run more resources are produced. Thus, a competitive economic system is in the self-interest of all of us.

Accordingly, Rand argues that her ethic of self-interest is the basis for personal happiness and free and prosperous societies.

6. Rand’s Influence

The impact of Rand’s ideas is difficult to measure, but it has been large. All her books were still in print as of 2017, had sold more than thirty million copies, and continued to sell approximately one million copies each year. A survey jointly conducted by the Library of Congress and the Book of the Month Club early in the 1990s asked readers to name the book that had most influenced their lives: Atlas Shrugged was second only to the Bible. Excerpts from Rand’s works are regularly reprinted in college textbooks and anthologies, and several volumes have been published posthumously containing her early writings, journals, and letters. As an outsider with iconoclastic views, Rand’s influence within the academic world has been limited, though university press books and scholarly articles about her work continue to be published regularly. Outside the academic world are several institutes founded by those influenced by Rand. Noteworthy among these are the Cato Institute, based in Washington, D.C., the leading libertarian think tank. Rand, along with Nobel Prize-winners Friedrich Hayek and Milton Friedman, was highly instrumental in attracting generations of individuals to the libertarian movement. Also noteworthy are the Ayn Rand Institute, founded in 1985 by philosopher Leonard Peikoff and entrepreneur Edward Snider and based in California, and The Atlas Society, founded in 1990 by philosopher David Kelley and based in Washington, D.C.

7. References and Further Reading

a. Primary Sources

  • Rand, Ayn. Atlas Shrugged. Random House, 1957.
    • Rand’s magnum opus of fiction.
  • Rand, Ayn. Capitalism: The Unknown Ideal. New American Library, 1967.
    • A collection of twenty of Rand’s essays on politics, history, and economics. Also includes two essays by psychologist Nathaniel Branden, three by economist Alan Greenspan, and one by historian Robert Hessen.
  • Rand, Ayn. The Fountainhead. Bobbs-Merrill, 1943.
    • The novel of individualism, independence, and integrity that made Rand famous.
  • Rand, Ayn. Introduction to Objectivist Epistemology. New American Library, 1979.
    • Rand’s theory of concept-formation. Includes an essay by philosopher Leonard Peikoff on the analytic/synthetic distinction.
  • Rand, Ayn. Philosophy: Who Needs It. Bobbs-Merrill, 1982.
    • A collection of Rand’s essays on the nature and significance of philosophy, including her critiques of other thinkers such as Kant, Aristotle, Rawls, and Skinner.
  • Rand, Ayn. The Romantic Manifesto. World Publishing, 1969. Paperback edition: New American Library, 1971.
    • A collection of Rand’s essays on philosophy of art and aesthetics.
  • Rand, Ayn. The Virtue of Selfishness. New American Library, 1964.
    • A collection of fourteen of Rand’s essays on ethics. Also includes five essays by psychologist Nathaniel Branden.
  • Rand, Ayn. We the Living. Macmillan, 1936.
    • Rand’s first novel, set in the Soviet Union in the years following the Russian Revolution.

b. Secondary Sources

  • Badhwar, Neera, and Long, Roderick T. “Ayn Rand,” The Stanford Encyclopedia of Philosophy, 2010/2016.
    • Two philosophers present an overview of Rand’s life and work in the major areas of philosophy, with special attention to several major disagreements among philosophers working within Objectivism.
  • Binswanger, Harry. The Biological Basis of Teleological Concepts. Los Angeles, CA: A.R.I. Press, 1990.
    • Written by a philosopher, this is a scholarly work focused on the connection between biology and the concepts at the roots of ethics.
  • Branden, Nathaniel. The Vision of Ayn Rand: The Basic Principles of Objectivism. Cobden Press, 2009.
    • A comprehensive overview of Rand’s philosophy based on the lecture series presented under Rand’s auspices in the 1960s.
  • Branden, Nathaniel, and Branden, Barbara. Who Is Ayn Rand? New York: Random House, 1962.
    • This book contains essays on Objectivism’s moral philosophy, its connection to psychological theory, and a literary study of Rand’s novel methods. It contains an additional biographical essay, tracing Rand’s life from birth up until her mid-50s.
  • Burns, Jennifer. Goddess of the Market: Ayn Rand and the American Right. Oxford University Press, 2009.
    • Written by a historian, a scholarly discussion of Rand’s ambiguous relationship with free market, libertarian, and conservative movements.
  • Gotthelf, Allan and Salmieri, Gregory. A Companion to Ayn Rand. Wiley-Blackwell, 2016.
    • The editors have compiled a series of scholarly entries on all of the major elements of Rand’s philosophy.
  • Gotthelf, Allan and Lennox, James. Concepts and Their Role in Knowledge. University of Pittsburgh Press, 2013.
    • Ten philosophers debate Rand’s epistemology, with focused articles on her theories of perception, concepts, and scientific method.
  • Gotthelf, Allan and Lennox, James. Metaethics, Egoism, and Virtue: Studies in Ayn Rand’s Normative Theory. University of Pittsburgh Press, 2010.
    • Eight philosophers debate Rand’s ethical theory.
  • Hessen, Robert. In Defense of the Corporation. Stanford, CA: Hoover Institution, 1979.
    • An economic historian, Hessen argues and defends from an Objectivist perspective the moral and legal status of the corporate form of business organizations.
  • Hicks, Stephen. “Ayn Rand and Contemporary Business Ethics.” Journal of Accounting, Ethics, and Public Policy 3:1, 2003.
    • A philosopher explores the implications of Rand’s ethics for the foundations of business ethics.
  • Hicks, Stephen. “Egoism in Nietzsche and Ayn Rand.” Journal of Ayn Rand Studies 10:2, 2009.
    • A philosopher compares and contrasts the positions that underlie Nietzsche’s and Rand’s theses on egoism and altruism.
  • Kelley, David. The Evidence of the Senses. Baton Rouge: Louisiana State University Press, 1986.
    • Written by a philosopher working within the Objectivist tradition, this scholarly work in epistemology focuses on the foundational role the senses play in human knowledge.
  • Mayhew, Robert. Ayn Rand’s Marginalia. New Milford, CT: Second Renaissance Books, 1995.
    • This volume contains Rand’s critical comments on over twenty thinkers, including Friedrich Hayek, C. S. Lewis, and Immanuel Kant. Edited by a philosopher, the volume contains facsimiles of the original texts with Rand’s comments on facing pages.
  • Peikoff, Leonard. The Ominous Parallels: The End of Freedom in America. New York: Stein & Day, 1982.
    • A scholarly work in the philosophy of history, arguing Objectivism’s theses about the role of philosophical ideas in history and applying them to explaining the rise of National Socialism.
  • Peikoff, Leonard. Objectivism: The Philosophy of Ayn Rand. New York: Dutton, 1991.
    • This is the first comprehensive overview of all aspects of Objectivist philosophy, written by the philosopher closest to Rand during her lifetime.
  • Rasmussen, Douglas and Douglas Den Uyl, editors. The Philosophic Thought of Ayn Rand. Urbana, IL: University of Illinois Press, 1984.
    • A collection of scholarly essays by philosophers, defending and criticizing various aspects of Objectivism’s metaphysics, epistemology, ethics, and politics.
  • Reisman, George. Capitalism: A Treatise on Economics. Ottawa, IL: Jameson Books, 1996.
    • A scholarly work by an economist, developing free-market capitalist economic theory, especially that coming out of the Austrian tradition, and connecting it to Objectivist philosophy.
  • Sciabarra, Chris Matthew. Ayn Rand, The Russian Radical. University Park: Pennsylvania State University Press, 1995.
    • A work in history of philosophy, this book attempts to trace the influence upon Rand’s thinking of dialectical approaches to philosophy prevalent in 19th century Europe and Russia. Also an introduction and overview of the major branches of Objectivist philosophy.
  • Smith, Tara. Ayn Rand’s Normative Ethics: The Virtuous Egoist. Cambridge University Press, 2006.
    • A scholarly work by a philosopher on Rand’s meta-ethics and its application in normative ethics.
  • Wilkinson, Will, editor. “What’s Living and Dead in Ayn Rand’s Moral and Political Thought?” Cato Unbound, 2010.
    • Four professors of philosophy—Douglas B. Rasmussen, Michael Huemer, Neera K. Badhwar, and Roderick T. Long—discuss and debate the current state of Rand scholarship.
  • Zwolinski, Matthew. “Is Ayn Rand Right about Rights?” Learn Liberty, April 2017.
    • A philosophy professor argues that Rand’s theory of individual rights is subject to three major criticisms.

 

Author Information

Stephen R. C. Hicks
Email: shicks@rockford.edu
Rockford University
U. S. A.

The African Predicament

The African predicament is a concept that explains the aggregate of plights that threaten the African people. It is also an account that combines methods from various disciplines since the robustness of the theme is not limited to the field of philosophy alone but serves as a problem for consideration in the social sciences, sciences, and the humanities; thus it interrogates the predicaments of the African from all these perspectives. This task of interrogation becomes more demanding because of the critical analytic outlook of philosophy in embracing various methods that are relevant in the African predicament. Although it places the African as being on the defensive side of reality having been bedeviled with numerous plagues, it does not exempt the African race in the execution of the problematic situations in which they find themselves. While constantly searching for scapegoats to apportion blame in order to gain psychological relief, Africans are also a threat to themselves; hence a people who have been trained to laugh at themselves bears the greater burden of ensuring liberation, not from the clutches of an alien, but from the enemy that lies within. Clarke (1991, 24) puts it succinctly when he said that a people who have been dehumanized have among them a separate group who are at odds within themselves. It is worth noting that this article aims to present what is basic and common knowledge insofar as the theme of African predicament is concern rather than an attempt to demonize any particular race/people.

Table of Contents

  1. Introduction
  2. Various Dimensions of the African Predicament
    1. Economic Enslavement and the Crisis of Leadership
    2. Mis-education of Africans and the Falsification of History
      1. Why is the Study of African History Necessary?
    3. Philosophy and Western Historiography
    4. Culture and Identity Crisis
    5. Religion
  3. Conclusion
  4. References and Further Readings

1. Introduction

The concept of the African predicament is a holistic one that can be viewed through various lenses depending on the approach(es) that scholars decide to begin their debate. So no single scholar exhausts in totality this captivating yet problematic theme, but even from their relative perspectives, there is a meeting point on the consensus of that which depicts the African predicament. Stanislav Andreski succinctly captures the purpose of the theme that largely lies in exploring those obstacles that bedevil the African continent and thus hinder it on the road to wholesome prosperity, internal peace, and basic/fundamental freedom (Andreski, 1968, 11). Obi Oguejiofor attests to the relativity of such claims but goes further to opine that in the midst of divergent opinions of the African plague, there is a general consensus that much of Africa is in a precarious state, and this concern runs very deep in the mind of an African (Oguejiofor, 2001, 7). Although the predicament of the African people ranges from cultural, political, economic religious, historical, and psychological factors, there is a single thread that binds all of these together in the collective psyche of the African. Until the African is able to mentally decolonize himself or herself, there would be a constant race-to-the-bottom-approach away from all other factors that make the liberation and prosperity of the African race possible. The African predicament becomes a relevant theme in African philosophy because while the sociologist, psychologist, historian, artist, scientist, and so on create ideological superstructure in their various disciplines, philosophy, which thrives mainly on objectivity, harmonizes experiences and views from all fields of study in a critically acclaimed manner without bias/preference to any; therefore, philosophy is that necessary stem that should hold all other branches in an objectified manner. In not laying claim to any discipline, philosophy lays claim to all disciplines.

When we make reference to the African predicament, “African” in this context is not limited to a black individual on the African soil. This is because over the years, the dispersal of the African people has been made possible by the event of colonialism and imperialism. Even before the advent of colonialism, Africans were evidently residing in parts of the Western hemisphere antedating the Columbus-discovery conspiracy theory (Imhotep, 2012, 17). Both in Africa and every other continent where the black individual is located, the plights are similar. Wilson attests to the fact that the black race is wholly bonded not necessarily because they came from the womb of same woman but because the shared experience coupled with the long history of collective parentage brings them together (Wilson, 2014, 50).

2. Various Dimensions of the African Predicament

One of the greatest challenges Africans have is the location of affirmation of their true identity. They carry such burden into economics, culture, religion, and every facet of their being. Thus, having been washed white as snow from their sins by a white Jesus, because sin which is immaterial has been bestowed with a Lockean secondary quality of darkness/blackness, they also wait for a white-washing of their economy, education, culture, religion, and so on. Hence, the richness of their identity as Africans is not dependent on what they make of it but is dependent on what the American or European says it is. The immateriality of salvation by a white savior and the materiality of socio-political, economic, and cultural redemption from the racialisation of colour bear some semblance (Wilson 2014, 38). And since even the location of heaven and hell is determined by the white individual, they are left with the consolation of both heaven and hell as alternative possibilities so as to distract them from their obligations in the material world, all to the advantage of the colonizers who are then left to extract from the natural universe. Since they have no unique identity of their own, they become whoever anybody says they are. At that point when they forget their location in time and space for the benefit of heaven alone, social amnesia sets in (Wilson, 2014, 41, 58).

It comes to the fore that the African predicament began with a racial distinction of colours; however, it is now evident that this predicament has transcended from raciality into the mind. Hence, it is evidently possible to see black individuals in white souls carrying a burden of identity everywhere they go; this burden of identity further complicates the problem because while the average African sees himself or herself as black, he or she fails to also see beyond this Lockean secondary quality of colours. One of the plights that bedevil the African continent is the inability to rally round a race-centred consciousness that sees the black individual first before any other: “No race of people has triumphed without these vital motivational, mental and behavioral orientations, for they are the keystones in the construction of liberated and prosperous peoples” (Wilson, 2014, 375).

The African predicament is explicated within five major themes. The first theme, which is the notion of economic enslavement and the crisis of leadership, x-rays how the problem of lack of leadership on the part of Africans has been exploited to imperialize the African continent and control its resources. The second theme is the mis-education of Africans and the falsification of history and how it has further led to collective historical dementia. Because the education that is received is largely a transmission from the occidental world, the third theme explicates how philosophy and western historiography interrogate this quest to further expunge the African continent from the development of the history of philosophy. The fourth theme is the culture and identity crisis where there is a gradual effort to show how the craving for an identity, other than African, depletes the collective humanity of the African people. Finally, the fifth theme exposes how religion is used to enslave, rather than liberate, the black individual. The onus rests on no one else but the African people to disturb the equilibrium in order to regain independence in all spheres of their being.

a. Economic Enslavement and the Crisis of Leadership

Long before Oguejiofor wrote his Philosophy and the African Predicament, Andreski had written a piece centred on the theme, The African Predicament. However, it is worth noting that in no page of his work did Oguejiofor make reference to the work of Andreski, yet in all their submissions, the problems noted by Andreski in the ’60s are still the same problems, bearing the same form with different matter (matter and form are used here in the philosophical sense). Similarly in writing his book, The Destruction of Black Civilization, Chancellor Williams made reference to the fact that the subtitle of the book, Great Issues of a Race from 4500 B.C. to 2000 A.D., represents a present continuous usage since “the main obstacles which confronted us in the past and are with us today will still be with us in the year 2000 and after, but also that for the rest of this century it is very likely that the Blacks will still be meeting, listening to and applauding fiery, soul-stirring speeches, protesting and denouncing injustices or happily relying on politics as the ultimate solution of our problems” (Williams, 1987, 320). Moreover, Wilson reflects the same thinking in his work, The Falsification of Afrikan Consciousness, that the problems of the black race today in seeking recognition predate the present moment. This cry has been going on since the nineteenth century, and even in American society there had always been inclusion of the black race in the activities of the nation, but this inclusivity is neutralized with a white-dominated ideology (Wilson, 2014, 7-12).

One of the most evident precarious situations of the African people is an imperial-centered economics. It is so-called because Africa is given little or no freedom to take its own destiny necessary for economic prosperity into its own power. Even under conditions where it appears to have economic policies, most of such policies are directed towards the prosperity of other continents, particularly Europe and America. Against this backdrop, the black race is miseducated into believing that the workability of an African centred economics using the African banking system is not feasible, even when such a system has proven to produce the best possible outcomes when properly utilized (Wilson, 2014, 738-740; Wilson, 2014, 45-51). In the 21st century, where emphasis is placed on ideological wars, every nation and continent is in a quick rush to out-compete the other. It would therefore be against Africa’s best interests to wait patiently for other continents to decide its fate instead of taking revolutionary steps.

In the era of colonization, peasant farmers did their jobs under the supervision of the colonial masters. Even on the peasant farmers’ own soil, the land on which they labored to sustain themselves and their families, the colonial supervisors decided how much the peasant farmers earned and determined the quantity of food that accrued to them on accomplishment of their daily tasks. Even in time of harvest, the colonial masters determined the price of the goods from the owners of the land (Andreski 26). This practice was only a rehash of what was to come several years after the colonial masters left Africa. This was a process of the satellisation of Africa, where every economic activity on African soil is directed towards the needs of Europe (Oguejiofor 39). This was not unconnected with the doctrine of the divine right of the white people to exploit the black race. Hence, even a King of France named Francis sought for a clause in “Adam’s Will” that prevented him from his own share of the wealth in Africa (Du Bois, 1999, 27; Clarke 29).

Recalling that it is these very developed nations that control the economic realities in the world, the international market forces have consistently provided the wrong outcomes for poor nations. Problems begin largely when developed nations cajole developing nations to sign up to rules that make sense and are meant for the prosperity of the already developed ones. One evident example that spans through history is the patenting of products and natural resources that emanate from African soil. King Henry VI of England is alleged to have given a letter patent to a Belgian in 1449 for a twenty-year monopoly in the production of stained glass (Bolton, 2008, 208). Some of those other natural resources patented are a hardy grain from Ethiopia called teff, which is a basic ingredient for the diet of the whole nation; an extract of the Aloe Ferox plant from Lesotho, which helps to lighten the skin; and an enzyme in Lake Nakuru in Kenya, brazzein (a protein that is said to be 500 times sweeter than sugar and gotten from a plant in Gabon). Although, this enzyme acquired from Kenya is controversial (as of 2017) because the Kenyan government denied granting permission for research in Lake Nakuru since it did not derive any benefits from this research (Bolton 209). Also, since then  another evident example is the problem of patenting HIV/AIDS drugs in the guise of the Trade Related Aspects of Intellectual Property Rights (TRIP Agreement) (Peter Mugyenyi 2008). Even the establishment of lending agencies such as the International Monetary Fund and the World Bank are imperial constructions partly to exert control over developing nations within Africa. Most of the staff of these banks and lending institutions reside in Europe and America and dictate economic policies for the African continent. They fly into African capitals just occasionally on mission after which they return back to Europe and America and pronounce economic solutions for the countries in Africa in a one-fit-all-approach. Is it possible to understand the intricacies of a country’s, or a continent’s economy in this way? Instead the countries are left with long-range prescriptions that have little or no bearing on the practical realities of the African people: “It’s all but akin to a doctor trying to carry out a major medical checkup by phone” (Bolton 137). Africa is always carried away by the charade of grants and loans that emerge from these institutions but are at the same time tied to conditions that are beneficial primarily to the imperialists.

These “grants” which amount to hardly two percent of the West’s Gross National Product and inflated in their importance and size are made to appear to be generous and voluntarily given from the heart. Western aid is propagandistically made to appear to be designed to support the development and growth of the social and economic infrastructure of recipient Afrikan nations” (Wilson, 2014, 381).

They are therefore lured into adjustment programmes that lead to currency devaluation, high interest rates, privatization of state enterprise, liberalization of imports, and so on (Wilson 368-369). Eventually, the monies that go out of the African continent annually far surpass the grants and loans that come into the continent (Wilson 373). What all these then portend for the African continent is the determination of the prizes of its exported goods by Europe and America, with these economies also fixing the prices of their imports into the African continent. And because the profits that stem out of these international transactions are from the processing/manufacturing stage, the imperialists control that aspect of production too. The poor nations are left unprotected as they rely on one or two exports for their economy, which the developed nations can refuse to buy in order to force prices down. It is not enough for these developing economies to produce, they must also have markets where the goods can be sold, so that the imperialists, that primarily control the forces of demand and supply, do not have the full power to decide to close a developing economy’s markets.

The lack of vision and creativity on the part of African leaders, coupled with the fact that their bellies have become their gods, make it impossible to take initiative to open local markets to engage in transnational trade among fellow African nations. The practical result of all these nonresident tribulations is the West trying to eat its cake and have it at the same time, without in the first place giving the chefs enough ingredients to bake it (Bolton 133).

All these set the foundation for various economic Structural Adjustment Programmes that created room for a fixed economy instead of a free one. It seems, therefore, that the economic structure created by the West is unchangeable because of its inherent correctness. However, it is a structure that is considered sufficient and necessary for any economy, notwithstanding the peculiarity of a people: “Consequently, all economic systems must submit to only one law; the law of “adjustment” to an infallible and immutable economic structure” (Ramose, 2002, 4). In this new law of economics, Social Darwinism is imported into the law of capitalism, and in this law, it is insignificant how the poor end up because only the fittest survives. Each nation that then labours under a structural adjustment programme is a perpetual borrower and a consequent debtor: “that debt is expected to be paid to the faceless bankers of the West by peasant farmers who have neither health facilities, schools for their children, nor adequate shelter over their heads” (Awoonor 2006, 264).

The population of sub-Saharan Africa is fast rising, and it is expected that the growth of the economy should match the rising population. Unfortunately, the staggering realities in Africa’s economy show the contrary. With the Gross National Product of the African continent south of the Sahara (this excludes South Africa) estimated to be about the same with Belgium (a country which had a population of about 10 million in 1991), it is startling to think about the possibility of the African continent having to take care of its population that is estimated to reach 1.6 billion by the year 2030, when it was only 600 million in 1991.

It is more appalling that Africa has refused to take responsibility for its own development. Constantly relying on attractive stipend loans and grants from international agencies and institutions such as the IMF and World Bank, it makes African leaders even more unresponsive to the plight of their continent. Some African elites, as well as Western imperialists, have been beneficiaries of this corporate greed since the West deliberately make deals with corrupt African elites, who will always fall for the profit they stand to gain from such businesses. The cumulative effect would be the West singing anthems of corruption and ineptitude on the part of African leaders, even when the whole scene is a plot from both parties; both Western imperialists and African leaders have been peddlers of the same boat that sinks the African continent. This has led to agitations from Afro-centrists who demand a move from the current state of equilibrium, requesting an exit date of these international institutions away from Africa, because coupled with the crisis of irresponsibility on the part of Africans, these institutions are mere watchdogs of the progress of the African continent. The unfortunate scenario that plays out for Africans is that they provide the capital that is required to finance their oppression.

No programme can truly be considered African if it is fashioned towards the advancement of non-African cultures.

b. Mis-education of Africans and the Falsification of History

i. Why is the Study of African History Necessary?

  • The psychology of individuals and groups could partly be formed from “historical and experiential amnesia.” (When an individual or group is compelled by various circumstances to repress important segments of their formative history).
  • “To manipulate history is to manipulate consciousness; to manipulate consciousness is to manipulate possibilities; and to manipulate possibilities is to manipulate power” (Wilson 2014, 2).
  • History brings liberation.
  • It creates a sense of self-identity. History creates identity and in creating identity, it also distorts identity. Part of the study of Egyptology is all about taking back Africa’s history that has been distorted, or rather stolen, by the European. Through historiography Africans can engage in the study of European history without knowing that these are mere projected myths of the reality of the African people. Hence, they can be reading history about the African people while at the same time giving credence to another race without knowing that such stories have African roots. If we don’t know ourselves, we become a puzzle unto ourselves and other persons become puzzles to us as well; therefore, we carry on a wrong identity everywhere we go.
  • Socio-Political Role of History: Sometimes we ask ourselves: “Why is the study of history necessary?” “What relevance does the study of the people and cultures of Africa have for them when it cannot be translated into economic prosperity that puts bread on their tables?” It is worthy to note that there is a direct, or indirect, relationship between history and money, history and economics, history and power. The question they need to ask themselves in this regard is: “If there was no direct relation between history and power or economics, then why is it that the Europeans rewrote history?” This question calls for serious reflection.
  • History and Psychology: People who have faint knowledge of history are more susceptible to manipulation than those who are knowledgeable in history. When we don’t have interest in history, all we do is to merely follow orders. We don’t need to just study computer science or mathematics, we must understand the psychology of the people who run this world. Since the study of European history is not at the centre of the high school educational curriculum, many Africans are persuaded into believing that the study of their own history is not worth it. Indeed, their ignorance of their history has a different consequence that is more injurious than the West’s ignorance of its own history. For example, a church building is a historical event so too is a bank, and this is an indication that even if individuals from the Caucasian race do not study their history, they still live with and see such historical edifice on a daily basis. History is, therefore, not only written in books, but also becomes an unwritten piece as it is lived daily. It is the past and the future all embedded in a mobile present.
  • Language and Power: If it is indeed true that a creator made the universe and gave humankind the power to name things, then by having such power, humans also had dominion. There is a relationship between naming and dominion, between naming and reality. When a people then relinquish to others the authority to name and define, they permit them to have dominion and control over their being; therefore, there is a close relationship between dominion and history, as history, when actualized, guarantees dominion.

Africans have been presented with a misconstrued history about themselves and their cultures and have been labeled several derogatory titles that have their basis in colonialism. The history of the African peoples and their cultures is not the history of European plunder of African soil. They have a rich history and culture that predate colonialism.

No nation can afford to treat with levity the education of its citizens since the kind of education a people receive either make or unmake them. This is very important because a weak educational system translates into a frail economy in the future. A people’s economy is largely a reflection of the education bequeathed to them and the quality of their economy cannot be better than the quality of their education.

Fortunately, while the education received in other advanced nations prepares them to meet the demands of each age, the education bequeathed to Africans kept them in a static position. Such industrial education that was necessary to meet the challenges of every season that was and is still being received by the blacks was merely to master skills already relegated in progressive societies (Woodson, 2010, 15). This was a deliberate attempt to perpetually relinquish the blacks to the Stone Age and even the study of history has been distorted to remove Africans from the scene of events. European historiography presents the African people as a race that was once upon a time not a people, and has since been grafted into the siblinghood of humanity through Western benevolence. It is partly this distorted knowledge they have about Africa that constantly and consistently depletes their identity. No one expects masters to reproduce their own history, while at the same time exalting slaves. History, as long as it is written and taught by the conqueror, will always be written and taught to the conqueror’s advantage. Thus, when people hear names of towns and historical events such as the University of Djenne (University of Sankhore in Timbuktu), the story of the scramble for Babatus, the ploy of king Necho II, which dates as far back as 600 B.C.E. (Babatus was later renamed Cameroons), and Goshen, which is the alleged birth place of the Bible character Moses (which is actually in Egypt), what they simply do is project whiteness into them, without at the same time knowing that the renaming of cities and events in Africa was a preconceived ideology by the colonial masters since it aids in the distortion of African history. Therefore, the African people could clearly be reading history about themselves, attributing greatness to those who accomplished such historical projects, without at the same time knowing that reference is being made to their ancestors.

Looking through history and viewing some alleged treaties that Africans supposedly made with the Europeans, it is important to note that there were some forms of historical forgery made at a time when African kings had no knowledge of the English language, yet these African leaders made and signed treaties in English (Jochannan, 1991, 18). The fact that Africans along the Nile valley were already in their 13th dynastic period, even when the Biblical Abraham was born, goes to show to a large extent how they have been white-washed by the falsification of African history in order to promote and maintain Eurocentric dominance: “Colonialism brings us to a kind of history written by the conqueror for the conquered to read and enjoy. When the conquered looks around and finds that even God speaks from the heart of the conqueror, the conquered then becomes suspicious of God” (Jochannan 60).

Nobody speaks of the plunder of a virgin land, for humans only struggle for domination in a world where fellow humans live. So when Hegel referred to Africa as a continent that is unhistorical, without movement or development, it is a contradiction of the scramble for Africa that started as far back as 1675 B.C.E. (Jochannan 16) and is still being scrambled for in the present era. This same continent without a history is the same continent that transported a life of contemplative devotion into Europe; a people that had a conception of resurrection and immortality long before the Christianization of those ideas in Europe (Diop 32); and a continent that pioneered foundational patriotism in an organized form (Diop 19) with an organized political, economic, and even educational system. Even in music the African continent creates a lasting influence because one of the authentic forms of art in America is jazz, which could not have been feasible without the rhythmic structure of Africa and the drums. Even the tales and folk-lore of the people of America are not original to them (Awoonor 86; Du Bois, 1994, 7).

Therefore, there is a great need for a new historiography of the African race by the African people themselves, not a history that is based on the adventures of Europe but that is instead premised on life of the African and the very things that characterize the essence of their being.

c. Philosophy and Western Historiography

Philosophy is literally the love of wisdom. The nature of philosophical discourse makes it possible for philosophy to thrive on criticality. Until recently, the study of philosophy in many of Africa’s colleges has been the study of the history of western thoughts. Thus, those responsible for the teaching of such disciplines in universities are always in a hurry to limit their findings to the beginning of philosophy as bequeathed to them by the Europeans, all in the bid to justify a failed deduction whose conclusion having been made, must necessarily be supported by every premise, not minding the wrongness of such foundation. Since the West therefore began its study of history with an epigram that Africa is without history, progress, and development, every study therefore must tend towards proving the said statement. It had therefore been wrongly pre-established from the onset that it was impossible for the task of philosophizing to have been done in Africa before the advent of the West. And because sometimes people misconstrue history to be the study of the past, the past for an African always begins with its plunder by Europe. Their notion of slavery takes their mind back to Africa and to the black race in general, without averting their minds to the fact that some other races outside the African race were also colonized. The history of Africa must and can only be rewritten by African scholars, as the history of a people that predates the existence of the white man on African soil. But this is possible only within a reinvented mind.

It is worth noting that students were being instructed by the Egyptian priests before the invasion of Egypt by Alexander the Great of Greece. It was within this period that the libraries in Egypt were converted into research centres by the school of Aristotle. The philosophers and scientists of this period, apart from the studies they received directly in Egypt, were also beneficiaries of the invasion as citizens of Greece. It is therefore not surprising that these philosophers were always victims of the Athenian government that always persecuted them because of ideologies that were alien to the people of Athens and were then considered unacceptable. How did they then arrive at a conclusion that ancient Greek philosophers, who were vilified for importation of thoughts that were alien to the Greek world, were actually the inventors of those ideas? The history of the life and thought of many of these prominent ancient Greek philosophers is filled with contradictions that can only be solved when we resort to faith as the arbiter of everything. Aristotle, for instance, who studied under the tutelage of Plato, became a great scientist while his tutor was a known philosopher. The fact that a philosopher could turn out a graduate in the sciences proper can only be resolved mysteriously. That Plato could keep Aristotle for a period of twenty years, tutoring him of that, which he is ignorant of himself, becomes questionable. It is therefore noteworthy that although Aristotle studied in Egypt, he made a library for himself from books that were carried from Egypt by Alexander. Plato who then studied philosophy and learnt the ten virtues in Egypt, from which he developed his own four cardinal virtues, could not have been the tutor of Aristotle (James, 2015, 1-3). This fiction continues with the fact that Theophrastus and Eudemus had studied physics, geometry, astronomy, arithmetic and theology all under Aristotle. The idea that different persons could at the same time specialize in different disciplines, some of which are distantly related, under a single tutor, is only acceptable in societies where myth is equated with reality. The fact that even the birth date of Thales is in contest bears semblance with modern day history in some societies where illiteracy makes it impossible for people to give a precise date of birth of their kindred, yet we are made to believe that this struggle among the Greeks is common with literate people. Even the positing of water as the foundation of everything that exists, or an indeterminate substance, including the theory surrounding fire, air, and so on as the primary stuff underlying everything, had long been held as a debate in Egypt; it is needful also to mention the famous Socratic dictum: “man know thyself,” which is actually an Egyptian dictum that has long been Europeanized (Obenga 2004). Since the foundation they build on is held tightly with litany of lies, it may therefore become impossible for their house to stand. People should not also forget that the history of the Greeks, during the time which the Greek philosophers were engaged in their so-called philosophical thoughts, was filled with wars and counter wars, which anyone would be quick to tell that it would be a misnomer to conceive of great intellectual strides in war-ridden areas of Iraq, Syria, Nigeria, and so on. However, it was normal for the Greeks of the ancient period under such violent and unstable conditions to be engaged in serious academic discourse. Nevertheless, like the origin of humankind that is written in favour of the white race: “this unfortunate position of the African continent and its peoples appears to be the result of misrepresentation upon which the structure of race prejudice has been built…that the African is backward, that its people are backward, and that their civilization is also backward” (James 5). The Copernican revolution used to characterize the achievement of Immanuel Kant in philosophy is another case in focus. While the name is derived from the heliocentric movement of the earth around the sun, as proposed by Nicolaus Copernicus in the 16th century, the ancestors of the Africans in Egypt had long held this heliocentric discovery in Egypt before Christ was born (B.C.) (Jackson, 1985, 24-25).

d. Culture and Identity Crisis

The very notion of the miseducation of the African people takes people to the question of identity crisis. One of the features of identity is the fact that it bears a relational character with that very concept, which a thing is identified with. To speak of an African identity is to have a character that is truly African and such character is not individuated, as it is shared among the same species. When a people have no identity that is uniquely theirs, they crave for any form of identity and because such identity comes from without and not within, it denigrates a people to a second order position. Because Africans carry an identity that is alien to them, there is an inbuilt inferiority complex in them. Hence, it does not matter whether they think, or feel that they are inferior or not, the inferiority is created by the existence of such institution that denies a people an identity that they can rightly call theirs. But this deliberate inferiority complex that is institutionalized has an aim, for it leads to dangerous brainwashing that further leads to self-erosion, shame, and self-alienation. Its negative impact is very broad since it erodes the relational character which identity seeks to create (Oguejiofor, 2007, 69). The war on race therefore creates a false biological determinism that makes people think that nature has endowed some races better than others and made them superior to some others. In terms of achievement, a Negro has no identity of his or her own, but they see themselves through the lens of another.

The colonial masters knew this very well; they had to destroy the foundation that made this relational character possible. For instance, the slaves in the United States of America had to be disconnected from family ties to make revolution impossible, thus destroying their loyalty system. Awoonor (2006) gives a vivid account of this identity crisis as it was and is still exemplified in the lives of the black individual. It all began with the history that was taught to the black individual and the various myths about the African race. The oppressed were taught that Africa was a land of barbarous instincts and primitive ties, where the consumption of human flesh featured prominently. The African therefore came to learn that civilization, modernization, and exposure are defined in terms of a total disconnect from African culture. This then went a long way to create an internal phobia for blackness, especially in the African-American and by such attitude, the African-Americans had phobia even for themselves. They were therefore programmed to be ashamed of their ancestry and should therefore count themselves lucky specimens that were dragged out of their homeland in chains. Being a lucky specimen, the best that could happen to the African individual is that he or she was merely worthy of observation. But the danger inherent in this type of specimen is that it is treacherous to that which is under observation, not minding whether the outcome is either positive or negative since it is still an instrument to be used, even if it is of a meritorious type. And because the debate on who is and who is not rational is the foundation of all racism, the colonial masters were in a hurry to justify the thesis of Hegel, that Africans, being a people without history but paradoxically have a history of darkness, it was illogical to conceive of them as a people with reason. They could be therefore worthy of slaves, but they must be slaves of a different kind because of their incapacity to reason, slaves of a subhuman kind that could have neither will nor freedom. Any attempt to bestow on them any knowledge that only humans were capable of grasping would be a contradiction of their irrationality.

What is most pathetic is that slavery has taken another form that has bestowed on the Africans the position of reformed serfdom. In colonial times, most slaves never saw their masters do some field work, they came to equate work with slavery while the ability to live in affluence and not work at the same time was equated with lordship. Many people of the black race therefore struggle to acquire material resources, which for them are synonymous with freedom. This is a distortion that was learnt from the colonial masters. It is worthy to note that the circumstance under which the masters acquired wealth was totally different from the thinking of the slaves. Having utilized their brains to exploit the resources on the African soil, the Europeans lived in a way that the oppressed saw as ostentatious. But since the oppressor was physically present in Africa temporarily, such exalted lifestyle was meant to shield them and their associates from the public in order to guarantee security. The oppressed having been physically free, must now struggle to live in such a gigantic way since the presence of the colonial masters on the African continent suggests that to be free, a person must be wealthy and wealth in this sense proceeds from the materiality of resources. Since the gorgeously dressed slaves who served the slave masters were the envy of fellow slaves, the black individual now equated elegance to being a master. Thus, even after the physical chains have been let loose off their arms and legs, they must still struggle to appear like the slave masters. Eventually, this desire to always look like the slave master always reminds them of their position as slaves that they truly are in their current thinking. While the blacks who can actually afford such exorbitant way of life end up as celebrities who must perform to the admiration of the masters and their households, what therefore was done under duress during colonial times due to the fear of being killed is now done willingly by those, who once oppressed, now take such acts of performance as a profession (Akbar 2014). The oppressed, therefore, moves around with the illusion that they have the mind of the master. It is a clear indication that they are far from freedom. To be ultimately free, they must understand that wealth proceeds first from the intangibility of resources, which can only be acquired from decolonizing the mind.

This was the clear-cut difference between the slaves that were prevalent in African society before colonialism and those introduced by the oppressors. In pre-colonial black Africa, slaves were in hierarchy; however, even in their hierarchical status, they were slaves of a human and not a sub-human kind. Masters, therefore, never exploited their slaves should the occasion arise since they were of different social status; while slaves exploited slaves, those of a higher social status exploited each other (Diop, 1987, 6, 10-11). This is not to downplay the role of the African in this whole predicament because when greed sets in and African leaders began to sell fellow blacks to Arabs, it opened a new page in the undermining of the identity of the black race. African leaders became more enthusiastic in the lucrativeness of such trade. This template was to be followed by the Europeans when they seized the task of enslavement from the Arabs (Williams 55). In all of this, there is still an inherent irrationality in the psyche of the conqueror, which is the fact that there was a victor. Victory in this sense arises from battle or struggle and a people must be in contest over an event or issue with certain rivals. No people need to claim victory over another as long as the preys are not armed in any sense to fight the predators; what the predators ought to simply do is to invade and not merely conquer. Victory for the conqueror should only be a natural result of the irrationality of the prey. However, this was not taken into consideration in the denial of rationality thesis to the African people as argued by the Europeans. But even the scramble for Africa in the nineteenth century was not without its own resistance. A people who have been brutally handicapped by the stroke of the pen should therefore be allowed to carve their own destiny and not be compelled to engage in a race with the rest of the developed world.

e. Religion

The whole concept of religion is not a coinage of the supernatural, but a formulation of humankind to bring the human person closer to the consciousness to the supersensible. The Christian religion becomes a victim in this research because large parts of Africa were colonized by nations that opined that salvation comes only through the Christian route. Thus, one of the criteria for slavery was the fact that a people or a nation was unchristian. Religious authorities operating under the form of religious tyranny gave permission to every nation in Europe to reduce every person in Africa to servitude who never accepted the Christian religion. It is an indicator that Africans have been a spiritual people before slavery since one of the primary reasons for Africa’s enslavement was that the people were not Christians. Not minding the form of religion being practiced by a people in Africa, its basis of reducing an entire race to the level of animals was the fact that they believed in the Supreme Being in a way that the Europeans did not believe. The justification for the enslavement of the African people was further given a biblical foundation. Was it therefore not a preconceived coincidence that the bible, which was used as a tool for colonialism, was given to the black individual with various verses that endorsed slavery as a divine order? It would be incongruous, and a deliberate denial, that has its foundation on irrationality, to conclude that the idea of religion was unknown to the African people before the emergence of the white individual. History, however, has revealed that there is virtually nothing that is found in Christianity that is new to the human person, except for the untrained mind. Since the African people had a vivid knowledge of the supersensible before colonialism, there must be reason(s) for the establishment of an organized religion as imported into various colonies.

When we go through the various mythologies surrounding the concept of death and the afterworld of the Bantu people of Africa, the Masarwas, the Ashanti, the Nandi and Wabende peoples of East Africa, and so on, we would understand that pre-colonial Africa did not only have organized religions, but it also had an elated form of spirituality. Thus, when Lightfoot and Ussher (Jackson 5) announced that the creation of the world dates back to 4004 B.C.E. in justification of the biblical foundations of the universe, it becomes clear that its main task was to justify a theocratic system that is based on European establishment. This is because a careful study of history shows that the 13th dynastic period of the African race predates the failed thesis of the existence of Adam and Eve. When even history records about twenty-five pre-Christian saviour-gods, all born of virgins (Jackson 38), the whole project of Christianity and its subsequent projection of the white race and the quick rush to vilify the black race is an indication that raciality and distortion are found in scripture. When even scripture is used by the oppressor to project whiteness, then it becomes evident that Christianity came in the first place, not primarily because there was a messiah to be advertised to the Africans that the conqueror wanted the oppressed to be beneficiaries, but because it fosters economic manipulation of the black race. Christianity, which therefore teaches that the end does not justify the means, uses its own end as a guide to all other means. This is seen in the use of the Babylonian baal to promote the Christian religion, while at the same time condemning such practices as pagan (Jackson 43). The worship of baal is therefore wrong in Christian theology, but the legend of the Babylonian Bel is right for propagating of Christian ideology. This use of double standards in Christian doctrine is the very standard that is used to condemn traditional religious practices in African society while at the same time using the myths of the same African religion to justify Christology. Black becomes demonic while white is used to represent everything that is honourable. It therefore leaves them amazed that while the image of God and his angelic hosts are Caucasians, the pictorial representation of the devil is black. Those who conceived this notion never averted their minds to the fact that even the devil was part of the angelic hosts before the supposed fall from grace. If everything evil has a black tag, did the devil who is the architect and bearer of evil turn black after the fall, or was he not Caucasian while he was among the angelic hosts? This Christological chauvinism has always been used to cage the African mind in a box, where slavery is justified in order to condemn everything that is African. It becomes laughable that almost everyone who encounters the divine in a vision always sees the heavenly hosts dressed in the racial colour of his or her oppressor. This gives them more reasons to justify godliness in their oppressor, while at the same time perceiving godlessness in their fellow black individuals. And since the person who controls your mind also controls your destiny, in this battle for ideological control of the civilization, the African civilization risks extinction.

Since extraction was the primary driving motive of the emergence of Christianity in Africa, one could therefore go into a church building in the guise that minds are being renewed. Renewal becomes the imperial masters’ coating of re-colonization. Where are Africans in all of these? They end up being nothing better than emancipated slaves. By this emancipation, slavery was transformed but not eliminated. An emancipated slave is one who is given the privilege of carrying the Lockean secondary quality of emancipation, while at the same time neglecting the fact that being a slave is the primary motive that makes a slave who he or she is; he or she is therefore nothing better than an exalted servant. Emancipation then becomes a profound adjective used to glorify a treacherous noun (slave). It is therefore evident from the above that Christianity and other various forms of worship, apart from being kinds of religions, are also ideologies, and like every ideology, one of its main features is to make it sellable to the masses. Real emancipation of the African would not be possible until every trace of Caucasian association with divinity is unveiled. This does not also require its substitution with black pictorial images. They must understand that ultimate liberation recognizes the supremacy of God transcending both Caucasian and black flesh, whom both races aspire to resemble in perfection and not permanently locked in a material pictorial form created by naïve minds (Akbar 68).

3. Conclusion

Freedom is not freely given, for it is always demanded with some form of revolution. In this predicament that faces the African people, the task of regaining humanity does not just lie with those whose humanity has been stolen, but also with the very persons who stole them. Dehumanization makes some persons beasts while regarding others as sub-humans. So both the beasts and the sub-humans are under a form of distortion, and they must be transformed back to a human state. This is very vital to the idea of wholesome decolonization because both the oppressed and the oppressor are afraid of freedom, for while the former fears being free, the latter fears losing the freedom to oppress (Freire, 2005, 46). Although Africans recognize the ills brought upon their colonies by their masters, the future of the African continent largely depends on the possibility of getting Africans who will drive a community-based ideology as against the egocentric lifestyle that led to its downfall in the first place and still puts the continent below other world powers. Lamentations must therefore be shifted from the West and centred on the role Africans have played over the years in the financing of its downfall. There is need for a rigorous reconstruction of the landscape of the African mind. This cannot just be done by mere bodily repatriation of Africans abroad but by the repatriation of the African mind.

4. References and Further Readings

  • Akbar, Na’im. Breaking the Chains of Psychological Slavery. Tallahasee: Mind Productions Associates Inc. 1996.
    • It gives an account of mental decolonization of the black race.
  • Andreski, Stanislav. The African Predicament. London: Michael Joseph Ltd; 1968.
    • It is a summary of the plights of the African race from numerous perspectives.
  • Appiah, Kwame. In my Father’s House. Africa in the Philosophy of Culture. Oxford: Oxford University Press, 1992.
    • A thought provoking discourse centred on race, culture and identity of the African.
  • Awoonor, Kofi. The African Predicament. Legon: Sub-Saharan Publishers, 2006.
    • Utilizing practical experiences in providing a detailed account of the plights of the African at home and abroad.
  • Bernal, Martin. Black Athena. The Afroasiatic Roots of Classical Civilization. Vol. 1. New Jersey: Rutgers University Press, 1987.
    • A detailed account of the fabrications and reordering of history.
  • Clarke, John Henrik. Christopher Columbus and the Afrikan Holocaust. New York: A&B Books Publishers, 1992.
    • A brief summary of the glorious and inglorious past of Africa.
  • Diop, Cheikh Anta. Precolonial Black Africa. New York: Lawrence Hill Books, 1987.
    • A summary of the life and practice of the African people prior to their contact with Europe.
  • Du Bois, W.E.B. The Souls of Black Folk. New York: Dover Publications Inc; 1994.
    • Largely borne out of personal experience, the book is a reflection on the plight of the black race as evident primarily in the lives of the African American.
  • Du Bois, W.E.B. Darkwater: Voices from Within the Veil. New York: Dover Publications Inc; 1999.
    • An autobiographical essay largely centred around black race predicament and consciousness.
  • Du Bois, W.E.B. The Education of Black People. New York: Monthly Review Press, 2001.
    • A critical study on how black people can acquire power from education.
  • Foreman, Christopher ed. The African American Predicament. Washington D.C: Brookings Institution Press, 1999.
    • This work gives an insight into plights of the black American
  • Freire, Paulo. Pedagogy of the Oppressed. New York: Continuum International Publishing Group Inc; 2005.
    • It is a detailed experiential account of oppression and how the oppressed could be truly free indeed.
  • Harrison, Hubert. When Africa Awakes. Baltimore: Black Classic Press, 1997.
    • Centred around the liberation of black people, it is an essay on race consciousness.
  • Imhotep, David. The First Americans were Africans. Bloomington: AuthorHouse, 2012.
    • It gives a justification of the occupation of the land of America by Africans before any other race.
  • Jackson, John. Christianity Before Christ. Texas: American Atheist Press, 1985.
    • In interrogating the existence of the Christian religion and how it emanates from different cultural/religious practices, it claims that nothing that Christianity projects is new.
  • James, George. Stolen Legacy. The Egyptian Origins of Western Philosophy. San Bernardino, CA: A Traffic Output Publication, 2015.
    • It is a summary of the existence of Greek philosophy from ancient Egypt and ways to decolonize the continent.
  • Jochannan, Yosef ben & Clarke John Henrik. From the Nile Valley to the New World. Science, Invention & Technology: New Dimensions in African History. New Jersey: Africa World Press Inc; 1991.
    • An historical account of the achievement of Africans in the field of science and technology.
  • Mugyenyi, Peter. Genocide by Denial. Kampala: Fountain Publishers, 2008.
    • A detailed account of how patenting of drugs has been used to exploit the African continent.
  • Obenga, Theophile. African Philosophy. The Pharaonic Period: 2780-330 BC. Per Ankh, 2004.
    • Highlighting some of the philosophical thought of ancient Egypt, it traces the origin of philosophy and presents Africa as the foundation of philosophy.
  • Oguejiofor, J. Obi. Philosophy and the African Predicament. Ibadan: Hope Publications Ltd; 2001.
    • On the use of philosophy in the liberation of the African continent.
  • Ramose, Mogobe. African Philosophy Through Ubuntu. Harare: Mond Books, 2002.
    • Using the Bantu people of Africa, the book gives an account of what African philosophy entails.
  • Sertima, Ivan Van. They Came Before Columbus. The African Presence in Ancient America. New York: Random House Trade Paperbacks, 2003.
    • Using archeological findings, it tells the history of the presence of the African in America prior to any other race.
  • Williams, Chancellor. The Destruction of Black Civilization: Great Issues of Race from 4500 B.C. to 2000 A.D. Chicago: Third World Press, 1987.
    • A thorough historical piece of the black race with a detailed account of the problems and prospects of the continent.
  • Wilson, Amos. Blueprint for Black Power: A Moral, Political and Economic Imperative for the Twenty-First Century. New York: Afrikan World InfoSystems, 2014.
    • An extensive book on black oppression by white supremacy and the wholesome liberation of the black race.
  • Wilson, Amos. The Falsification of Afrikan Consciousness. Eurocentric History, Psychiatry and the Politics of White Supremacy. New York: Afrikan World InfoSystems, 2014.
    • Journeying into the mind of the African, it suggests ways to liberate the African from the bondage of Eurocentrism.
  • Woodson, Carter. The Mis-Education of the Negro. New York: Seven Treasures Publications, 2010.
    • It is a concise essay on the liberation of the black man from mental and physical slavery.

 

Author Information

Isaiah Aduojo Negedu
Email: negedu.isaiah@fulafia.edu.ng
Federal University Lafia
Nigeria

Thomas Aquinas (1224/6—1274)

aquinasSt. Thomas Aquinas was a Dominican priest and Scriptural theologian. He took seriously the medieval maxim that “grace perfects and builds on nature; it does not set it aside or destroy it.” Therefore, insofar as Thomas thought about philosophy as the discipline that investigates what we can know naturally about God and human beings, he thought that good Scriptural theology, since it treats those same topics, presupposes good philosophical analysis and argumentation. Although Thomas authored some works of pure philosophy, most of his philosophizing is found in the context of his doing Scriptural theology. Indeed, one finds Thomas engaging in the work of philosophy even in his Biblical commentaries and sermons.

Within his large body of work, Thomas treats most of the major sub-disciplines of philosophy, including logic, philosophy of nature, metaphysics, epistemology, philosophical psychology, philosophy of mind, philosophical theology, the philosophy of language, ethics, and political philosophy. As far as his philosophy is concerned, Thomas is perhaps most famous for his so-called five ways of attempting to demonstrate the existence of God. These five short arguments constitute only an introduction to a rigorous project in natural theology—theology that is properly philosophical and so does not make use of appeals to religious authority—that runs through thousands of tightly argued pages. Thomas also offers one of the earliest systematic discussions of the nature and kinds of law, including a famous treatment of natural law. Despite his interest in law, Thomas’ writings on ethical theory are actually virtue-centered and include extended discussions of the relevance of happiness, pleasure, the passions, habit, and the faculty of will for the moral life, as well as detailed treatments of each one of the theological, intellectual, and cardinal virtues. Arguably, Thomas’ most influential contribution to theology and philosophy, however, is his model for the correct relationship between these two disciplines, a model which has it that neither theology nor philosophy is reduced one to the other, where each of these two disciplines is allowed its own proper scope, and each discipline is allowed to perfect the other, if not in content, then at least by inspiring those who practice that discipline to reach ever new intellectual heights.

In his lifetime, Thomas’ expert opinion on theological and philosophical topics was sought by many, including at different times a king, a pope, and a countess. It is fair to say that, as a theologian, Thomas is one of the most important in the history of Western civilization, given the extent of his influence on the development of Roman Catholic theology since the 14th century. However, it also seems right to say—if only from the sheer influence of his work on countless philosophers and intellectuals in every century since the 13th, as well as on persons in countries as culturally diverse as Argentina, Canada, England, France, Germany, India, Italy, Japan, Poland, Spain, and the United States—that, globally, Thomas is one of the 10 most influential philosophers in the Western philosophical tradition.

Table of Contents

  1. Life and Works
    1. Life
    2. Works
  2. Faith and Reason
  3. Philosophy of Language: Analogy
  4. Epistemology
    1. The Nature of Knowledge and Science
    2. The Extension of Science
    3. The Four Causes
      1. The Efficient Cause
      2. The Material Cause
      3. The Formal Cause
      4. The Final Cause
    4. The Sources of Knowledge: Thomas’ Philosophical Psychology
  5. Metaphysics
    1. On Metaphysics as a Science
    2. On What There Is: Metaphysics as the Science of Being qua Being
  6. Natural Theology
    1. Some Methodological Considerations
    2. The Way of Causation: On Demonstrating the Existence of God
    3. The Way of Negation: What God is Not
      1. God is Not Composed of Parts
      2. God is Not Changeable
      3. God is Not in Time
    4. The Way of Excellence: Naming God in and of Himself
  7. Philosophical Anthropology: The Nature of Human Beings
  8. Ethics
    1. The End or Goal of Human Life: Happiness
    2. Morally Virtuous Action as the Way to Happiness
      1. Morally Virtuous Action as Pleasurable
      2. Morally Virtuous Action as Perfectly Voluntary and the Result of Deliberate Choice
      3. Morally Virtuous Action as Morally Good Action
      4. Morally Virtuous Action as Arising from Moral Virtue
    3. Human Virtues as Perfections of Characteristically Human Powers
      1. Infused Virtues
      2. Human Virtues
    4. The Logical Relations between the Human Virtues
    5. Moral Knowledge
    6. The Proximate and Ultimate Standards of Moral Truth
  9. Political Philosophy
    1. Law
      1. The Nature of Law
      2. The Different Kinds of Law
        1. The Eternal Law
        2. The Natural Law
        3. The Divine Law
        4. Human Law and its Relation to Natural Law
    2. Authority: Thomas’ Anti-Anarchism
    3. The Best Form of Government
  10. References and Further Reading
    1. Thomas’ Works
    2. Secondary Sources and Works Cited
    3. Bibliographies and Biographies

1. Life and Works

a. Life

St. Thomas Aquinas was born sometime between 1224 and 1226 in Roccasecca, Italy, near Naples. Thomas’ family was fairly well-to-do, owning a castle that had been in the Aquino family for over a century. One of nine children, Thomas was the youngest of four boys, and, given the customs of the time, his parents considered him destined for a religious vocation.

In his early years, from approximately 5 to 15 years of age, Thomas lived and served at the nearby Benedictine abbey of Monte Cassino, founded by St. Benedict of Nursia himself in the 6th century. It is here that Thomas received his early education. Thomas’ parents probably had great political plans for him, envisioning that one day he would become abbot of Monte Cassino, a position that, at the time, would have brought even greater political power to the Aquino family.

Thomas began his theological studies at the University of Naples in the fall of 1239. In the 13th century, training in theology at the medieval university started with additional study of the seven liberal arts, namely, the three subjects of the trivium (grammar, logic, and rhetoric) and the four subjects of the quadrivium (arithmetic, geometry, music, and astronomy), as well study in philosophy. As part of his philosophical studies at Naples, Thomas was reading in translation the newly discovered writings of Aristotle, perhaps introduced to him by Peter of Ireland. Although Aristotle’s Categories and On Interpretation (with Porphyry’s Isagoge, known as the ‘old logic’) constituted a part of early medieval education, and the remaining works in Aristotle’s Organon, namely, Prior Analytics, Posterior Analytics, Topics, and Sophismata (together known as the ‘new logic’) were known in Europe as early as the middle of the 12th century, most of Aristotle’s corpus had been lost to the Latin West for nearly a millennium. By contrast, Arab philosophers such as Ibn Sina or Avicenna (c. 980-1087) and Ibn Rushd or Averroes (1126-1198) not only had access to works such as Aristotle’s De Anima, Nicomachean Ethics, Physics, and Metaphyiscs, they produced sophisticated commentaries on those works. The Latin West’s increased contact with the Arabic world in the 12th and 13th centuries led to the gradual introduction of these lost Aristotelian works—as well as the writings of the Arabic commentaries mentioned above—into medieval European universities such as Naples. Philosophers such as Peter of Ireland had not seen anything like these Aristotelian works before; they were capacious and methodical but never strayed far from common sense. However, there was controversy too, since Aristotle seemed to teach things that contradicted the Christian faith, most notably that God was not provident over human affairs, that the universe had always existed, and that the human soul was mortal. Thomas would later try to show that such theses either represented misinterpretations of Aristotle’s works or else were founded on probabilistic rather than demonstrative arguments and so could be rejected in light of the surer teaching of the Catholic faith.

It was in the midst of his university studies at Naples that Thomas was stirred to join a new (and not altogether uncontroversial) religious order known as the Order of Preachers or the Dominicans, after their founder, St. Dominic de Guzman (c. 1170-1221), an order which placed an emphasis on preaching and teaching. Although Thomas received the Dominican habit in April of 1244, Thomas’ parents were none too pleased with his decision to join this new evangelical movement. In order to talk some sense into him, Thomas’ mother sent his brothers to bring him to the family castle sometime in late 1244 or early 1245. Back at the family compound, Thomas continued in his resolve to remain with the Dominicans. Having resisted his family’s wishes, he was placed under house arrest. A famous story has it that one day his family members sent a prostitute up to the room where Thomas was being held prisoner. Apparently, they were thinking that Thomas would, like any typical young man, satisfy the desires of his flesh and thereby “come back down to earth” and see to his familial duties. Instead, Thomas supposedly chased the prostitute out of the room with a hot poker, and as the door slammed shut behind her, traced a black cross on the door. Eventually, Thomas’ mother relented and he returned to the Dominicans in the fall of 1245. Despite these family troubles, Thomas remained dedicated to his family for the rest of his life, sometimes staying in family castles during his many travels and even acting late in his life as executor of his brother-in-law’s will.

Recognizing his talent early on, the Dominican authorities sent Thomas to study with St. Albert the Great at the University of Paris for three years, from 1245-1248. Thomas made such an impression on Albert that, having been transferred to the University of Cologne, Albert took Thomas along with him as his personal assistant.

From 1252-1256, Thomas was back at the University of Paris, teaching as a Bachelor of the Sentences. We might think of Thomas’ position at Paris at this time as roughly equivalent to an advanced graduate student teaching a class of his or her own. In addition to his teaching duties, Thomas was also required, in accord with university standards of the time, to work on a commentary on Peter the Lombard’s Sentences. We might think of Thomas’ commentary on the Sentences as roughly equivalent to his doctoral dissertation in theology.

At 32 years of age (1256), Thomas was teaching at the University of Paris as a Master of Theology, the medieval equivalent of a university professorship. After teaching at Paris for three years, the Dominicans moved Thomas back to Italy, where he taught in Naples (from 1259-1261), Orvietto (1261-1265), and Rome (1265-1268). It was during this period, perhaps in Rome, that Thomas began work on his magisterial Summa theologiae.

Thomas was ordered by his superiors to return to the University of Paris in 1268, perhaps to defend the mendicant way of life of the Dominicans and their presence at the university. (Like the Franciscans, the Dominicans depended upon the charity of others in order to continue their work and survive. This sometimes meant they had to beg for their food. In doing so, the members of the mendicant orders consciously saw themselves as living after the pattern of Jesus Christ, who, as the Gospels depict, also depended upon the charity of others for things to eat and places to rest during his public ministry.) Thomas ended up teaching at the University of Paris again as a regent Master from 1268-1272. While he was at the University of Paris, Thomas also famously disputed with philosophers who contended on Aristotelian grounds—wrongly in Thomas’ view—that all human beings shared one intellect, a doctrine that Thomas argued was incompatible with personal immortality and moral responsibility, not to mention our experience of ourselves as individual knowers.

In 1272, the Dominicans moved Thomas back to Naples, where he taught for a year. In the middle of composing his treatise on the sacraments for the Summa theologiae around December of 1273, Thomas had a particularly powerful religious experience. After the experience, despite constant urging from his confessor and assistant Reginald of Piperno, Thomas refused any longer to write. Called to be a theological consultant at the Second Council of Lyon, Thomas died in Fossanova, Italy, on March 7, 1274, while making his way to the council.

Canonized in 1323, Thomas was later proclaimed a Doctor of the Church by Pope St. Pius V in 1567. In 1879, Pope Leo XIII published the encyclical Aeterni Patris, which, among other things, holds up Thomas as the supreme model of the Christian philosopher. Through his voluminous, insightful, and tightly argued writings, Thomas continues to this day to attract numerous intellectual disciples, not only among Catholics, but among Protestants and non-Christians as well.

b. Works

Thomas is famous for being extremely productive as an author in his relatively short life. For example, he authored four encyclopedic theological works, commented on all of the major works of Aristotle, authored commentaries on all of St. Paul’s letters in the New Testament, and put together a verse by verse collection of exegetical comments by the Church Fathers on all four Gospels called the Catena aurea. Such examples constitute only the beginning of a comprehensive list of Thomas’ works. His literary output is as diverse as it is large. Thomas’ body of work can be usefully split up into nine different literary genera: (1) theological syntheses, for example, Summa theologiae and Summa contra gentiles; (2) commentaries on important philosophical works, for example, Commentary on Aristotle’s Nicomachean Ethics and Commentary on Pseudo-Dionysius’ De divinis nominibus; (3) Biblical commentaries, for example, Literal Commentary on Job and Commentary and Lectures on the Epistles of Paul the Apostle; (4) disputed questions, for example, On Evil and On Truth; (5) works of religious devotion, for example, the Liturgy of Corpus Christi and the hymn Adoro te devote; (6) academic sermons, for example, Beata gens, sermon for All Saints; (7) short philosophical treatises, for example, On Being and Essence and On the Principles of Nature; (8) polemical works, for example, On the Eternity of the World against Murmurers, and (9) letters in answer to requests for an expert opinion, for example, On Kingship. For present purposes, this article focuses on the first four of these literary genera. This should be enough to demonstrate the capaciousness of Thomas’ thought.

Thomas’ most famous works are his so-called theological syntheses. Thomas composed four of these during his lifetime: his commentary on Peter Lombard’s Sentences, Summa contra gentiles, Compendium theologiae, and Summa theologiae. Although each of these works was composed for different reasons, they are nonetheless similar insofar as each of them attempts to communicate clearly and defend the substance of the Catholic faith in a manner that can be understood by someone who has the requisite education, that is, training in the liberal arts and Aristotle’s philosophy of science. Although Thomas aims at both clarity and brevity in the works, because Thomas also aims to speak about all the issues integral to the teaching the Catholic faith, the works are quite long (for example, Summa theologiae, although unfinished, numbers 2,592 pages in the English translation of the Fathers of the English Dominican Province).

Thomas’ Summa contra gentiles (SCG), his second great theological synthesis, is split up into four books: book I treats God; book II treats creatures; book III treats divine providence; book IV treats matters pertaining to salvation. Whereas the last book treats subjects the truth of which cannot be demonstrated philosophically, the first three books are intended by Thomas as what we might call works of natural theology, that is, theology that from first to last does not defend its conclusions by citing religious authorities but rather contains only arguments that begin from premises that are or can be made evident to human reason apart from divine revelation and end by drawing logically valid conclusions from such premises. SCG is thus Thomas’ longest and most ambitious attempt at doing what he is probably most famous for—arguing philosophically for various theses concerning the existence of God, the nature of God, and the nature of creatures insofar as they are creatures of God. Although Thomas cites Scripture in these first three books in SCG, such citations always come on the heels of Thomas’ attempt to establish a point philosophically. In citing Scripture in the SCG, Thomas thus aims to demonstrate that faith and reason are not in conflict, that those conclusions reached by way of philosophy coincide with the teachings of Scripture.

Summa theologiae (ST) is Thomas’ most well-known work, and rightly so, for it displays all of Thomas’ intellectual virtues: the integration of a strong faith with great learning; acute organization of thought; judicious use of a wide range of sources, including pagan and other non-Christian sources; an awareness of the complexity of language; linguistic economy; and rigorous argumentation. However, ST is not a piece of scholarship as we often think of scholarship in the early 21st century, that is, a professor showing forth everything that she knows about a subject. Rather, it is the work of a gifted teacher, one intended by its author, as Thomas himself makes clear in the prologue, to aid the spiritual and intellectual formation of his students. It was once thought that Thomas meant ST to replace Lombard’s Sentences as a university textbook in theology, which, incidentally, did begin to happen as early as one hundred and fifty years after Thomas’ death. Recent scholarship has suggested that Thomas rather composed the work for Dominican students preparing for priestly ministry. This thesis is consistent with what Thomas actually does in ST, which may surprise people who have not examined the work as a whole.

What of the method and content of ST? Like Lombard’s Sentences, Thomas’ ST is organized according to the neo-Platonic schema of exit from and return to God. This is no accident. Thomas thinks it is fitting that divine science should imitate reality not only in content but in form. ST is split into three parts. Part one (often abbreviated “Ia.”) treats God and the nature of spiritual creatures, that is, angels and human beings. Part two treats the return of human beings to God by way of their exercising the virtues, knowing and acting in accord with law, and the reception of divine grace. Given the Fall of human beings, part three (often abbreviated “IIIa.”) treats the means by which human beings come to embody the virtues, know the law, and receive grace: (a) the Incarnation, life, passion, death, resurrection, and ascension of Christ, as well as (b) the manner in which Christ’s life and work is made efficacious for human beings, through the sacraments and life of the Church.

Of the three parts of ST, the second part on ethical matters is by far the longest, which is one reason recent scholarship has suggested that Thomas’ interest in composing ST is more practical than theoretical. We might think of ST as a work in Christian ethics, designed specifically to teach those Dominican priests whose primary duties were preaching and hearing confessions. In fact, part two of ST is so long that Thomas splits it into two parts, where the length of each one of these parts is approximately 600 pages in English translation. The first part of the second part is often abbreviated “IaIIae”; the second part of the second part is often abbreviated “IIaIIae.”

The fundamental unit of ST is known as the article. It is in the article that Thomas works through some particular theological or philosophical issue in considerable detail, although not in too much detail. (Recall Thomas is training priests for ministry, not scholars. For Thomas’ most detailed discussions of a topic, readers should turn to his treatment in his disputed questions, his commentary on the Sentences, SCG, and the Biblical commentaries.) Thomas treats a very specific “yes” or “no” question in each article in accord with the method of the medieval disputatio. That is to say, each article within the ST is, as it were, a mini-dialogue. Each article within ST has five parts. First, Thomas raises a very specific question, for example, “whether law needs to be promulgated.” Second, Thomas entertains some objections to the position that he himself defends on the specific question raised in the article. In other words, Thomas is here fielding objections to his own considered position. Third, Thomas cites some authority (in a section that begins, on the contrary) that gives the reader the strong impression that the position defended in the objections is, in fact, untenable. Oftentimes the authority Thomas cites is a passage from the Old or New Testament; otherwise, it is some authoritative interpreter of Scripture or science such as St. Augustine or Aristotle, respectively. It should be noted the authority cited is in no way, shape, or form Thomas’ final word on the subject at hand. Thomas is well aware that authorities need to be interpreted. Fourth, Thomas develops his own position on the specific topic addressed in the article. This part of the article is oftentimes referred to as the body or the respondeo, literally, I respond. Here, Thomas offers arguments in defense of his own considered position on the matter at issue. Sometimes Thomas examines various possible positions on the question at hand, showing why some are untenable whereas others are defensible. At other times, Thomas shows that much of the problem is terminological; if we appreciate the various senses of a term crucial to the science in question, we can show that authorities that seem to be in conflict are simply using an expression with different intended meanings and so do not disagree after all. Fifth, Thomas returns to the objections and answers each of them in light of the work he has done in the body of the article. It should be noted that Thomas often adds interesting details in these answers to the objections to the position he has defended in the body of the article.

In addition to his theological syntheses, Thomas composed numerous commentaries on the works of Aristotle and other neo-Platonic philosophers. For example, Thomas commented on all of Aristotle’s major works, including Metaphysics, Physics, De Anima, and Nichomachean Ethics. These are line-by-line commentaries, and contemporary Aristotle scholars have remarked on their insightfulness, despite the fact that Thomas himself did not know Greek (although he was working from Latin translations of Greek editions of Aristotle’s text). The focus in Thomas’ commentaries is certainly explaining the mind of Aristotle. That being said, given that Thomas sometimes corrects Aristotle in these works (see, for example, his commentary on Physics, book 8, chapter 1), it seems right to say that Thomas’ commentaries on Aristotle are usefully consulted to elucidate Thomas’ own views on philosophical topics as well.

Thomas is often spoken of as an Aristotelian. This is particularly so when speaking of Thomas’ philosophy of language, metaphysics of material objects, and philosophy of science. When it comes to Thomas’ metaphysics and moral philosophy, though, Thomas is equally influenced by the neo-Platonism of Church Fathers and other classical thinkers such as St. Augustine of Hippo, Pope St. Gregory the Great, Proclus, and the Pseudo-Dionysius. One way to see the importance of neo-Platonic thought for Thomas’ own thinking is by noting the fact that Thomas authored commentaries on a number of important neo-Platonic works. These include commentaries on Boethius’ On the Hebdomads, Boethius’ De trinitate, Pseudo-Dionysius’ On the Divine Names, and the anonymous Book of Causes. (The last work Thomas correctly identified as the work of an Arab philosopher who borrowed greatly from Proclus’ Elementatio Theologica and the work of Dionysius; previously it had been thought to be a work of Aristotle’s).

Although Thomas commented on a number of philosophical works, Thomas probably saw his commentaries on Scripture as his most important. (Thomas commented on Job, Isaiah, Jeremiah, Lamentations, Psalms 1-51 (this commentary was interrupted by his death), Matthew, John, Romans, 1 and 2 Corinthians, Galatians, Ephesians, Philippians, Colossians, 1 and 2 Thessalonians, 1 and 2 Timothy, Titus, Philemon, and Hebrews. Thomas also composed a running gloss on the four gospels, the Catena aurea, which consists of a collection of what various Church Fathers have to say about each verse in each of the four gospels.) Thomas understood himself to be, first and foremost, a Catholic Christian theologian. Indeed, theology professors at the University of Paris in Thomas’ time were known as Masters of the Sacred Page. In addition, Thomas was a member of the Dominican order, and the Dominicans have a special regard for teaching the meaning of Scripture.

A reader might wonder why one would mention Thomas’ commentaries on Scripture in an article focused on his contributions to the discipline of philosophy. It is important to mention Thomas’ Scripture commentaries since Thomas often does his philosophizing in the midst of doing theology, and this is no less true in his commentaries on Scripture. To give just one example of the importance of Thomas’ Scripture commentaries for understanding a philosophical topic in his thought, he has interesting things to say about the communal nature of perfect happiness in his commentaries on St. Paul’s letters to the Corinthians and to the Ephesians. A reader who focused merely on Thomas’ treatment of perfect happiness in, for example, the Summa theologiae, would get an incomplete picture of his views on human happiness.

Where talk of Thomas’ philosophy is concerned, there is a final literary genus worth mentioning, the so-called disputed question. Like ST, the articles in Thomas’ disputed questions are organized according to the method of the medieval disputatio. However, whereas a typical article in ST fields three or four objections, it is not uncommon for an article in a disputed question to field 20 objections to the position the master wants to defend. Consider, for example, the question of whether there is power in God. Whereas the article in ST that treats this question fields four objections, the corresponding article in Thomas’ Disputed Questions on the Power of God fields 18 objections. Nonetheless, it would be a mistake to think that Thomas’ disputed questions necessarily represent his most mature discussions of a topic. Although the disputed questions can be regarded as Thomas’ most detailed treatments of a subject, he sometimes changed his mind about issues over the course of his writing career, and the disputed questions do not necessarily represent his last word on a given subject.

2. Faith and Reason

Thomas’ views on the relationship between faith and reason can be contrasted with a number of contemporary views. Consider first an influential position we can label evidentialism. For our purposes, the advocate of evidentialism believes that one should proportion the strength of one’s belief B to the amount of evidence one has for the truth of B, where evidence for a belief is construed either (a) as that belief’s correspondence with a proposition that is self-evident, indubitable, or immediately evident from sense experience, or (b) as that belief’s being supported by a good argument, where such an argument begins from premises that are self-evident, indubitable, or immediately evident from sense experience (see Plantinga [2000, pp. 67-79] and Rota [2012]). Evidentialism, so construed, is incompatible with a traditional religious view that Thomas holds about divine faith: if Susan has divine faith that p, then Susan has faith that p as a gift from God, and Susan reasonably believes that p with a strong conviction, not on the basis of Susan’s personally understanding why p is true, but on the basis of Susan’s reasonably believing that God has divinely revealed that p is true. In other words, divine faith is a kind of certain knowledge by way of testimony for Thomas.

Fideism is another position with which we can contrast Thomas’ views on faith and reason. For our purposes, consider fideism to be the view that states that faith is the only way to apprehend truths about God. Put negatively, the fideist thinks that human reason is incapable of demonstrating truths about God philosophically.

Finally, consider the position on faith and reason known as separatism. According to separatism, philosophy and natural science, on the one hand, and revealed theology, on the other, are incommensurate activities or habits. Any talk of conflict between faith and reason always involves some sort of confusion about the nature of faith, philosophy, or science.

In contrast to the views mentioned above, Thomas not only sees a significant role for both faith and reason in the best kind of human life (contra evidentialism), but he thinks reason apart from faith can discern some truths about God (contra fideism), as epitomized by the work of a pagan philosopher such as Aristotle (see, for example, SCG I, chapter 3). Thomas also recognizes that revealed theology and philosophy are concerned with some of the same topics (contra separatism). Although treating some of the same topics, Thomas thinks it is not possible in principle for there to be a real and significant conflict between the truths discovered by divine faith and theology on the one hand and the truths discerned by reason and philosophy on the other. In fact, Thomas thinks it is a special part of the theologian’s task to explain just why any perceived conflicts between faith and reason are merely apparent and not real and significant conflicts (see, for example, ST Ia. q. 1, a. 8). Indeed, showing that faith and reason are compatible is one of the things Thomas attempts to do in his own works of theology. A diverse group of subsequent religious thinkers have looked to Thomas’ modeling the marriage of faith and reason as one of his most important contributions.

One place where Thomas discusses the relationship between faith and reason is SCG, book I, chapters 3-9. Thomas notes there that there are two kinds of truths about God: those truths that can be apprehended by reason apart from divine revelation, for example, that God exists and that there is one God (in the Summa theologiae, Thomas calls such truths about God the preambles to the faith) and those truths about God the apprehension of which requires a gift of divine grace, for example, the doctrine of the Trinity (Thomas calls these the articles of faith). Although the truth of the preambles to the faith can be apprehended without faith, Thomas thinks human beings are not rationally required to do so. In fact, Thomas argues that three awkward consequences would follow if God required that all human beings need to apprehend the preambles to the faith by way of philosophical argumentation.

First, very few people would come to know truths about God and, since human flourishing requires certain knowledge of God, God wants to be known by as many people as possible. Not everyone has the native intelligence to do the kind of work in philosophy required to understand an argument for the existence of God. Among those who have the requisite intelligence for such work, many do not have the time it takes to apprehend such truths by philosophy, being engaged as they are in other important tasks such as taking care of children, manual labor, feeding the poor, and so forth. Finally, among those who have the natural intelligence and time required for serious philosophical work, many do not have the passion for philosophy that is also required to arrive at an understanding of the arguments for the existence of God.

Second, of the very few who could come to know truths about God philosophically, these would apprehend these truths with anything close to certainty only late in their life, and Thomas thinks that people need to apprehend truths such as the existence of God as soon as possible. (Compare here with a child learning that it is wrong to lie; parents wisely want their children to learn this truth as soon as possible.) In order to understand why Thomas thinks that the existence of God is a truth discernible by way of philosophy only late in life, we need to appreciate his view of philosophy, metaphysics, and natural theology. Philosophy is a discipline we rightly come to only after we have gained some confidence in other disciplines such as arithmetic, grammar, and logic. Among the philosophical disciplines, metaphysics is the most difficult and presupposes competence in other philosophical disciplines such as physics (as it is practiced, for example, in Aristotle’s Physics, that is, what we might call philosophical physics, that is, reflections on the nature of change, matter, motion, and time). Finally, demonstrating the existence of God is the hardest part of metaphysics. If we are to apprehend with confidence the existence of God by way of philosophy, this will happen only after years of intense study and certainly not during childhood, when we might think that Thomas believes it is important, if not necessary, for it to happen.

Third, let us suppose Susan has the native intelligence, time, passion, and experience requisite for apprehending the existence of God philosophically and that she does, in fact, come to know that God exists by way of a philosophical argument. Thomas maintains that such an apprehension is nonetheless going to be deficient for it will not allow Susan to be totally confident that God exists, since Susan is cognizant—being the philosopher she is—that there is a real possibility she has made a mistake in her philosophical reasoning. However, the good life, for example, living like a martyr, requires that we possess an unshakeable confidence that God exists. Since God wants as many people as possible to apprehend his existence, and to do so as soon as possible and with the kind of confidence enjoyed by the Apostles, saints, and martyrs, Thomas argues that it is fitting that God divinely reveals to human beings—even to theologians who can philosophically demonstrate the existence of God—the preambles to the faith, that is, those truths that can be apprehended by human reason apart from divine faith, so that people from all walks of life can, with great confidence, believe that God exists as early in life as possible.

However, does it make sense to believe things about God that exceed the natural capacity of human reason? Thomas thinks the answer is “yes,” and he defends this answer in a number of ways. Two are mentioned here. First, Thomas thinks it sensible of God to ask human beings to believe things about God that exceed their natural capacities since to do so reinforces in human beings an important truth about God, namely, that God is such that He cannot be completely understood by way of our natural capacities. If we say we completely understand God by way of our natural capacities, then we do not understand what “God” means. Talk about God, for Thomas, requires that we recognize our limitations with respect to such a project. God’s asking us to believe things about Him that we cannot apprehend philosophically makes sense for Thomas because it alerts human beings to the fact that we cannot know God in the same way we know the objects of other sciences.

Thomas also notes that believing things about God by faith perfects the soul in a manner that nothing else can. Here Thomas draws on the testimony of Aristotle, who thinks that even a little knowledge of the highest and most beautiful things perfects the soul more than a complete knowledge of earthly things. Although we cannot understand the things of God that we apprehend by faith in this life, even a slim knowledge of God greatly perfects the soul. Just as a bit of real knowledge of human beings is better for Susan’s soul than Susan’s knowing everything there is to know about carpenter ants, Susan’s possessing knowledge about God by faith is better for Susan’s soul than Susan’s knowing scientifically everything there is to know about the cosmos.

Still, we might wonder why Thomas thinks it is reasonable to accept the Catholic faith as opposed to some other faith tradition that, like the Catholic faith, asks us to believe things that exceed the capacity of natural reason. One thing Thomas says is that some non-Catholic religious traditions ask us to believe things that are contrary to what we can know by natural reason. Thomas accepts the medieval maxim that “grace does not destroy nature or set it aside; rather grace always perfects nature.” Although the Catholic faith takes us beyond what natural reason by itself can apprehend, according to Thomas, it never contradicts what we know by way of natural reason. Therefore, any real conflicts between faith and reason in non-Catholic religious traditions give us a reason to prefer the Catholic faith to non-Catholic faith traditions.

In addition, Thomas thinks there are good—although non-demonstrative—arguments for the truth of the Catholic faith. Thomas begins with the accounts of healings, the resurrection of the dead, and miraculous changes in the heavenly bodies, as contained in the Old and New Testaments. These accounts of miracles—which Thomas takes to be historically reliable—offer confirmation of the truthfulness of the teaching of those who perform such works by the grace of God. Even more significant, thinks Thomas, is the fact that simple fishermen were transformed overnight into apostles, that is, eloquent and wise men. Thomas takes this to be a miracle that provides confirmation of the truth of the Catholic faith the apostles preached. Most powerful of all, according to Thomas, the Catholic faith spread throughout the world in the midst of great persecutions. As Thomas notes, the Catholic faith was not initially embraced because it was economically advantageous to do so; nor did it spread—as other religious traditions have—by way of the sword; in fact, people flocked to the Catholic faith—as Thomas notes, both the simple and the learned—despite the fact that it teaches things that surpass the natural capacity of the intellect and demands that people curb their desires for the pleasures of the flesh. Given human nature, Thomas thinks that such conversions were miraculous and so testify to the truth of the faith that such people came to adopt.

3. Philosophy of Language: Analogy

Any discussion of Thomas’ views concerning what something is, for example, goodness or knowledge or form, requires some stage-setting. Much of contemporary analytic philosophy and modern science operates under the assumption that any discourse D that deserves the honor of being called scientific or disciplined requires that the terms employed within D not be used equivocally. Thomas agrees, but with a very important caveat. Thomas distinguishes two different kinds of equivocation: uncontrolled (or complete) equivocation and controlled equivocation (or analogous predication). While the former is incompatible with a discourse being scientific or disciplined, according to Thomas, the latter is not. Thomas therefore distinguishes three different ways words are used: univocally, equivocally (in a sense that is complete or uncontrolled), and analogously, that is, equivocally but in a manner that is controlled. When we use a word univocally, we predicate of two things (x and y) one and the same name n, where n has precisely the same meaning when predicated of x and y. For example, think of the locutions, “the cat is an animal” and “the dog is an animal.” Here, the same word “animal” is predicated of two different things, but the meaning of “animal” is precisely the same in both instances. By contrast, when we use a word equivocally, two things (x and y) are given one and the same name n, where n has one meaning when predicated of x and a different meaning when predicated of y. For example, we use the very same word “bank” to refer to a place where we save money and that part of the land that touches the edge of a river.

Importantly, Thomas notices that some instances of equivocation are controlled, or instances of analogous predication, whereas other instances of equivocal naming are complete or uncontrolled. In a case of complete or uncontrolled equivocation, we predicate of two things (x and y) one and the same name n, where n has one meaning when predicated of x and n has a completely different meaning when predicated of y. English usage of the word “bank” is a good example of complete or uncontrolled equivocation; here the use of the same name is totally an accident of language. It is a matter of linguistic chance that “bank” has these two totally different and unrelated meanings in English.

By contrast, in a case of controlled equivocation or analogous predication, we predicate of two things (x and y) one and the same name n, where n has one meaning when predicated of x, n has a different but not unrelated meaning when predicated of y, where one of these meanings is primary whereas the other meaning derives its meaning from the primary meaning. For example, consider the manner in which we use the word “good.” We sometimes speak of “good dogs,” and sometimes we say things such as “Doug is a good man.” The meanings of “good” in these two locutions obviously differ one from another since in the first sense no moral commendation is implied where there is moral commendation implied in the latter. However, it also seems right to say that “good” is not being used in completely different and unrelated ways in these locutions. Rather, our speaking of “good dogs” derives its meaning from the primary meaning of “good” as a way to offer moral commendation of human beings. We thus use the word “good” as an analogous expression in Thomas’ sense. To take an example Aristotle uses, “healthy” is used in the primary sense in a locution such as “Joe is healthy.” We might also say “Joe’s urine is healthy,” which uses “healthy” to pick out a sign of Joe’s health (in the primary sense of that term), or “exercise is healthy,” which uses “healthy” to pick out a cause of health (again, in the primary sense).

Thomas takes analogous predication or controlled equivocation to be sufficient for good science and philosophy, assuming, of course, that the other relevant conditions for good science or philosophy are met. Although the most famous use to which Thomas puts his theory of analogous naming is his attempt to make sense of a science of God, analogous naming is relevant where many other aspects of philosophy are concerned, Thomas thinks. For example, we also use words analogously when we talk about being, knowledge, causation, and even science itself. Thomas therefore sees a significant difference between complete equivocation and controlled equivocation or analogous naming. Whereas the scientist qua scientist must avoid the former, a discipline that uses words in the latter sense can properly be understood to be scientific or disciplined.

4. Epistemology

a. The Nature of Knowledge and Science

Thomas is aware of the fact that there are different forms of knowledge. One form of knowledge that is particularly important to a 13th-century professor such as Thomas is scientific knowledge (scientia). However, Thomas recognizes that scientific knowledge itself depends upon there being non-scientific kinds of knowledge, for example, sense knowledge and knowledge of self-evident propositions (about each of which, there is more below). We can begin to get a sense of what Thomas means by scientia by way of his discussion of faith, which is a form of knowledge he often contrasts with scientia (see, for example, ST IIaIIae. q. 1, aa. 4-5; q. 2, a. 1). According to Thomas, faith and scientia are alike in being subjectively certain. If I believe that p by faith, then I am confident that p is true. It is likewise with scientific knowledge. However, the reason for one’s being confident that p differs in the cases of faith and scientia. If I know that p by way of science, then I not only have compelling reasons that p, but I understand why those reasons compel me to believe that p. In contrast to scientia, the certainty of faith that p is grounded for Thomas in a rational belief that someone else has scientia or intellectual vision with respect to p. Thus, the certainty of faith is grounded in someone else’s testimony—in the case of divine faith, the testimony of God. For Thomas, faith can and, at least for those who have the time and talent, should be supported by reasons. However, if Susan believes p by faith, Susan may see that p is true, but she does not see why p is true. Susan’s belief that p is ultimately grounded in confidence concerning some other person, for example, Jane’s epistemic competence, where Jane’s competence involves seeing why p is true, either by way of Jane’s having scientia of p, because Jane knows that p is self-evidently true, or because Jane has sense knowledge that p.

We should note that, for Thomas, scientia itself is a term that we rightly use analogously. For example, in speaking of science, we could be talking about an act of inquiry whereby we draw certain conclusions, not previously known, from things we already know, that is, starting from first principles, where these principles are themselves known by way of (reflection upon our) sense experiences, we draw out the logical implications of such principles. We can contrast science as an act of inquiry with another kind of speculative activity that Thomas calls contemplation. Both science (in the sense of engaging in an act of inquiry) and contemplation are acts of speculative intellect according to Thomas, that is, they are uses of intellect that have truth as their immediate object. (In contrast, practical uses of intellect are acts of intellect that aim at the production of something other than what is thought about, for example, thinking at the service of doing the right thing, in the right way, at the right time, and so forth, or thinking at the service of bringing about a work of art.) Thomas thinks that, whereas an act of scientific inquiry aims at discovering a truth not already known, an act of contemplation aims at enjoying a truth already known.

We can speak of science not only as an act of inquiry, but also as a particularly strong sort of argument for the truth of a proposition that Thomas calls a scientific demonstration. If a person possesses a scientific demonstration of some proposition p, then he or she understands an argument that p such that the argument is logically valid and he or she knows with certainty that the premises of the argument are true.

In addition to the senses of science mentioned above, Thomas also recognizes the Aristotelian sense of scientia as a particular kind of intellectual habit or disposition or virtue, which habit is the fruit of scientia as scientific inquiry and requires the possession of scientific demonstrations. But science in the sense of a habit is more than the fruit of inquiry and the possession of arguments. Science as a habit is a person’s possession of an organized body of knowledge of and demonstrative argumentation about some subject matter S, where possessing an organized body of knowledge of and demonstrative argumentation about some subject matter is a function of knowing (a) the basic facts about S, that is, the characteristic properties or powers of things belonging to S, as well as (b) the principles, causes, or explanations of these properties or powers of S, and (c) the logical connections between (a) and (b). For example, according to this model of science, I have a scientific knowledge of living things qua living things only if I know the basic facts about all living things, for example, that living things grow and diminish in size over time, nourish themselves, and reproduce, and I know why living things have these characteristic powers and properties. According to Thomas, a science as habit is a kind of intellectual virtue, that is, a habit of knowledge about a subject matter, acquired from experience, hard work, and discipline, where the acquisition of that habit usually involves having a teacher or teachers. A person who possesses a science s knows the right kind of starting points for thinking about s, that is, the first principles or indemonstrable truths about s, and the scientist can draw correct conclusions from these first principles. In other words, if one has a science of s, one’s knowledge of s is systematic and controlled by experience, and so one can speak about s with ease, coherence, clarity, and profundity.

Thomas notes that the first principles of a science are sometimes naturally known by the scientist, for example in the cases of arithmetic and geometry (ST Ia. q. 1, a. 2). According to Thomas, the science of sacred theology does not fit this characterization of science since the first principles of sacred theology are articles of faith and so are not known by the natural light of reason but rather by the grace of God revealing the truth of such principles to human beings. Of course, contemporary philosophers of science would not find sacred theology’s inability to fit neatly into a well-defined univocal conception of science to be a problem for the scientific status of sacred theology. Think of the demarcation problem, that is, the problem of identifying necessary and sufficient conditions for some discourse counting as science. The demarcation problem suggests that science is a term we use analogously. This is what Thomas thinks. For example, Thomas recognizes that, even among those sciences whose first premises are known to some human beings by the natural light of reason, there are some sciences (call them “the xs”) such that scientists practicing the xs, at least where knowledge of some of the first principles of the xs is concerned, depend upon the testimony of scientists in disciplines other than their own. For example, optics makes use of principles treated in geometry, and music makes use of principles treated in mathematics. If, for example, all musicians had to be experts at mathematics, most musicians would never get to practice the science of music itself. Thus, musicians take the principles and findings of mathematics as a starting point for the practice of their own science. Like optics and music, therefore, sacred theology draws on principles known by those with a higher science, in this case, the science possessed by God and the blessed (see, for example, ST Ia. q. 1, a. 2, respondeo). Unlike optics, music, and other disciplines studied at the university, the principles of sacred theology are not known by the natural light of reason. However, sacred theology is nonetheless a science, since those who possess such a science can, for example, draw logical conclusions from the articles of faith, argue that one article of faith is logically consistent with the other articles of faith, and answer objections to the articles of faith, doing all of these things systematically, clearly, and with ease by drawing on the teachings of other sciences, including philosophy (ST Ia. q. 1, a. 8).

b. The Extension of Science

Given his notion of science (whether taken as activity, demonstrative argument or intellectual virtue), we might think that Thomas understands the extension of science to be wider than what most of our contemporaries would allow. There is a sense in which this is true. Although there is certainly disagreement among our contemporaries over the scientific status of some disciplines studied at modern universities, for example, psychology and sociology, all agree that disciplines such as physics, chemistry, and biology are to be counted among the sciences. The demarcation problem notwithstanding, we tend to think of science as natural science, where a natural science constitutes a discipline that studies the natural world by way of looking for spatio-temporal patterns in that world, where “the way of looking” tends to involve controlled experiments (Artigas 2000, p. 8). Thomas would have known something of science in this sense from his teacher St. Albert the Great (c. 1206-1280). However, for Thomas, (for whom science is understood as a discipline or intellectual virtue) disciplines such as mathematics, music, philosophy, and theology count as sciences too since those who practice such disciplines can talk about the subjects studied in those disciplines in a way that is systematic, orderly, capacious, and controlled by common human experience (and, in some cases, in the light of the findings of other sciences).

On the other hand, there is a sense in which Thomas’ understanding of science is more restrictive than the contemporary notion. Thomas follows Aristotle in thinking that we know something x scientifically only if our knowledge of x is certain. That is to say, we have demonstrative knowledge of x, that is, our knowledge begins from premises that we know with certainty by way of reflection upon sense experience, for example, all animals are mortal or there cannot be more in the effect than in its cause or causes, and ends by drawing logically valid conclusions from those premises. However, it seems to be a hallmark of the modern notion of science that the claims of science are, in fact, fallible, and so, by definition, uncertain.

c. The Four Causes

No account of Thomas’ philosophy of science would be complete without mentioning the doctrine of the four causes. Following Aristotle, Thomas thinks the most capacious scientific account of a physical object or event involves mentioning its four causes, that is, its efficient, material, formal, and final causes. Of course, some things (of which we could possibly have a science of some sort) do not have four causes for Thomas. For example, immaterial substances will not have a material cause. However, Thomas thinks that material objects—whether natural or artificial—do have four causes. For example, for any material object O, O has four causes, the material cause (what O is made of), the formal cause (what O is), the final cause (what the end, goal, purpose, or function of O is), and the efficient cause (what brings—or conserves—O in(to) being). One has a scientific knowledge of O (or O’s kind) only if one knows all four causes of O or the kind to which O belongs. Here follows a more detailed account of each of the four causes as Thomas understands them.

i. The Efficient Cause

An efficient cause of x is a being that acts to bring x into existence, preserve x in existence, perfect x in existence, or otherwise bring about some feature F in x. For example, Michelangelo was the efficient cause of the David. Thomas thinks that there are different kinds of efficient causes, which kinds of efficient causes may all be at work in one and the same object or event, albeit in different ways. For example, Thomas thinks that God is the primary efficient cause of any created being, at every moment in which that created being exists. That is, if it were not for God’s timelessly and efficiently causing a creature to exist at some time t, that creature would not exist at t. God’s act of creation and conservation with respect to some creature C does not rule out that C also simultaneously has creatures as secondary efficient causes of C. This is because God and creatures are efficient causes in different and yet analogous senses. God is the primary efficient cause as creator ex nihilo, timelessly conserving the very existence of any created efficient cause at every moment that it exists, whereas creatures are secondary efficient causes in the sense that they go to work on pre-existing matter such that matter that is merely potentially F actually becomes F. For example, we might say that a sperm cell and female gamete work on one another at fertilization and thereby function as secondary efficient causes of a human being H coming into existence. To continue with this example, Thomas thinks that God, too, is at work as the primary efficient cause of H’s coming into existence, since, for example, (a) God is the creating and conserving cause of (i) any sperm cell as long as it exists, (ii) any female gamete as long as it exists, and (iii) all aspects of the environment necessary for successful fertilization. In addition, Thomas thinks (b) God is the creating and conserving cause of the existence of H itself as long as H exists.

ii. The Material Cause

Thomas thinks that “material cause” (or simply “matter”) is an expression that has a number of different but related meanings. Perhaps the most obvious sense of “matter” is what “garden-variety” objects and their “garden-variety” parts are made of. In this sense of “matter,” the material cause of an axe is some iron and some wood.

There is one sense of “matter” that is very important for an analysis of change, thinks Thomas. Matter in this sense explains why x is capable of being transformed into something that x currently is not. The material cause in this sense is the subject of change—that which explains how something can lose the property not-F and gain the property F. For example, the material cause for an accidental change is some substance. Socrates himself is the material cause of the change that consists in Socrates’ losing the property of not-standing and gaining the property of standing. Such a change is accidental since the substance we name Socrates does not in this case go out of existence in virtue of losing the property of not-standing and gaining the property of standing.

The material cause for a substantial change is what medieval interpreters of Aristotle such as Thomas call prima materia (prime or first matter). Prime matter is that cause of x that is intrinsic to x (we might say, is a part of x) that explains why x is subject to substantial change. For Thomas, substances are unified objects of the highest order. Substances, for example, living things, are thus to be directly contrasted with heaps or collections of objects, for example, a pile of garbage or an army. Thomas thinks that if substantial changes had actual substances functioning as the ultimate subjects for those substantial changes, then it would be reasonable to call into question the substantial existence of those so-called substances that are (supposedly) composed of such substances. If Socrates were composed, say, of Democritean atoms that were substances in their own right, then Socrates, at best, would be nothing more than an arrangement of atoms. He would merely be an accidental being—an accidental relation between a number of substances—instead of a substance. At worst, Socrates would not exist at all (if we think the only substances are fundamental entities such as atoms, and Socrates is not an atom). Since Thomas thinks of Socrates as a paradigm case of a substance, he thus thinks that the matter of a substantial change must be something that is in and of itself not actually a substance but is merely the ultimate material cause of some substance. Thomas calls this ultimate material cause of a substance that can undergo substantial change prime matter. For example, consider that a bear eats a bug at t, so that the bug exists in space s, that is, the bear’s stomach, at t. Some prime matter therefore is configured by the substantial form of a bug in s at t such that there is a bug in s at t. At time t+1, when the bug dies in the bear’s stomach, the prime matter in s loses the substantial form of a bug and that prime matter comes to be configured by a myriad of substantial forms such that the bug no longer exists at t+1. What exists in s at t+1 is a collection of substances, for example, living cells arranged bug-wise, where the cells themselves will soon undergo substantial changes so that what will exist is a collection of non-living substances, for example, the kinds and numbers of atoms and molecules that compose the living cells of a living bug.

That being said, Thomas thinks prime matter never exists without being configured by some form. First of all, matter always exists under dimensions, and so this prime matter (rather than that prime matter) is configured by the accidental form of quantity, and more specifically, the accidental quantity of existing in three dimensions (see, for example, Commentary on Boethius’ De trinitate q. 4, a. 2, respondeo). In addition, it is never the case that some prime matter exists without being configured by some substantial form. For example, some quantity of prime matter m might be configured by the substantial form of an insect at t, be configured by the substantial forms of a collection of living cells at t+1 (for example, some moments after the insect has been eaten by a frog), be configured by the substantial forms of a collection of chemical compounds at t+2, and be incorporated into the body of a frog as an integral part of the frog such that it is configured by the frog’s substantial form at t+3. A portion of prime matter is always configured by a substantial form, though not necessarily this or that substantial form.

Note the theoretical significance of the view that material substances are composed of prime matter as a part. Prime matter is the material causal explanation of the fact that a material substance S’s generation and (potential) corruption are changes that are real (contra Parmenides of Elea), substantial (contra atomists such as Democritus), natural (contra those who might say that all substantial changes are miraculous), and intelligible (contra Heraclitus of Ephesus and Plato of Athens).

iii. The Formal Cause

Like the material cause of an object, the expression formal cause is said in many ways. There are at least three for Thomas. First, formal cause might mean “the nature or definition of a thing,” that is, what-it-is-to-be S. The formal cause of a primary substance x in this sense is the substance-sortal that picks out what x is most fundamentally or the definition of that substance-sortal. For example, for Socrates this would be human being, or, what-it-is-to-be-a-human being, and, given that human beings can be defined as rational animals, rational animal. Although Socrates certainly belongs to other substance-sortals, for example, animal, living thing, rational substance, and substance, such substance-sortals only count as genera to which Socrates belongs; they do not count as Socrates’ infima species, that is, the substance-sortal that picks out what Socrates is most fundamentally. Of course, Socrates can be classified in many other ways, too, for example, as a philosopher or someone who chose not to flee his Athenian prison. However, such classifications are not substantial for Thomas, but merely accidental, for Socrates need not be (or have been) a philosopher—for example, Socrates was not a philosopher when he was two years old, nor someone who chose not to flee his Athenian prison, for even Socrates might have failed to live up to his principles on a given day.

A second sense that formal cause can have for Thomas is that which is intrinsic to or inheres in x and explains that x is actually F. There are two kinds of formal cause in this sense for Thomas. First, there are accidental forms (or simply, accidents). Accidental forms inhere in a substance and explain that a substance x actually is F, where F is a feature that x can gain or lose without x’s ceasing to exist, for example, Socrates’ being tan, Socrates’ weighing 180 lbs, and so forth. Second, there are substantial forms. According to Thomas, substantial forms are particulars—each individual substance has its own individual substantial form—and the substantial form of a substance is the intrinsic formal cause of (a) that substance’s being and (b) that substance’s belonging to the species that it does. A substantial form is a form intrinsic to x that explains the fact that x is actually F, where F is a feature that x cannot gain or lose without ceasing to exist, for example, Socrates’ property being an animal.

A third sense of formal cause for Thomas is the pattern or definition of a thing insofar as it exists in the mind of the maker. Thomas calls this the exemplar formal cause. For example, the form of a house can exist insofar as it is instantiated in matter, for example, in a house. However, the form of (or plan for) a house can also exist in the mind of the architect, even before an actual house is built. This latter sense of formal cause is what we might call the exemplar formal cause. For Thomas, following St. Augustine, some of the ideas of God are exemplar formal causes in this sense, for example, God’s idea of the universe in general, God’s idea of what-it-is-to-be a human being, and so forth, function, as it were, as plans or archetypes in the mind of the Creator for created substances.

iv. The Final Cause

The final cause of an object O is the end, goal, purpose, or function of O. Some material objects have functions as their final causes, namely, that is, artifacts and the parts of organic wholes. For example, the function of a knife is to cut, and the purpose of the heart is to pump blood. Therefore, the final cause of the knife is to cut; the final cause of the heart is to pump blood. Thomas thinks that all substances have final causes. However, Thomas (like Aristotle) thinks of the final cause in a manner that is broader than what we typically mean by function. It is a mistake, therefore, to think that all substances for Thomas have functions in the sense that artifacts or the parts of organic wholes have functions as final causes (we might say that all functions are final causes, but not all final causes are functions). For example, Thomas does not think that clouds have functions in the sense that artifacts or the parts of organic wholes do, but clouds do have final causes. In the broadest sense, that is, in a sense that would apply to all final causes, the final cause of an object is an inclination or tendency to act in a certain way, where such a way of acting tends to bring about a certain range of effects. For example, a knife is something that tends to cut. A cloud is a substance that tends to interact with other substances in the atmosphere in certain ways, ways that are not identical to the ways that either oxygen per se or nitrogen per se tends to interact with other substances.

For Thomas, the final cause is “the cause of all causes” (On the Principles of Nature, ch. 4) and so the final, formal, efficient, and material causes go “hand in hand.” If an object has a tendency to act in a certain way, for example, frogs tend to jump and swim, that tendency—final causality—requires that the frog has a certain formal cause, that is, it is a thing of a certain kind. In addition, things that jump and swim must be composed of certain sorts of stuffs and certain sorts of organs. Frogs, since they are by nature things that flourish by way of jumping and swimming, are composed of bone, blood, and flesh, as well as limbs that are good for jumping and swimming. Finally, a frog’s jumping is something the frog does insofar as it is a frog, given the frog’s form and final cause. That is to say, it is clear that the frog acts as an efficient cause when it jumps, since a frog is the sort of thing that tends to jump (rather than fly or do summersaults). Contrast the frog that is unconscious and pushed such that it falls down a hill. In so falling, the frog is not acting as an efficient cause.

As we have seen, some final causes are functions, whereas it makes better sense to say that some final causes are not functions but rather ends or goals or purposes of the characteristic efficient causality of the substances that have such final causes. In closing this section, we can note that some final causes are intrinsic whereas others are extrinsic. According to Thomas, each and every substance tends to act in a certain way rather than other ways, given the sort of thing it is; such goal-directedness in a substance is its intrinsic final causality. However, sometimes an object O acts as an efficient cause of an effect E (partly) because of the final causality of an object extrinsic to O. Call such final causality extrinsic. For example, John finds Jane attractive, and thereby John decides to go over to Jane and talk to her. John’s own desire for happiness, happiness that John currently believes is linked to Jane, is part of the explanation for why John moves closer to Jane and is a good example of intrinsic formal causality, but Jane’s beauty is also a final cause of John’s action and is a good example of extrinsic final causality.

d. The Sources of Knowledge: Thomas’ Philosophical Psychology

Thomas thinks there are different kinds of knowledge, for example, sense knowledge, knowledge of individuals, scientia, and faith, each of which is interesting in its own right and deserving of extended treatment where its sources are concerned. For present purposes, we shall focus on what Thomas takes to be the sources of knowledge requisite for knowledge as scientia, and, since Thomas recognizes different senses of scientia, what Thomas takes to be the sources for knowledge as a scientific demonstration of a proposition in particular.

As we have seen, if a person possesses scientia with respect to some proposition p for Thomas, then he or she understands an argument that p such that the argument is logically valid and he or she knows the premises of the argument with certainty. Therefore, one of the sources of scientia for Thomas is the operation of the intellect that Thomas calls reasoning (ratiocinatio), that is, the act of drawing a logically valid conclusion from other propositions (see, for example, ST Ia. q. 79, a. 8). Reasoning is sometimes called by Thomists, the third act of the intellect.

How do we come to know the premises of a demonstration with certainty? Our coming to know with certainty the truth of a proposition, Thomas thinks, potentially involves a number of different powers and operations, each of which is rightly considered a source of scientia. Before we speak of the intellectual powers and operations (in addition to ratiocination) that are at play when we come to have scientia, we must first say something about the non-intellectual cognitive powers that are sources of scientia for Thomas.

Thomas agrees with Aristotle that the intellectual powers differ in kind from the sensitive powers such as the five senses and imagination. Nonetheless, Thomas also thinks that all human knowledge in this life begins with sensation. Even our knowledge of God begins, according to Thomas, with what we know of the material world. Since God, for Thomas, is immaterial, the claim that “knowledge… begins in sense” (Disputed Questions on Truth, q. 1, a. 11, respondeo) should not be thought to mean that knowledge of x requires that we can form an accurate image of x. Thomas’ claim rather means that knowledge of any object x presupposes some (perhaps prior) activity on the part of the senses. Indeed, Thomas thinks that sensation is so tightly connected with human knowing that we invariably imagine something when we are thinking about anything at all. Of course, if God exists, that means that what we imagine when we think about God bears little or no relation to the reality, since God is not something sensible. Given the importance of sense experience for knowledge for Thomas, we must mention certain sense powers that are preambles to any operation of the human intellect.

In addition to the five exterior senses (see, for example, ST Ia. q. 78, a. 3), Thomas argues that a capacious account of human cognition requires that we mention various interior senses as preambles to proper intellectual activity (see, for example, ST Ia. q. 78, a. 4). For in order for perfect animals (that is, animals that move themselves, such as horses, oxen, and human beings [see, for example, Commentary on Aristotle’s De Anima, n. 255]) to make practical use of what they cognize by way of the exterior senses, they must have a faculty that senses whether or not they are, in fact, sensing, for the faculties of sight, hearing, and so forth themselves do not confer this ability. In addition, none of the exterior senses enables their possessor to distinguish between the various objects of sense, for example, the sense of sight does not cognize taste, and so forth. Therefore, the animal must have a faculty in addition to the exterior senses by which the animal can identify different kinds of sensations, for example, of color, smell, and so forth with one particular object of experience. We might think that it is some sort of intellectual faculty that coordinates different sensations, but not all animals have reason. Therefore, animals must have an interior sense faculty whereby they sense that they are sensing, and that unifies the distinct sensations of the various sense faculties. Thomas calls this faculty, following Avicenna, the common sense (not to be confused, of course, with common sense as that which most ordinary people know and professors are often accused of not possessing). Since, for Thomas, human beings are animals too, they also possess the faculty of common sense.

In addition to the common sense, Thomas argues that we also need what philosophers have called phantasy or imagination to explain our experience of the cognitive life of animals (including human beings). For, clearly, perfect animals sometimes move themselves to a food source that is currently absent. Therefore, such animals need to be able to imagine things that are not currently present to the senses but have been cognized previously in order to explain their movement to a potential food source. On the assumption that, in corporeal things, to receive and retain are reduced to diverse principles, Thomas argues the faculty of imagination is thus distinct from the exterior senses and the common sense. He also notes that imagination in human beings is interestingly different from that of other animals insofar as human beings, but not other animals, are capable of imagining objects they have never cognized by way of the exterior senses, or objects that do not in fact exist, for example, a golden mountain.

In Thomas’ view, we cannot explain the behavior of perfect animals simply by speaking of the pleasures and pains that such creatures have experienced. Thus, we need to posit two additional powers in those animals. The estimative power is that power by which an animal perceives certain cognitions instinctively, for example, the sheep’s cognition that the wolf is an enemy or the bird’s cognition that straw is useful for building a nest (for neither the sheep nor the bird knows this simply by way of what it cognizes by way of the exterior senses). The memorative power is that power that retains cognitions produced by the estimative power. Since (a) the estimative sense and common sense are different kinds of powers, (b) the common sense and the imagination are different kinds of powers, and (c) the estimative power can be compared to the common sense whereas the memorative power can be compared to the imagination, it stands to reason that the estimative power and the memorative power are different powers.

Just as intellect in human beings makes a difference in the functioning of the faculty of imagination for Thomas, so also does the presence of intellect in human beings transform the nature of the estimative and memorative powers in human beings. As Thomas notes, this is why the estimative and memorative powers have been given special names by philosophers: the estimative power in human beings is called the cogitative power and the memorative power is called the reminiscitive power. The cogitative power in human beings is that power that enables human beings to make an individual thing, event, or phenomena, qua individual thing, event, or phenomena, an object of thought. For example, if Joe comes to believe “this man is wearing red,” he does so partly in virtue of an operation of the cogitative power, since Joe is thinking about this man and his properties (and not simply man in general and redness in general, both of which, for Thomas, are cognized by way of an intellectual and not a sensitive power; see below). Similarly, if I come to think, “I should not steal,” I do so partly by way of my cogitative power according to Thomas insofar as I am ascribing a property to an individual thing, in this case, myself. As for the reminiscitive power, it enables its possessor to remember cognitions produced by the cogitative power. In other words, it helps us to remember intellectual cognitions about individual objects. For example, say that I am trying to remember the name of a particular musician. I employ the reminiscitive power when I think about the names of other musicians who play on recordings with the musician whose name I cannot now remember but want to remember.

Having said something about the non-intellectual, cognitive sources of scientia for Thomas, we can return to speaking of the properly intellectual powers and activities of human beings necessary for scientia. According to Thomas, there are two powers of the intellect, powers Thomas calls the active intellect and the passive intellect, respectively. Thomas thinks that the intellect has what he calls a passive power since human beings come to know things they did not know previously (see, for example, ST Ia. q. 79, a. 2). In being able to do this, human beings are unlike the angels, Thomas thinks, since, according to Thomas, the angels are created actually knowing everything they will naturally know. (According to Thomas, the blessed angels do come to have supernatural knowledge, namely, knowledge of the essence of God in the beatific vision.) Following Aristotle, Thomas believes that the intellect of a human being, in contrast to that of an angel, is a tabula rasa at the beginning of its existence. The passive intellect of a human being is that which receives what a person comes to know; it is also the power by which a human being retains, intellectually, what is received. For Thomas, therefore, the passive intellect plays the role of memory where knowledge of the nature of things is concerned [see, for example, ST Ia. q. 79, a. 7). For example, say John does not know what a star is at time t. He reads about stars at t+1 and in doing so comes to know the nature of a star. Since John’s intellect has been altered such that he knows something he did not know before, there must be a power that explains this ability to receive knowledge; for Thomas, it is John’s passive intellect, that is, the intellect insofar as John can come to know something he did not know before.

Whereas the passive intellect is that which receives and retains an intelligible form, what Thomas calls the active intellect is the efficient cause intrinsic to the knowing agent that makes what is potentially knowable actually so. In Thomas’ view, anything that is understood is understood in virtue of its form. However, the forms of material things, although potentially intelligible, are not actually intelligible insofar as they configure matter, but human beings can understand material things. Therefore, since that which is brought from potency to act is done so only by that which is appropriately actual, we do not know things innately, and we sometimes experience ourselves actually understanding things, there must be a power in human beings that can cause the forms of material objects to become actually intelligible. That power is what Thomas calls the active intellect.

We can round out our discussion of Thomas’ account of the sources of scientia by speaking of the three activities of the powers of the intellect. The first act of the intellect is what Thomists call the act of simple apprehension; this is the intellect’s act of coming to understand the essence of a thing (see, for example, Commentary on Aristotle’s On Interpretation, Proeemium, n. 1). The intellectual act of simple apprehension is simple in the sense that it does not yet imply a judgment on the part of an intellect about the truth or falsity of a proposition. For example, it is by the intellect’s act of simple apprehension that a person cognizes what a thing is, that is, its quiddity, without forming true or false propositions about that quiddity such as, it exists, or it is F rather than not-F.

According to Thomas, the intellect’s simple act of apprehension is the termination of a process that involves not only the activities of intellectual powers but sensory powers, too, both exterior and interior. As we have seen, Thomas thinks that all intellection begins with sensation. Therefore, when we come to understand the essence of a material object, say a bird, the form of the bird is first received spiritually in a material organ, for example, the eye. To say that the form of the bird is received spiritually is simply to say that what is received is received as a form, where the form in question does not exist in the sense organ as it exists extra-mentally. As Stump (2003, p. 253) notes, we might think of this form, as it exists in the sense organ, as encoded information. Thomas calls this immaterial reception of the bird in the eye “the sensible species” of the object cognized. We do not, as of yet, have enough to explain an animal’s conscious awareness of what is sensed. In order for this to occur, Thomas speaks of the need of the sensible species being worked on by the power of phantasia. At that point, the agent has a phantasm of the bird; she is at least conscious of a blue, smallish object with wings. From the phantasm, including experiences of similar phantasms stored in phantasia or the reminiscitive power, the power of active intellect abstracts what Thomas calls the intelligible species from the phantasm(s), that is, leaves to one side those features the agent recognizes are accidental to the object being cognized in order to focus on the quiddity, nature, or essence of what is being cognized. The resulting quiddity is received in the possible intellect. Finally, the intelligible species is transformed into an “inner word” or “concept,” that is, there is conscious awareness of the quiddity of what has been cognized such that the quiddity is recognized as corresponding to a word such as “bird.”

So far we have spoken of the third and first acts of the intellect. The second activity of the intellect is what Thomists call judgment, but Thomas himself typically speaks of the intellect’s composing and dividing (see, for example, Commentary on Aristotle’s On Interpretation, Proeemium, n. 1, and ST Ia. q. 85, a. 5). In this act of the intellect, the intellect compares quiddities and judges whether or not this property or accident should be attributed to this quiddity. For example, Joe comes to know the quiddity of mammality and animality through the first act of intellect and judges (correctly) that all mammals are animals by way of the second act of understanding.

Since scientia for Thomas involves possessing arguments that are logically valid and whose premises are obviously true, one of the sources of scientia for Thomas is the intellect’s second act of intellect, composing and dividing, whereby the scientist forms true premises, or propositions, or judgments about reality. Since such judgments have the intellect’s first act of understanding as a prerequisite—one cannot truly judge that all mammals are animals until one apprehends animality and mammality—acts of simple apprehension are also a source of scientific knowledge for Thomas. This brings us back to where we started, with the third act of intellect, namely, ratiocination, the intellect’s ability to derive a logically valid conclusion from some other proposition or propositions, for example, judging that all mammals are animals and all animals are living things, we reason to the conclusion that all mammals are living things. To take a more interesting example, if we judge that all human beings have intellectual souls and all intellectual souls are by nature incorruptible, it follows that any human being has a part that survives the biological death of that human being.

We would be remiss not to mention God as a source of all forms of knowledge for Thomas. For all human intellection involves many instances of change, of going from a state of not-knowing that p to knowing that p, and each and every change, Thomas thinks, requires as part of its sufficient explanation the action of one being that is itself absolutely immutable (see, for example, Thomas’ so-called first way of demonstrating the existence of God at ST Ia. q. 2, a. 3, respondeo). Thomas believes (by faith) that the God of Abraham, Isaac, and Jacob is this one immutable being. Therefore, in Thomas’ view God is the primary uncaused cause of each and every act of human intellection. However, all of this is consistent, Thomas thinks, with human intellects also being real and active secondary causes of their own acts of knowing. Unlike some of his forerunners in philosophical psychology, Thomas thinks that each and every human being has his or her own agent intellect by which he or she can “light up” the phantasms in order to actually understand a thing. (Here we can contrast Thomas’ views with those of St. Augustine of Hippo, Ibn Sina [Avicenna], and Ibn Rushd [Averroes], all of whom think God or some non-human intellect plays the role of agent intellect). Although God’s act of creating and sustaining any intellectual activity is a necessary condition and the primary efficient cause for any human act of coming to know something not previously known, it is neither a sufficient condition nor the sole cause of such activity, Thomas thinks. For a human being, too, is a secondary, efficient cause of his or her coming to know something.

5. Metaphysics

a. On Metaphysics as a Science

In Thomas’ Aristotelian understanding of science, a science S has a subject matter, and a scientist with respect to S knows the basic facts about the subject matter of S, the principles or starting points for thinking about the subject matter of S, the causes of the subject matter of S, and the proper accidents of the subject matter of S. Following Aristotle, Thomas thinks of metaphysics as a science in this sense. For Thomas, the subject matter of the science of metaphysics is being qua being or being in common, that is, being insofar as it can be said of anything that is a being. (Contrast, for example, the narrower subject matters of philosophical physics, which studies physical being insofar as it can be investigated philosophically, and natural theology, which studies immaterial being insofar as it can be studied by the power of natural reason alone.) Thomas also thinks intelligent discussion of the subject matter of metaphysics requires that one recognize that “being is said in many ways,” that is, that there are a number of different but non-arbitrarily related meanings for being, for example, being as substance, quality, quantity, or relation, being qua actual, being qua potential, and so forth. The metaphysician, minimally, can speak intelligently about the proper relationships between these many different but related meanings of “being.”

The principles of being qua being include those principles that are ever and always employed but are never themselves considered carefully in all disciplines, for example, the principle of identity and the principle of non-contradiction. The causes of being qua being are the efficient, formal, and final causes of being qua being, namely, God. Finally, the proper accidents of being qua being are “one,” “good,” “beautiful,” “same,” “whole,” “part,” and so forth. For Thomas, metaphysics involves not only disciplined discussion of the different senses of being but rational discourse about these principles, causes, and proper accidents of being.

Note that Thomas therefore thinks about the subject matter of metaphysics in a manner that differs from that of contemporary analytic philosophers. Contemporary analytic philosophers tend to think about metaphysics as the philosophical discipline that treats a collection of questions about ultimate reality (see, for example, Van Inwagen 2015, p. 3). However, this contemporary understanding of the subject matter of metaphysics is too broad for Thomas since he thinks there are philosophical disciplines distinct from metaphysics that treat matters of ultimate reality, for example, the ultimate causes of being qua movable are treated in philosophical physics or natural philosophy, the ultimate principles of human being are treated in philosophical anthropology.

b. On What There Is: Metaphysics as the Science of Being qua Being

For Thomas, when we think about the meaning of being wisely, we recognize that we use it analogously and not univocally. Thus, one of the things the metaphysician does, thinks Thomas, is identify, describe, and articulate the relationship between the different senses of being. Let us catalogue some of the ways Thomas uses “being,” which ways of using the expression “being” are best understood by way of emphasizing Thomas’ examples.

In one place Thomas distinguishes four different senses of being (Disputed Questions on Truth q. 21, a. 4, ad4). Being in the primary sense is substantial being, for example, Socrates, or a particular tree. However, there are also extended senses of being; there is being in the sense of the principles of substances, that is, form and matter, being in the sense of the dispositions or accidents of a substance, for example, a quality of a substance, and being in the sense of a privation of a disposition of a substance, for example, a man’s blindness. Again, although the same word is used to speak of these four realities, the term being does not have precisely the same meaning in these four cases, although all four meanings are related to the primary meaning of being as substance.

Another distinction Thomas makes where being is concerned is the distinction between being in act and being in potency. Being in potency does not actually exist now but is such that it can exist at some point in the future, given the species to which that being in potency belongs. In contrast, being in act exists now. For example, say Socrates is not tan right now but can be tan in the future, given that he is a rational animal, and rational animals are such that they can be tan. Socrates is therefore not tan in act, but rather tan in potency (see, for example, On the Principles of Nature, ch. 1). The distinction between being in act and being in potency is important because it helps solve a puzzle raised by Parmenides, namely, how something can change. If “being” can only refer to what exists in act, then there can be no change. However, if being is said in many ways, not only of what actually is but also what can be in the sense of what can become what it is not, then change can be understood as something intelligible (see, for example, Commentary on Aristotle’s Physics, lec. 6, n. 39). The viability of the distinction between being in act and being in potency can be confirmed by thinking about the way we commonly speak and think. For example, compare a rock and a very young person who is not yet old enough to see. Both of them do not actually see, but not in the same sense. For we rightly negate the ability to see of a rock; it does not actually have the ability to see, nor does it potentially have such an ability, given the sort of thing that it is. However, although a very young human person, like the rock, does not actually have the ability to see, that young person is nonetheless potentially something that sees.

If a being were fully actual, then it would be incapable of change. If a being were purely potential, then it would not, by itself, actually exist. Thus, actually existent beings capable of change are composites of act and potency. The principle of actuality in a composite being explains that the being in question actually exists or actually has certain properties whereas the principle of potentiality in a composite being explains that the being in question either need not exist—it is not in the nature of that thing to exist—or is a thing capable of substantial change such that its matter can become part of some numerically distinct substance.

Where act and potency are concerned, Thomas also distinguishes, with Aristotle, between first and second act on the one hand and active and passive potency on the other. A substance s is in first act or actuality insofar as s, with respect to some power P, actually has P. For example, the newborn Socrates, although actually a human being, only potentially has the power to philosophize and so is not in first act with respect to the power to philosophize. On the other hand, Socrates, when awaiting his trial, and being such that he is quite capable of defending the philosophical way of life, is in first act with respect to the habit of philosophy, that is, he actually has the power to philosophize. A substance s is in second act insofar as, with respect to some power P, s not only actually has P but is currently making use of P. For example, imagine that Socrates is sleeping, say, the night before he makes his famous defense of the philosophical way of life. When he is sleeping, although Socrates is in first act with respect to the power to philosophize, he is not in second act with respect to that power (although he is in potency to the second act of philosophizing). Socrates, when he is actually philosophizing at his trial, is not only in first act with respect to the power to philosophize, but also in second act.

Consider now the difference between active and passive potency. Imagine Socrates is not now philosophizing. He is resting. Nonetheless, he is potentially philosophizing. However, his potency with respect to philosophizing is an active potency, for philosophizing is something one does; it is an activity. Insofar as Socrates is not now philosophizing, but is potentially philosophizing, he has an active potency.

Now imagine Socrates is hit by a tomato at time t at his trial. Socrates can be hit by a tomato at t because he has, among other passive potencies, the ability to be hit by an object. Having the ability to be hit by an object is not an ability (or potentiality) Socrates has to F, but rather an ability (or potentiality) to have F done to him; hence, being able to be hit by an object is a passive potentiality of Socrates.

Where being is concerned, Thomas also distinguishes between beings in nature and intentional beings or beings of reason (see, for example, Commentary on Aristotle’s Metaphysics IV, lec. 4, n. 574). Thomas thinks that nothing can be understood, save insofar as it has being. Natural being is what philosophers (and empirical scientists) study, for example, non-living things, plants, animals, human beings, colors, virtues, and so forth. However, some beings that we think about follow upon the consideration of thinking about beings of nature, notions such as genus, species, and difference. These are the sorts of beings studied in logic, Thomas thinks. In additional to logical beings, we could also mention fictional beings such as Hamlet as an example of a being of reason.

Where the meanings of being are concerned, Thomas also recognizes the distinction between being in the sense of the essentia (essence or nature or form) or quod est (what-it-is) of a thing on the one hand and being in the sense of the esse or actus essendi or quo est (that-by-which-it-is) of a thing on the other hand (see, for example, SCG II, ch. 54). To say that a being B’s essentia differs from its esse is to say that B is composed of essentia and esse, which is just to say that B’s esse is limited or contracted by a finite essentia, which is also to say that B’s esse is participated esse, which itself is to say that B receives its esse from another. If esse and essentia do not differ in a being B1, then B1’s esse is not limited by a finite essentia, B1’s esse is not participated and so uncreated, and B1’s esse is unreceived. For Thomas, only in God are God’s esse and essentia identical.

According to Thomas, all created substances are composed of essentia and esse. The case where there is the clearest need to speak of a composition of essentia and esse is that of the angels. In speaking of act and potency in the angels, Thomas does not speak in terms of form and matter, since for Thomas matter as a principle of potentiality is always associated with an individual thing existing in three dimensions. Thomas’ Franciscan colleague at the University of Paris, St. Bonaventure, did indeed argue that angels were composed of form and spiritual matter. However, Thomas thinks the notion of spiritual matter is a contradiction in terms, for to be material is to be spread out in three dimensions, and the angels are not spread out in three dimensions. Angels are essentially immaterial beings, thinks Thomas. (This is not to say that angels cannot on occasion make use of a body by the power of God; this is how Thomas would make sense of the account of the angel Gabriel talking with the Blessed Virgin Mary in the Gospel according to Luke; whatever Mary saw when she claimed to talk to the angel Gabriel, according to Thomas, it was not a part of Gabriel. Compare the notion that angels are purely immaterial beings that nonetheless make use of bodies as instruments with Plato’s view (at least in the Phaedo) that the human body is not a part of a human being but only an instrument that the soul uses in this life.) However, because angels are not pure act—this description is reserved for the first uncaused efficient cause alone for Thomas—there is need to make sense of the fact that an angel is a composite of act and potency. Thus, Thomas speaks of a composition of essentia (being in the sense of what something is) and esse (being in the sense that a thing is) in the angels, for it does not follow from what an angel is that it exists. In other words, where we can distinguish essentia and esse in a thing, that thing is a creature, that is, it exists ever and always because God creates and conserves it in being. Of course, substances composed of form and matter, for example, human beings, non-rational animal, plants, minerals, are creatures too and so they are also composed of essentia and esse. In general, talk of essence/esse composition in created substances is Thomas’ way of making sense, for him, of the fact that such substances do not necessarily exist but depend for their existence, at every moment that they exist, upon God’s primary causal activity.

6. Natural Theology

a. Some Methodological Considerations

Thomas thinks there are two kinds of truths about God: (a) those truths that can be demonstrated philosophically and (b) those truths that human beings can come to know only by the grace of divine revelation. Although Thomas has much of great interest to say about (b)—see, for example, SCG, book IV, ST Ia. qq. 27-43, and ST IIIa.—this article focuses on (a): those truths that according to Thomas can be established about God by philosophical reasoning.

Thomas thinks there are at least three mutually reinforcing approaches to establishing truths about God philosophically: the way of causation; the way of negation, and the way of perfection (or transcendence). Thomas makes use of each one of these methods, for example, in his treatment of what can be said truly about God by the natural light of reason in ST.

b. The Way of Causation: On Demonstrating the Existence of God

Thomas offers what he takes to be demonstrations of the existence of God in a number of places in his corpus. (On the meaning of the term “demonstration,” see the section on Thomas’ epistemology). His most complete argument is found in SCG, book I, chapter 13. There is also an argument that Brian Davies (1992, p. 31) calls “the existence argument,” which can be found at, for example, ST Ia. q. 65, a. 1, respondeo. The most famous of Thomas’ arguments for the existence of God, however, are the so-called “five ways,” found relatively early in ST.

There are a number of things to keep in mind about the five ways. First, the five ways are not complete arguments, for example, we should expect to find some suppressed premises in these arguments. To see this, we can compare the first way of demonstrating the existence of God in ST Ia. q. 2, a. 3, which is an argument from motion, with Thomas’ complete presentation of the argument from motion in SCG, book I, chapter 13. Whereas the former is offered in one paragraph, the latter is given in 32 paragraphs.

Second, Thomas’ arguments do not try to show that God is the first mover, first efficient cause, and so forth in a temporal sense, but rather in what we might call an ontological sense, that is, in the sense that things other than God depend ultimately upon God causing them to exist at every moment that they exist. Indeed, as we shall see, Thomas does not think that God could be first in a temporal sense because God exists outside of time.

Third, as Thomas makes clear in SCG I, 13, 30, his arguments do not assume or presuppose that there was a first moment in time. As he notes there, given that the universe has a beginning, it is easier to show there is a God: “the most efficacious way to prove that God exists is on the supposition that the world is eternal. Granted this supposition, that God exists is less manifest” (Anton Pegis, trans.). Nor do the five ways attempt to prove that there was a first moment of time. Although Thomas believes there was a first moment of time, he is very clear that he thinks such a thing cannot be demonstrated philosophically; he thinks that the temporal beginning of the universe is a mystery of the faith (see, for example, ST Ia. q. 46, a. 2). Thus, if we should assume anything, for the sake of argument, about time or the duration of the world where Thomas’ arguments for the existence of God are concerned, we should assume that there is no first moment of time, that is, that the universe has always existed. Interestingly, even on such a supposition, Thomas thinks he can demonstrate philosophically that there is a God.

Fourth, as will be seen, the five ways are simply five ways of beginning to demonstrate God’s existence. For example, in ST the demonstrations of God’s existence continue beyond Ia. q. 2, a. 3, as Thomas attempts to show that a first mover, first efficient cause, first necessary being, first being, and first intelligence is also ontologically simple (q. 3), perfect (q. 4), good (qq. 5-6), infinite (q. 7), ontologically separate from finite being (q. 8), immutable (q. 9), eternal (q. 10), one (q. 11), knowable by us to some extent (q. 12), nameable by us (q. 13), knowledgeable (q. 14), such that there are ideas in that being’s mind (q. 15), such that life is properly attributed to that being (q. 18), such that will is properly attributed to that being (q. 19), and such that love is properly attributed to that being (q. 19). However, as Thomas says at the end of each of the five ways, such a being is what everyone calls “God.”

For our purposes, let us focus on one of Thomas’ five ways (ST Ia. q. 2, a. 3), the second way. Here is Thomas’ text (note that numbers have been inserted in the following text, corresponding to premises in the detailed formulation of the second way that follows):

The second way is from the nature of the efficient cause. [(1)] In the world of sense we find there is an order of efficient causes. [(3)] There is no case known (neither is it, indeed, possible) in which a thing is found to be the efficient cause of itself; for so it would be prior to itself, which is impossible. Now [(12)] in efficient causes it is not possible to go on to infinity, because [(6)] in all efficient causes following in order, the first is the cause of the intermediate cause, and the intermediate is the cause of the ultimate cause, whether the intermediate cause be several, or only one. Now [(7)] to take away the cause is to take away the effect. Therefore, [(8)] if there be no first cause among efficient causes, there will be no ultimate, nor any intermediate cause. But [(9)] if in efficient causes it is possible to go on to infinity, there will be no first efficient cause, [(10)] neither will there be an ultimate effect, nor any intermediate efficient causes; [(11)] all of which is plainly false. Therefore, [(13)] it is necessary to admit a first efficient cause, [(14)] to which everyone gives the name of God (Fathers of the English Dominican Province, trans.).

 This argument might be formulated as follows:

  1. In the world that can be perceived by the senses, there is an order of efficient causes, for example, there is something E that is an effect of an efficient cause or causes at a time t, for example, there is an animal whose existence at t is an effect of a number of efficient causes, for example, the warmth of the earth’s atmosphere at t, there being oxygen in the atmosphere for the animal to breath at t, and the proper functioning of biological systems within the animal at t, and so forth, and some of those efficient causes of E are themselves effects of other efficient causes at t, for example, the warmth of the earth’s atmosphere at t is an effect of the sun’s warming the atmosphere of the earth at t and the proper functioning of biological systems within the animal at t is an effect of the action of certain bio-chemicals within those biological systems at t, and so forth [assumption].
  2. If there is an order of efficient causes, for example, there is some effect E that has x as an efficient cause at t, and x itself has y as an efficient cause at t, and y itself has z as an efficient cause at t, and so forth, then (a) there is an order of efficient causes of E at t that is infinite, (b) there exists something (E or a cause of E) that is the efficient cause of itself at t, or (c) there is an absolutely first efficient cause of E’s existence at t, that is, E’s existence has an efficient cause at t where that efficient cause itself does not itself have an efficient cause [assumption].
  3. Nothing can be the efficient cause of itself, all by itself, otherwise it would be metaphysically prior to itself, which is impossible [assumption].
  4. Therefore, if there is an order of efficient causes, for example, there is some effect E that has x as an efficient cause of its existence at t, and x itself has y as an efficient cause at t, and so forth, then (a) there is an order of efficient causes of E at t that is infinite or (c) there is an absolutely first efficient cause of E’s existence at t [from (2) and (3), conditional introduction].
  5. (a) There is an order of efficient causes of E at t that is infinite or (c) there is an absolutely first efficient cause of E’s existence at t [from (1) and (4), MP].
  6. In an order of efficient causes such that a is an efficient cause of b and b is an efficient cause of an effect c, a is a first cause of b and c and b is an intermediate cause of the effect c [assumption].
  7. To take away the cause is to take away the effect [assumption].
  8. Therefore, if it is not the case that there is an absolutely first efficient cause of an effect E’s existence at t, then there are no intermediate causes and so no effect E at t [from (6) and (7)].
  9. If there is an order of efficient causes of E at t that is infinite, then it is not the case that there is an absolutely first efficient cause of E’s existence at t [assumption].
  10. Therefore, if there is an order of efficient causes of E at t that is infinite, then there are no intermediate causes and no effect E [from (8) and (9), HS].
  11. It is not the case that there are no intermediate causes and no effect E [from (1)].
  12. Therefore, it is not the case that there is an order of efficient causes of E at t that is infinite [from (10) and (11), MT].
  13. Therefore, there is an absolutely first efficient cause of E’s existence at t [from (5) and (12), DS].
  14. An absolutely first efficient cause of E’s existence at t is what everyone calls “God” [assumption].
  15. Therefore, there is a God [from (13) and (14)].

The second premise, third premise, seventh premise, the inference to the eighth premise, and the fourteenth premise likely require further explanation. As for premise (2), we should note that Thomas assumes the truth of a principle often called the principle of causality. The principle of causality states that every effect has a cause. The principle of causality is a piece of common sense that arguably also plays a pivotal role in all scientific inquiry. If, for example, Susan was eating Wheaties for breakfast and suddenly a blueberry appeared on the top of her cereal, it would be reasonable for Susan to ask, “What caused the blueberry to be there?” We would not accept the following answer as a legitimate response to that question: “Nothing caused it to be there.” Of course, we might not be able to find out precisely what caused the blueberry to be there. However, we should not therefore conclude that the blueberry’s coming to be on the top of Susan’s cereal bowl does not have a cause. The principle of causality is also being invoked when scientists ask a question such as, “What causes plants to grow?” A scientist assumes the principle of causality when he or she assumes there is an answer to this question that involves causes. Of course, when it comes to our understanding of the nature of ultimate causes, it may be that we run into certain limits to human understanding. This is something Thomas admits, as will be seen below. However, we get premise two of the formulation of Thomas’ second way by applying the principle of causality to the case of the existence of some effect. Given the importance of the principle of causality in everyday life and scientific work, to deny the principle of causality in the context of doing metaphysics would seem to be ad hoc (see Feser 2009, p. 51ff. for more discussion of this point).

Premise (3) is a metaphysical principle. Consider a scenario that would constitute a denial of premise (3): there is an x such that, absolutely speaking, x causes itself to exist. However, this is not possible. Although x can be the efficient cause of itself in one respect, for example, an organism is an efficient cause of its own continued existence insofar as it nourishes itself, it cannot be the efficient cause of itself in every respect. This is easiest to see in the case of something bringing itself into existence. In order for x to perform the act of bringing x into existence at time t, x must already exist at t in order to perform such an act. However, if x already exists at t to perform the act of bringing x into existence at t, then x does not bring itself into existence at t, for x already exists at t. However, the same kind of reasoning works if x is a timelessly eternal being. To say that x is timelessly the efficient cause of its own existence is to offer an explanatory circle as an efficient causal explanation for x’s existence, which for Thomas is not to offer a good explanation of x’s existence, since circular arguments or explanations are not good arguments or explanations.

Premise (7) shows that Thomas is not in this argument offering an ultimate efficient causal explanation of what is sometimes called a per accidens series of efficient causes, that is, a series of efficient causes that stretches (perhaps infinitely) backward in time, for example, Rex the dog was efficiently caused by Lassie the dog, and Lassie the dog was efficiently cause by Fido the dog, and so forth. If he did have such a per accidens causal series in mind, then premise (7) would be subject to obvious counter-examples, for example, a sculptor is the efficient cause of a sculpture. However, it routinely happens that a sculpture outlives its sculptor. In such a case, we can take away the efficient cause (the sculptor) without taking away the effect of its efficient causation (the sculpture). Unless we are comfortable assigning to Thomas a view that is obviously mistaken, we will look for a different interpretation of premise (7).

A typical and more charitable interpretation of premise (7) is that Thomas is talking here about concurrent efficient causes and their effects, for example, in a case where a singer’s song exists only as long as the singer sings that song. This interpretation of premise (7) fits well with what we saw Thomas say about the arguments for the existence of God in SCG, namely, that it is better to assume (at least for the sake of argument) that there is no beginning to time when arguing for the existence of God, for, in that case, it is harder to prove that God exists.

With such an interpretation of premise (7) in the background, we are in a position to make sense of the inference from premises (6) and (7) to premise (8). If there were no absolutely first cause in the order of efficient causes of any effect E, then there would be nothing that ultimately existentially “holds up” E, since none of the supposed intermediate causes of E would themselves exist without an efficient cause that is not itself an effect of some efficient cause.

Finally, premise (14) simply records the intuition that if there is an x that is an uncaused cause, then there is a God. Of course, Thomas does not think he has proved here the existence of the Triune God of Christianity (something, in any case, he does not think it possible to demonstrate). Rather, Thomas believes by faith that the absolutely first efficient cause is the Triune God of Christianity. However, to show philosophically that there is a first uncaused efficient cause is enough to show that atheism is false. To put this point another way, Thomas thinks Jews, Muslims, Christians, and pagans such as Aristotle can agree upon the truth of premise (14). As will be seen, Thomas thinks it possible, upon reflection, to draw out interesting implications about the nature of an absolutely first efficient cause from a few additional plausible metaphysical principles. The more inferences Thomas draws out regarding the nature of the absolutely first efficient cause, the easier it will be to say with him (whether or not we think his arguments sound), “But this is what people call ‘God’.”

c. The Way of Negation: What God is Not

As we saw in discussing his philosophical psychology, Thomas thinks that when human beings come to know what a material object is, for example, a donkey, they do so by way of an intelligible species of the donkey, which intelligible species is abstracted from a phantasm by a person’s agent intellect, where the phantasm itself is produced from a sensible species that human beings receive through sense faculties that cognize the object of perception. Thomas thinks I can know what a thing is, for example, a donkey, since the form of a donkey and my intelligible species of a donkey are identical in species (see, for example, SCG III, ch. 49, 5). However, in Thomas’ view, we cannot possess an idea of the first cause, that is, God, in this life that is isomorphic with God’s essence, for he thinks any likeness of God that we have in our minds in this life is derived from what we know of material objects, and such a likeness is not the same in species as the form or essence of God Himself (for reasons that will become clear in what follows). Therefore, we cannot naturally know what God is. (Thomas thinks this is true even of the person who is graced by the theological virtues of faith, hope, and charity in this life; knowing the essence of God is possible for human beings, Thomas thinks, but it is reserved for the blessed in heaven, the intellects of whom have been given a special grace called the light of glory [see, for example, ST Ia. q. 12, a. 11, respondeo].) Although we cannot know what God is in this life, by deducing propositions from the conclusions of the arguments for the existence of God, Thomas thinks we can, by natural reason, come to know what God is not. For our purposes, let us focus on three pieces of negative theology in Thomas’ natural theology: that God is not composed of parts; that God is not changeable; that God does not exist in time.

i. God is Not Composed of Parts

To say that God is not composed of parts is to say that God is metaphysically simple (see, for example, ST Ia. q. 3), for whatever has parts has a cause of its existence, that is, is the sort of thing that is put together or caused to exist by something else. Since nothing can cause itself to exist all by itself, whatever is composed of parts has its existence caused by another. However, God, the first uncaused cause, does not have God’s existence caused by another. Therefore, God does not have parts.

As Thomas notes, the denial that God the Creator has parts shows how much God is unlike those things God creates, for all the things with which we are most familiar are composed of parts of various kinds. However, there are a number of ways in which something might be composed of parts. The most obvious sense is being composed of quantitative parts, for example, there is the top inch of me, the rest of me, and so forth. Since God is not composed of parts, God is not composed of quantitative parts.

Thomas thinks that material objects, at any given time, are also composed of a substance and various accidental forms. The substance of an object explains why that object remains numerically one and the same through time and change. For example, Thomas would say that a human being, say, Sarah, is numerically the same yesterday and today because she is numerically the same substance today as she was yesterday. However, Sarah is not absolutely the same today compared to yesterday, for today she is cheerful, whereas yesterday she was glum. Thomas calls such characteristics—forms a substance can gain or lose while remaining numerically the same substance—accidental forms or accidents. At any given time, Sarah is a composite of her substance and some set of accidental forms. Now, we have shown that God is not composed of parts. Therefore, God also is not a composite of substance and accidental forms.

ii. God is Not Changeable

God’s not being composed of substance and accidental forms shows that God does not change, for if a being changes, it has a feature at one time that it does not possess at another. However, features that a being has at one time that it does not have at another are accidental forms. Thus, beings that change are composed of substance and accidental forms. However, God is not composed of substance and accidents. Therefore, God does not change (see, for example, ST Ia. q. 9).

Indeed, the fact that God is not composed of parts shows that God is not only unchanging, but also immutable (unchangeable), for if God can change, then God has properties or features that he can gain or lose without going out of existence. However, properties or features that a being can gain or lose without going out of existence are accidental forms. Therefore, if God can change, then God is composed of substance and accidental forms. However, God is not composed of parts, including the metaphysical parts that we call substance and accidental forms. Therefore, God cannot change, that is, God is immutable.

iii. God is Not in Time

Thomas contends that God does not exist in time (see, for example, ST Ia. q. 10). To see why he thinks so, consider what he thinks time is: a measurement of change with respect to before and after. (Thomas thinks time is neither a wholly mind-independent reality—hence it is a measurement—nor is it a purely subjective reality—it exists only if there are substances that change.) Therefore, if something does not change, it is not measured by time, that is, it does not exist in time. However, as has been seen, God is unchanging. Therefore, God does not exist in time.

d. The Way of Excellence: Naming God in and of Himself

Thomas thinks that we can not only know that God exists and what God is not by way of philosophy, but we can also know—insofar as we know God is the first efficient cause of creatures, exemplar formal cause of creatures, and final cause of creatures—that it is reasonable and meaningful to predicate of God certain positive perfections such as being, goodness, power, knowledge, life, will, and love. Nonetheless, in knowing that, for example, God is good is a correct and meaningful thing to say, we still do not know the essence of God, Thomas thinks, and so we do not know what God is good means with the clarity by which we know things such as triangles have three sides, mammals are animals, or this tree is flowering right now. Why this is the case will become clear in what follows.

In Thomas’ view, words are signs of concepts and concepts are likenesses of things. (For Thomas, concepts are not [usually] the objects of understanding; they are rather that by which we understand things [see, for example, ST Ia. q. 85, a. 2], like a window in a house is that by which we see what is outside the house.) Therefore, words relate to things through the medium of intellectual conception. We can therefore meaningfully name a thing insofar as we can intellectually conceive it. Although we cannot know the essence of God in this life, we can know that God exists as the absolutely first efficient cause of creatures, we can know what God is not, and, insofar as we know God as the absolutely first efficient cause of creatures and what God is not, we can know God by way of excellence. It is this last way of knowing God that allows us to meaningfully predicate positive perfections of God, thinks Thomas. Knowing God by way of excellence requires some explanation.

First, whatever perfection P exists in an effect must in some way exist in its cause or causes, otherwise P would come from absolutely nothing, and ex nihilo nihil fit (from nothing, nothing comes). (Note that the traditional theological doctrine of creation ex nihilo, which Thomas accepts, does not contradict the Greek axiom, ex nihilo nihil fit. Whereas the latter means that nothing can come from absolutely nothing, the former does not mean that creatures come from absolutely nothing. Rather, creation ex nihilo is shorthand for the view that creatures do not have a first material cause; according to the traditional doctrine of creation ex nihilo, creatures do, of course, have a first efficient, exemplar formal, and extrinsic final cause, that is, God.) Some perfections are pure and others are impure. A pure perfection is a perfection the possession of which does not imply an imperfection on the part of the one to which it is attributed; an impure perfection is a perfection that does imply an imperfection in its possessor, for example, being able to hit a home run is an impure perfection; it is a perfection, but it implies imperfection on the part of the one who possesses it, for example, something that can hit a home run is not an absolutely perfect being since being able to hit a homerun entails being mutable, and an absolutely perfect being is not mutable since a mutable being has a cause of its existence.

Second, creatures possess perfections such as justice, wisdom, goodness, mercy, power, and love. However, justice, wisdom, goodness, mercy, power, and love are pure perfections.

Third, God is the absolutely first efficient cause, which cause is simple, immutable, and timeless. Therefore, whatever pure perfections exist in creatures must pre-exist in God in a more eminent way (ST Ia. q. 4, a. 2, respondeo). Therefore, we can apply positive predicates to God, for example, just, wise, good, merciful, powerful, and loving, although not in such a way that defines the essence of God and not in a manner that we can totally understand in this life (ST Ia. q. 13, a. 1).

Not only can we meaningfully apply positive predicates to God, some such predicates can be applied to God substantially, Thomas thinks (see, for example, ST Ia. q. 13, a. 2, respondeo). One applies a name substantially to x if that name refers to x in and of itself and not merely because of a relation that things other than x bear to x. For example, the terms “Creator” and “Lord” are not said substantially of God, Thomas thinks, since such locutions imply a relation between creatures and God, and, for Thomas, it is not necessary that God bring about creatures (God need not have created and so need not have been a Creator, a Lord, and so forth). Although we come to know God’s perfection, goodness, and wisdom through reflecting upon the existence of creatures, Thomas thinks we can know that predicates such as perfect, good, and wise apply to God substantially and do not simply denote a relation between God and creatures since, as we saw above, God is the absolutely first efficient cause of the perfection, goodness, and wisdom in creatures, and there cannot be more in the effect than in the cause.

However, given the radical metaphysical differences between God and creatures, what is the real significance of substantially applying words such as good, wise, and powerful to God? Thomas knows of some philosophers, for example, Moses Maimonides (1138-1204), who take positive predications with respect to God to be meaningful only insofar as they are interpreted simply as statements of negative theology. For example, on Thomas’ reading, Maimonides thinks “God is good” should be understood simply as “God is not evil.” Thomas notes that other theologians take statements such as “God is good” to simply mean “God is the first efficient cause of creaturely goodness.” Thomas thinks there are a number of problems with these reductive theories of God-talk, but one problem that both of them share, he thinks, is that neither of them do justice to the intentions of people when they speak about God. Thomas states, “For in saying that God lives, [people who speak about God] assuredly mean more than to say that He is the cause of our life, or that He differs from inanimate bodies” (ST Ia. q. 13, a. 2, respondeo; English Dominican Fathers, trans.). According to Thomas, positive predicates such as God is good “are predicated substantially of God, although they fall short of a full representation of Him. . . So when we say, God is good, the meaning is not God is the cause of goodness, or, God is not evil, but the meaning is, Whatever good we attribute to creatures, pre-exists in God, and in a more excellent and higher way” (ST Ia. q. 13, a. 2, respondeo; English Dominican Fathers, trans.). Although it is correct to say that goodness applies to God substantially and that God is good “in a more excellent and higher way” than the way in which we attribute goodness to creatures, given that we do not know the essence of God in this life, we do not comprehend the precise meaning of “good” as applied substantially to God.

As has been seen, Thomas thinks that even within the created order, terms such as “being” and “goodness” are “said in many ways” or used analogously. Thus, we should not be surprised that Thomas thinks that a proper use of positive predications when it comes to God, for example, in the phrase, “God is wise,” involves predicating the term wise of God and human beings analogously and not univocally or equivocally (ST Ia. q. 13, a. 5). Why can we not properly predicate the term wise of God and human beings univocally? When we attribute perfections to creatures, the perfection in question is not to be identified with the creature to which we are attributing it. For example, when we say, John is wise, we do not mean to imply John is wisdom. However, given the divine simplicity, the perfections of God are to be identified with God’s very existence so that when we say God is wise, we should also say God is wisdom itself. In fact it is important to say both God is wise and God is wisdom itself when speaking of the wisdom of God, Thomas thinks. For if we say only the latter, then we may fall into the trap of thinking that God is an abstract entity such as a number (which is false, as the ways of causality, negation, and excellence imply). If we say only the former, we run the risk of thinking about God’s wisdom as though it were like our own, namely, imperfect, acquired, and so forth (which the ways of causality, negation, and excellence also show is false). Thus, when we use the word wise of John and God, we are not speaking univocally, that is, with the precisely same meaning in each instance.

On the other hand, if we merely equivocate on wise when we speak of John and God, then it would not be possible to know anything about God, which, as Thomas points out, is against the views of both Aristotle and the Apostle Paul, that is, both reason and faith. Rather, Thomas thinks we predicate wise of God and creatures in a manner between these two extremes; the term wise is not completely different in meaning when predicated of God and creatures, and this is enough for us to say we know something about the wisdom of God. Although we do name God from creatures, we know God’s manner of being wise super-exceeds the manner in which creatures are wise. It is correct to say, for example, God is wise, but because it is also correct to say God is wisdom itself, the wisdom of God is greater than human wisdom; in fact, it is greater than human beings can grasp in this life. That being said, we can grasp why it is that God’s wisdom is greater than we can grasp in this life, namely, because God is the simple, immutable, and timelessly eternal uncaused cause of creaturely perfections, including creaturely wisdom, and that is to know something very significant about God, Thomas thinks.

7. Philosophical Anthropology: The Nature of Human Beings

Thomas attributes to Plato of Athens the following view:

(P) A human being, for example, Socrates, is identical to his soul, that is, an immaterial substance; the body of Socrates is no part of him.

Thomas thinks (P) is false. In fact, in his view there are good reasons to think a human being is not identical to his or her soul. To take just one of his arguments, Thomas thinks the Platonic view of human beings does not do justice to our experience of ourselves as bodily beings. For Thomas, Plato is right that we human beings do things that do not require a material organ, namely, understanding and willing (for his arguments that acts of understanding do not make use of a material organ per se, see, for example, ST Ia. q. 75, aa. 2, 5, and 6). However, anything that sees, hears, touches, tastes, and smells is clearly also a bodily substance. We experience ourselves as something that sees, hears, touches, tastes, and smells. In short, I smell things, therefore, I am not an immaterial substance (see, for example, ST Ia. q. 76, a. 1, respondeo).

Although Thomas does not agree with Plato that we are identical to immaterial substances, it would be a mistake—or at least potentially misleading—to describe Thomas as a materialist. Like Aristotle, Thomas rejects the atomistic materialism of Democritus. In other words, Thomas would also reject the following view:

(M) Human beings are composed merely of matter.

For Thomas, (M) is false since human beings, like all material substances, are composed of prime matter and substantial form, and forms are immaterial. In fact, even non-living things such as instances of water and bronze are composed of matter and form for Thomas, since matter without form has no actual existence.

However, Thomas thinks (M) is false in the case of human beings for another reason: the substantial form of a human being—what he calls an intellect or intellectual soul—is a kind of substantial form specially created by God, one that for a time continues to exist without being united to matter after the death of the human being whose substantial form it is. To make some sense of Thomas’ views here, note that Thomas thinks a kind of substantial form is the more perfect insofar as the features, powers, and operations it confers on a substance are, to use a contemporary idiom, “emergent,” that is, features of a substance that cannot be said to belong to any of the integral parts of the substance that is configured by that substantial form, whether those integral parts are considered one at a time or as a mere collection. Here is Thomas:

It must be considered that the more noble a form is, the more it rises above (dominatur) corporeal matter, the less it is merged in matter, and the more it exceeds matter by its operation or power. Hence, we see that the form of a mixed body has a certain operation that is not caused by [its] elemental qualities (ST Ia. q. 76, a. 1, respondeo; English Dominican Fathers, trans.).

In other words, a substance’s substantial form is something above and beyond the properties of that substance’s integral parts. Why think a thing like that? Substances have powers and operations that are not identical to any of the powers and operations of that substance’s integral parts taken individually, nor are the powers conferred by a substantial form of a substance x identical to a mere summation of the powers of the integral parts of x. Thus, a mixed body such as a piece of bronze has certain powers that none of its elemental parts have by themselves nor when those elemental parts are considered as a mere sum.

Consider that Thomas thinks substantial forms fall into the following sort of hierarchy of perfection. The least perfect kind of substantial form corresponds with the least perfect kind of material substance, namely, the elements (for Thomas, elemental substances are individual instances of the kinds water, air, earth, and fire; for us they might be fundamental particles such as quarks and electrons). Thomas says that the substantial forms of the elements are wholly immersed in matter, since the only features that elements have are those that are most basic to matter. In contrast, the substantial forms of compounds, that is, instances of those non-living substance-kinds composed of different kinds of elements, for example, blood, bone, and bronze, have operations that are not caused by their elemental parts. Above the substantial forms of compounds, the substantial forms of living things, including plants, reach a level of perfection such that they get a new name: “soul” (see, for example: Disputed Question on the Soul [QDA] a. 1; ST Ia. q. 75, a.1; and ST Ia. q. 76, a.1.). For those of the 21st century, soul almost always means “immortal substance.” Thomas rather uses soul (anima) in Aristotle’s deflationary sense of “a substantial form which is the explanation for why a substance is alive rather than dead.” To see this, consider the English word “animate.” Soul (anima), for Thomas, is the principle or explanation for life or animation in a living substance. Souls are therefore substantial forms that enable plants and animals to do what all living things do: move, nourish, and reproduce themselves, things non-living substances cannot do. Next in line comes the souls or substantial forms of non-human animals, which have emergent properties to an even greater degree than the souls of plants, since in virtue of these substantial forms non-human animals not only live, move, nourish themselves, and reproduce, but also sense the world. Finally, the substantial forms of human beings have operations (namely, understanding and willing) that do not require bodily organs at all in order to operate, although such operations are designed to work in tandem with bodily organs (see, for example, SCG II, ch. 68). Since human souls do not require matter for their characteristic operations, given the principle that something’s activity is a reflection of its mode of existence (for example, if something acts as a material thing, it must be a material thing; if something acts as an immaterial thing, it must be an immaterial thing), human souls can exist apart from matter, for example, after biological death. In contrast, the substantial forms of non-human material substances are immersed in matter such that they go out of existence whenever they are separated from it (see, for example, ST Ia. q. 75, a. 3).

Since the human soul is able to exist apart from the matter it configures, the soul is a subsistent thing for Thomas, not simply a principle of being as are material substantial forms (see, for example: QDA a. 1; QDA a. 14; and ST Ia. q. 75, a. 2). However, even when it is separated from matter, a human soul remains the substantial form of a human being. As Thomas states (see, for example, ST Ia. q. 75, a. 4), a human being such as Socrates is not identical to his soul (for human beings are individual members of the species rational animal). Nonetheless, the individual soul can preserve the being and identity of the human being whose soul it is. In other words, although the soul is not identical to the human person, a human person can be composed of his or her soul alone. Thomas explains the point as follows: God creates the human soul such that it shares its existence with matter when a human being comes to exist (see, for example, SCG II, ch. 68, 3). Because the being of the human soul is numerically the same as that of the composite—again, the soul shares its being with the matter it configures whenever the soul configures matter—when the soul exists apart from matter between death and the general resurrection, the being of the composite is preserved insofar as the soul remains in existence (see, for example: SCG IV, ch. 81, 11; ST Ia. q. 76, a. 1, ad5; and ST IaIIae. q. 4, a. 5, ad2).

Consider an analogy: say Ted loses his arms and legs in a traffic accident but survives the accident. After the accident, Ted is not identical to the parts that compose him. Otherwise, we would have to say, by the law of the transitivity of identity, that Ted’s arms and legs (or the simples that composed them) were not parts of Ted before the accident. Composition is not identity. Something analogous can be said about Thomas’ views on the human soul and the human person. Although the human soul is never identical to the human person for Thomas, it is the case that after death and before the general resurrection, some human persons are composed merely of their soul.

Although the human soul can exist apart from matter between death and the general resurrection, existing separately from matter is unnatural for the human soul. The human soul, by its very nature, is a substantial form of a material substance (see, for example, SCG II, chs. 68 and 83). Given Thomas’ belief in a good and loving God, he thinks such a state can only be temporary (see, for example, SCG IV, ch. 79). Indeed, as a Catholic Christian, Thomas believes by faith that it will be only temporary, since the Catholic faith teaches there will one day be a general resurrection of the dead in which all human beings rise from the dead, that is, all intellectual souls will reconfigure matter. At that time not only will all separated souls configure matter again, by a miracle the separated soul of each human being will come to configure matter such that each human being will have numerically the same human body that he or she did in this life (see, for example: ST Suppl. q. 79, a. 1; and SCG IV, chs. 80 and 81). Human beings will then be restored to their natural state as embodied beings that know, will, and love.

Finally, since human souls are immaterial, subsistent entities, they cannot have their origin in matter (see, for example, SCG II, ch. 86). Thus, unlike material substantial forms, human souls only come to exist by way of a special act of creation on the part of God (see, for example, SCG II, ch. 87). Therefore, for Thomas, the beginning of the existence of every human person is both natural (insofar as the human parents of that person supply the matter of the person) and supernatural (insofar as God creates a person’s substantial form or intellectual soul ex nihilo).

8. Ethics

Thomas has one of the most well-developed and capacious ethical systems of any Western philosopher, drawing as he does on Jewish, Christian, Greek, and Roman sources, and treating topics such as axiology, action-theory, the passions, virtue theory, normative ethics, applied ethics, law, and grace. His ST alone devotes some 1,000 pages in English translation to ethical issues. Where many philosophers have been content to treat topics in meta-ethics and ethical theory, Thomas also devotes the largest part of his efforts in ST, for example, to articulate the nature and relations between the particular virtues and vices. In this summary of his ethical thought, we treat, only in very general terms, what Thomas has to say about the ultimate end of human life, the means for achieving the ultimate end, the human virtues as perfections of the characteristic human powers, the logical relationship between the virtues, moral knowledge, and the ultimate and proximate standards for moral truth.

a. The End or Goal of Human Life: Happiness

Thomas argues that in order to make sense of any genuine action in the universe we must distinguish its end or goal from the various means that a being employs in order to achieve such an end, for if a being does not act for an end, then that being’s acting in this or that way would be a matter of chance. In that case there would be no reason why the being acted as it did. In other words, the act would be unintelligible. However, for any act A in the universe, A is intelligible. Therefore, every being acts for an end (see, for example, SCG III, ch. 2). An end of an action is something (call it x) such that a being is inclined to x for its own sake and not simply as a means to achieving something other than x. A means to an end refers to something (call it y) such that a being is inclined to y for the sake of something other than y. However, some ends are what Thomas calls “ultimate.” An ultimate end is an end of action such that a being is inclined to it merely for its own sake, not also as a means to some further end.

Thomas thinks we can apply this general theory of action to human action. For example, although wealth might be treated as an end by a person relative to the means that a person employs to achieve it, for example, working, Thomas thinks it is obvious that wealth is not an ultimate end, and even more clearly, wealth is not the ultimate end. This distinction between an ultimate end and the ultimate end is important and does not go unnoticed by Thomas. He is willing to take seriously the possibility that human life might have several ultimate ends (see, for example, ST IaIIae. q. 1, a. 5). For example, we might think that knowledge, virtue, and pleasure are each ultimate ends of human life, that is, things we desire for their own sake and not also as means to some further end. However, Thomas thinks it is clear that a human being really has only one ultimate end. This is because the ultimate end—as Thomas understands the term—is more than simply something we seek merely for its own sake; it is something such that all by itself it entirely satisfies one’s desire. Say that John desires pleasure and virtue as ends in themselves, and pleasure and virtue do not necessarily come and go together in this life (some things that are pleasant are not compatible with a life of virtue; sometimes the virtuous life entails doing what is unpleasant). Thus, neither of these could be equivalent to the ultimate end for John; for John’s having one without the other, there would still be something that John desires, and possession of the ultimate end sates all of one’s desires. In that case, if pleasure and virtue are both ends in themselves, then at most they must be component parts of an ultimate end construed as a complex whole.

Thus, for Thomas, each and every human being (like all beings) has one ultimate end. However, do all human beings have the same ultimate end? Thomas thinks so, and he believes that, in one sense, this should not be controversial. All human beings think of happiness as the ultimate end of human beings. Of course, Thomas recognizes that to speak about the ultimate end as “happiness” is still to speak about the ultimate end in very abstract terms, or, as Thomas puts it, to speak merely of the “notion of the ultimate end” (rationem ultimi finis) (ST IaIIae. q. 1, a. 7). Four people might agree that their goal in life is to be happy but disagree with one another (greatly) about that in which a happy life consists. For Thomas, this claim is not the same as the claim that human beings choose different means to achieving happiness. Although this is undoubtedly true, what Thomas means to say here is that people disagree about the nature of the happy life itself, for example, some think the ultimate end itself is the acquisition of wealth, others enjoying certain pleasures, whereas others think the happy life is equivalent to a life of virtuous activity. To see Thomas’ point, compare John and Jane, both of whom plan to rob a bank. John (unthinkingly) takes the acquisition of a great sum of wealth to be his ultimate end. Jane realizes that wealth is really merely an instrumental good and has already planned to retire to a vacation resort, which she (still shortsightedly) takes to be the object of human happiness.

Although people certainly disagree about what happiness is in the concrete, Thomas maintains that there are objective truths about the nature of happiness. (It is important to emphasize here that if one thinks that there are ways in which all of us must live if we are to be counted as genuinely happy, for example, by displaying and acting in accord with the moral virtues, then one can also think there are nearly an infinite number of ways that we can manifest those virtues, for example, as doctors, lawyers, teachers, artists, mechanics, engineers, priests, lay persons, and so forth.) If we take Thomas’ manner of speaking about human happiness in ST as demonstrative of his own position—what we have here, after all, is one long chain of arguments—Thomas also thinks that it is possible to offer a convincing argument for what it is that, objectively, fulfills a human being qua human being. However, Thomas also shows sensitivity to the role that our moral habits play in forming our beliefs—and so which arguments we will find convincing—regarding the nature of the good life for human beings (see, for example, ST IaIIae. q. 1, a. 7).

Before leaving the subject of the ultimate end of human action, we should note two other respects in which Thomas thinks the expression “ultimate end” (or “happiness”) is ambiguous. First, it is one thing to speak about the happiness that human beings can possess in this life, what Thomas sometimes calls “imperfect human happiness,” and another to speak about the happiness possessed by God, the angels, and the blessed, which Thomas considers to be perfect (see, for example, ST IaIIae. q. 4, a. 5). Thomas calls this worldly human happiness imperfect not only because he thinks it pales by comparison with the perfect happiness enjoyed by the saints in heaven, but also because he reads Aristotle—whose discussion of happiness is very important for Thomas’ own—as thinking about this worldly human happiness as imperfect. Thomas notes that, after Aristotle identifies the general characteristics of human happiness in NE, book I, ch. 7, Aristotle goes on to note in chapter 10 that human beings cannot be happy in this life, absolutely speaking, or perfectly, since human beings in this life can lose their happiness, and not being able to lose their happiness is something human beings desire. Thus, Aristotle himself thinks of human happiness in this life as imperfect in comparison to the conditions he lays out in NE, book I, ch. 7. Aristotle thinks humans are happy in this life merely as human beings, that is, as beings whose nature is mutable.

Second, Thomas recognizes two different kinds of questions we might wish to raise when we think about the nature of human happiness (see, for example, ST IaIIae. q. 1, a. 8 and q. 2, a. 7). When asking about the nature of human happiness, we might be asking what is true about the person who is happy. As Thomas puts it, this is to focus our attention on the use, possession, or attainment of happiness by the one who we are describing as (at least hypothetically) happy. To speak about happiness in this sense is to make claims about what has to be true about the soul of the person who is happy, for example, that happiness is an activity of the soul and not merely a state of the soul or an emotion, that it is a speculative rather than a practical activity, that this activity does not require a body, and so forth. However, in asking about the happiness of human beings, we might rather be asking about the object of happiness, or as Thomas puts it, “the thing itself in which is found the aspect of good” (ST IaIIae q. 1, a. 8). For example, the end of a hungry man in the sense of the object of his desire is food; the end of the hungry man in the sense of attainment is eating.

What constitutes happiness for Thomas? Thomas agrees with Aristotle that the attainment of happiness consists in the soul’s activity expressing virtue and, particularly, the best virtue of contemplation where the object of such contemplation is the best possible object, that is, God. Thus, the object of human happiness, whether perfect or imperfect, is the cause of all things, namely, God, for human beings desire to know all things and desire the perfect good. However, this is just another way to talk about God. Therefore, whether they consciously know it or not, all human beings desire contemplative union with God. Thomas thinks that human beings in this life—even those who possess the infused virtues, whether theological or moral (about which more is said below)—at best attain happiness only imperfectly since their contemplation and love of God is, at best, imperfect. For Thomas, only human happiness in heaven is perfect insofar as God brings it about that persons in heaven enjoy a perfect intellectual and volitional union with God. Thomas calls such a union the beatific vision.

b. Morally Virtuous Action as the Way to Happiness

Thomas thinks that happiness is the goal of all human activity. That suggests that human beings normally achieve happiness by means of human actions, that is, embodied acts of intellect and will (see, for example, ST IaIIae. q. 6, prologue). However, Thomas also thinks there are certain kinds of human actions that conduce to happiness. One complication, however, arises from the fact that Thomas thinks that we can speak about both imperfect and perfect happiness, the latter which is a happiness that human beings can only possess by God’s grace helping us transcend (but not setting aside) human nature. This latter happiness culminates for the saints in the beatitudo (blessedness) of heaven. Thus, according to Thomas, there are, in reality, two mutually reinforcing stories to tell about those human actions that lead to happiness. Since our focus here is on Thomas’ philosophy, we shall focus on what follows on what Thomas has to say about the relation between virtuous actions and imperfect happiness in this life. (We will nonetheless have occasion to discuss a few things about Thomas’ views on perfect happiness.)

Thomas’ primary concern in the place where he provides his most detailed outline of the good human life—ST IaIIae.—is explaining how human beings achieve happiness by means of virtuous human actions, especially morally virtuous actions (for more on the difference between intellectual virtue and moral virtue, see the section below on Human Virtues as Perfections of Characteristically Human Powers). Thomas, like Aristotle and Jesus of Nazareth (see, for example, Matthew 5:48), is a moral perfectionist in the sense that the means to human happiness comes not by way of merely good human actions, but by way of perfect or virtuous moral actions. Thus, in order to understand Thomas’ understanding of morality and the good life, we have to say something about his understanding of virtuous moral activity. However, what are morally virtuous human actions? In general terms, Thomas thinks virtuous human actions are actions that perfect the human agent that performs them, that is, good human actions are actions that conduce to happiness for the agent that performs them. An act is perfective of an agent relative to the kind to which the agent belongs. Since human beings are rational animals by nature, then virtuous human actions are actions that perfect the rationality and animality of human beings. Of course, this is still to speak about actions that conduce to happiness in very abstract terms. Thomas has much to say about the specific characteristics of virtuous human action, especially morally virtuous action.

i. Morally Virtuous Action as Pleasurable

First of all, good or happiness conducive human actions are pleasant for Thomas. Thomas goes so far as to say that intellectual pleasure (or delight) is even a necessary or proper accident of human activity in heaven (see, for example, ST IaIIae. q. 4, a. 1; and ST IaIIae. q. 34, a. 3). Thomas also sees pleasure as a necessary feature of the kind of happiness humans can have in this life, if only because virtuous activity—at the center of the good life for Thomas—involves taking pleasure in those virtuous actions (see, for example, ST IaIIae. q. 31, a. 4; ST IaIIae. q. 31, a. 5, ad1; and ST IaIIae. q. 35, a. 5). Both intellectually and morally virtuous actions are pleasant in themselves, thinks Thomas; in fact, he thinks they are the most pleasant of activities in themselves (ST IaIIae. q. 31, a. 5).

However, it is not just intellectual pleasure that belongs to virtuous human action in this life for Thomas, but bodily pleasure, too. For we are bodily creatures and not simply souls, and so human perfection (happiness) must make reference to the body (ST IaIIae. q. 59, a. 3). Thomas rejects the view, held by some Stoics, that all bodily pleasures are evil. As Thomas notes, it is natural for human beings to experience bodily and sensitive pleasures in this life (ST IaIIae. q. 34, a. 1). Therefore, the perfection of a bodily nature such as ours will involve not only intellectual pleasures, but bodily and sensitive pleasures, too.

Nonetheless, Thomas thinks it is true that bodily pleasure tends to hinder the use of reason, and this for three reasons (ST IaIIae. q. 33, a. 3). First, bodily pleasures, as powerful as they are, can distract us from the work of reason. Second, bodily pleasures can be contrary to reason, particularly those that are enjoyed in excess. Third, bodily pleasures can weaken or fetter the reason in a way analogous to how the drunkard’s use of reason is weakened. However, despite all of this, Thomas does not think that bodily pleasure is something evil by definition, and this for two reasons. First, pleasure is taking repose in an apparent good; but if we take repose in a manner that is consistent with reason, such pleasure is good, otherwise, it is not. Second, taking pleasure in an action is more akin to that action than a desire to act since the desire to act precedes the act whereas the pleasure in acting does not. However, desiring to do good is something good, whereas desiring to do evil is itself evil. A fortiori, taking pleasure in doing good is itself something good whereas taking pleasure in evil is something evil.

However, perhaps some bodily pleasures are evil by definition. For example, there have been philosophers and religious teachers that teach that sexual pleasure is evil insofar as it hinders reason. Although Thomas agrees that sexual pleasure hinders reason, he disagrees that sexual pleasure is bad per se. Recall that a bodily pleasure hinders reason for one of three reasons: it distracts us from using reason, it is inconsistent with reason, or it weakens reason. Thomas does not think that sexual pleasure per se is inconsistent with reason, for it is natural to feel pleasure in the sexual act (indeed, Thomas says that, before the Fall, the sexual act would have been even more pleasurable [see, for example, ST Ia. q. 98, a. 2, ad3]), and performing the sexual act within marriage is, all other things being equal, something natural and good. Thus, sexual pleasure must hinder reason insofar as it distracts us from using reason or weakens reason. However, this need not be morally evil, even a venial sin, as long as it is not inconsistent with reason, just as sleep, which hinders reason, is not necessarily evil, for as Thomas notes, “Reason itself demands that the use of reason be interrupted at times” (ST IaIIae. q. 34, a. 1, ad1).

ii. Morally Virtuous Action as Perfectly Voluntary and the Result of Deliberate Choice

Although virtuous actions are pleasant for Thomas, they are, more importantly, morally good as well. What does this mean for Thomas? We can begin with the fact that, according to Thomas, morally good actions are moral rather than amoral. However, moral actions have being voluntary as a necessary condition. Voluntary acts are acts that arise (a) from a principle intrinsic to the agent and (b) from some sort of knowledge of the end of the act on the part of the agent (see, for example, ST IaIIae. q. 6, a. 1). For example, the movements of a plant do not meet the necessary condition of being voluntary, according to Thomas. This is because plants do not have cognitive powers and so have no apprehension of the end of their actions. To take another example, insofar as a squirrel moves towards an object on the basis of apprehending that object by way of its sense faculties, the squirrel’s act is, in a sense, a voluntary one (see, for example, ST IaIIae. q. 6, a. 2).

However, an action’s being voluntary is not a sufficient condition for that action counting as a moral action according to Thomas. More than being voluntary, moral actions must be perfectly voluntary in order to count as moral actions. A perfectly voluntary action is an action that arises (a) from knowledge of the end of an action, understood as an end of action, and (b) from knowledge that the act is a means to the end apprehended (see, for example, ST IaIIae. q. 6, a. 2). This is just to say that perfectly voluntary actions are caused by rational appetite, or will, for Thomas. Therefore, although irrational animals (such as squirrels) can be said, in a sense, to act voluntarily, they cannot be understood to be acting morally, since they do not cognize the end as an end and do not understand their actions to be a means to such an end. Indeed, insofar as an act of a human being does not arise from an act of will, for example, when someone moves his or her arm while he or she is asleep, that action is not perfectly voluntary and so is not a moral action for Thomas (see, for example, ST IaIIae. q. 1, a. 1).

Morally virtuous action is moral (rather than amoral) action, and so it is perfectly voluntary. However, morally virtuous activity is also intentional and deliberate. Here we see a connection between the virtue of prudence and the other moral virtues. Prudence is that virtue that enables one to make a virtuous decision about what, for example, courage calls for in a given situation, which is often (but not always) acting in a mean between extremes. In other words, prudence is the virtue of rational choice (see, for example, ST IaIIae. q. 57, a. 5). Without prudence, human action may be good but not virtuous since virtuous activity is a function of rational choice about what to do in a given set of circumstances; although, as we shall see, virtuous action arises from a virtuous habit, and virtuous action is not habitual in the sense that we “do it without even thinking about it.”

iii. Morally Virtuous Action as Morally Good Action

Although morally virtuous action is more than simply morally good action, it is at least that. However, how does Thomas distinguish morally good actions from bad or indifferent ones? First of all, Thomas thinks that some kinds of actions are bad by definition. As Thomas would put it, such actions are bad according to their genus or species, no matter the circumstances in which those actions are performed. For example, an act of adultery is a species of action that is immoral in and of itself insofar as such acts necessarily have the agent acting immoderately with respect to sexual passion as well as putting preexisting or potential children at great risk of being harmed (ST IIaIIae. q. 154, a. 8, respondeo). An action, therefore, that counts as morally good—and so is conducive to living what we might call a good life—cannot be an action that is morally bad according to its genus or species.

Second, there are circumstances surrounding an action that affect the moral goodness or badness of an action. For example, Thomas thinks that it is morally permissible for a community to put a criminal to death on the authority of the one who governs that community. However, if those in authority in a community have set a timetable for an execution, say, that it should occur no sooner than Wednesday at 5 PM, and John the executioner, on his own authority, kills the prisoner on Wednesday at 10 AM (where John is not also an authority in the community), then the circumstances of John’s act of killing make what might otherwise have been a morally permissible act to be an immoral act. Sometimes circumstances make an action that is bad according to its species even worse. For example, it is morally wrong to murder. However, if someone murders his father, he commits patricide, which is a more grievous act than the act of murdering a stranger.

Third, motivations count as another form of circumstance that make an action bad, good, better, or worse than another. If Jane obeys her parents because of her love for God while Joan does so because she is afraid of being punished, although Joan’s act can still be morally praiseworthy, it is not as praiseworthy as Jane’s, since Jane’s motivation for moral action is better than Joan’s.

In putting these three “sources” for offering a moral evaluation of a particular human action together—kind of action, circumstances surrounding an action, and motivation for action——Thomas thinks we can go some distance in determining whether a particular action is morally good or bad, as well as how good or bad that action is. For example, Thomas thinks lying by definition is morally bad (see, for example, ST IaIIae. q. 110, a. 3). However, not all lies are equally bad. If someone lies in order to get an innocent person killed, one commits a mortal sin (the effect of which is, if one dies without repenting of such a sin, one will go to hell). However, if one tells a lie in order to save a person’s innocent life, one does something morally wrong, but such moral wrongdoing counts only as a venial sin, where venial sins harm the soul but do not kill charity or grace in the soul (see, for example, ST IaIIae. q. 110, a. 4).

iv. Morally Virtuous Action as Arising from Moral Virtue

Morally virtuous action, therefore, is minimally morally good action—morally good or neutral with respect to the kind of action, good in the circumstances, and well-motivated. However, it is also action that arises from a good moral habit, that is, a moral virtue, which good moral habits make it possible easily and gracefully to act with moral excellence. To be sure, in many cases, moral virtues are acquired by way of good actions. However, one morally good action is not necessarily a morally virtuous act. This is because virtuous actions arise from a habit such that one wills to do what is virtuous with ease. The person who does what the virtuous person does, but with great difficulty, is at best continent or imperfectly virtuous—a good state of character compared to being incontinent or vicious to be sure—but not perfectly virtuous.

One way that Thomas often sums up the conditions for morally virtuous action we have been discussing is to say that morally virtuous action consists in a mean between extremes (see, for example, ST IaIIae. q. 60, a. 4). In acting temperately, for example, one must eat the right amount of food in a given circumstance, for the right reason, in the right manner, and from a temperate state of moral character. If, for example, John eats the right amount of food on a day of feasting (where John rightly eats more on such days than he ordinarily does), but does so for the sake of vain glory, his eating would nonetheless count as excessive. If, on the other hand, John eats the right amount of food on a day of mourning (where John rightly eats less on such days than he ordinarily does) for the sake of vain glory, this would be deficient (compare ST IaIIae. q. 64, a. 1, ad 3). Of course, John might also eat too much on a given day, or too little, for example, on a day marked for feasting and celebration. Such actions would also be excessive and deficient, respectively, and not morally virtuous.

c. Human Virtues as Perfections of Characteristically Human Powers

So far we have discussed Thomas’ account of the nature of the means to happiness as moral virtue bearing fruit in morally virtuous action. One might wonder how we acquire the virtues. Although we have a natural desire for some of the virtues, the actual possession of the virtues is not in us by nature. How do we come to possess the virtues according to Thomas? Here, it is again worth pointing out that there are two stories to tell, since Thomas thinks there are really two different kinds of virtue, one which disposes us to act perfectly in accord with human nature and one which disposes us to perform acts which transcend human nature (see, for example, ST IaIIae. q. 54, a. 3). These two kinds of virtues correspond with the two different ends of human beings for Thomas, one that is natural, that is, the imperfect happiness attainable by human beings in this life by the natural light of reason and the natural inclination of the will, and one that is supernatural and comes to us only by grace, that is, the perfect happiness of the saints in heaven, in which happiness Christians can begin to participate even in this life, Thomas thinks.

According to Thomas, human beings can acquire virtues that perfect human beings according to their natural end by repeatedly performing the kinds of acts a virtuous person performs, that is, by habituation. Thomas calls such virtues human (see, for example, ST IaIIae. q. 54, a. 3; ST IaIIae. q. 55, aa. 1-3; and ST IaIIae. q. 61, a. 1, ad2) in order to distinguish such virtues from “infused” (or, to use concepts Thomas finds in Aristotle, “god-like,” “heroic” or “super-human”) virtues, which are virtues we have only by way of a gift from God, not by habituation. For example, we can imagine that, apart from any special gift of the God, Socrates was courageous in the sense that Socrates acquired the ability to habitually say “yes” to pains that are in accord with right reason in much the same way that an athlete or a musician voluntarily becomes more skilled or proficient in what they do through practice, that is by doing (or at least approximating) what good athletes and virtuosi do. Before saying more about human virtue, which is our focus here, it will be good to say a few things about infused virtue since this is an important topic for Thomas, and Thomas’ views on infused virtue are historically very important.

i. Infused Virtues

Like human virtues, infused virtues are perfections of our natural powers that enable us to do something well and to do it easily. For example, the virtue of faith enables its possessor, on a given occasion, to believe that “God exists and rewards those who seek Him” (Hebrews 11:6) and to do so confidently and without also thinking it false that God exists, and so forth. In addition, as in the case of human virtues, we are not born with the infused virtues; virtues, for Thomas, are acquired.

However, infused virtues differ from human virtues in a number of interesting ways. First, unlike human virtues, which enable us to perfect our powers such that we can perform acts that lead to a good earthly life, infused virtues enable us to perfect our powers such that we can perform acts in this life commensurate with—and/or as a means to—eternal life in heaven (ST IaIIae. q. 62, a. 1).

Second, whereas a human virtue, for example, human temperance, is acquired by habituation, that is, by repeatedly performing the kinds of actions that are performed by the temperate person, infused virtues are wholly gifts from God. Thomas cites St. Augustine in this regard: “Virtue is a good quality of the mind, by which we live righteously, of which no one can make a bad use, which God works in us, without us (ST IaIIae. q. 55, a. 4, obj. 1; emphasis mine). To see clearly this difference between human and infused virtue according to Thomas, note that Thomas thinks that neither infused nor human virtue makes a human being impervious to committing mortal sin. (For Thomas, a mortal sin is a sin that kills supernatural life in the soul, where such supernatural life makes one fit for the supernatural reward of heaven. Mortal sins require intentionally and deliberately doing what is grievously morally wrong. Contrast a mortal sin with a venial sin. Although venial sin can lead to mortal sin, and so ought to be avoided, a venial sin does not destroy supernatural life in the human soul.) Does Socrates lose his human virtue, for example, his courage, if he commits a mortal sin? Thomas thinks the answer is “no.” This is because naturally acquired virtues are virtues acquired through habituation, and one sinful act does not destroy a habit acquired by way of the repetition of many acts of one kind (see, for example, ST IaIIae. q. 63, a. 2, ad2). However, since infused virtues are not acquired through habituation but are rather a function of being in a state of grace as a free gift from God, and sinning mortally causes one to no longer be in a state of grace, just one mortal sin eliminates the infused virtues in the soul (although imperfect forms of them can remain, for example, unformed faith and hope [see below]). Of course, such mortal sins can be forgiven, Thomas thinks, by God’s grace through the sacrament of penance, thereby restoring a soul to the state of grace (see, for example, ST IIIa. q. 86, a. 1, respondeo).   

Thomas speaks of at least two different kinds of infused virtue. First, there are the well-known theological virtues of faith, hope, and charity (see, for example, St. Paul’s First Letter to the Corinthians, ch. 13). In general, the theological virtues direct human beings toward their supernatural end, specifically in relation to God himself. In other words, they are gifts of God that enable human beings to look to God himself as the object of a happiness that transcends the natural powers of human beings. Faith is the infused virtue that enables its possessor to believe what God has supernaturally revealed. Hope is the infused virtue that enables its possessor to look forward to God Himself—and not some created image of God—being the object of his or her perfect bliss. Finally, the virtue of charity creates a union of friendship between the soul of its possessor and God—a union that is not natural to human beings but requires that God raise up the nature of its possessor to God. In comparison to charity, faith and hope are imperfect infused virtues, since, unlike charity, faith and hope connote the lack of complete possession of God (see, for example, ST IaIIae. q. 66, a. 6, respondeo). As has been seen, perfect human happiness (qua possession) consists of the beatific vision. However, if we have faith, we do not have vision. If we have hope, we do not yet possess that for which we hope. Therefore, among the theological virtues, only charity remains in the saints in heaven. Thomas thinks this is one reason why St. Paul says, “The greatest of these [three virtues, that is, faith, hope, and charity] is charity.”

Unlike the intellectual and moral virtues—whether infused or human—the theological virtues do not observe the mean where their proper object, that is, God, is concerned, for Thomas thinks it is not possible to put faith in God too much, to hope too much in God, or to love God more than one should (see, for example, ST IaIIae. q. 64, a. 4).

Second, in addition to the theological virtues, there are also the infused versions of the intellectual and moral virtues (see, for example, ST IaIIae. q. 63, a. 3; on the distinction between intellectual and moral virtue, see below). Why infused virtues of this type? Whereas the theological virtues direct human beings to God Himself as object of supernatural happiness, the infused intellectual and moral virtues are those virtues that are commensurate with the theological virtues—and thus direct us to a supernatural perfection—where things other than God are concerned. Just as human beings are naturally directed to both God and creatures through their natural desires and through virtues that can be acquired naturally, so human beings, by the grace of God, can be supernaturally directed both to God and creatures through the theological and the infused intellectual and moral virtues, respectively. As Thomas says in one place, where the human moral virtues, for example, enable human beings to live well in a human community, the infused moral virtues make human beings fit for life in the kingdom of God (see, for example, ST IaIIae. q. 63, a. 4).

ii. Human Virtues

Thomas thinks there are a number of human virtues, and so in order to offer an account of what he has to say about humanly virtuous activity (and its relationship to the imperfect human happiness we can have in this life), we need to mention the different kinds of human virtues. In order to do this, we have to examine the various powers that human beings possess, since, for Thomas, mature human beings possess various powers, and virtues in human beings are perfections of the characteristically human powers (see, for example, ST IaIIae. q. 55, a. 1).

First, there are the rational powers of intellect and will. Although Thomas thinks that intellect enables human beings to do a number of different things, most important for the moral life is intellect’s ability to allow a human being to think about actions in universal terms, that is, to think about an action as a certain kind of action, for example, a voluntary action, or as a murder, or as one done for the sake of loving God. Our ability to do this—which separates us from irrational animals, Thomas thinks—is a requisite condition for being able to act morally. Since a gorilla, we might suppose, cannot think about actions in universal terms, it cannot perform moral actions.

Second, Thomas also distinguishes between the apprehensive powers of the soul, that is, powers such as sense and intellect that are productive of knowledge of some sort, and the appetitive powers of the soul, which are powers that incline creatures to a certain goal or end in light of how objects are apprehended by the senses and/or intellect as desirable or undesirable. The will, according to Thomas, is an appetitive power always linked with the operation of intellect. For Thomas, intellect and will always act in tandem. Since the object of will—that is, what it is about—is being insofar as the intellect presents it as desirable, Thomas thinks of will as rational appetite. The will is therefore an inclination in rational beings towards an object or act because of what the intellect of that being presents of that object or act as something desirable or good in some way.

In addition to the appetitive power of the will, there are appetitive powers in the soul that produce acts that by nature require bodily organs and therefore involve bodily changes, namely, the acts of the soul that Thomas calls passions or affections. These include not only emotions such as love and anger, but pleasure and pain, as well (see, for example, ST IaIIae. q. 31, a. 1).

Thomas thinks there are two different kinds of appetitive powers that produce passions in us, namely, the concupiscible power and the irascible power. The object of the concupiscible power is sensible good and evil insofar as a creature desires/wants to avoid such sensible goods/evils in- and-of-themselves. Thus, the concupiscible power produces in us the passions of love, hate, pleasure, and pain or sorrow. By contrast, the object of the irascible power is sensible good and evil insofar as such good/evil is difficult to acquire/avoid. Thomas therefore associates the passions of anger, fear, and hope with the irascible power.

In contrast to Socrates of Athens, who, according to Thomas, thinks all human virtues are intellectual virtues (see, for example, ST IaIIae. q. 58, a. 2), Thomas distinguishes intellectual and moral virtues since he thinks human beings are both intellectual and appetitive beings. Since virtues are dispositions to make a good use of one’s powers, Thomas distinguishes virtues perfecting the intellect—called the intellectual virtues—from those that perfect the appetitive powers, that is, the moral virtues. Unlike the moral virtues, which automatically confer the right use of a habit, intellectual virtues merely confer an aptness to do something excellently (ST IaIIae. q. 57, a. 1). For example, John might have an intellectual virtue such that he can easily solve mathematical problems. However, John might use such a habit for evil purposes. On the other hand, if John is courageous, he cannot make use of his habit of courage to do what is wrong. If John were to do what is morally wrong, it would be in spite of his moral virtues, not because of them.

Following Aristotle, Thomas mentions five intellectual virtues: wisdom (sapientia), understanding (intellectus), science (scientia), art (ars), and prudence (prudentia). First, there are the purely speculative intellectual virtues. These intellectual virtues do not essentially aim at some practical effect but rather aim simply at the consideration of truth. Understanding is the speculative intellectual virtue concerning the consideration of first principles, that is, those propositions that are known through themselves and not by way of deduction from other propositions, for example, the principle of non-contradiction, and propositions such as all mammals are animals and it is morally wrong to kill an innocent person intentionally. Wisdom is the intellectual virtue that involves the ability to think truly about the highest causes, for example, God and other matters treated in metaphysics. As we saw in the section on the nature of knowledge and science above, science (considered as a virtue) is the intellectual ability to draw correct conclusions from first principles within a particular subject domain, for example, there is the science of physics, which is the ability to draw correct conclusions from the first principles of being qua material being.

Second, there are two intellectual virtues, namely, art and prudence, to which it belongs essentially to bring about some practical effect. Thomas defines art as “right reason about certain works to be made” (ST IaIIae. q. 57, a. 3, respondeo). Art is therefore unlike the first three of the intellectual virtues mentioned—which virtues are purely speculative—since art necessarily involves the practical effect of bringing about the work of art (if I simply think about a work of art without making a work of art, I am not employing the intellectual virtue of ars). Thomas considers art nonetheless to be an intellectual virtue because the goodness or badness of the will is irrelevant where the exercise of art itself is concerned. (Beethoven may or may not have been a morally bad man all the while he composed the 9th symphony, but we need not consider the moral status of Beethoven’s appetites when we consider the excellence of his 9th symphony qua work of art).

Finally, there is prudence. Prudence is the habit that enables its possessor to recognize and choose the morally right action in any given set of circumstances. As Thomas puts it: “Prudence is right reason of things to be done” (ST IaIIae. q. 57, a. 4, respondeo). Prudence is not a speculative intellectual virtue for the same reason ars is not: the human being exercising the virtue of prudence is not simply thinking about an object but engaged in bringing about some practical effect (so, for example, the philosopher who is simply thinking about the right thing to do without actually doing the morally right thing is not exercising the virtue of prudence, even if said philosopher is, in fact, prudent). Prudence also differs from ars in a crucial way: whereas one can exercise the virtue of ars without rectitude in the will, for example, one can bring about a good work of art by way of a morally bad action, one cannot exercise the virtue of prudence without rectitude in the will. Indeed, we do not find prudence in a person without also finding in that person the moral virtues of justice, courage, and temperance. Thus, not only is prudence necessarily practical, its exercise necessarily involves someone (a) habitually acting with a good will and (b) possessing appetites for food, drink, and sex that are habitually measured by right reason.

Why, then, is prudence an intellectual virtue for Thomas? Recall that Thomas thinks that virtue is the perfection of some power of the soul. Thomas therefore thinks the essential difference between the intellectual and moral virtues concerns the kinds of powers they perfect. Intellectual virtues perfect the intellect while moral virtues are perfections of the appetitive powers. However, prudence is essentially a perfection of intellect, and so it is an intellectual virtue. Nonetheless, it “has something in common with the moral virtues,” (ST IaIIae. q. 58, a. 3, ad1) Thomas says, insofar as it is concerned with things to be done. This is why, Thomas thinks, prudence is also reckoned among the moral virtues by authors such as Cicero and St. Augustine. Indeed, some philosophers call prudence a “mixed” virtue, partly intellectual and partly moral.

According to Thomas, moral virtue “perfects the appetitive part of the soul by directing it to good as defined by reason” (ST IaIIae. q. 59, a. 4, respondeo). Since the moral virtues are perfections of human appetitive powers, there is a cardinal or hinge moral virtue for each one of the appetitive powers (recall that prudence is the cardinal moral virtue that perfects the intellect thinking about what is to be done in particular circumstances). As has been seen, Thomas thinks there are three appetitive powers: the will, the concupiscible power, and the irascible power. Thus, there are three cardinal moral virtues: justice (which perfects the faculty of will); temperance (perfecting the concupiscible power), and fortitude (perfecting the irascible power). Where prudence perfects intellect itself thinking about what is to be done, justice is intellect disposing the will such that a person is “set in order not only in himself, but also in regard to another” (ST IaIIae. q. 66, a. 4). According to Thomas, temperance is the virtue whereby the passions of touch participate in reason so that one is habitually able to say “no” to desires of the flesh that are not in accord with right reason (ST IaIIae. q. 61, a. 3). Finally, fortitude is the virtue whereby the desire to avoid suffering participates in reason such that one is habitually able to say “yes” to suffering insofar as right reason summons us to do so (ST IaIIae q. 61, a. 3).

This is just the tip of the iceberg of what Thomas has to say by way of characterizing the human virtues and their importance for the good life. In addition, Thomas has a lot to say about the parts of the cardinal virtues and the virtues connected to the cardinal virtues, not to mention the vices that correspond with these virtues (see, for example, his treatment of these issues in ST IIaIIae).

d. The Logical Relations between the Human Virtues

Virtue ethicists have traditionally been interested in defending a position on the logical relations between the human virtues. For example, we might wonder whether one can really be courageous without also being temperate. Thomas is no exception to this rule. As has been seen, there are two kinds of human virtues, intellectual and moral. Where specifying the relations between the human moral virtues are concerned, Thomas thinks it important to distinguish two senses of human moral virtue, namely, perfect human moral virtue and imperfect human moral virtue (see, for example, ST IaIIae. q. 65, a. 1). An imperfect human moral virtue, for example, imperfect courage, is a disposition such that one simply has a strong inclination or desire to do good deeds, in this case, courageous deeds. Perfect human moral virtues, by contrast, are dispositions such that one is inclined to do good deeds well, that is, in the right way, at the right time, for the proper motive, and so forth. Where imperfect human moral virtues are concerned, these can be possessed independently of the others. For example, Joe is inclined (by nature or by acquired habit) to perform deeds that would be rightly (if loosely) described as just, but Joe is not inclined to virtuous activity where his desires for eating, drinking, and sex are concerned. By contrast, perfect human moral virtues cannot be possessed apart from one another. If Joe is perfectly just, then he also is perfectly temperate. Thomas has two reasons for accepting this “unity of the virtues” thesis. As he notes, these two reasons correspond with two different ways we can distinguish the cardinal virtues from one another (ST IaIIae. q. 65, a.1, respondeo).

First, we might distinguish the virtues “according to certain general properties of the virtues: for instance, by saying that discretion belongs to prudence, rectitude to justice, moderation to temperance, and strength of mind to courage” (ST IaIIae. q. 65, a. 1, respondeo). Given this way of distinguishing the virtues, discretion is not perfectly virtuous without strength of mind, strength of mind is not virtuous without moderation, and so forth. Thomas notes that it is for this sort of reason that, for example, Pope St. Gregory the Great and St. Augustine believe the unity of the virtues thesis.

Second, we might distinguish the cardinal virtues as Thomas himself prefers to do, after the example of Aristotle, namely, insofar as the different virtues perfect different powers. Given this way of distinguishing the virtues, it still follows that one cannot have any one of the perfect cardinal virtues without also possessing the others. This is because one cannot have courage, temperance, or justice without prudence, since part of the definition of a perfect virtue is acting in accord with rational choice, where rational choice is a function of being prudent. For example, if I am able to act courageously in a given situation, not only does my irascible power need to be perfected, that is, I have to perfectly desire to act rationally when experiencing the emotion of fear, but I need to know just what courageous action calls for in that given situation. For example, it may be that the prudent thing to do in that situation is to “run away in order to fight another day.” However, knowing just what to do in a given situation where one feels afraid is a function of the virtue of prudence. Thus, one cannot be perfectly courageous without having perfect prudence (ST IaIIae. q. 65, a. 1; see also ST IaIIae. q. 58, a. 4).

However, according to Thomas, it is also the case that one cannot be perfectly prudent unless one is also perfectly temperate, just, and courageous. This is because the prudent person has a perfected intellect where deciding on the virtuous thing to do in any given situation. However, such knowledge requires a perfected knowledge about the rational ends or principles of human action, for one cannot perfectly know how to apply the principles of action in a given situation if one does not perfectly know the principles of action. However, a perfect knowledge of the ends or principles of human action requires the possession of those virtues that perfect the irascible appetite, the concupiscible appetite, and the will, otherwise, one will have a less than perfect, that is, a distorted, picture of what ought to be pursued or avoided. For example, if John is a coward, then he will be inclined to think that one always ought to avoid what causes pain. However, if John is inclined to believe such a thing, then he will not be able to think rightly, that is, prudently, about just what he should do in a particular situation that potentially involves him suffering pain. However, what goes for courage goes for temperance and justice, too. Therefore, the perfectly prudent person has the perfect virtues of courage, temperance, and justice.

Finally, we can also note that, for Thomas, Joe cannot be perfectly temperate if he is not also perfectly courageous and just (where we are speaking about perfect human virtue). This is because Joe cannot be temperate if he is not also prudent. However, for Thomas, Joe cannot be prudent if he is not also temperate, courageous, and just. Therefore, Joe cannot be temperate if he is not also courageous and just. For the same kinds of reasons, it follows, according to Thomas, that all of the human cardinal virtues come with one another. It is for these sorts of reasons that Thomas affirms the truth of the “unity of the virtues” thesis.

Where perfect human virtue is at issue, what of the relation between the human intellectual virtues and the human moral virtues for Thomas? Since prudence is a mixed virtue—at once moral and intellectual—there is at least one human intellectual virtue that requires possession of the moral virtues and one intellectual virtue that is required for possession of the moral virtues. In addition, since the possession of prudence requires a knowledge of the principles of human action that are naturally known, that is, natural law precepts (see the section on moral knowledge below), and understanding is the virtue whose possessor has knowledge of, among other things, the principles of human action that are naturally known, possession of the moral virtues requires possession of the intellectual virtue of understanding (although one may have understanding without possessing the moral virtues, if only because one can have understanding without prudence).

As for the other intellectual virtues—art, wisdom, and science—none of these virtues can be possessed without the virtue of understanding. To give Thomas’ example, if one does not know a whole is greater than one of its parts—knowledge of which is a function of having the intellectual virtue of understanding—then one will not be able to possess the science of geometry. Aside from its dependence on understanding, the possession of the virtue of art does not require the moral virtues or any of the other intellectual virtues. The possession of science with respect to a particular subject matter seems to be similar to the virtue of art in this regard, that is, although it requires possessing the virtue of understanding, it does not require the possession of moral virtues or any other intellectual virtues.

The possession of the intellectual virtue of wisdom—habitual knowledge of the highest causes—seems to differ for Thomas from science and art insofar as possession of wisdom presupposes the possession of other forms of scientific knowledge (see, for example, SCG I, ch. 4, sec. 3). Nonetheless, like art and the other sciences, one can possess the virtue of wisdom without possessing prudence and the other moral virtues. That being said, Thomas seems to suggest that possession of the virtue of wisdom is less likely if one lacks the moral virtues (SCG I, ch. 4, sec. 3).

e. Moral Knowledge

In order to make sense of Thomas’ views on moral knowledge, it is important to distinguish between different kinds of moral knowledge, which different kinds of moral knowledge are produced by the (virtuous) working of different kinds of powers.

Thomas thinks that all human beings who have reached the age of reason and received at least an elementary moral education have a kind of moral knowledge, namely, a knowledge of universal moral principles. One place he says something like this is in his famous discussion of law in ST. In that place he argues that there are at least three different kinds of universal principles of the natural law, that is, principles that apply in all times, places, and circumstances, which principles can be learned by reflecting on one’s experiences by way of the natural light of human reason, apart from faith (although Thomas notes that knowledge of these principles often is inculcated in human beings immediately through divinely infused faith [see, for example, ST IaIIae. q. 100, a. 3, respondeo]).

First, there are those universal principles of the natural law that function as the first principles of the natural law, for example, one should do good and avoid evil (ST IaIIae. q. 100, a. 3, respondeo). Such universal principles are known to be true by every human person who has reached the age of reason without fail. Of course, most people—unless they are doing theology or philosophy—will not make such principles of practical action explicit. In being usually implicit in our moral reasoning, Thomas compares the first principles of the natural law with the first principles of all reasoning, for example, the principle of identity and the principle of non-contradiction.

Second, there are those universal principles of the natural law that, with just a bit of reflection, can be derived from the first principle of the natural law (ST IaIIae. q. 100, a. 3, respondeo). We can call these the secondary universal precepts of the natural law. For example, we all know we should do good and avoid evil. We also know, when we reflect upon it, that failing to honor those who have given us extremely valuable gifts we cannot repay would be to do evil. However, we all know that our father and mother have given us extremely valuable gifts we cannot repay, for example, life and a moral education. Therefore, we can naturally know that we ought to honor our mother and our father. Of course, most of us do not need to make such reasoning explicit in order to accept such moral principles as absolute prescriptions or prohibitions. Like the first universal principles of the natural law, the truthfulness of these secondary universal precepts of the natural law is immediately obvious to us—whether we know this by the natural light of reason insofar as the truth of such propositions is obvious to us as soon as we understand the meaning of the terms in those propositions or we immediately know them to be true by the light of faith (see, for example, ST IaIIae. q. 100, a. 1 respondeo). Thomas thinks that (at least abstract formulations of) the commandments of the Decalogue constitute good examples of the secondary, universal principles of the natural law [see, for example, ST IaIIae. q. 100, a. 3, respondeo). To know the primary and secondary universal precepts of the natural law is to have what Thomas calls the human virtue of understanding with respect to the principles of moral action. Moral knowledge of other sorts is built on the back of having the virtue of understanding with respect to moral action. As we have seen, it is possible to have the virtue of understanding (say, with respect to principles of action) without otherwise being morally virtuous, for example, prudent, courageous, and so forth (see, for example, ST IaIIae. q. 58, a. 5).

Third, Thomas thinks there are also universal principles of the natural law that are not immediately obvious to all but which can be inculcated in students by a wise teacher (see, for example ST IaIIae. q. 58, a. 5; ST IaIIae. q. 100, a. 1, respondeo; and ST IaIIae. q. 100, a. 3, respondeo). We might call this third of universal principle of the natural law the tertiary precepts of the natural law. Thomas gives as an example of such a principle a precept from Leviticus 19: 32: “Rise up before the hoary head, and honor the person of the aged man,” that is, respect your elders (ST IaIIae. q. 100, a. 1, respondeo). Other examples Thomas would give of tertiary precepts of the natural law are one ought to give alms to those in need (ST IIaIIae. q. 32, a. 5, respondeo), one must not intentionally spill one’s seed in the sex act (ST IIaIIae. q. 154, a. 11, respondeo), and one should not lay with a person of the same sex (ST IIaIIae. q. 154, a. 11, respondeo).

It is easy to be confused by what Thomas says here about natural law as conferring moral knowledge if we think Thomas means that all people have good arguments for their moral beliefs. People sometimes say that they “just see” that something is morally wrong or right. Thomas thinks it is possible to know the general precepts of the moral law without possessing a scientific kind of moral knowledge (which, as has been seen, does require having arguments for a thesis). One way to talk about this “just seeing” that some moral propositions are true is by making reference to what Thomas calls natural law. People do not typically argue their way to believing the general norms of morality, for example, it is wrong to murder, one should not lie. Rather, the truth of these norms is “self-evident” (per se nota) to us, that is, we understand such norms to be true as soon as we understand the terms in the propositions that correspond to such norms (see, for example, ST IaIIae. q. 94, a. 2). Of course, that does not mean that arguments cannot be given for the truth of such norms, at least in the case of the secondary and tertiary precepts of the natural law, if only for the sake of possessing a science of morals. The truth of such basic moral norms is thus analogous to the truth of the proposition “God exists” for Thomas, which for most people is not a proposition one (needs to) argue(s) for, although the theologian or philosopher does argue for the truth of such a proposition for the sake of scientific completeness (see, for example, ST Ia. q. 2, a. 2, ad2).

So far we have simply talked about the fact that, in Thomas’ view, human beings have some knowledge of universal moral principles. However, unless such knowledge is joined to knowledge of particular cases in the moral agent or there is a knowledge of particular moral principles in the agent, then the moral agent will not know what he or she ought to do in a particular circumstance. For example, all human beings know they should seek happiness, that is, they should do for themselves what will help them to flourish. However, in a particular case, Joe really wants to go to bed with Mike’s wife. In fact, given his passions and lack of temperance, it seems to Joe that going to bed with Mike’s wife will help him to flourish as an individual human being. That is, it seems good to Joe to commit adultery. Thomas thinks that ordinarily a person such as Joe knows by the universal principles of the natural law, that is, he understands not only that he should not commit adultery but that committing adultery will not help him flourish. In addition, Joe knows that going to bed with Mike’s wife would be an example of an adulterous act. However, such knowledge can be destroyed or rendered ineffective (and perhaps partly due to Joe’s willingness that it be so) in a particular case by his passion, which reflects a lack of a virtuous moral disposition in Joe, that is, temperance, which would support the judgment of Joe’s reason that adultery is not happiness-conducive. Thus, it may seem genuinely good to Joe to go to bed with Mike’s wife. In this particular case, (we are supposing) Joe lacks effective moral knowledge of the wrongness of going to bed with Mike’s wife. (Again, Joe could be morally responsible for his lack of temperance, and so for his lack of resolve to act in accord with what he knows about the morality of going to bed with Mike’s wife; in that case, his passion would simply render him vincibly ignorant of the principles of this particular case and so would not excuse his moral wrongdoing, although it would make intelligible why he wills as he does.) In order for knowledge of the universal principles of the natural law to be effective, the agent must have knowledge of moral particulars, and such knowledge, Thomas thinks, requires possessing the moral virtues. Without the virtues, a person will have at best a deficient, shallow, or distorted picture of what is really good for one’s self, let alone others (see, for example, ST IaIIae. q. 58, a. 5, respondeo).

Finally, we should mention another kind of knowledge of moral particulars that is important for Thomas, namely, knowing just what to do in a particular situation such that one does the right thing, for the right reason, in the right way, to the proper extent, and so forth. This is knowledge had by way of the possession of prudence. As we noted above, the knowledge that comes by prudence has the agent’s possession of the other moral virtues as a necessary condition, for the knowledge we are speaking of here is knowing just how to act courageously in this situation; to know this, one must have one’s passions ordered such that, whatever one chooses to do, one knows one always ought to act courageously. However, the prudent person is also able to decide to act in a particular way in a given situation. Such deciding, of course, involves a sort of knowing just what the situation in question calls for, morally speaking. In order for one’s temperance, for example, to be effective, one needs not only to have a habit of desiring food, drink, and sex in a manner consistent with right reason, but one needs to decide how to use that power in a particular situation. For example, the prudent person knows what temperate eating will look like on this given day, at this given time, and so forth. The moral knowledge that comes by prudence is another kind of moral knowledge, Thomas thinks, one necessary for living a good human life.

f. The Proximate and Ultimate Standards of Moral Truth

According to Thomas, the proximate measure for the goodness and badness of human actions is human reason insofar as it is functioning properly, or to put it in Thomas’ words, right reason (recta ratio) (see, for example, ST IaIIae. q. 34, a. 1). Thomas sometimes speaks of this proximate measure of what is good in terms of that in which the virtuous person takes pleasure (see, for example, ST IaIIae. q. 1, a. 7; and ST IaIIae. q. 34, a. 4).

However, since right reason in human beings is a kind of participation in God’s mind (see, for example, ST IaIIae. q. 91, a. 2, respondeo), we can also speak of the mind of God as the ultimate standard for whether a human action is morally good or bad. In fact, given Thomas’ doctrine of divine simplicity, we can say simply that God is the ultimate measure or standard of moral goodness.

One way Thomas speaks about God being the measure of morally good acts is by using the language of law. According to Thomas, God’s idea regarding His providential plan for the universe has the nature of a law (ST Ia. q. 91, a. 1; see the section below on political philosophy for more on Thomas on law). This idea of how the universe ought to go, like any other of God’s ideas, is not, in reality, distinct from God Himself, for by the divine simplicity God’s intellect and will are in reality the same as God himself. God’s own infinite and perfect being—we might even say “God’s character,” if we keep in mind that applying such terms to God is done only analogously in comparison to the way we use them of human moral agents—is the ultimate rule or measure for all creaturely activity, including normative activity. This is why Thomas can say that none of the precepts of the Decalogue are dispensable (ST IaIIae. q. 100, a. 8), for each one of the Ten Commandments is a fundamental precept of the natural law, thinks Thomas. However, it would be a contradiction in terms for God to will that a fundamental precept of the natural law be violated, since the fundamental precepts of the natural law are necessary truths (we could say that they are true in all possible worlds) that reflect God’s own necessary, infinite, and perfect being. For God to will to dispense with any of the Ten Commandments, for example, for God to will that someone murder, would be tantamount to God’s willing in opposition to His own perfection. Since God’s will and God’s perfection (being) are the same, for God to will in opposition to His own perfect being would be a contradiction in terms.

9. Political Philosophy

a. Law

i. The Nature of Law

For Thomas, law is (a) a rational command (b) promulgated (c) by the one or ones who have care of a perfect community (d) for the sake of the common good of that community (ST IaIIae. q. 90, a. 4). First, a law is a rational command. It is not simply a suggestion or an act of counsel. If John merely suggests a course of action A to Mike, or Mike asks John what to do about some moral decision D, and Mike merely offers counsel to John about what to do where D is concerned, all other things being equal, John is not morally obligated to perform A or follow John’s advice where D is concerned, even if John is related to Mike as John’s moral or political superior. Mike may indeed be likely to perform A or follow John’s advice about D out of fear or out of respect for John, but Mike would not necessarily do something morally wrong if he did not perform A or follow John’s counsel about D. On the other hand, if John commands Mike to do something (and all the other conditions for a law are met), then John does something morally wrong if he fails to act in accord with John’s command. According to Thomas, law morally obligates those to whom it is directed. That being said, not all moral acts are equally morally wrong for Thomas. It may be that Susan’s breaking a law in a given situation merely counts as a venial sin. (For the distinction between venial and mortal sin, see the section on infused virtue above.)

A law is also a rational command. That means that, minimally, John’s command must be coherent. In addition, for John’s command to have the force of law, it must not contradict any pre-existing law that has the force of law. Such a pre-existing law could be a higher law. For example, if John (a mere human being) commands that all citizens sacrifice to him as an act of divine worship once a year, Thomas would say that such a command does not have the force of law insofar as (Thomas thinks) such a command is in conflict with a natural law precept that ordains that only divine beings deserve to be worshiped by way of an act of sacrifice. One is not obliged to obey a human being’s ordinance that is in conflict with the commands of a higher power (see, for example, ST IaIIae. q. 104, a. 5.). In his Letter from the Birmingham Jail, Martin Luther King Jr. invokes precisely this aspect of Thomas’ understanding of law in defense of the injustice of segregation ordinances when he notes that, according to Thomas, “an unjust law is a human law that is not rooted in eternal law and natural law” (1963, p. 82).

A command C of a human being could also be in conflict with a pre-existing human law. C would not, in such a case, have the force of law. Take an example: John’s mother commands him to run some errands for her. As John is about to do so, John’s father says to him: “Stop what you’re doing right now and do your homework!” Assuming that John’s mother and father have equal authority in John’s home, and that both of these commands meet all of the other relevant conditions for a law, the command issued by John’s father does not have the force of law for John, since it contradicts a pre-existing law.

Second, commands that get to count as laws must have as their purpose the preservation and promotion of the common good of a particular community. When Thomas speaks about the common good of a community, he means to treat the community itself as something that has conditions for its survival and its flourishing. For example, if a tyrant issues an edict that involves taxing its citizens so heavily that the workers in that community would not be able to feed themselves or their families, such an edict would violate the very purpose of law, since the edict would, in short order, lead to the destruction of the community.

Third, in addition to being a rational command that promotes the common good of a community, a law must be issued by those who have true political authority in that community. There is no need to think that the authority figures in question here have to be political authorities in the sense that we take elected officials or kings to be. Within the confines of a household, for example, parents have the authority to make laws, that is, rational commands that morally obligate those to whom the laws are addressed. It is worth stressing that a command’s being issued by the requisite authority is a necessary but not sufficient condition for that command’s having the force of law. The political authorities in Birmingham, Alabama may have been genuine authorities and enjoyed real power to make laws. However, if Martin Luther King Jr. was right that segregation ordinances were unjust—and so irrational—then such ordinances, despite the fact that they were issued by authorities that were legitimate, did not have the force of law and so did not morally obligate those who, in their conscience, recognized that such segregation ordinances were unjust.

Finally, a command must be promulgated in order to have the force of law, that is, to morally bind in conscience those to whom it is directed. Thomas accepts the principle that ignorance of the law excuses, but not just any kind of ignorance does so. For ignorance comes in at least two varieties, invincible and vincible. If I am invincibly ignorant of p, it is not reasonable to expect me to know p, given my circumstances. For example, say John has been extremely ill for a year, and in that time a law was passed of which, under normal circumstances, John should have made himself aware. Because of John’s circumstances, however, it would be correct to say he remains invincibly ignorant of the law. For John, then, the law does not bind in conscience (at least as long as John remains invincibly ignorant of it). If John were to transgress the law, John would not be morally culpable for such a transgression. On the other hand, someone might really be ignorant of a law but still be culpable for transgressing it. Such a person would be vincibly ignorant of that law. Someone is vincibly ignorant of a law just in case that person does not know about the law but should have taken actions so as to know about it.

ii. The Different Kinds of Law

1. The Eternal Law

In his famous discussion of law in ST, Thomas distinguishes four different kinds of law: eternal, natural, human, and divine. The eternal law is “God’s idea of the government of things in the universe” (ST IaIIae. q. 91, a. 1, respondeo). This description of the eternal law follows Thomas’ definition of law in general, which definition mentions the four causes of law. Recall that, according to Thomas, a law is a rational command (this is a law’s formal cause) made by the legitimate authority of a community (a law’s efficient cause) for the common good of that community (the final cause) and promulgated (the material cause). The community in question here is the whole universe of creatures, the legitimate authority of which is God the creator. In Thomas’ view, God the creator is provident over, that is, governs, his creation (see, for example, ST Ia. q. 22, aa. 1-2). Since God is perfect Being and Goodness itself (see, for example, ST Ia. q. 4, a. 2; and ST Ia. q. 13, a. 2, respondeo), God’s governing of the universe is perfectly good, and so God’s idea of how the universe should be is a rational command for the sake of the common good of the universe.

How does God promulgate the eternal law? God communicates the eternal law to creatures in accord with their capacity to receive it. Now, God’s eternal law is not distinct from God, but God is perfection itself. Therefore, God communicates Himself, that is, perfection itself, to creatures insofar as this is possible, that is, insofar as God creates things as certain reflections of God’s own perfection.

For example, God communicates His perfection to non-rational, non-living creatures insofar as God creates each of these beings with a nature that is inclined to perfect itself simply by exhibiting those properties that are characteristic of its kind. For example, a carbon atom reflects the divine perfection—and so has God’s eternal law communicated to it—insofar as God gives a carbon atom a nature such that it tends to exhibit the properties characteristic of a carbon atom, for example, being such that it can form such and such bonds with such and such atoms, and so forth. God communicates the eternal law to plants insofar as God creates plants with a nature such that they not only tend to exhibit certain properties, each of which is a certain limited reflection of the Creator, but also insofar as plants are inclined by nature to perfect themselves by nourishing themselves, growing, and maturing so as to contribute to the perpetuation of their species through reproduction. Non-rational animals, of course, have all of these perfections plus the added perfection of being conscious of other things, thereby having the eternal law communicated to them in an even more perfect sense than in the case of non-living things and plants. Finally, rational creatures—whether human beings or angels—have the eternal law communicated to them in the most perfect way available to a creature, that is, in a manner analogous to how human beings promulgate the law to other human beings, that is, insofar as they are self-consciously aware of being obligated by said law. In other words, God gives rational creatures a nature such that they can naturally come to understand that they are obligated to act in some ways and refrain from acting in other ways. This reception of the law by rational creatures is what Thomas calls the natural (moral) law (see, for example, ST Ia. q. 91, a. 2, respondeo).

2. The Natural Law

More specifically, by natural law Thomas understands that aspect of the eternal law that has to do with the flourishing of rational creatures insofar as it can be naturally known by rational creatures—in contrast to that aspect of the eternal law insofar as it is communicated by way of a divine revelation. (In this section, we are interested in natural law only insofar as it is relevant for the development of a political philosophy; for the importance of natural law where moral knowledge is concerned, see the discussion of that topic in the ethics section above.) To put this another way, the natural law implies a rational creature’s natural understanding of himself or herself as a being that is obligated to do or refrain from doing certain things, where he or she recognizes that these obligations do not derive their force from any human legislator. As we saw Martin Luther King Jr. say above, there are some moral laws that constitute the foundation of any just human society; if such laws are transgressed, or legislated against, we act or legislate unjustly. This set of moral laws that transcends the particularities of any given human culture is what Thomas and King call the natural law.

There is another way to think about natural law in the context of politics that is commensurate with what was said above. As in the case of all creatures, the nature possessed by human beings represents a certain way of participating in God, a certain finite degree of perfection that is therefore limited and imperfect in comparison to God’s absolute, infinite perfection. As Thomas famously says in one place, “The natural law is nothing else than the rational creature’s participation of the eternal law” (ST IaIIae. q. 91, a. 2, respondeo). Now, like all created beings, human beings are naturally inclined to perfect themselves, since their nature is an image of the eternal law, which is absolutely perfect. One way in which all creatures show that they are creatures, that is, created by Perfection itself, is in their natural inclination toward perfecting themselves as members of their species. However, human beings are rational creatures and rational creatures participate in the eternal law in a characteristic way, that is, rationally; since the perfection of a rational creature involves knowing and choosing, rational creatures are naturally inclined to know and to choose, and to do so well. In addition, like other animals, human beings must move themselves (with the help of others) from merely potentially having certain perfections to actually having perfections that are characteristic of flourishing members of their species. Although everything is perfect to some extent insofar as it exists—since existence itself is a perfection that reflects Being itself—actually possessing a perfection P is a greater form of perfection than merely potentially possessing P. Therefore, the natural law is a human being’s natural understanding of its inclination to perfect himself or herself according to the kind of thing he or she naturally is, that is, a rational, free, social, and physical being. Thus, we know naturally that we should act rationally, protect life, educate our children, increase liberty for ourselves and others, work for the common good of the community, and, given the precept act rationally, apply all these principles in a rational manner, a manner that reflects a natural understanding that we are animals of a certain sort. We therefore are naturally inclined to pursue those goods that are consistent with human flourishing, as we understand it, that is, the flourishing of a rational, free, social, and animal being. Insofar as we conclude that such an activity or apparent good is a real good for us, we conclude that it is a good we can—or ought to—seek. Insofar as we see that a particular activity or apparent good undermines human flourishing, we conclude that such an activity or apparent good is something bad and so should not be sought, but rather avoided.

3. The Divine Law

The chief reason the natural law is called natural is because it is that aspect of the eternal law that rational creatures can (given the right sort of circumstances) discern to be true by unaided human reason, that is, apart from a special divine revelation. What human beings can know of God’s eternal law only by way of a special divine revelation from God is what Thomas calls divine law (ST IaIIae. q. 91, a. 4, respondeo and ad2). Thomas also contrasts the divine law with the natural law by noting that the natural law directs us to perform those actions we must habitually perform if we are to flourish in this life as human beings (what Thomas calls our natural end, that is, our end qua created). The divine law, on the other hand, directs us to perform actions that are proportionate with living an eternal life with God (what Thomas calls our supernatural end, that is, our end qua grace and glory). It is not as though the natural law is irrelevant where our supernatural end is concerned since, as Thomas often says, “grace perfects nature; it does not destroy it” (see, for example, ST Ia. q. 1, a. 8, ad2). Therefore, living in a manner that violates the natural law is inconsistent with a human being’s achieving his or her supernatural end too. That being said, to live merely in accord with the natural law is not proportionate to the life that human beings live in heaven, which life, by the grace of God, human beings can, in a limited sense, begin to live even in this life. Thus, one reason God gives the divine law is to instruct human beings about which acts are proportionate to a supernatural life, that is, flourishing in heaven, so as to make human beings fit for heaven (see, for example, ST IaIIae. q. 91, a. 4, respondeo).

4. Human Law and its Relation to Natural Law

Thomas develops his account of human law by way of an analogy (see ST IaIIae. q. 91, a. 3). He posits that the human law is to the natural law what the conclusions of the speculative sciences (for example, metaphysics and mathematics) are to the indemonstrable principles of that science. Just as all science begins from premises the truth of which cannot themselves be demonstrated, for example, the law of non-contradiction, and proceeds by the work of reason to particular conclusions, so, in practical matters (such as politics), authorities begin with the knowledge of indemonstrable precepts, for example, good should be rewarded and evil punished and the punishment must fit the crime, and proceed to apply those precepts in light of the particular circumstances, needs, and realities of the communities of which they are the rightful leaders. These particular practical applications of the natural law, as long as they meet the conditions of law, have the force of law. Such laws Thomas calls, human laws. For example, the relevant authorities in community A might decide to enact a law that theft should be punished as follows: the convicted thief must return all that was stolen and refrain from going to sea for one day for each ducat that was stolen. On the other hand, community B enacts the following law: the thief will be imprisoned for up to one day for each dollar stolen.

Thomas would want us to notice a couple of things about these human laws. First, neither of these laws follow logically from the precepts of the natural law. Just as one cannot deduce empirical truths from the law of non-contradiction alone, one cannot deduce human laws simply from the precepts of the natural law. That being said, the natural law functions as a kind of control on what can count as a legitimate (morally and legally binding) law. Just as any scientific theory that contradicts itself is not a good theory, although a number of proposed theories meet this minimal condition of rationality, so no binding law contradicts the precepts of the natural law, although there may be any number of proposed human laws that are consistent with the natural law.

Second, notice that the human laws addressing the appropriate punishment of thievery mentioned above reflect the circumstances in which the members of those communities find themselves. For example, say the members of community A belong to a society where sea-faring is important, and so restriction of such sea-faring is appropriately painful. On the other hand, the members of community B, say, do not live in circumstances where it is so important to travel at sea, and so the punishment for thievery reflects that. Some human laws, Thomas thinks, will be different in different times and places, if only because they are enacted in times and places where there are different geographical, moral, political, and religious circumstances and needs.

b. Authority: Thomas’ Anti-Anarchism

Unlike some political philosophers, who see the need for human authority as, at best, a consequence of some moral weakness on the part of human beings, Thomas thinks human authority is logically connected with the natural end of human beings as rational, social animals. Thomas, therefore, rejects anarchism in all of its forms, and he does so for philosophical reasons. Human authority is in itself good and is necessary for the good life, given the kind of thing human beings are. One place where we can see clearly that Thomas holds this position is in his discussion of what human life would have been like in the Garden of Eden had Adam and Eve (and their progeny) not fallen into sin.

In a section of ST where he is discussing what life was (and in some cases would have been) like for the first human beings in the state of innocence, that is, before the Fall, Thomas entertains questions about human beings as authorities over various things in that state of innocence (Ia. q. 96). Particularly relevant for our purposes are articles three and four.

In article three, Thomas asks whether all human beings would have been equal in the state of innocence. Thomas answers this question by saying, “In some senses, human beings would have been equal in the state of innocence, but in other senses, they would not have been equal.” Thomas thinks human beings would have been equal, that is, the same, in the state of innocence in two significant senses: (a) all human beings would have been free of defects in the soul, for example, all human beings would have been equal in the state of innocence insofar as none would have had sinned, and (b) all human beings would have been free of defects in the body, that is, no human beings would have experienced bodily pain, suffered disease, and so forth in the state of innocence. It is worth mentioning that Thomas believes that the state of innocence was an actual state of affairs, even if it probably did not last very long. However, it certainly could have lasted a long time. In fact, assuming Adam and Eve and their progeny chose not to sin, the state of innocence could have been perpetual or could have lasted until God translated the whole human race into heaven (see, for example, ST Ia. q. 102, a. 4, respondeo).

Interestingly, Thomas thinks that there are a number of different ways in which human beings would have been unequal (by which he simply means, not the same) in the state of innocence. First of all, since God intended there to be families in the state of innocence, some would have been male and others female, since human sexual reproduction, which was intended by God in the state of innocence, requires diversity of the sexes. In addition, some people would have been older than others, since children would have born to their parents in the state of innocence.

Second, there would have been inequalities having to do with the souls of those in the state of innocence. For example, although none would have a defect in the soul, some would have had more knowledge or virtue than others. Thomas mentions the following sort of reason: those in the state of innocence had free choice of the will. Thus, some would have freely chosen to make a greater advance in knowledge in virtue than others. In addition, although the first human persons were created with knowledge and all the virtues, at least in habit (see ST Ia. q. 95, a. 3), those born as children in paradise would not have had knowledge and the virtues, being too young (ST Ia. q. 101, aa. 1 and 2). Therefore, adult human persons in the state of innocence would have had more knowledge and virtue than children born in paradise.

Third, since human bodies would not have been exempt from the influence of the laws of nature, the bodies of those in paradise would have been unequal, for example, some would have been stronger or more beautiful than others, although, again, all would have been without bodily defect. Since those in the state of innocence have the virtues—or at the very least, have no defects in the soul—such disparity in knowledge, virtue, bodily strength, and beauty among those in paradise would not have necessarily occasioned jealousy and envy.

In the fourth article in this question on authority in the state of innocence, Thomas asks whether some human beings would be master of other human beings in the state of innocence. In answering this question, Thomas distinguishes two senses of “mastership.” First, there is the sense of “mastership” that is involved in the master/slave relationship. Second, there is a broader sense of “mastership” where one person is in authority over another, for example, a father in relation to his child.

Thomas argues that “mastership” in the first sense would not exist in the state of innocence. According to Thomas, a slave is contrasted with a politically free person insofar as the slave, but not the free person, is compelled to yield to another something he or she naturally desires, and ought, to possess himself or herself, namely, the liberty to order his or her life according to his or her own desires, insofar as those desires are in accord with reason. This provides Thomas with two reasons for thinking there would be no slavery in the state of innocence. First, since all persons naturally desire political freedom, not having it would be painful. However, there is no pain in the state of innocence. Second, all persons ought to enjoy political freedom. Slaves do not have it. However, there is no sin in the state of innocence. Therefore, there is no “mastership” in the state of innocence that implies the existence of slavery.

Nonetheless, Thomas argues there would have been human authorities, that is, some human beings governing others, in the state of innocence. Why? Thomas offers two reasons. He begins from the belief that human beings are by nature rational and social creatures, and so would have led a social life with other human beings, ordered by reason, in the state of innocence. This means that, in the state of innocence, human beings would seek not just their own good but the common good of the society of which those individuals are a part. However, where there are many reasonable individuals, there will be many reasonable but irreconcilable ideas about how to proceed on a variety of different practical matters. For the sake of the common good, there must therefore be those who have the authority to decide which of many reasonable and irreconcilable ideas will have the force of law in the state of innocence. Therefore, there would have been some human beings in authority over other human beings in the state of innocence.

Thomas’ second reason that there would have been human authorities in the state of innocence has him drawing on positions he established in ST Ia. q. 96, a. 3. Recall that he argues there that human beings would have been unequal in the state of innocence insofar as some would have been wiser and more virtuous than others. However, it would be unfitting if the wiser and more virtuous did not share their gifts with others for the sake of the common good, namely, as those who have political authority. Given that (as Thomas believes) human beings are not born with knowledge and virtue, it seems obvious that this would have been true in the case of the relation between parents and their children. However, Thomas sees that human authorities would have been necessary and fitting at all levels of society.

Since law is bound up with authority for Thomas, what has been said about authority has an interesting consequence for Thomas’ views on law too. It is not essential to law that there be evil-doers. Given that human beings are rational and social creatures, that is, they were not created to live independently and autonomously with respect to other human beings, even in a perfect society a human society will have human laws. (This also assumes that God has willed to share His authority with others; this is precisely what Thomas thinks; in fact, Thomas thinks that having authority over others is part of what it means to be created in the image of God.) Recall the definition of law—it says nothing about curbing appetites or protecting the innocent. In a world where the strong try to take advantage of the weak, law, of course, does do these things. However, the fact that law protects the weak from the strong is accidental to law for Thomas.

c. The Best Form of Government

Thomas thinks that a just government is one in which the ruler or rulers work(s) for the common good and not simply for the good of one class of citizens. In his view, there are a number of un-mixed forms of government that are, in principle, legitimate or just, for example, kingship (regnum), that is, rule by one virtuous man, aristocracy, that is, rule by a few virtuous men, and polity, rule by a large number of citizens. Following Aristotle in Politics, book III, chapter 7, Thomas identifies three unjust forms of unmixed government that are opposed to these just forms: for example, tyranny, that is, rule by one man who looks after his own benefit rather than the common good, oligarchy, that is, rule by a few wealthy men who look after their own good rather than the common good, and democracy, rule by the many poor people for their own good rather than the common good (see, for example, De regno ad regem Cypri, I, ch. 2 [chapter 1 in some editions]).

Of the various just unmixed forms of government, Thomas thinks that a kingship is, in principle, the best form of government. He offers a number of arguments for this thesis. Consider just one of these. Thomas thinks the chief concern of a good ruler is to secure the unity and peace of the community. Therefore, the more a form of government is better able to secure unity and peace in the community, the better is that form of government, all other things being equal. What itself has the nature of unity and peace is better able to secure unity and peace than what is many. However, kingship has the nature of unity and peace more so than rule by many men (whether or not these men are virtuous; recall from our discussion of authority above that Thomas does not think that a group of virtuous people will necessarily agree on a course of action). Therefore, all other things being equal, kingship is better able to secure unity and peace than rule by many. Therefore, kingship is the best unmixed form of government (De regno, book I, ch. 3 [ch. 2]; compare this argument with Thomas’ argument at SCG IV, ch. 76 that there needs to be one bishop, that is, the Pope, functioning as the visible head of the Church in order to secure the unity and peace of the Church.)

Thomas is aware of the possibility that a good man can become a tyrant (De regno, book I, ch. 7 [ch. 6 in some editions]). Furthermore, since the contrary of the best is the worst, and tyranny is the contrary of kingship, tyranny is the worst form of government (De regno, ch. 4 [ch. 3 in some editions]). Thomas therefore thinks kingship should be limited in a number of ways in order to ensure a ruler will not be(come) a tyrant.

First, in a limited kingship the king is selected by others who have the authority to do so (De regno, book I, ch. 7 [ch. 6], where such authorities should choose a king with a moral character such that it is unlikely he will become a tyrant. In one place Thomas speaks of an ideal situation where the king is selected from among the people—presumably for his virtue—and by the people (ST IaIIae q. 105, a. 1, respondeo). Second, in order to ensure the king does not become a tyrant, the government (and its constitution) should be written so as to limit the power of the king (De regno, book I, ch. 7 [ch. 6]). Finally, Thomas thinks kingship ideally should be limited in that the community has a right to depose or restrict the power of the king if he becomes a tyrant (De regno I, ch. 7 [ch. 6]). Although early in his career he seems to sanction tyrannicide (In Sent. Book II, d. 44, qu. 2, ad5), by the time he writes De regno (book I, ch. 7 [ch. 6]) Thomas rejects that view not only as imprudent, but also as inconsistent with the teaching of the Apostles (compare 1 Peter 2:19). Rather, those who have the authority to appoint the king have the authority and responsibility to depose him if need be (De regno book I, ch. 7 [ch. 6]). If no human authorities can or are willing to help a community ruled by a tyrant, Thomas counsels that the people should have recourse to God. However, in doing so, they should first look to expiating their own sins, since God sometimes allows a people to be ruled by the impious as a punishment for sin (De regno book I, ch. 7 [ch. 6]).

Notably, in a place in ST, Thomas argues that a certain kind of mixed government is really the best form of government (ST IaIIae. q. 105, a. 1, respondeo). Thomas notes there that both Aristotle (Politics, book iii) and divine revelation (Deuteronomy 1:15; Exodus 18:21; and Deuteronomy 1:13) agree that the ideal form of government combines kingship, aristocracy, and democracy insofar as one virtuous man rules as king, the king has a few virtuous men under him as advisors, and, not only all are eligible to govern (the virtuous can come from the populace and not simply from the wealthy class), but also all participate in governance insofar as all participate in choosing who will be the king.

Thomas argues that this form of mixed government—part kingship, part aristocracy, and part democracy—is the best form of government as follows. As Aristotle states in Politics ii, 6, a form of government where all take some part in the government ensures peace among the people, commends itself to all, and is most enduring. However, a form of government that ensures peace among the people, commends itself to all, and is most enduring is, all other things being equal, the best form of government. Therefore,

(G1) A form of government where all take some part in the government is, all other things being equal, the best form of government.

However, given the soundness of the kind of argument for the superiority of kingship as a form of government we noted above, and the importance of virtuous politicians for a good government, we have the following:

(G2) The best non-mixed form of government is kingship.

(G3) The second-best form of non-mixed government is an aristocracy.

However, there is a mixed form of government (call it a limited kingship or limited democracy) that is part kingship, since a virtuous man presides over all, part aristocracy, since the king takes to himself a set of virtuous advisors and governors, and part democracy, since the rulers can be chosen from among the people and the people have a right to choose their rulers.

However, there is no form of government other than a limited kingship or limited democracy that takes the truths of (G1), (G2), and (G3) into account. Therefore, the best form of government is a limited kingship or limited democracy. Thus, interestingly, we have in Thomas a 13th-century theologian advocating for a limited form of democracy as the best form of government.

10. References and Further Reading

a. Thomas’ Works

Thomas authored an astonishing number of works during his short life. Other than the first entry below, which cites the ongoing project of providing a critical edition of Thomas’ Opera Omnia (entire body of work), the entries mentioned here are those works of Thomas’ cited in the body of this article. For a complete list of Thomas’ works, see Torrell 2005, Stump 2003, or Kretzmann and Stump 1998.

  • Opera Omnia (Complete Works), 1248-1273. Ed. Leonine Commission, S. Thomae Aquinatis Doctoris Angelici. Opera Omnia. Iussu Leonis XIII, P.M. edita, Rome: Vatican Polyglot Press, 1882- (on-going).
  • De principiis naturae, ad fratrem Sylvestrum (On the Principles of Nature, for Brother Sylvester), 1248-1252 or 1252-1256.
    • English translation: Eleonore Stump and Stephen Chanderbhan, trans. In The Hackett Aquinas: Basic Works. Jeffrey Hause and Robert Pasnau, eds. (Indianapolis: Hackett Publishing Company, 2014), pp. 2-13.
  • De ente et essentia, ad fratres et socios suos (On Being and Essence, for His Brothers and Companions), 1252-1253.
    • English translation: Peter King, trans. In The Hackett Aquinas: Basic Works. Jeffrey Hause and Robert Pasnau, eds. (Indianapolis: Hackett Publishing Company, 2014), pp. 14-35.
  • Scriptum super libros Sententiarum (Commentary on [Lombard’s] Sentences), 1252-1256.
  • Questiones disputatae de veritate (Disputed Questions on Truth), 1256-1259.
    • English translation: Mulligan, Robert W., James V. McGlynn, and Robert W. Schmidt, trans. Truth. 3 vols. Library of Living Catholic Thought (Chicago: Regnery, 1952-1954; reprint, Indianapolis: Hackett, 1994).
  • Beata gens (Sermon on the Feast of All Saints, the First of November), ca. 1256-1259 or1268-1272?
    • English translation: Mark-Robin Hoogland, trans. In The Fathers of the Church: Medieval Continuation. Vol. II. Thomas Aquinas: The Academic Sermons (Washington, DC: The Catholic University of America Press, 2010), pp. 295-312.
  • Expositio super librum Boethii De trinitate (Commentary on Boethius’ De trinitate), 1257-1258 or 1259 (incomplete).
    • English translation: Maurer, Armand, trans. Faith, Reason and Theology: Questions I-IV of his Commentary on the De Trinitate of Boethius. Mediaeval Sources in Translation, 32 (Toronto: Pontifical Institute of Mediaeval Studies, 1987). Maurer, Armand, trans. The Division and Methods of the Sciences: Questions V and VI of his Commentary on the De Trinitate of Boethius. 4th rev. ed. Mediaeval Sources in Translation, 3 (Toronto: Pontifical Institute of Mediaeval Studies, 1986).
  • Summa contra gentiles (Synopsis [of Christian Doctrine] Directed against Unbelievers) [SCG], 1259-1265.
    • English translation: Pegis, Anton C., James F. Anderson, Vernon J. Bourke, and Charles J. O’Neil, trans. Summa contra gentiles (1955; reprint, Notre Dame, IN: University of Notre Dame Press, 1975).
  • Glossa continua super Evangelia (Catena aurea) (A Continuous Gloss on the Evangelists [collected from the writings of the Church Fathers]), 1262-1265 (Matthew); 1265-1268 (Mark, Luke, and John).
    • English translation: M. Pattison, J. D. Dalgairns, and T. D. Ryder, trans. John Henry Newman, ed. 4 vols. (1841-1845; reprint, Boonville, NY: Preserving Christian Publications, 2009).
  • Expositio super Iob ad litteram (Literal Commentary on Job), 1263-1265.
    • English translation: Yaffe, Martin D., and Anthony Damico, trans. The Literal Exposition on Job: A Scriptural Commentary Concerning Providence. Classics in Religious Studies, 7 (Atlanta, GA: Scholars Press, 1989).
  • Expositio et lectura super Epistolas Pauli Apostoli (Commentary and lectures on the Epistles of Paul the Apostle), 1263-1265 (1 Cor. 11—Philemon); 1271-1272 (Romans), and 1272- 1273 (Hebrews).
    • English translations: multiple.
  • Officium de festo Corporis Christi ad mandatum Urbani Papae (The Office of the Feast of the Body of Christ, Commissioned by Pope Urban), 1264.
    • English translation: The Aquinas Prayer Book: The Prayers and Hymns of St. Thomas Aquinas, R. Anderson and J. Moser, trans. (Manchester, NH: Sophia Institute Press, 2000).
  • Adoro te devote (Hymn) (Humbly I Adore Thee), 1264 or 1274.
    • English translation: The Aquinas Prayer Book: The Prayers and Hymns of St. Thomas Aquinas, R. Anderson and J. Moser, trans. (Manchester: NH: Sophia Institute Press, 2000).
  • Quaestiones disputatae de potentia (Disputed Questions on [the] Power [of God]), 1265-1266.
    • English translation: The English Dominican Fathers, trans. (1932; reprint, Eugene, OR: Wipf and Stock, 2004).
  • Compendium theologiae, ad fratrem Reginaldum socium suum (A Compendium of Theology, for Brother Reginald, his Companion), 1265-1267 (incomplete).
    • English translation: Vollert, Cyril, trans. Light of Faith: The Compendium of Theology (1947; reprint, Manchester, NH: Sophia Institute, 1993).
  • Expositio super librum Dionysii De divinis nominibus (Commentary on Pseudo-Dionysius’ De divinis nominibus), 1265-1268.
    • English translation: Marsh, Harry C., trans. “A Translation of Thomas Aquinas’ In Librum beati Dionysii de divinis nominibus expositio.” In his “Cosmic Structure and the Knowledge of God: Thomas Aquinas’ In Librum beati Dionysii de divinis nominibus expositio,” 265–549. Ph.D. diss. (Vanderbilt University, 1994).
  • Summa theologiae (Synopsis of Theology) [ST], 1265-1268 (Prima Pars); 1271 (Prima Secundae); 1271-1272 (Secunda Secundae), and 1271-1273 (Tertia Pars) (incomplete).
    • English translation: Fathers of the English Dominican Province, trans. (1911; reprint, Allen, TX: Christian Classics, 1981).
  • Quaestiones disputatae de anima (Disputed Questions on the Soul), 1266-1267.
    • English translation: Robb, James H., trans. Questions on the Soul. Mediaeval Philosophical Texts in Translation, 27 (Milwaukee: Marquette University Press, 1984).
  • De regno [or De regimine principum], ad regem Cypri (On Kingship [or On the Governance of Rulers], for the King of Cyprus), 1266-1267.
    • English translation: Phelan, Gerald B., and I.T. Eschmann, trans. On Kingship to the King of Cyprus. Mediaeval Sources in Translation, 2 (Toronto: Pontifical Institute of Mediaeval Studies, 1949).
  • Sententia super De anima (Commentary on Aristotle’s De anima), 1267-1268.
    • English translation: Pasnau, Robert C., trans. Commentary on Aristotle’s De anima (New Haven: Yale University Press, 1999).
  • Expositio Libri Physicorum (Commentary on Aristotle’s Physics), 1268-1270.
    • English translation: Blackwell, Richard J., Richard J. Spath, and W. Edmund Thirlkel, trans. Commentary on Aristotle’s Physics. Rare Masterpieces of Philosophy and Science (New Haven: Yale University Press, 1963; reprint, Aristotelian Commentary Series. Notre Dame, IN: Dumb Ox Books, 1999).
  • Questiones disputatae de malo (Disputed Questions on Evil), 1269-1271.
    • English translation: Trans. Jean Oesterle (Notre Dame, IN: The University of Notre Dame Press, 1995).
  • Expositio Libri Peryermenias (Commentary on Aristotle’s De interpretatione), 1270-1271.
    • English translation: Oesterle, Jean, trans. Aristotle on Interpretation: Commentary by St. Thomas and Cajetan. Mediaeval Philosophical Texts in Translation, 11 (Milwaukee: Marquette University Press, 1962. Reprinted, with a new introduction, as Commentary on Aristotle’s On Interpretation, Notre Dame, IN: Dumb Ox Books, 2004).
  • Sententia super Metaphysicam (Commentary on Aristotle’s Metaphysics), 1270-1273.
    • English translation: Rowan, John P., trans. Commentary on the Metaphysics of Aristotle. 2 vols (Chicago: Regnery, 1964; reprinted in one volume with revisions as Commentary on Aristotle’s Metaphysics, Aristotelian Commentary Series, Notre Dame, IN: Dumb Ox Books, 1995).
  • De aeternitate mundi, contra murmurantes (On the Eternity of the World against Murmerers), 1271.
    • English translation: In St. Thomas, Siger de Brabant, and St. Bonaventure, On the Eternity of the World, Cyril Vollert, Lottie Kenzierski, and Paul M. Byrne, trans. Mediaeval Philosophical Texts in Translation, 16 (Milwaukee: Marquette University Press, 1964).
  • Sententia libri Ethicorum (Commentary on Aristotle’s Nicomachean Ethics), 1271-1272.
    • English translation: Litzinger, C.I., trans. Commentary on the Nicomachean Ethics. 2 vols. Library of Living Catholic Thought (Chicago: Regnery, 1964; reprinted in 1 vol. with revisions as Commentary on Aristotle’s Nicomachean Ethics, Aristotelian Commentary Series, Notre Dame, IN: Dumb Ox Books, 1993).
  • Expositio super librum Boethii De hebdomadibus (Commentary on Boethius’ De hebdomadibus), 1271-1272?
    • English translation: Schultz, Janice L., and Edward A. Synan, trans. An Exposition of the ‘On the Hebdomads’ of Boethius. Thomas Aquinas in Translation (Washington, DC: The Catholic University of America Press, 2001).
  • Expositio super librum De causis (Commentary on Liber de causis), 1272-1273.
    • English translation: Guagliardo, Vincent A., Charles R. Hess, and Richard C. Taylor, trans. Commentary on the Book of Causes. Thomas Aquinas in Translation (Washington, DC: The Catholic University of America Press, 1996).

b. Secondary Sources and Works Cited

The secondary literature on Thomas is vast. Here follows just a few important studies of Thomas’ thought in English that will be particularly helpful to someone who wants to learn more about Thomas’ philosophical thought as a whole. Also included in this section are works cited within the article (other than Thomas’ own).

  • Artigas, Mariano. The Mind of the Universe: Understanding Science and Religion (Philadelphia: Templeton Foundation Press, 2000).
  • Chesterton, G. K. The Dumb Ox (New York: Image Books, 1956).
    • Originally published in 1933, this is a wryly written study by the famous English journalist that attempts to convey the spirit and significance of Thomas’ thought. The eminent 20th-century Thomas scholar Etienne Gilson once called it “the best book ever written on St. Thomas.” The book is readily available in many different editions.
  • Clarke, W. Norris. The One and the Many: A Contemporary Thomistic Metaphysics (Notre Dame: University of Notre Dame Press, 2001).
    • An excellent attempt to articulate Thomas’ metaphysical views in light of the phenomenological and personalist traditions of 20th-century philosophy.
  • Copleston, F.C. Aquinas. (London: Penguin Books, 1955).
    • A still classic study that attempts to explain Thomas’ views with an eye toward analytic philosophical idioms.
  • Davies, Brian. The Thought of Thomas Aquinas (Oxford: Clarendon Press, 1992).
    • A clear and philosophically interesting summary of Thomas’ theological and philosophical thought, one that follows the structure of Thomas’ Summa theologiae.
  • Davies, Brian and Eleonore Stump, eds. The Oxford Handbook of Aquinas (Oxford: Oxford University Press, 2012).
    • A recent and excellent collection of scholarly articles on all aspects of Thomas’ thought.
  • Eberl, Jason. The Routledge Guidebook to Aquinas’ Summa Theologiae (London: Routledge, 2015).
    • A close reading and explanation of the philosophical views contained in Thomas’ greatest work.
  • Feser, Edward. Aquinas: A Beginner’s Guide (Oxford: Oneworld, 2009).
    • Despite the title, this is a sophisticated, very readable, articulation and defense of ideas central to Thomas’ thought.
  • Gilson, Etienne. The Christian Philosophy of St. Thomas Aquinas. Trans. L. K Shook (1956; reprint, Notre Dame, IN: University of Notre Dame Press, 1994).
    • A classic study by the famous 20th-century Thomist and scholar of medieval philosophy. Among other things, Gilson argues that Thomas’ concept of actus essendi is the key to understanding his thought and its unique contribution to the history of Western philosophy.
  • King, Jr., Martin Luther. “Letter from the Birmingham Jail,” in Why We Can’t Wait (New York: Signet Books, 1963).
  • Kretzmann, Norman and Eleonore Stump, eds. The Cambridge Companion to Aquinas (Cambridge: Cambridge University Press, 1993).
    • An excellent collection of scholarly introductions to all the major facets of Thomas’ thought.
  • Pieper, Josef. A Guide to Thomas Aquinas. Trans. Richard and Clara Winston (San Francisco: Ignatius, 1991).
    • Gives a helpful introduction to Thomas’ thought by way of clearly presenting the historical context in which Thomas lived and taught.
  • Plantinga, Alvin. Warranted Christian Belief (New York: Oxford University Press, 2000).
  • Rota, Michael W. “What Aristotelian and Thomistic philosophy can contribute to Christian theology,” in Theology and Philosophy: Faith and Reason, eds. O. Crisp, G. D’Costa, M. Davies, and P. Hampson (London: T & T Clark, 2012), pp. 102-115.
  • Stump, Eleonore. Aquinas. Arguments of the Philosophers (London: Routledge, 2003).
    • A detailed presentation of Thomas’ philosophical thought, one that articulates and defends Thomas’ views in light of contemporary analytic philosophical discussions in metaphysics, epistemology, the philosophy of religion, the philosophy of mind, and ethics.
  • Torrell, Jean-Pierre. Aquinas’s Summa: Background, Structure, and Reception. Trans. Benedict M. Guevin (Washington, DC: The Catholic University of America Press, 2005).
    • Helpfully explains the context, content, and the history of the reaction to Thomas’ greatest work.
  • Van Inwagen, Peter. Metaphysics. 4th ed. (Boulder, CO: Westview Press, 2015).

c. Bibliographies and Biographies

  • Ingardia, Richard. Thomas Aquinas: International Bibliography 1977-1990 (Bowling Green, KY: The Philosophical Documentation Center).  
  • Kretzmann, Norman and Eleonore Stump. “Aquinas, Thomas,” in The Routledge Encyclopedia of Philosophy. Vol. 1. Edward Craig, ed. (London: Routledge, 1998), pp. 326-350.
    • A scholarly, concise, and very informative account of Thomas’ life and works. Also contains a good bibliography.
  • Miethe, T. L. and Vernon Bourke. Thomistic Bibliography 1940-1978 (Westport, CT: Greenwood Press, 1980).
  • Torrell, Jean-Pierre. Saint Thomas Aquinas: The Person and His Work. Trans. Robert Royal. Revised Edition (Washington, DC: The Catholic University of America Press, 2005).
    • The most up-to-date, scholarly, book-length treatment of Thomas’ life and works.
  • Tugwell, Simon. Albert and Thomas: Selected Writings. The Classics of Western Spirituality (Mahwah, NJ: Paulist Press, 1988).
    • The introduction to this work contains a concise and helpful account of Thomas’ life and works.
  • Weisheipl, J. Friar Thomas D’Aquino: His Life, Thought, and Works (Washington, DC: The Catholic University of America Press, 1983).
    • A classic study, which is nonetheless superseded by (Torrell 2005).

 

Author Information

Christopher M. Brown
Email: chrisb@utm.edu
University of Tennessee at Martin
U. S. A.

Scotus: Knowledge of God

Any discussion of John Duns Scotus (1266—1308) on our knowledge of God has to be a discussion of Scotus’s thesis that we have concepts univocal to God and creatures. By this, Scotus means that some one idea can equally represent both God and other types of things. This is striking even to modern ears and was perhaps more so for Scotus’s contemporaries. There are religious objections. Some call Scotus an idolater. But beyond this, as Scotus himself pointed out, the metaphysical ramifications of his thesis threaten to “destroy all philosophy.” By this, he means Aristotle’s thought, which did much to set the philosophical terrain of the thirteenth century. For Aristotle, words that refer to things that are different yet somehow related are analogical, words like ‘healthy’ said of both persons and medicine. Medievals adopted Aristotle’s scheme to make sense of the meaning of religious language, which uses words like ‘good’ to talk about God and creatures. For thinkers in the Latin West, concerns did not focus so much on whether God talk is analogical, as it did on exactly what type of analogy was at play. Imagine the reception when Scotus insisted that analogy (and with analogy, religious language) in fact rests tacitly on concepts univocal to God and creatures. In an Aristotelian universe, this would seem to require that God and creatures really do have something in common, that they differ only in kind, like cats and persons. But everyone, including Scotus, agreed that this was not so. Hence, Scotus and Aristotle would seem to be irremediably at odds with one another and Scotus would indeed destroy all philosophy. As bad as things seem for Scotus, these difficulties recede in light of the fact that his univocity thesis is about religious language, not things. Yes, we can think of radically distinct types of things using just one concept, but this does not mean that they really share in some feature. Thoughts and things need not line up that neatly. This is the way that it is for Scotus. The univocal concepts that represent God and creatures are high level abstractions, mental constructs formed through experience and conceived apart from the limits that attended the things that gave rise to them. These concepts are sufficiently vague to conceive God and creatures, provided that we see the concepts for the abstractions that they are. They really do not refer to anything, because every being is either finite (creatures) or infinite (God), and this makes all the difference. Scotus recognizes that the complex concept formed when a univocal concept is linked with the concept of God’s infinite being is about something that is metaphysically distinct from any creature. But, the genesis of the concept lies in concepts that are of creatures and creatures imitate God as effects imitate their causes. Therefore, to the extent that imperfect creatures imperfectly imitate the perfect creator, univocal concepts are of God. But only to this extent, which medieval thinkers, including Scotus, agree falls far short of the perfection of the divine essence.

Table of Contents

  1. Introduction
  2. Preliminaries
    1. Scotus’s Writings and Early Thought
    2. The Aristotelian Paradigm: Thirteenth-Century Categorial Metaphysics
  3. Contemporary Scholarship
  4. Scotus on our Natural Knowledge of God
  5. Univocity
    1. Univocity and Natural Theology
    2. Illumination Theory and Abstraction
    3. Analogy and Univocity
  6. Metaphysics as Natural Theology
    1. Metaphysics and the Transcendentals
    2. Does Scotus “Destroy All Philosophy?”
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

John Duns Scotus (1266—1308) defends the existence of concepts univocal to God and creatures (on the medieval understanding of concepts, see below, section 2a), most importantly the concept of being. In so doing, Scotus helped to expand medieval thinkers’ understanding of the scope of metaphysics, which, following Aristotle, was conceived of as the science of being qua being or being as such (Metaphysics (Metaph.) 4.1). Because for Scotus being as such pertains to both God and creatures, he thinks of metaphysics as a natural theology (Wolter 1946). More specifically, Scotus believes that certain attributes characterize everything that exists (examples of such attributes include one, true and good) (Ordinatio (Ord.) I, d. 8, q. 3, Vat. IV). These transcendental attributes transcended the then generally accepted classification of types of beings (see below, sections 2b and 6a) and thereby apply to everything. Metaphysics is then for Scotus the science of the transcendentals.

Scotus’s work was vulnerable to a variety of objections on several fronts. First, on strictly philosophical grounds, medievals did not think that there was any concept broad enough to take in everything (see below, section 2b and King 2003). Again, were God similar to and different from creatures, God would be metaphysically complex. For medieval thinkers, this would mean that God is not God, but rather a contingent thing like any other and hence unable to function as the uncaused cause of all things. Scotus recognizes that his univocity thesis threatens “to destroy all philosophy” (1 Lectura (Lect.), d. 3, n. 105) and is at pains to answer these various charges. His strategy is to insist that the univocity thesis is about concepts, not things (Lect. 1, d. 8, n. 129, Vat. XVII:46). More specifically, the informational or intensional content of concepts that are common to God and creatures does not involve the respectively infinite and limited being of either. The concepts really do not apply to anything until these considerations are introduced (or, strictly speaking, reintroduced in the case of creatures, whence concepts are derived). This introduction renders a composite concept comprising the univocal concept and a concept of degree (limited or unlimited as regards creatures or God, respectively). This composition does not alter the content of the univocal concept, which therefore carries the same content in each application and, with that, the unity of meaning that is requisite for it to support the types of sound proof that interest most theologians (see below, section 5a). As the thesis is about how we talk about things and not how they are, univocal terms can apply to all things and the fact that we use these terms to speak of God does not entail any real metaphysical complexity of the divine essence. But, inasmuch as univocal concepts are drawn from creatures and creatures imperfectly represent or imitate God (Ord. I, d. 3, pt. 1, qq. 1-2, n. 56, Vat. III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179), univocal concepts are also of God (see below, sections 3 and 6b). The conceptual apparatuses that Scotus deploys in support of his thesis regarding religious language along with his construal of metaphysics as a natural theology have deeply influenced the Western tradition. For instance, Scotus’s thesis that things that do not share in any real feature may fall under one univocal concept opens the door to William of Ockham’s nominalism, which allows for universal concepts absent common natures (see Summa Logicae I.14 and Klima 2010). Again, Scotus’s influence is likewise seen in Francisco Suárez, David Hume, Immanuel Kant, Charles Sanders Peirce and Martin Heidegger, to name just a few.

Scotus’s earliest disciples were divided over how to understand the Subtle Doctor on the key issue of the univocity of being. Antonius Andreas (d. 1320 C.E.), for instance, looks for some reality common to God and creatures as the real basis for Scotus’s univocal concepts; whereas Peter of Navarre (d. 1347 C.E.) finds in Scotus a weak sense of univocity, whereby concepts univocal to God and creatures are such only inasmuch as they are so indifferent as to apply properly to neither. Peter thereby works to bring Scotus’s thought into line with the common opinion (which he links to Thomas Aquinas, d. 1274 C.E.) that denies that there are any concepts univocal to God and creatures, maintaining rather that theological discourse is analogical (Dumont 1992). This view recognizes that any idea that we have of God is proper rather to creatures, because ideas are in their genesis of creatures, not God. In light of this generally empiricist outlook, these theologians acknowledge that religious language is analogical. As creatures imitate or represent the creator, we are justified in assigning various perfections to God. But, creatures are limited and imperfect. Therefore, barring any special revelation, our grasp of God in this life must be inaccurate (see below, section 5c).

Debates such as those that unfolded soon after his death over whether Scotus held that concepts univocal to God and creatures pick out a reality common to both persist, though a consensus has emerged that Scotus’s univocal concepts are vague abstractions that apply properly to neither God nor creatures absent certain modal considerations relevant to finite and infinite being that serve to delimit the scope of these concepts (Cross 2001). Scotus was led to this account out of his belief that (1) analogy (and with analogy, religious language) tacitly relies on univocity (see below, section 5c) and (2) God and creatures do not share in any common reality.

Hampering the efforts of scholars to present a clear picture of Scotus’s considered opinion on the univocity of being is the fact that his thought on the matter shifts over the course of his career and his sudden passing in 1308 around the age of 43 left the bulk of his writings in a state of partially-edited disarray that can obscure its development, thereby lending credence to conflicting readings. Recent scholarship has done much to ameliorate this difficulty, but an understanding of Scotus’s development must also be parsed along the lines of the Aristotelian categorial metaphysics that dictate his approach. Accordingly, section 2 of this essay is devoted to such preliminary considerations. Section 2a lays out a chronology of Scotus’s writings, presenting his early view on the univocity of being (which appears to have reflected the view then standard amongst Oxford scholars). This shows where Scotus’s mature thought on univocity is to be found and helps to resolve difficulties posed by the decidedly mixed and confused treatment that the univocity thesis receives in Scotus’s commentary on Aristotle’s Metaphysics. Section 2b is a study of the thirteenth-century Aristotelian categorial metaphysics that led medieval theologians to rely on analogy to speak of God and creatures and determined the trajectory of Scotus’s thought.

2. Preliminaries

a. Scotus’s Writings and Early Thought

What we take to be Scotus’s mature position on the univocity of the concepts by which we conceive both God and creatures is drawn primarily from his Ordinatio. (Scotus’s Ordinatio is a revised version of the lectures on the Sentences of Peter of Lombard (d. 1160) that Scotus delivered in partial fulfillment of the curriculum of the faculty of Theology. Scotus was at work revising his Ordinatio up until his death.) Yet, Scotus’s thought on the univocity of concepts appears to shift over the course of his career. Whereas the Ordinatio presents Scotus’s considered opinion, other texts (such as his earliest, logical writings) conform to what we now know to be the standard, mid-thirteenth century Oxford tradition that looks on the term ‘being’ as either equivocal (from the point of view of the logician, who deals with concepts as such, that is, independently of the entities they conceive) or analogous (for the metaphysicians and natural philosophers (that is, physicists), who consider mind-independent realities) (Pini 2005a). Hence there is a striking dissonance between early and late Scotus as regards the univocity of the concept of being, and we find Scotus in his early-period commentary on Aristotle’s Categories (construed by medievals as a work on the properties of terms) stating that the term ‘being’ is in fact equivocal or analogical, from the respective standpoints of logic, on the one hand, and metaphysics and physics, on the other:

This term ‘being’ is simply equivocal . . . . However an utterance that for the logician is simply equivocal . . . is analogous for the metaphysician or the natural philosopher . . . Thus . . . ‘being’ is posited by the metaphysician as analogous . . . But for the logician, it is simply equivocal (Questions on the “Categories” of Aristotle Q. 4, sect. 37-38, trans. mine). (Unless otherwise noted, all translations are my own.)

Medieval semiotic theories develop along the lines laid down by Aristotle in his logical treatise on the properties of statements (the Perihermenias), better known today by its Latin title De Interpretatione. The first chapter presents a semiotic that remained open to a variety of interpretations throughout the middle ages as to whether it construes words primarily as signs of ideas or of extramental things. Scotus allows that there are good reasons to hold either opinion and leaves open the matter (Quaestiones in libros Perihermenias Aristotelis 1.2.51; see also Buckner and Zupko 186-87; Read 16-18). Aristotle’s account is as follows:

Spoken sounds are symbols of affections in the soul, and written marks symbols of spoken sounds . . . What these [that is, spoken sounds] are in the first place signs of – affections in the soul – are the same for all; and what these affections are likenesses of – actual things – are also the same (trans. Ackrill 16a3-8).

Affections in the soul, or concepts, are for medievals the components of mental propositions. Ultimately, concepts are traceable to impressions that things make on our minds and hence the medieval account is an empiricist one. Concepts are likenesses of the entities that they represent and are thereby that by which things are cognitively present to one who conceives. The property of a term whereby it brings to mind a concept is described by medieval thinkers as the term’s signification (Read 9-10).

Logic, for Scotus, considers, among other things, the signification of terms. Signification should not be understood in light of our contemporary notion of meaning, which allows us to speak loosely of the meaning of a term as what it brings to mind. The meaning of a term can be looked up, whereas an expression’s signification is ultimately traceable to something. Again, Scotus conceives of signification as a mental event, a state (or, in scholastic terminology, derived from Aristotle, an ‘accident’) of the mind through which an entity is cognitively present for the person who is conceiving (Pini 2015). On this view, significative utterances are said to be either univocal or equivocal dependent on what concepts they bring to mind. Drawing from Aristotle’s account in Categories 1, univocal (or synonymous) things have a name and definition or account in common. By extension, a term is univocal over two or more entities when it signifies the same idea for each. Again, things are equivocal (or homonymous) when they only have a name in common, the corresponding definition or account differing in each case. Accordingly, an equivocal term signifies differently in its different applications. Hence the term ‘man’ is univocal to Socrates and Plato, whereas it is equivocal when applied to Socrates and a painted image. As this example suggests, equivocity is not always the result of happenstance, absent any real relation, as it is with, for instance, the word ‘bank’, which refers to financial institutions and spots along rivers. Borrowing from Boethius’s (d. 480 C.E.) influential commentary on Aristotle’s Categories, medievals termed these types of equivocity ‘deliberate (a consilio)’ and ‘chance (a casu)’, respectively. Analogy is an instance of deliberate equivocity. Interestingly, the Oxford tradition of the latter-thirteenth and early-fourteenth centuries does not allow that analogy derived from real relationships carries over to the semantic level. Other thinkers, such as Thomas Aquinas who was active in Paris and Cologne, believe that the signification of analogous terms reflects real relations.

So, the term ‘healthy’ is analogous, signifying either an individual, her complexion or the medicine she takes:

This mode of community of idea is a mean between pure equivocation and simple univocation. For in analogies the idea is not, as it is in univocals, one and the same, yet it is not totally diverse as in equivocals; but a term which is thus used in a multiple sense signifies various proportions to some one thing; thus ‘healthy’ applied to urine signifies the sign of animal health, and applied to medicine signifies the cause of the same health (Summa Theologica (ST) Ia.13.5c) (All translations of ST are from the Fathers of the English Dominican Province).

The Oxford logicians, on the other hand, tend toward decoupling ontology and semantics. Terms that bring to mind different things (a person and her complexion) elicit discrete concepts and are therefore simply equivocal, regardless of the real relation that holds between the things signified. For instance, the term ‘being’ could be used to signify subsistent entities (substances) and inherent entities (accidents such as complexion that are parasitic upon substances for their existence). Substances and accidents stand in a real relationship to one another in reality; but, for the logician, this relationship is not reflected in the discrete concepts of each. The expression ‘being’ said of a substance signifies differently than the same expression said of an accident and hence, for the logician, the expression used in these ways is equivocal. For the metaphysician and the physicist, who are concerned primarily with things in the world, on the other hand, the expression is analogous. Perhaps this distinction between concepts and things proved influential for the later Scotus, inasmuch as the univocity thesis considers concepts apart real features of the extramental things that they conceive. Be that as it may, the influence of the Oxford tradition explains why early in his career Scotus holds that the term ‘being’ is equivocal in logic and analogous in metaphysics and physics.

In contrast to the logical writings’ clear rejection of the univocity of the concept of being, the account of the Questions on the Metaphysics of Aristotle is confused and indecisive. The unedited state of Scotus’s writings at the time of his early death is likely to blame and Giorgio Pini’s recent scholarship (2005a) persuasively casts the work as a conflation of early and late drafts that upon completion would have rendered an account in keeping with Scotus’s mature theory. Nevertheless, the impression remains that Scotus was never satisfied with the univocity thesis. In his Quodlibetal Questions (a record of Scotus’s participation in a public theological inquiry, composed near the end of Scotus’s life in either 1306 or 1307), Scotus states that it does not matter for the purpose of the investigation whether the concept of being at issue is analogical or univocal, provided that we allow that being is somehow common to God and creatures (14.39). Hence Steven Marrone (1983) suggests that perhaps Scotus was ready to give up univocity should anything better come along.

Owing at least in part to these conflicting claims and open questions, contemporary interpretations of Scotus’s thought on the univocity of concepts have reflected something of the dichotomy seen in Antonius and Peter over whether Scotus’s account entails a reality common to God and creatures (see above, section 1). And yet, with the recent completion of the critical editions of both Scotus’s Ordinatio (in 2013) and his philosophical writings (the Opera philosophica, in 2006), we are better able to chart the evolution of his thought on the univocity of the concept of being. And certain long-standing difficulties seem to have disappeared. The aforementioned logical writings along with Scotus’s earliest draft of the Questions on the Metaphysics endorse the denial of the univocity of being standard at the time and place of their composition. His Ordinatio, by contrast, reflects Scotus’s theological studies in Paris, where he was introduced to conceptual devices that led him to rethink the univocity of being. Scotus was at work revising his early writings in light of his mature position on univocity right up until his untimely death in 1308, hence the odd character of his Questions on the Metaphysics (Pini 2005a). Contemporary research then paints a developmental picture of Scotus’s thought. That said, as has been noted, there remains evidence that even late into his career Scotus was never completely satisfied with the univocity thesis, which threatened to “to destroy all philosophy.” By ‘all philosophy’ Scotus means Aristotelian philosophy and more specifically the thirteenth-century categorial metaphysics that developed out of Aristotle’s thought. Section 2b is an account of this system.

b. The Aristotelian Paradigm: Thirteenth-Century Categorial Metaphysics

Medieval thinkers view Aristotle’s Categories as a work on the properties of terms. Under this broad classification, there was much debate over whether Aristotle was discussing: (1) mental acts or concepts; (2) linguistic entities; (3) extra-mental features or properties of the things about which we think and speak; or (4) words, concepts and properties, but in different ways (Gracia).  Scholars divide the work into three parts: the Prepredicamenta (chapters1-4), the Predicamenta (chapters 5-9) and the Postpredicamenta (chapters 10-15). (The Greek title of Categories is ‘katēgoriai’ meaning ‘predicates’, translated into Latin as ‘praedicamenta’.) The Prepredicamenta presents distinctions between homonyms and synonyms (see above, section 2a), substances and accidents and universal and particular terms (for example, ‘whiteness’ and ‘white’) as well as a list of things said (praedicamenta) without combination. The Predicamenta discusses these ‘things said’, the Postpredicamenta taking up a variety of tangential topics. Medieval debate over the subject matter of the Categories focused on the nature of the things said without combination, viz., substances and the nine categories of accidents. During the thirteenth century, it was generally agreed that the Categories treats words, concepts and things. For the metaphysician, the text presents a classification of extramental things (with substance most properly termed ‘being’ and the things in the nine categories of accidents derivatively so called), whereas the logician studies the work as a treatise on concepts of second intention. (Concepts of second intention are concepts derived by reflection on first-intention concepts, which are of things in the world. The concept species is a second intention concept.)

Each category (or highest genus) may be characterized as an ordered hierarchy of predicates (Ord. II, d. 3, p. 1, q. 4, n. 89). At the bottom of this hierarchy exists the individual substance or accident that is the ultimate locus of predication. Individuals are classed into species by means of differentiae (differences) that set species within the same category apart from one another (as, medievals believe, humans are set apart from other animals by virtue of the quality of rationality). Genera, in turn, can serve as species under still higher genera (animal, for instance, is a species of body – the animate type). The hierarchy culminates in a highest genus or category that is said of everything contained within it (as, for example, all things in the category of substance are termed ‘substances’). Things in different categories are primarily diverse, meaning that they are not classified in terms of any common predicate. Hence, the predicates that go into the definition of a particular type of thing within a particular category are particular to that category. Things in the same category, on the other hand, are different rather than diverse, meaning that (by virtue of belonging to the same category) they share in some predicate or predicates (as, for example, both humans and cats are sensitive, living, animate material substances) (Metaph. 10.3, 1054b13-32). As the predicates are category specific, univocity is not transcategorial, terms taken from one category apply only analogically to things in other categories, as, famously, when accidents (or inherent beings) receive the denomination ‘being’ by virtue of their relationship to the substances in which they inhere, about whom ‘being’ is more properly said (Metaph. 4.2). (See above, section 2a.)

Recall that Scotus’s thesis regarding the univocity of religious language needs a concept of being broad enough to apply to everything, whence metaphysics as a natural theology studies the transcendental attributes that are coextensive with being as such (see above, section 1 and below, 6a). So characterized, Scotus’s project faces several interwoven difficulties. First, although metaphysicians of the Latin West had explicitly held that there exist transcendental attributes of being since the time of Phillip the Chancellor (d. 1236 C.E.), they abided by the aforementioned restriction on univocity that it not be transcategorial and relied on analogy to secure an approximation of conceptual unity across the diverse genera (see above section 2a and below, 5c). Hence, for Scotus’s system to get off of the ground in an Aristotelian universe, he had to find a way to accommodate the transcategorial prohibition. The move to render being a super-genus or highest category over and above the ten highest genera and thereby secure univocity by denying that it is transcategorial would appear tempting, but it is ruled out, as the medievals agree with Aristotle that being cannot be a genus. Genera are predicated of the species and individuals that fall under them, but not of the differences that constitute these species (Topics 122b17-23). For instance, ‘animal’ is said of human beings but not of the rational quality that sets humans apart from other types of animals; for no quality that pertains to animal as such can serve to distinguish one type of animal from another. If, then, being were a genus, the differential qualities that constitute types of beings would not fall under the genus of being and hence would not exist. Differences do in fact exist and hence being is not a genus (Metaph. 3.3, 998b14-999a22).

Scotus avoids transcategorial predication without rendering being a genus by casting his univocity thesis as a semantic thesis:

It is plain, therefore, from what has been said that God and creatures are in reality wholly diverse, agreeing in no reality . . . and nevertheless they agree in one concept such that there may exist one concept common to God and creatures fashioned by an imperfect intellect (Lect. 1, d. 8, n. 129, Vat. XVII:46; see also Ord. I, d. 3, pt. 1, qq. 1-2, nn. 38-40, Vat. III.25-27; q. 3, n. 163, Vat. III.100-101, d. 8, n. 136, Vat. III:221).

The type of being that figures into the univocity thesis is not a highest genus above both God and creatures. It is, rather, a mental construct arrived at through abstraction. Both God and creatures are beings, but they need not agree in anything real. Rather, they fall under a common concept that lacks any immediate referent as it is prescinded from the considerations of degree that make all the difference as regards its instantiation.

3. Contemporary Scholarship

For Scotus, univocal concepts must refer to both God and creatures without collapsing the metaphysical space that separates them. Accordingly, Scotus is careful to note that univocal concepts are, as Richard Cross (2001, 13) puts it, “vicious abstractions,” referring properly to neither God nor creatures. This being the case, it appears as if Scotus’s univocal concepts may leave the theologian empty-handed. As Scotus is working to make space for univocity absent real commonality, it is perhaps unsurprising that he appears as a protean figure in contemporary minority reports that criticize him as either idolatrous or apophatic. David Burrell thinks that Scotus does not appreciate the problematic nature of our conceptual access to mystery and N. Trakakis reads into Scotus an anthropic conception of God on account of which he accuses Scotus of idolatry. Catherine Pickstock, on the other hand, believes that by rendering the distinction between God and creatures one of degree, Scotus paradoxically makes the distinction over into one of kind, inasmuch as there would exist a nonnegotiable epistemic gulf between the two (1998, 2005). By way of contrast, Richard Cross (2001), Stephen Dumont (1992), Steven Marrone (1983), Jan Aertsen and Wouter Goris (2013), Peter King (2003) and Thomas Williams (2005) count among the scholars who represent the majority view that recognizes that Scotus’s univocity theory is a semantic theory that does not require that God and creatures share in any real, common trait. To the contrary, the claim that Scotus’s univocity thesis entails that there is some reality common to the two does not fully appreciate the importance of the distinction between semantics and ontology that Scotus is careful to draw:

Note how there can be a first intention [that is, a real concept] of a and b which is indifferent, and nothing of a single nature corresponds in reality, but two formal objects wholly diverse are understood in one first intention (Ord. I, d. 8, n. 136, Vat. IV:221) (See also above, section 2b.)

The univocity thesis concerns concepts, not things and this distinction is crucial for Scotus. But, this is not to suggest that Scotus thinks that we do not have any knowledge of God. We know God through univocal concepts of empirical origin:

Those things that are known of God are known through the species [that is, a mental grasp] of creatures . . . Creatures, which imprint proper species in the intellect are also able to imprint the species of the transcendentals which agree in common with them and God. And then the intellect by means of its proper power is able to use many species simultaneously for the purpose of conceiving simultaneously those of which these are species, e.g. the species good and the species highest and the species act for the purpose of conceiving some highest and most actual good (Ord. I, d. 3, pt. 1, n. 61, Vat. III:42).

God contains the perfection of every creature (Ibid., d. 8, pt. 1, q. 3, n. 116, Vat IV:207-208). Univocal concepts are of God inasmuch as univocal concepts are in their genesis of creatures who imitate or represent God (Ibid., d. 3, pt. 1, qq. 1-2, n. 56, Vat III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179). Hence univocal concepts are mere mental constructs only to the extent that they pertain properly to neither God nor creatures when entertained apart from relevant modal considerations, which considerations (as has been stressed) make all the difference. Scotus invites us to think of these univocal concepts along the lines of a concept of whiteness absent any particular degree of intensity (Ord. I, d. 8, q. 3, n. 138).

4. Scotus on our Natural Knowledge of God

Scotus offers various proofs that we possess concepts univocal to God and creatures. Perhaps the most well know argument is that from certain and doubtful concepts:

Every intellect certain of one concept and doubtful of others has the concept of which it is certain as other than the concept of which it is doubtful . . . But the intellect . . . can be certain of God that God is a being, doubting as to this being whether it is finite or infinite . . . Therefore, the concept of being as regards God is other than this [concept] and that [concept]. And therefore for its part it is neither and it is included in both. Therefore, it is univocal (Ord. I, d. 3, n. 27, Vat. 3:18).

Scotus cites the debates amongst Presocratic philosophers over the nature of the first principle to show that the concept of being is common to the composite concepts infinite-being and finite-being, which overlap at the simple concept of being. This simple concept is common to the composite concepts such that in both its modalization does not alter its intensional content, that is, the common concept of being remains exactly the same concept of being in each case. A person who mistakenly thinks of fire as the uncaused first principle or constituent of all things (and therefore as infinite with respect to being) can be corrected but would not then cease to think of fire as a being. The intensional content of the concept of being is thus univocal for the concepts infinite-being and finite-being: “This certain concept, which is for its part neither of the doubtful ones is preserved in both of them” (Ibid., n. 29, Vat. 3:19). Scotus’s point is that concepts said of God and creatures retain a core content univocal to both instances. Joining the concept of finitude or infinity to a concept does not alter its meaning but merely produces a new, composite concept.

We arrive at these concepts univocal to God and creatures through experience (see below, section 5b), but when these concepts are joined with the concept of infinity, they apply only to God. There are two types of concepts that are proper to God:

I say that it is possible to arrive at many concepts that are proper to God and that do not agree with creatures. Concepts of this type are the concepts of all of the perfections taken simply, in the highest degree. And the most perfect concept, through which as if by description we most perfectly know God, is by conceiving every perfection simply and in the highest degree. Nevertheless, a concept more perfect and yet simpler, available for us, is the concept of infinite being. This concept is simpler than the concept ‘good being’, ‘true being’ or concepts of other, similar things; because ‘infinite’ is not a quasi-attribute or property of being, or of that of which it is said. Rather it signals an intrinsic mode of that entity, such that when I say ‘infinite being’ I don’t have a concept that is like an accidental concept, composed out of the subject and property, but, rather, I have an essential concept of a subject in a certain grade of perfection, viz. infinity, just as ‘intense white’ doesn’t express the same things as an accidental concept like ‘visible white’, indeed the intensity expresses an intrinsic grade of whiteness in itself. And thus the simplicity of the concept ‘infinite being’ is evident (Ord. I, d. 3, pt. 1, qq. 1-2, n. 58, Vat. III.40).

Concepts that are proper to God describe only God. The first type of concept proper to God is a descriptive, cluster-concept composed of attributes and perfections conceived in the highest possible degree – infinite-goodness, infinite-wisdom, and so forth, bundled together as it were (Frank and Wolter 150-51). Another type of concept, yet more appropriate to the divine nature, is of God conceived simply as infinite being. This latter concept is superior to the cluster-concept for several reasons. First, the concept of infinite being only applies to God (Ibid., n. 60, Vat. III: 41-42). Second, unlike the cluster-concept, infinite being does not explicitly comprise distinct traits. This respects the medieval insistence that God is metaphysically simple (see, for example, De primo principio 4). The essence of God cannot comprise aspects that stand in a relationship of potentiality with respect to one another. Otherwise, the existence of God would require some account as to why it is the way that it is and God would not be God, that is, the uncaused (or unaccounted for) first cause. Third, the concept of infinite being is superior to the cluster-concept because the transcendental attributes and perfections are coextensive with being as such, as infinite being God has every perfection (see below, section 6a). Fourth, the distinction between infinite and finite being is one of degree and it is because the distinction between God and creatures is one of degree that Scotus avoids rendering being a genus over and above both God and creatures (see below, section 6b). (Nevertheless, we should not think of infinite being as a divisible quantity. Infinite being is, for Scotus, indivisible. See Cross 2001.) Finally, by Scotus’s estimation, the concept of God’s infinite being is the most fertile ground available to the natural theologian seeking to deduce various divine attributes.

In his De primo principio (A Treatise on God as First Principle), Scotus describes God’s infinite being as a “most fertile conclusion, which if it had been proved of you at the outset, would have made obvious so many of the conclusions we have mentioned so far” (4.47, trans. Wolter, 1966). Following a lengthy series of demonstrations that the divine essence is infinite, Scotus then concludes that (among other things):

Catholics can infer most of the perfections which philosophers knew of you . . . You are the first efficient cause, the ultimate end, supreme in perfection, transcending all things. You are uncaused in any way and therefore incapable of becoming or perishing; indeed it is simply impossible that you should not exist . . . You are therefore eternal . . . You live a most noble life . . . You are happy . . . You are the clear vision of yourself and the most joyful love . . . You . . . understand in a single act everything that can be known . . . You possess the power to freely and contingently will each thing that can be caused and by willing it through your volition to cause it to be. Most truly then you are of infinite power . . . You alone are simply perfect, not just a perfect angel, or a perfect body, but a perfect being . . . You are one God, than whom there is no other (Ibid., 4.84-87).

Just as our understanding of the goodness and wisdom that we ascribe to God originates in experience, so too does our notion of infinity. In his fifth Quodlibetal Question, Scotus walks us through the process of reflection by which we arrive at this concept. Aristotle defines the infinite as infinite with respect to quantity. No matter how many discrete quantities one takes away from it, an infinite amount remains (Physics 3.6, 207a7-9). Hence, Scotus notes that this infinity can never be in existence as a whole. We are then asked to imagine per impossibile that the entirety of this infinity should be present at once, hence “if this could be done we would have an actually infinite in quantity, because it would be as great in actuality as it was potentially” (5.6, trans. Alluntis and Wolter). God’s infinite perfection is conceived along the lines of this model of a quantitative infinity taken as a whole:

If we think of something among beings that is actually infinite in entity, we must think of it along the lines of the actual infinite quantity we imagined, namely as an infinite being that cannot be exceeded in entity by any other being. It will truly have the character of something whole and perfect. It will indeed be whole or complete (Ibid., n. 7).

God is then “infinite in perfection or power” (Ibid., n. 8).

In summary, we know God through the numerous concepts of various perfections and attributes by which creatures imitate and represent God. We know of God that God is an infinite being, and because God is an infinite being, God possesses these perfections in the highest possible degree. For this reason, of all of the things that we know of God, the most significant is that God is infinite being. It remains, then, to discuss the intensional or informational content of concepts univocal to God and creatures (section 5) with an eye toward why Scotus thinks we need these concepts (section 5a), how we acquire them (section 5b), why analogical concepts will not do in their place for the natural theologian (section 5c) and, finally, how Scotus uses concepts univocal to God and creatures to render metaphysics a natural theology (section 6a) without thereby “destroying all philosophy” (section 6b).

5. Univocity

Theological considerations are at the heart of Scotus’s univocity thesis. First, Scotus holds that theology is pointless absent any concepts univocal to God and creatures, as theologians would literally have no idea what they are talking about. (Section 5c, below, discusses why Scotus does not believe that the generally accepted account of God talk as analogical will do.) Again, theologians present certain conclusions as the product of sound reasoning and Scotus (naturally enough) holds that sound reasoning requires the univocity of concepts (Cross 2006). As regards this second point, Scotus’s description of univocal concepts draws attention to their role in demonstration:

I say that God is not only conceived in a concept analogous to the concept of a creature (namely a concept which is entirely different from a concept said of a creature), but also in some concept that is univocal to God and creatures. And so that there won’t be any contention about the term ‘univocal’, I call a concept univocal which is one such that its unity suffices for a contradiction when the concept is affirmed and denied of the same thing; likewise, it suffices for a syllogistic middle, so that the extreme [minor and major] terms joined in the middle that is one in this way are concluded to be joined to one another without the fallacy of equivocation (Ord. I, d. 2, qq. 1-2, n. 26, Vat. III:18).

a. Univocity and Natural Theology

We can think of Scotus’s univocity thesis as a thesis regarding how theological language has to work if it is to furnish the concepts that are needed to render theology a deductive science. The need for univocity is evident in the simplest syllogism, consider: ‘A loving parent cares for her child. God is as a loving parent and hence God cares for God’s creatures’. If the signification of the term ‘loving’ is not fixed across the demonstration but rather shifts from premise to premise, then what we know of parental love might not have any bearing or relevance when it comes to a proper understanding of the love of God. But then it would seem that natural theology is a dead-end practice. As Scotus says, we need a univocity sufficient to avoid the fallacy of equivocation.

Again, just as demonstration cannot proceed absent terms whose meanings are fixed, Scotus believes that without such terms we literally do not have any idea what we are saying when we talk about God. If some ideas do not pertain equally well to God and creatures, if the data of experience do not somehow map onto the divine essence, we know nothing of God, the correct account (ratio) of any divine attribute or perfection need not have anything at all in common with a similar correct accounting of the attribute as it is manifest in creatures. As Scotus puts it, if things were really this bad, we would have no better reason to call God wise than a rock (Ord. I, d. 2, qq. 1-2, n. 40, Vat. III:27).

Apart from univocity and analogy, Scotus had another option when it came to knowledge of God proffered in the negative theology of Rabbi Moses Maimonides (d. 1204 C.E.), who held that we know of God only what God is not (negative theology or strong apophaticism is also referred to as the way of remotion or the via negativa). On Scotus’s reckoning, even this supposed lack of knowledge presupposes some positive knowledge of God. Every denial entails an assertion and when we deny that God has some attribute, this is on the basis of positive knowledge that shows us that it is inconsistent to affirm this attribute of God. Likewise, and in keeping with Thomas Aquinas (ST Ia.13.2c.), Scotus notes that negative theology is incompatible with Christian faith: “We don’t fall intensely in love with negations” (Ord. I, d. 2, qq. 1-2, n. 10, Vat. III:5).

In sum, if we lack concepts univocal to God and creatures, Scotus believes that natural theology must fail on several counts. We could not construct sound proofs with God as the subject and all of our concepts of God would prove to be vacuous inasmuch as all knowledge is tied to experience and experience could not then serve to provide any correct account of God. Hence, Scotus charges that “All masters and theologians appear to use a concept common to God and creatures, though they deny this when they do it” (1 Lect. d. 3, n. 29).

b. Illumination Theory and Abstraction

Medieval illumination theory holds that God is somehow responsible for our having knowledge. God’s role in our acquisition of concepts is seen as more or less active often dependent on a thinker’s respective adherence to either a Platonic or an Aristotelian framework. During his middle period (c. 365-c. 347 B.C.E.), Plato (428/427-348/347 B.C.E.) states that various natural kinds, attributes (and perhaps even artefacts) acquire essential predicates by means of a type of vaguely described participation in unique, eternal, immutable, archetypical exemplars (termed ‘forms’ or ‘ideas’) that are more and less perfectly imitated by these various particulars. (See, for example, Republic 504e–518c and 596e–597a, Phaedo 100b–102a3, and Phaedrus 247c3–247e6. For the dating of these works in Plato’s middle-period, see Kraut.) Direct access to Plato’s writings in the middle ages was limited to a fragment of his Timaeus. Nevertheless, Plato’s thought was transmitted to medieval thinkers in a variety of ways, including the writings of Augustine (354-430 C.E.). By way of contrast, by the end of the twelfth century, the Latin West had access to more or less the entirety of the surviving writings of Aristotle that today comprise his corpus (prior to this medievals had access only to the Categories and On Interpretation as well as Porphyry’s (d. 305 C.E.) Isagoge, a tremendously influential introduction to Aristotle’s logic). Whereas Plato grants ontological priority to the immaterial forms and insists that the best knowledge we have is of these archetypical templates, Aristotle’s Categories upends this picture by rendering everyday substances the primary locus of predication:

All the other things are either said of the primary substances as subjects or in them as subjects. So if the primary substances did not exist it would be impossible for any of the other things to exist (trans. Ackrill, 2b4-6).

For Aristotle, predicates apply only to individual substances; they do not correspond to hypostasized, otherworldly Platonic essences such as goodness and beauty. Substances are therefore prior “by nature (tē phusei)” and hence responsible for the existence of the accidents for whom to be is to be in another (Cat. 14b11–13). Mundane substances, not otherworldly forms, ground our knowledge. It was accordingly natural for Aristotelian and Augustinian accounts respectively to downplay and emphasize the need for illumination.

Augustine straightforwardly identifies Plato’s forms with the divine ideas (De diversis Quaestionibus octoginta tribus liber unus, q. 46, 1-2). And Augustine’s account of knowledge of God incorporates direct illumination. Augustine’s On The Trinity spells out the abstractive process whereby we approach knowledge of God’s goodness and details the requisite illumination in which the idea of God’s goodness is “impressed” on us:

[Reflect on] ‘this [particular] good’ and ‘that [particular] good’; [and then] take away ‘this’ and ‘that’, and see good itself if you can; so you will see God who is good not by another good, but is the good of every good . . . In all these good things . . . we would be unable to call one better than the other . . . if the idea of the good itself had not been impressed upon us, according to which we approve of something as good, and also prefer one good to another (8.3, quoted in Frank and Wolter, 138).

For his part, Scotus holds that were the human intellect so weak as to require an illumination to form concepts of God, this very weakness would likewise undercut our ability to receive these concepts (Ord. I, d. 3, pt. 1, q. 4, n. 225, Vat. III:136). Rather, Scotus will allow for a general form of illumination to the extent that God both produces objects in intelligible being and is also that in virtue of which these objects move us to understanding (Ord. I, d. 3, pars 1, q. 4, n. 268, Vat. III: 163-64). Accordingly, Scotus believes that we can form concepts proper to God and creatures through purely natural means, apart from any special activity on the part of God over and above God’s having put into place certain factors. Scotus spells out how we do this in a discussion that runs parallel to Augustine’s while avoiding any reference to special illumination in the acquisition of concepts univocal to God and creatures:

Every metaphysical inquiry about God proceeds in this fashion: the formal notion of something is considered; the imperfection associated with this notion in creatures is removed, and then, retaining the same formal notion, we ascribe to it the ultimate degree of perfection and then attribute it to God . . . Consequently, every inquiry regarding God is based upon the supposition that the intellect has the same univocal concept which it obtained from creatures (Ibid., qq. 1-2, n. 39, Vat: III:26).

c. Analogy and Univocity

Scotus believes that natural theology rests (tacitly or otherwise) on the assumption that experience furnishes concepts univocal to God and creatures. But is not Scotus overhasty in his univocity-or-nothing approach to knowledge of God (see above, section 5a)? After all, for Scotus’s contemporaries, analogy is good enough for the purposes of natural theology. Thomas Aquinas makes this point when he states that whereas we lack terms univocal to God and creatures, demonstration can nevertheless proceed by means of analogical terms:

No name is predicated univocally of God and of creatures. Neither, on the other hand, are names applied to God and creatures in a purely equivocal sense, as some have said. Because if that were so, it follows that from creatures nothing could be known or demonstrated about God at all; for the reasoning would always be exposed to the fallacy of equivocation. Such a view is against the philosophers, who proved many things about God, and also against what the Apostle says: “The invisible things of God are clearly seen being understood by the things that are made” (Romans 1:20). Therefore, it must be said that these names are said of God and creatures in an analogous sense (ST Ia.13.5c).

Medieval theories of analogy develop out of Aristotle’s Physics and Metaphysics, where he discusses the many meanings that we attach to the term ‘being’. In the latter work, Aristotle investigates the possibility of metaphysics as the universal science of being qua being, ranging from substances and their modes or accidents to the first unmoved mover (4.1, 6.1). But, demonstration is not transcategorial, as diverse entities have nothing in common (Posterior Analytics 1.7; see also above, 2b). Hence, to function as a science that cuts across the categories, metaphysics uses what Aristotle terms ‘pros hen (toward one)’ equivocation or analogy, which conceives diverse entities under a concept that applies primarily to one and in a secondary or derivative sense to the other. Hence even accidents are termed ‘beings’ inasmuch as they derive their existence from substances of whom being is properly said (Metaph. 4.2). Aquinas uses Aristotle’s scheme to cast metaphysics as the study of creatures and God as their source, reliant on terms that signify in prior and posterior senses to supply the science with its universality (Wippel).

Separating the likes of Aquinas, on the one hand, and Scotus, on the other, were the Condemnations of 1277, drafted as a reaction to so-called Latin Averroist readings of Aristotle that developed out of the reception of the commentaries on Aristotle of the Muslim philosopher Averroes (d. 1198 C.E.), Latin Averroism suggested a possible disparity between truths of reason, on the one hand, and revelation, on the other. Though he was a strident and successful critic of this interpretation of Aristotle, some of Aquinas’s views were lumped in with those of the Averroists, leading to the condemnation of certain of Aquinas’s positions, for instance, that we can know of God only that God is, or exists. (In 1325, two years after Aquinas’s canonization, the Condemnations were repealed to the extent that they touched on his works.) Henry of Ghent (d. 1293), who had a hand in drafting the Condemnations, viewed Aquinas as too apophatic. Whereas Aquinas held that metaphysics studies God only indirectly as the cause of categorial beings, Henry construes metaphysics as the study of being taken absolutely, comprising both God and creatures (Dumont 1998b). Again, Henry holds that we have essential or quidditative knowledge of God (quidditative knowledge answers the question ‘What is it (Quid est)?’). Quidditative knowledge of God is of the divine attributes grasped in an imperfect or quasi-accidental manner (Dumont 1998a). Such knowledge of God and so broad a metaphysics requires concepts that are general enough to apply to God and creatures without suggesting that the two share in anything real, so as to avoid collapsing the metaphysical distance between them. As we shall see, Henry’s attempt to accommodate this demand will open the door to Scotus’s univocity thesis. Henry seeks concepts sufficiently general to apply to God and creatures in a model of pseudo-concepts of being and the various perfections, which concepts initially strike us as common to God and creatures owing to the concepts’ vagueness. On reflection, these pseudo-concepts are exposed as each being the conflation of two concepts, one proper to God, the other to creatures. As the pseudo-concept in fact comprises utterly distinct concepts, its existence does not entail that God and creatures actually share in any real feature. Henry calls these distinct concepts ‘analogical’ with respect to one another as they are of traits that apply primarily to God and in a derivative sense to creatures – though Henry sometimes speaks of the vague pseudo-concept itself as an analogous concept (Summa, a. 21, q. 3). The analogous concept that pertains only to God is ‘negatively undetermined’ (not open to any further determination by means of some advening perfection), whereas its creaturely counterpart is ‘privitively undetermined’ (conceived apart from the determinations that are bound up with its instantiations in creatures). It is because in either case the concepts are of being and its attributes as undetermined (either negatively or privitively) that the concepts were initially conflated (see Dumont 1998a and 1998b, and Quodlibeta 13, q. 10; Summa a. 21, q. 2; a.24, qq. 6-7).

Henry’s pseudo-concept that merely seems common to God and creatures is the progenitor of Scotus’s univocal concepts under which we conceive both. Scotus sees that if Henry’s account is correct, concepts of creatures tell us nothing of the creator and hence experience teaches us nothing of God. Hence, Scotus contends that on Henry’s account, an analogical concept of God is in fact “entirely different from a concept said of a creature” (Ord. I, d. 2, qq. 1-2, n. 26, Vat. III:18). Scotus therefore replaces the analogous pseudo-concept with the univocal concept and modalizes negative and privitive indetermination into the degrees of intensity that characterize the instantiation of traits in God and creatures, respectively (Dumont, 1992). Scotus’s attack on analogy is then directed at Henry’s version of analogy, which supposes radically distinct concepts only mistakenly thought to pertain to God and creatures. As regards the traditional sense of analogy wherein terms apply primarily to God and in a secondary sense to creatures, Scotus would likely insist that if religious language does not preserve a univocal conceptual content common to both senses, it devolves into chance equivocity as discussed above in section 2a (Williams 2005 and Cross 2012).

6. Metaphysics as Natural Theology

Scotus takes up the notion of the univocal concept of being that renders metaphysics a natural theology in response to the question as to whether we have natural knowledge of God. Ultimately, Scotus will conclude that although we cannot naturally grasp the divine essence in its individuality as it is distinct from all things, we nevertheless can naturally acquire a concept whereby we conceive God essentially and quidditatively as the subject of inherence with respect to the divine attributes. Scotus distinguishes his theory from Henry’s on the grounds that the latter’s quidditative knowledge of God does not pertain directly to the divine essence but is rather “quasi accidental (quasi per accidens)” (Ord. I, d. 3, pt. 1, qq. 1-2, nn. 25, 56, Vat. III.16-17, 38-39). As regards the properties of the divine essence that we arrive at in metaphysics, for Scotus these remain identical with the divine essence and yet formally distinct from one another inasmuch as they may be considered without reference to one another. (Scotus recognizes a formal distinction between inseparable aspects (or formalities) of one and the same individual, for example, an individual’s rationality and animality, such that they may be considered apart from one another. In the case of the divine essence the formal distinction implies even less composition than in that of creatures, wherein various formal aspects united in an essence perfect one another, as, for example, the rational quality may perfect animal nature. See Hall 136, n. 38; Noone; King, n. 13; Ross and Bates n. 13; Alluntis and Wolter 505-09). Nevertheless, as our grasp of what we would attribute to God leaves off at the level of a univocal concept under which we conceive both God and creatures in a manner that is proper to neither, we do not know the divine essence in a proper and particular manner; our finite mind’s finite grasp of a concept univocal to God and creatures proves inadequate when we allow that the attribute thus conceived is constituent of the infinite divine essence (see above, section 4). Scotus’s caution on this point grows out of his understanding of the unity in diversity of the divine essence. An infinite entity must possess every perfection of being (Quodlibet 5.8-9) while remaining utterly simple (De primo principio 4.75). Moreover, God’s infinite being exceeds finite being beyond any relative measure or proportion (Quodlibet 5.9). Hence the distance between God and creatures is secure.

a. Metaphysics and the Transcendentals

Scotus conceives of metaphysics as the universal science of what he terms the transcendentals as such (Questions on the Metaphysics, prologue). The medieval theory of the transcendentals has its roots in Plato and Aristotle and was developed by Augustine, Boethius, Pseudo-Dionysius the Areopagite (late-fifth or early-sixth century C.E.) and Avicenna (d. 1037 C.E.). Phillip the Chancellor codifies the theory in his Summa de bono, which asks how we speak of both God and creatures as good and proposes that goodness pertains to God and creatures (in respectively absolute and relative senses) inasmuch as goodness (and unity and truth) are transcendental attributes or properties of being as such.

Following Aristotle (Cat. 5), medieval thinkers recognize ten categories or highest genera of things that are, substance, on the one hand, and its various accidental modes (such as quantity, quality, relation, and so forth), on the other (see above, section 2b). The ten categories together comprise all things except God. Since the transcendentals are the attributes of being as such (that is, as conceptually prior to its division into finite (categorial) and infinite (divine) being), they therefore cut horizontally across the various categories and extend vertically to take in God and creatures. As regards unity, truth and goodness, these were thought to be coextensive properties of being. Apart from the coextensive properties of being, Scotus’s account of the transcendentals recognizes transcendental pure perfections and disjunctions. From Anselm’s (d. 1109 C.E.) Monologion, Scotus derives the notion of pure perfections as perfections that are absolutely and unqualifiedly better than whatever is incompatible with them. Hence it is better to be wise than not and if a dog cannot be wise it would be better for it were it not a dog but rather something that can attain wisdom (De Primo Principio 4.10). Transcendental disjunctions, on the other hand, are disjunctions whose extremes take in all things, for example, finite-infinite (Ord. I, d. 8, q. 3). Note that only the attributes of being are coextensive with all beings. The disjunctions are opposed to one another in the sense that they are mutually exclusive within one and the same individual and the pure perfections do not characterize all entities (neither dogs nor instances of whiteness are wise). Hence, strictly speaking, the pure perfections and disjunctions are transcendental only inasmuch as they aren’t contained under any one particular genus and not because they characterize all things.

As noted, transcendentals pertain to being as such prior to its division into finite and infinite being. Yet, Scotus does not hypostasize being as such. He does not maintain that being as such exists somehow independently of either God or creatures. Rather, all being is modalized being, infinite or finite being. Scotus’s talk of being considered in its indifference to finite and infinite modes refers to the univocal concept of being that pertains to both God and creatures in a manner that is not proper to either inasmuch as the univocal concept does not take into account the relevant modal characteristics that govern its various instantiations. But when we account for the relevant modal factors, this results in the production of new, complex concepts. As Scotus points out, we can be certain that God is a being, whereas we remain in doubt as to whether God is a finite or an infinite being and hence the complex concept of infinite being that is affirmed of God differs from both the simple, univocal concept of being, on the one hand, and that of creaturely, finite being, on the other (Ord. I, d. 3, pt. 1, qq. 1-2, n. 27, Vat.III:18) (see above, section 4).

Working out the implications of metaphysics as the science of the transcendentals as such, Scotus believes that the metaphysician is able to demonstrate that God exists and can ascribe to God various perfections and attributes. Proof of the existence of God draws on transcendental disjunctions such as necessary-or-contingent and relies on the principle that “as a general rule by positing the less noble extreme of some being, we can conclude that the nobler extreme is realized in some other being” (Ibid. d. 39, n. 13). Hence, Scotus’s strategy is to demonstrate God’s existence by means of transcendental disjunctions such as ‘necessary-contingent’:

If some being is contingent, then some being is necessary. For . . . it is not possible for the more imperfect extreme of the disjunction to be existentially predicated of being particularly taken, unless the more perfect extreme be existentially verified of some other being upon which it depends (Ibid.). (For the complete proof, with commentary, see Frank and Wolter, 40-107. Other versions of the proof are at Lect. 1, d. 2, q. 1, nn. 38–135; Reportatio 1, d. 2, q. 1; and De primo principio).

Moreover, the metaphysician’s proof is superior to the natural philosopher’s Aristotelian proofs of an unmoved mover, inasmuch as we know God more perfectly and immediately when we conceive God as necessary being rather than as first mover (as the former attribute is more intimately bound up with the divine essence). As regards perfections and attributes of the divine essence, the latter are known to belong to God inasmuch as they characterize all things, whereas the former are ascribed by means of a perfect-being theology that endorses the principle that pure perfections belong necessarily and in the highest degree to the highest nature (De primo principio, 4.3). Hence when the natural theologian has deduced that God is the highest being, the pure perfections are then known to apply to the divine essence, with our creaturely understanding of these perfections serving as the basis of our knowledge of God (Wolter, 1950).

b. Does Scotus “Destroy All Philosophy?”

Scotus’s natural theology rises or falls with the success or failure of the univocity thesis. Univocity is not supposed to be transcategorial; hence, Scotus’s contemporaries use analogy to predicate across the highest genera and of God and creatures (see above, sections 2a and 5c). On this scheme, Scotus’s claim that being (and with being its transcendental attributes) is univocal to God and creatures and this risks elevating being to a highest genus (see above, section 2b); how else can the concepts of being and its transcendental attributes be univocal across the categories on an Aristotelian worldview? Yet being cannot be a genus, genera are not said of their differences and yet the differences that specify types of beings certainly do exist (see above, section 2b). Perhaps even worse, were being a genus over God and creatures, God and creatures would then agree in some reality, rendering God metaphysically complex (composed of that common reality along with a reality that would uniquely determine the divine essence) and of a kind with creatures. God would no longer be God.

Scotus recognizes that the thesis of the univocity of being and its transcendental attributes would appear to ask that being function as a super-genus above the ten highest genera and that his scheme therefore threatens to collapse the metaphysical space that separates God and creatures. Scotus’s solution is to use the concept of being as such (that is, as conceptually prior to its division into finite (categorial) and infinite (divine) being) as a stand in for any such super-genus. Unlike such a super-genus, however, the concept of being as such is a mental abstraction that does not pertain to anything at all until the relevant modal considerations have been introduced and hence God and creatures needn’t agree in anything real in order to be conceived under the concepts of being as such its transcendental attributes. As noted, however, Scotus suggests that these modal differences entail a difference of kind: “The infinite exceeds the finite in being beyond any relative measure or proportion that could be assigned” (Quodl. 5.9). Be that as it may, the univocity thesis does not concern what God is. The thesis is about how we think and talk about God and the conditions to which religious language must conform in order to advance sound arguments. Hence the univocity of concepts under which both God and creatures are conceived is compatible with the metaphysical gap between God and creatures. But, it should be noted that the distance between God and creatures does not prevent our learning about God through experience. Like other medieval thinkers, Scotus holds that the attributes we ascribe to God belong primarily to God and in a secondary or derivative manner to creatures. Though the understanding of the attributes in question that we build up through experience is admittedly imperfect, it is nevertheless an understanding of God. Accordingly, Scotus’s univocity thesis conforms to the medieval consensus that inasmuch as concepts are of creatures that imperfectly imitate or represent God, they are of God imperfectly conceived (Ord. I, d. 3, pt. 1, qq. 1-2, n. 56, Vat III:38-39; pt. 2, q. 2, q. un., n. 294, Vat. III:179).

7. References and Further Reading

a. Primary Sources

  • Aquinas, Thomas. Summa Theologica. Translated by Fathers of the English Dominican Province.
  • Aristotle. The Complete Works of Aristotle: The Revised Oxford Translation. Edited by J. Barnes. 2 volumes. Bollingen Series. Princeton: Princeton University Press, 1984.
  • Duns Scotus, John. Opera omnia. Edited by C. Balić, et al. Vatican Scotistic Commission. Rome: Polyglot Press, 1950-
  • Duns Scotus, John. Duns Scotus on Time and Existence: The Questions on Aristotle’s ‘De interpretatione’. Translated with introduction and commentary by Edward Buckner and Jack Zupko. Washington, D.C.: The Catholic University of America Press, 2014.
  • Duns Scotus, John. Duns Scotus, Metaphysician. Translated and edited with commentary by William A. Frank and Allan B. Wolter. West Lafayette, Indiana: Purdue University Press, 1995.
  • Duns Scotus, John. John Duns Scotus, Philosophical Writings: A Selection. Translated with introduction and notes by Allan Wolter. Foreword by Marilyn McCord Adams. Indianapolis: Hackett, 1987.
  • Duns Scotus, John. John Duns Scotus, God and Creatures: The Quodlibetal Questions. Translated with introduction, notes, and glossary by Felix Alluntis and Allan B. Wolter. Princeton, N.J: Princeton University Press, 1975.
  • Duns Scotus, John. John Duns Scotus, A Treatise on God as First Principle. Translated and edited with commentary by Allan B. Wolter. Chicago: Franciscan Herald, 1984
  • Ghent, Henry. Quodlibeta Magistri Henrici Goethals a Gandavo Doctoris Solemnis. Paris, I. Badius, 1518; repr. In 2 vols., Louvain, Bibliothèquee, SJ, 1961.
  • Ghent, Henry. Summa quaestionem ordinariarum. Paris 1520; repr. In 2 vols., ST. Bonaventure, NY, Franciscan Institute, 1953.
  • Ockham, William. Opera Philosophica I – Summa Logicae St. Bonaventure, N.Y: Editiones Instituti Franciscani Universitatis S. Bonaventurae, 1974. 899 p., eds. Boehner, Philotheus, Gál, Gedeon, 1915- Brown, Stephen.

b. Secondary Sources

  • Burrell, David. “John Duns Scotus: The Univocity of Analogous Terms.” The Monist 49 (October 1965) 639-58.
  • Cross, Richard. Duns Scotus. Oxford, 1999.
  • Cross, Richard. “Where Angels Fear to Tread.” Antonianum 76 (2001): 7-41.
  • Cross, Richard. “Duns Scotus on God.” Ashgate, 2005.
  • Cross, Richard. “Univocity and Mystery.” In New Essays on Metaphysics as Scientia Transcendens. Edited by Roberto Hofmeister Pich. Fédération Internationale des Instituts d’Études Médiévales, 2007.
  • Cross, Richard. “Duns Scotus and Analogy: A Brief Note.” The Modern Schoolman 89:3/4 (2012): 147-54.
  • Dumont, Stephen D. “The Univocity of the Concept of Being in the Fourteenth Century: John Duns Scotus and William of Alnwick.” Mediaeval Studies 49 (1987): 1-31.
  • Dumont, Stephen D. “Transcendental Being: Scotus and Scotists.” Topoi 11 (Sept. 1992): 135-48.
  • Dumont, Stephen D. “Henry of Ghent and Duns Scotus.” Medieval Philosophy 3 (1998a): 291-328.
  • Dumont, Stephen D. “Scotus’s Doctrine of Univocity and the Medieval Tradition of Metaphysics.” In Was ist Philosophie im Mittelalter? Edited by Jan Aertsen and Andreas Speer. Walter de Gruyter, 1998b.
  • Goris, Wouter and Aertsen, Jan, “Medieval Theories of Transcendentals”, The Stanford Encyclopedia of Philosophy (Summer 2013 Edition), Edward N. Zalta (ed.), URL = http://plato.stanford.edu/archives/sum2013/entries/transcendentals-medieval/
  • Gracia, Jorge J. E. “Categories Vs. Genera: Suárez’s Difficult Balancing Act.” In Categories and What is Beyond. Proceedings of the Society for Medieval Logic and Metaphysics Volume 2. Cambridge Scholars Publishing, 2011: 7-18.
  • Hall, Alexander. Thomas Aquinas and John Suns Scotus: Natural Theology in the High Middle Ages. Bloomsbury, 2009.
  • Hall, Alexander “Confused Univocity?” In Proceedings of the Society for Medieval Logic and Metaphysics 7 (2007): 18-31; reprinted in Medieval Metaphysics; or is It “Just Semantics”?  Cambridge Scholars Publishing, 2011. Co-edited with Gyula Klima.
  • Ingham, Mary Beth. “RE-Situating Scotist Thought.” Modern Theology 21:4 (2005): 609-618.
  • King, Peter. “Scotus on Metaphysics.” In The Cambridge Companion to Duns Scotus. Edited by Thomas Williams, 15-68. Cambridge: Cambridge University Press, 2003.
  • Klima, Gyula. “Nominalist Semantics.” In The Cambridge History of Medieval Philosophy. Volume 1. Edited by Robert Pasnau and Christina Van Dyke, 159-172.
  • Kraut, Richard. The Cambridge Companion to Plato. Cambridge: Cambridge University Press, 1992.
  • Marrone, Steven. “The Notion of Univocity in Duns Scotus’s Early Works.” Franciscan Studies 43 (1983): 347-95.
  • Noone, Timothy. “Alnwick on the Origin, Nature and Function of the Formal Distinction.” In Franciscan Studies 53 (1993): 231-61.
  • Pickstock, Catherine. After Writing: On the Liturgical Consummation of Philosophy. Oxford: Blackwell Publishers, 1998.
  • Pickstock, Catherine. “Duns Scotus: His Historical and Contemporary Significance.” Modern Theology 21, no. 4 (2005): 543-574.
  • Pini, Giorgio. Categories and Logic in Duns Scotus: An Interpretation of Aristotle’s Categories in the Late Thirteenth Century. Studien und Texte zur Geistesgeschichte des Mittelalters, Bd. 77. Brill, 2002.
  • Pini, Giorgio. “Univocity in Scotus’s Quaestiones Super Metaphysicam: The Solution to a Riddle.” Medioevo 30 (2005a): 69-110.
  • Pini, Giorgio. “Scotus’s Realist Conception of the Categories: His Legacy to late Medieval Debates.” In Vivarium 43.1 (2005b): 63-110.
  • Pini, Giorgio. “Two Models of Thinking: Thomas Aquinas and John Duns Scotus on Occurrent Thoughts.” In Intentionality, Cognition, and Mental Representation in Medieval Philosophy, edited by Gyula Klima, 81-103. Medieval Philosophy: Texts and Studies. Fordham University Press, 2015.
  • Read, Stephen. “Concepts and Meaning in Medieval Philosophy.” In Intentionality, Cognition, and Mental Representation in Medieval Philosophy, edited by Gyula Klima, 9-28. Medieval Philosophy: Texts and Studies. Fordham University Press, 2015.
  • Ross, James and Todd Bates. “Duns Scotus on Natural Theology.” In The Cambridge Companion to Duns Scotus. Edited by Thomas Williams, 193-238. Cambridge: Cambridge University Press, 2003.
  • Trakakis, N. N. “Does Univocity Entail Idolatry?” Sophia 49 (2010): 535-555.
  • Williams, Thomas, ed. The Cambridge Companion to Duns Scotus. Cambridge: Cambridge University Press, 2003.
  • Williams, Thomas. “The Doctrine of Univocity is True and Salutary.” Modern Theology 21, no. 4 (2005): 575-585.
  • Wipple, John. “Metaphysics.” In The Cambridge Companion to Aquinas. Edited by Norman and Eleonore Stump, 85-127. Cambridge, Cambridge University Press, 1993.
  • Wolter, Allan B. Transcendentals and their Function in the Metaphysics of Duns Scotus. New York: St. Bonaventure, 1946.

 

Author Information

Alexander Hall
Email: AlexanderHall@clayton.edu
Clayton State University
U. S. A.

Arnold Geulincx (1624—1669)

Arnold (or Arnout) Geulincx was an early-modern Flemish philosopher who initially taught at Leuven (Louvain) University, but fled the Catholic Low Countries when he was fired there in 1658. He settled at Leiden, in the Protestant North, where he worked under the patronage of the Cartesian Calvinist theologian Abraham Heidanus (1597-1678), and tried to obtain a post at Leiden University. Geulincx was never to procure a steady position in his new surroundings, and ultimately died in poverty as a victim of the 1669 Leiden plague. On the basis of Descartes’ philosophy, he developed a range of philosophical ideas that sometimes closely resemble Spinoza’s, but always have a particular flavour of their own. His contributions in the fields of logic, metaphysics and ethics have earned him a place not only in the history of Dutch Cartesianism, but in Western intellectual history at large.

As a result of accusations that he had been a Spinozist in disguise, Geulincx’ name was almost erased from history after 1720, but nineteenth-century historians rehabilitated Geulincx for having been a forerunner of Immanuel Kant. Nowadays, Arnold Geulincx is primarily known as a representative of seventeenth-century “occasionalism”, and as an original thinker in-between Descartes and Spinoza. Despite a certain impact he made on his immediate Leiden pupils, such as the Dutch Cartesians Cornelis Bontekoe (c. 1644-1685) and Johannes Swartenhengst (1644-1711), and on the English philosopher Richard Burthogge (1638-1705), as well as on a number of enlightened members of the Dutch Calvinist clergy during the last quarter of the seventeenth century, Geulincx’ most significant influence in intellectual history to date has been on the novels and plays of Samuel Beckett (1906-1989), as well as, through Beckett, on late twentieth-century French philosophy.

Table of Contents

  1. Life
  2. Logic and Method
  3. Metaphysics
  4. Ethics
  5. Anti-Aristotelianism
  6. A Philosophy of Wonder
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Arnold Geulincx was born in the city of Antwerp, which despite having lost its former glory as a hub of world trade and a centre of the arts, had regained new vigour as the home of Counter-Reformation culture in the Southern Netherlands—a new spirit that was evidenced in the paintings of Peter-Paul Rubens, Jacob Jordaens and Anthonie van Dijck, as well as in the Baroque church of Saint Carolus Borromeus, a Jesuit monument consecrated in 1625. Geulincx’ father apparently did well as the city’s messenger to Brussels, since he bought a large house just around the corner of Saint Carolus Borromeus’ Church when Arnold was around thirteen years of age, and another, adjacent one, a year later. While Jan Geulincx, one of Arnold’s younger brothers, studied with Jacob Jordaens for some time, Arnold was destined for an academic career and left Antwerp to go to university in January 1640. In Leuven, he studied arts and philosophy at the College of the Lily, obtaining his licentiate on November 19, 1643 ranking second best in a class of 159 students. Reading theology for some time, Geulincx was appointed junior professor in philosophy at Lily College in December 1646.

Not much is known about Geulincx’ early career, but it is reasonable to assume that he made a strong impression with his rhetorical skills, founded on the remarkable proficiency in Latin he had already exhibited during his Antwerp school days. In the autumn of 1649, Geulincx’ career perspectives seemed secure enough for his parents to give up their life in Antwerp and join their son in Leuven, the area they originally came from. Another three years hence, Geulincx became senior professor and was asked to deliver a series of speeches during the end-of-the-year Saturnalia festivities. Protests against the nova philosophia at Leuven may have been prompted by Geulincx’ opening address on December 16, 1652, where he ventilated Baconian ideas and outlined recommendations for changes to be made to the university curriculum.  Initially, however, this did not in any way hinder a successful continuation of his academic career.

Disgrace and downfall came only in 1658, when Geulincx was dismissed, presumably on account of attempting to breach the rule of celibacy for university professors by planning to marry his cousin Susanna Strickers— a privilege that had been granted to his Lily tutor William Philippi (1600-1665) in 1630 only after mediation by the Brabant Council. Reportedly, the 1630 agreement had been made on the explicit condition that this would be the last time. A later eighteenth-century source mentions disputes with his colleagues and debts as reasons for Geulincx’ dismissal, but these have been impossible to trace. Since there is no evidence that the committee that sacked him had any problems with Geulincx personally, it may well have been the case that he simply had to choose either not to marry or to leave.

Religious considerations, however, may also have played a part. A letter of recommendation signed May 3, 1658 by the three Leiden theologians: Abraham Heidanus, Johannes Coccejus (1603-1669) and Johannes Hoornbeek (1617-1666), not only indicates that Geulincx had turned his back on the Catholic faith after he had taken in St. Augustine’s theory of grace, but also that he had initially visited Holland of his own accord in January 1658, “under the pretext of another trip to this province” (Eekhof, 1919: 19). Upon his return to Leuven, Geulincx had found that a successor had been appointed in his place. Although, as the text tells us, he had already decided to give up his position at Leuven, he had not expected the hostile reaction he was met with, since the letter specifies that “he barely escaped a life sentence.” If it is indeed the case that Geulincx confronted his colleagues in January 1658 with the embarrassing fact that he had left Leuven prompted by the intention to convert to Protestantism, it is likely they treated his case with utmost efficiency and discretion. Such intentions would have come at a very untimely moment. Amidst condemna­tions of Jansenism issued by the Vatican, and declarations of political freedom made by the Brabant Council, there was extremely little room to maneuver for Leuven’s university professors. They may well have been happy to explain Geulincx’ dismissal, if at all, in terms of marriage plans rather than dogmatic preferences. Rumors about Geulincx’ debts, moreover, may have had their origin in the fact that, under these circumstances, Geulincx had to leave everything behind in a hurry, and flee to Leiden penniless.

In his new home town, Geulincx was to graduate in medicine on September 17, 1658, no doubt in order to be able to earn a living. He married Susanna on December 8. Rather than to become a doctor, however, his ambition was to resume his career as a professor of philosophy. After a series of appointments and dismissals, Geulincx was finally appointed as junior lecturer in late 1662, with the help of Heidanus; first in logic; then in metaphysics. He was temporarily appointed as Professor extraordinarius in 1665, but was allowed to teach ethics only in February 1667. From June to November 1669, Geulincx was again newly appointed, now in order to teach rhetoric. He died in poverty in November 1669, having failed to pay any rent for the apartment he had shared with Susanna since October 1668. When Susanna died around the New Year, there was only some furniture left to compensate the couple’s creditors.

2. Logic and Method

With two works on the subject of logic, Geulincx had nevertheless started off his Leiden career in a positive mood. In the first of these works, his Logica suis fundamentis restituta (Logic Restored to its Foundations, 1662), Geulincx interprets negation mainly as propositional negation, that is, as acting on the whole of a proposition, not on terms. The other book, Methodus inveniendi argumenta (A Method for Finding Arguments, 1663), used set theory relations to demonstrate logical principles. For this way of approaching logic, the Dutch philosopher Gabriël Nuchelmans (1922-1996) would later refer to Geulincx’ logic as a “containment theory of logic”, in which relations of containment illustrate how statements are implied by other statements. Containment may explain logical consequence, for instance, since the propositional content of a statement q may be implied by that of p, just as, according to Geulincx, every proposition p will entail any number of further statements implied by p. Interpreting the way in which subjects relate to predicates in terms of relations of containment as well, Geulincx considered subjects as the denumerable “parts” of conceptual “wholes”, and considered the connection between subjects and predicates to be made on the basis of the “relation in which they stand to one another within the hierarchical structure of a conceptual field” (Nuchelmans, 1988: 40).

Producing a modernised summary of Geulincx’ propositional logic in the 1939 issue of Erkenntniss, the Swiss logician Karl Dürr (1888-1970) portrayed Geulincx as an early representative of symbolic logic. Geulincx presented his logical principles in a purely conceptual form and evidently depended on earlier scholastic traditions, such as in his formulation of De Morgan’s laws, which reproduce the fifteenth-century account John Versor offered in his commentary on Peter of Spain. Yet, according to Dürr, Geulincx’ logic contained all the elements of a mathematical logic, complete with variables and logical constants, as well as other remarkable features, such as a Tarskian definition of truth.

To weigh the sophistication of a seventeenth-century system of logic against its medieval forerunners, or to assess its significance for the development of later formal logic is, however, a complicated matter. In a later study, Dürr compared Geulincx’ achievement with similar works in logic, such as the Port-Royal Logic (1662), and works by Johannes Clauberg (1622-1665), Leibniz and Girolamo Saccheri (1667-1733). Dürr came to the conclusion that, especially in the area of propositional logic, Geulincx’ system was richer than that of most of his contemporaries, whilst in the field of term logic, his basic rules for the formal validity of syllogisms surpassed even those of Leibniz in elegance and precision (Dürr, 1965).

As a senior professor in Leuven, Geulincx had previously shown an interest in Baconian philosophy and had proposed to revise the university’s curriculum in such a way that natural philosophy might be studied as a separate field that also included logic and mathematics, as well as forms of experimentation. It is unknown whether Geulincx developed Cartesian views in Leuven as well, as did his Leuven colleague William van Gutschoven (c. 1618- 1667) and his tutor William Philippi (c. 1600-1665) at some point. However this may be, it was only in Leiden that Geulincx began to develop a Cartesian line of argument in natural philosophy, metaphysics and ethics, and expounded views on God’s causal role in nature that would later be interpreted as “occasionalist”.

3. Metaphysics

The appeal to God’s causal activity would become a central feature of both Geulincx’ metaphysics and his ethics, but the way in which he justified and explained the need for a divine administration of the activities normally attributed to “secondary causes”—that is to say, to individual persons and things—differs markedly from the arguments seen in the works of medieval Islamic “occasionalists” and Cartesian contemporaries such as Louis de la Forge, Géraud de Cordemoy and Nicholas Malebranche. Rather than developing the theological view that God exercises full power over man’s causal and epistemological functions; or questioning the metaphysically problematic notion of an exchange of accidents between substances; or, finally, dismissing the possibility that purely corporeal bodies might have a power to move either themselves or other bodies, Geulincx developed his so-called “occasionalist” position on the basis of an interpretation that grounds the idea of causality on the inner experience of active involvement (Renz & Van Ruler, 2010). What may pass for causality in the strictest sense is revealed by what human beings are familiar with, and what they experience within themselves as their own activities: the conscious awareness of “doing” things. Geulincx thus turns the Cartesian focus on human awareness, with its potential for deliberate and conscious activity, into the bedrock of a metaphysics of causal activity. With the notion of activity being linked to states of mental awareness, causality itself becomes the privilege of conscious minds, and a phenomenon for which the subject “doing” them is uniquely responsible.

At the same time, the scope of human activity is greatly reduced on the basis of such a criterion. Since the Cogito, or human consciousness, realises that there are many thoughts (cogitationes) that do not depend on the subject having them, Geulincx very early on in his Metaphysica Vera drew the conclusion that, “[t]here is a knowing and willing being distinct from me” (Geulincx, 1892: 150). It is this being, God, who arouses in us, through his manipulation of matter, the thoughts for which, not knowing how they come about, we cannot claim responsibility ourselves. On the basis of this consideration, Geulincx came to formulate the maxim that has become known as the first axiom of his philosophy: Quod nescis quomodo fiat, id non facis, in other words: “What you do not know how to do, is not your action” (Geulincx, 1892: 150).

In the Metaphysica Vera, or True Metaphysics, first published posthumously in 1691, the focus on the various causal roles of God and man gives rise to a tripartition of the discipline into an Autologia, a philosophy of the Self; a Somatologia, or a metaphysics of the World; and, finally, a Theologia, on God. To include a discussion of the physical universe in an exposition on metaphysics is something that would have been uncharacteristic for Descartes, but it is a move towards a deeper, metaphysical, understanding of nature that Geulincx shares with Spinoza. In fact, although the Metaphysica Vera is an unfinished text that was never authorized and leaves many questions unanswered, it testifies to the way in which various ontological conceptualisations in Spinozism have their antecedents in Geulincx. One of these is the distinction of causal levels into substantial and modal spheres. A significant aspect of Geulincx’ understanding of physical reality is his duplication of the world into a world of “becoming” and a world of “being”— a distinction Geulincx relates to Plato. According to this view, all individual bodies, with their states of “presence” and “absence”, belong to the world of becoming. Based on the idea that a world of mere effects cannot be all there is, Geulincx’ Platonic interpretation of the Cartesian universe introduces the notion of a Body-as-such, in which these effects find their ontological foundation. Carefully avoiding any reintroduction of the Aristotelian terminology of “substance” and “accident”, Geulincx thereby reintroduces the idea of an ontological distinction between the enduring entities of Mind and Body on the one hand, and their varying “modal”, that is, spatio-temporal manifestations on the other. Formulated in Platonic terms in Geulincx and in Aristotelian terms in Spinoza, this quasi-scholastic strategy to distinguish substantial from accidental levels of being results in a metaphysical interpretation of reality in terms of a diversity of ontological spheres – an interpretation that goes well beyond Descartes, but that we find in both Geulincx’ Metaphysica vera and Spinoza’s Principia Philosophiae Cartesianae, Short Treatise and Ethics (Van Ruler, 2009). In Geulincx, moreover, Descartes’ indistinct metaphysical categorizations, in which a single universal matter occurs next to a set of countless individual minds and a single God, is transformed into a strict metaphysical dualism according to which there are only two things: God, or Mind, on the one hand, and World, or Matter, on the other. Placing human minds in God, moreover, Geulincx also prefigured Spinoza in his way of arguing that human minds, like human bodies, are parcels of a larger field, or “modes”.

4. Ethics

Parallels with Spinozistic ways of thinking equally occur in Geulincx’ treatment of the subject of ethics. In both authors, Descartes’ natural philosophy serves as a new basis for the neo-Stoic view that morality should primarily be seen as a way of mentally dealing with inevitable patterns of causality in nature and human social life. According to Geulincx, moreover, the application of reason to all areas of experience is the practical upshot of a mental attitude focused on a “love of God”. Contrary to Spinoza, Geulincx had no qualms with the idea that one is free whether or not to align oneself mentally to the necessary course of things. To put it in Geulincx’ own words: whereas one always obeys God, one has the option whether or not to obey reason – and this is what constitutes the criterion of morality.

With his focus on reason, Geulincx conforms to a general tendency within Renaissance moral philosophy. At the same time, he interprets what is reasonable in his own peculiar way, introducing a new set of four cardinal virtues, namely diligence, obedience, justice and humility, in place of the old quadriga of temperance, fortitude, justice and prudence. These virtues are all aimed at reason. Accordingly, rather than being directed towards other human beings, what Geulincx prescribes as obedience is an obedience to reason, just as humility is a mental humility in the face of reason, diligence involves a diligent attention to reason, and justice is the acceptance of a just and reasonable mean.

Reason should always be followed, but in the context of such encouragements to mental subservience, the example with which Geulincx illustrates obedience is easily misread. Even the wretched life of a slave, Geulincx argues, may be lived in freedom, as long as the slave is able to direct his will to the call of reason and to endure even “an appalling and cruel slavery” by obeying orders not because it is the will of his master, but because it is his own (Geulincx, 1986: 82; See also 1893: 23; 2006: 24). Despite its awkward way of seemingly sanctioning slavery, this argument only carries to the extreme another conception predominant in both classical and Renaissance traditions of Western moral philosophy, and most straightforwardly expressed in (neo-)Stoic sources: the notion that mental freedom does not depend on the relative force of outward circumstances, but is brought about exclusively by an inner consent to the demands of reason.

In combination with his metaphysical view on the limitations of human causal activity, such a radical endorsement of intellectualist and indifferentist arguments would seem inevitably to lead to a moral position emphasising a passive or even submissive attitude. Geulincx, however, did not preach quietism. The complete text of his Ethics was published only posthumously in 1675, presumably by Bontekoe, under the title of Gnōthi seauton, or Know Thyself, but Geulincx had already issued a Dutch version of the first of its six “Treatises” as Van de Hooft-deuchden (“On the Cardinal Virtues”) in 1664. Far from teaching resignation, the book contains an exceptionally practical list of ethical maxims and reads like a self-help manual in popular psychology rather than a moral treatise in the traditional sense of the word. What, according to Geulincx, is reasonable for a human being to do in the light of the “human condition”— a concept he may have taken over from the French moralist Pierre Charron, or from his Leuven professor in theology Libert Froidmont (1587-1653)— is to abide by seven moral guidelines, or “obligations”: to accept death, to avoid suicide, to take care of one’s health and of that of one’s species, to learn a trade, to earn a living, to relax now and again, and never to curse one’s ancestry or day of birth.

With respect to all of these guidelines, Pierre Charron’s De la Sagesse (1601; revised edition 1603) may have provided Geulincx with a model for the kind of things a moral treatise should instruct (De Vleeschauwer, 1974). Geulincx, however, explained his obligations on the basis of a quasi-Cartesian metaphysical groundwork that at first sight seems to undermine rather than to support them. Denying, like Spinoza, the possibility of any interaction between the body and the mind, Geulincx comes to the conclusion that the human being is only an onlooker, a “spectator” of the outside universe: “I am a mere spectator of a machine whose workings I can neither adjust nor readjust” (Geulincx, 1893: 33; 2006: 34). This would seem to make all human activity not only irrelevant, but downright impossible. Geulincx, however, argues that we should nevertheless be mindful to fulfil certain actions we know from experience God wishes us to perform. We have to search for food, for instance, in order to survive, and we should try to comply in as far as we are able with such evident commitments. Indeed, an attentiveness to the basic facts of life is what links the two aspects of what Geulincx presents as his ethics of ‘humility’. On the one hand, this is the “occasionalist” Inspection of Oneself that tells us we find ourselves in a situation we neither control nor really understand; and, on the other hand, the list of “Obligations” that mark the obvious tasks we have to fulfil, and thus comprise a Disregard of Oneself. We should always choose what we know to be best. The only thing we should not do, according to Geulincx, is to bother about the outcome of our wishes, all of which are ultimately up to God. Thus, in the end, it is only our intentions that matter. In a famous example, Geulincx argued that it is for God to decide whether or not one is killed by the dagger with which one penetrates one’s heart. How one’s volitions are matched by activities produced in the material sphere “outside” is necessarily beyond us.

Geulincx does not speculate on the question to what extent we may rely on God’s resolve. Since the way in which God links physical to mental states is unknown to us, it is unclear whether Geulincx himself expected God either to have established a permanent world order or to produce an incessant number of miracles. As German commentators argued in late-nineteenth century debates on the possible impact of Geulincx on Leibniz, the analogy of two independent but synchronised clocks that Geulincx introduced in order to explain the relation between body and mind, seems to accentuate the Cartesian idea of a law-like regularity in nature. This is a position consistent with the emphasis laid on the notion of reason in Geulincx’ ethics. Yet where human volitions are in play, such as in Geulincx’ example of the dagger, or in his references to the phenomenon of paralysis, it would seem that God might have a more immediate role to play.

In the end, a solution to such metaphysical questions is not Geulincx’ primary concern in the context of ethics. As far as morality is concerned, it does not matter whether God makes a singular decision or whether he lets all physical conditions play their proper roles whenever one wishes to pierce one’s heart with a dagger. The moral point is, that this should not have been one’s intention in the first place. In this sense, the example is not so much meant to elucidate a metaphysical viewpoint, as it is indicative of Geulincx’ preoccupation with questions of life and death, and with the idea that the realm of the moral is defined by the mental attitude one takes with respect to preserving the condition that one finds oneself in as a conscious being. This is also the way in which to read the ethical axiom that Geulincx introduced as a counterpart to his earlier metaphysical maxim. The slogan Ubi nihil vales, ibi nihil velis, “Wherein you have no power, therein you should not will” (Geulincx 1893: 164; 2006: 178), applies not so much to any specific activities, but rather to human existence—“the human condition”— as such.

Although it is very likely that Spinoza (whose friend Lodewijk Meyer studied with Geulincx) must at least have known Geulincx by name, and although one may trace many coincidences in their works, there is no convincing evidence that the two men either knew each other, or knew each other’s work (Van Ruler, 2006). Likewise, it is unknown to what extent Geulincx’ moral philosophy may have inspired Spinoza. Spinoza may have been thinking of Geulincx for instance when, in his own Ethics, he explicitly denied that humility is a virtue, since this was the single most important of the four cardinal virtues for Geulincx. Spinoza may, on the other hand, also have wished simply to make an unequivocal statement against the traditional glorification of humility in Christian theological contexts.

Contrary to Spinoza, Geulincx presented his own moral philosophy as a philosophy compatible with Christian views. Interpreting certain Christian themes in purely naturalistic ways, such as by taking the “devil” solely to stand for a mental propensity to persist in inconsiderate behaviour, Geulincx’ Christian philosophy was unorthodox, but it was also paradoxical in various respects. He considered his moral philosophy to be an ethics exclusively founded on reason. Still, God’s word, or so Geulincx argued, had worked for him like a microscope: once Scripture had revealed the truth, he was now able to decide questions of right and wrong without its help – in other words, purely on the basis of reason. The obvious implication of this is that pagan philosophers could never have been able to find their way in matters of moral philosophy, and this is indeed what Geulincx concluded. True spiritual redemption was open only to philosophers acquainted with what Scripture had shown to be reasonable: the idea that one has no title to one’s life and that this insight should bear fruit in an attitude of humility. Geulincx might still include Platonic, Stoic and even Aristotelian concepts, and stick to classical forms of philosophical analysis in his ethics, but he dismissed all pagan philosophies for having been developed on the basis of inappropriate motivations. All pagans had urged “for the Land of Cockaigne”; they had craved for pleasure rather than having searched for God (Geulincx 1893: 52-54; 1966: 116-118). The pagans, in other words, had consciously aimed at achieving happiness, when all they should have been doing was to look for what is right. The difference between these two roads, Geulincx admits, is a very subtle one, for since reason and Christianity themselves lead to happiness, one has to be extremely careful to avoid the pitfall of self-centered motivations even as a Christian philosopher.

If, as Geulincx argues, one has to flee happiness in order to pursue it, it may seem tempting to try to flee happiness exactly for the reason of acquiring it. In that case, however, Geulincx argues, happiness “will not pursue you” (Geulincx, 1893: 58; 2006: 57). In other words, while one knows happiness will result from the fulfilment of a duty, one still needs to fulfil the duty without doing it with the aim of acquiring happiness, or it will not work. The notions of “Obligation” and “Law” may help to avert any psychological dilemmas here. Laws, according to Geulincx, never correspond to obvious forms of self-interest, or they would not be laws. If only we direct our mind “to refer nothing of what we do or do not do to our Happiness, but everything to our Obligation” and thus “pledge” ourselves “wholly to God” (Geulincx, 1893: 58 and 57, respectively; 2006: 57 and 56), there is no problem. Libertas will be the immediate, if paradoxical, effect of obedience; and happiness, Felicitas (or beatitude, Beatitudo, a word Geulinx uses for Felicitas only when explaining matters in the accepted scholastic terminology) will present itself automatically as the mental bonus for abiding by the way of virtue. Simply doing what God and reason demand, the wise man is able to disconnect his mind from sensory impressions, and to assent to what happens in God’s universe not according to what is most agreeable to him, but according to the way in which reason presents things as they are.

The complicated dialectics of receiving happiness in return for virtue caused Geulincx to touch upon theological questions as well. If Geulincx became a Jansenist in Leuven after having imbibed Augustine’s theory of grace, and a Calvinist later in Leiden, he must at some point have become aware that the whole idea of devising a Protestant moral philosophy was something inherently problematic. Theologically speaking, there could be no question of a Jansinist or Calvinist God distributing happiness in return for our effort. Geulincx was well aware of this, and therefore attempts to deny that he ever implied that God acts in reply to our achievement: “But mark: I did not say that the Humble first love God, and are then loved in return by God. Certainly not, I did not say this, and this should suffice” (Geulincx, 1893: 64; 2006: 63). Philosophically speaking, the rewards of virtue were nevertheless exactly this: God’s love in return for our love of God and reason. In line with a wider tendency in Dutch Cartesianism, Geulincx inevitably had to argue for a strict separation between philosophy and theology in order to save the practical relevance of his moral philosophy.

Besides classical, Christian and Cartesian themes, there may also have been biographical factors involved in shaping Geulincx’ ethics. The precarious living conditions of his Leiden years in particular seem to be reflected in his preoccupation with the insecurities of life and with the possibility of suicide, both of which topics are central to his ethics. Yet such interests may also have had their origin in a special talent for the experience of wonder in Geulincx, as well as an exceptionally subtle philosophical imagination.

5. Anti-Aristotelianism

It is in foreshadowing quasi-Kantian themes that Geulincx’ philosophical discernment appears most conspicuously. Essentially a criticism of Aristotelian ways of thinking, Geulincx’ Metaphysica ad mentem peripateticam, a book that was published only posthumously in 1691, argued that there was an illusory quality to thinking, aside from the illusiveness of sense perception. Not only was it true that our senses, as Descartes had argued, yield a subjective view of the world; according to Geulincx, our intellectual “ways of thinking” (modi cogitandi) distort our conception of reality just as much. Indeed, it is with intelligible species that we impose our ways of thinking on outside things similarly to the way in which we impose sensible species onto the world that do not apply to things as they are in themselves. Both ways, we “always attribute the phantasms (phasmata) of sense and intellect to things themselves”— even if “there is something divine in us that always tells us it is not so” (Geulincx, 1892: 301).

Once more giving a stricter format to Cartesian intuitions than Descartes himself would have done, and prefiguring Spinoza on both accounts, Geulincx distinguished four different kinds of knowledge and drew a sharp distinction between the realm of “imaginations” and the realm of “ideas”. Holding on to a classical notion of scientia that limits the notion of “idea” to the knowledge of the “essence” of a thing, Geulincx interpreted the gradual development of epistemological stages in Platonic rather than in Aristotelian terms and classified the respective levels of knowledge as (1) sense perception, (2) knowledge, or cognitio, (3) scientia, or knowledge with an account; and, finally, (4) the ultimate kind of scientia that is called sapientia or wisdom, which is available only to whomever is accountable for the thing known. Thus offering a seemingly Augustinian-inspired understanding of “ideas” as the kind of things in God’s mind that we must somehow have access to in order to intuit the essences of things, Geulincx in fact denied man any wisdom apart from the wisdom related to his own mental activities, such as our mental activities of love and hate, affirmation and negation and so forth, the reason for this being that to understand these and to will are, in the end, the only things one can actually “do”.

Wisdom accordingly presents itself in Geulincx mainly in a negative way; that is to say, in the form of a recognition that our intellectual capacities are extremely limited with respect to understanding things that occur outside the realm of consciousness. Although the mind knows that all things are either minds or bodies and that infinite mind and infinite extension (that is, God and Body) are ultimately all there is, our “modes of thinking”, in other words, our ways of apprehending reality, misrepresent things as they are in themselves by seeing them as separate “beings” that may function as the subject of predication. Yet we have to see them in this way, if we wish to say something about them.

Although there is an immediate Cartesian context as well as a Scotist terminological background to these arguments, and although, like Geulincx, authors such as Clauberg and Johannes de Raey (1622-1702) had also tried to come to terms with the indistinct manner in which Descartes had discussed general metaphysical concepts in the Principia Philosophiae (Aalderink: 2009), Geulincx’ position stands out for the way in which it emphasizes how the human intellect is liable to characterize the outside world in terms of forms of propositional content that portray whatever there is as being divided into objects possessing certain properties. As Geulincx himself remarks (Geulincx, 1892: 199), “few people seem to observe” that this logical mould introduces ontological classifications for which there is actually no basis in reality itself.

Geulincx thus came to criticise a philosophical viewpoint that had been almost universally shared since Aristotle, the idea, namely, that ontological concepts such as the concept of “substance” may function in parallel ways in metaphysics and logic. His criticism of this view (which is not merely a Peripatetic, but in fact a virtually universal human assumption) launched the epistemologically radical idea that the linguistic and logical ways in which our concepts function within our intellectual representations of the outside world, should actually be a warning against taking them seriously in metaphysical terms. According to Geulincx, logical and linguistic distinctions do not necessarily represent things as they are in themselves. Indeed, notions such as “being (ens), substance, accident, relation, subject, predicate, whole and part” only illustrate how we think about objects. As modes of thought we use these notions to express what we mean when we distinguish a thing from its activity or from our judgement of it. Our manner of understanding, however, should not be confused with the way things are structured and organised independently of our representations of them. Nor should we uncritically build philosophical systems on the categories and logical forms that help us to analyse what we experience.

Because of the way in which he gave prominence to, and ultimately dealt with, the question of the knowability of “things as they are in themselves”, Geulincx’ position has often been associated with the critical philosophy of Immanuel Kant. Ernst Cassirer (1874-1945), for instance, saw both Geulincx’ thesis of the unknowability of “things in themselves” (translated in German as Dinge an sich) and his view that all human understanding is dependent on “forms of thought” brought in by ourselves, as prefigurations of the Kantian position. Although he remained careful not to deny the differences between Geulincx and Kant, the Flemish Geulincx scholar (and former nazi-sympathiser in exile) Herman de Vleeschauwer (1899-1986) in 1957 agreed that if one defines “Criticism” as the theory according to which “we know things only by the medium of our forms of thought”, one could no longer “regard it as the personal discovery of Kant” (De Vleeschauwer, 1957: 63).

In general terms, Geulincx’ alertness to the possible incongruity between the logic of our thoughts and the structure of the outside world may indeed be compared to Kant’s. It may even be extended beyond Kant to serve as a comparison between the Flemish Cartesian’s criticisms of scholastic views and Wittgenstinian, as well as postmodern censures of the metaphysical suggestion that logical forms reflect an ontological structure of things. At the same time, Geulincx stood closer to other seventeenth-century denunciations of Aristotelianism inspired by Descartes, such as John Locke’s. Exposing scholastic metaphysics as a logical scheme functional only within the domain of our daily interaction with macroscopic objects, Geulincx’ evaluation of Peripatetic metaphysics, although it is cast in a rather scholastic terminology itself, anticipates Locke’s view in so far as it confirms the idea that there is a purely nominal aspect to the Aristotelian manner of metaphysical categorisation.

And yet Geulincx’ Metaphysica ad mentem peripateticam creates a sense of epistemological alienation that goes far beyond Locke’s criticism of the notion of substance. If, as a direct consequence of Cartesian natural philosophy, Geulincx argued that scholastic types of analysis in metaphysics might be exposed as logico-linguistic frameworks only, this not only meant that there is a certain contingency to the “essences” derived from mere experience; it also meant that the logic of substance itself was mistaken, and that, accordingly, the search for “substantiality” was ill-conceived. Geulincx did, of course, accept the existence of a universal “Body”, but for him, this idea was not dependent on the vague conceivability of a substantial substrate to which one might attach accidental properties. For Geulincx, the notion of Body-as-such may simply be deduced from the fact that one finds many “thoughts” (cogitationes) in one’s conscious experience that do not depend on oneself. Accordingly, there is something out there, something orchestrated by God. This is the World itself, no less— but there is no sense in continuing, like Locke, to see this World as a substance with properties, or to lament the indistinctness of this “something”. With respect to substan­tiality, we should rather be aware that we are misled by our own intellect into searching for it in the everyday world of things. In the Principia philosophiae, Descartes himself had already argued against trying to conceive of substantial beings behind the forms of “extension” and “thought” that we find in nature. Geulincx drew the ultimate conclusion by arguing that the search for a universal “something” of which the property of being extended is an “accident”, arises from the mistaken belief that the world is structured along the lines of our “modes of thought”.

As a consequence, Geulincx does indeed come close to Kant in the sense that his emphasis on the unknowability of things is modified by the idea that the world as it is “in itself”, remains hidden to our observation and eludes our limited epistemological capabilities to grasp what is actually there. Still, Geulincx’ arguments are very different from Kant’s. According to the Flemish philosopher, our intellect imposes a grid on our experience on account of which we necessarily envision the external world as a world of “things”. Doing so, our metaphysical imagination follows the linguistic and logical habit of distinguishing substantives from adjectives in language and subjects from predicates in logic. The problem with scholastic metaphysics is that it draws ontological conclusions from such cognitive ways of dealing with reality. Just as we attribute our sense impressions to the outside world even though, at least at a certain level of mental development, we become aware that such attributions are incorrect, so too should we, with respect to our intellectual understanding of things, come to doubt the way in which we attribute our cogitationes to things in themselves.

According to Geulincx, there is hardly a way to avoid this, and we cling to the idea of distinguishing beings from properties with even more tenacity than we adhere to the idea of attributing mentally experienced qualities to external things in sense experience. Posing the question how we come to conclude that there is a real basis for distinguishing between subjects and predicates, Geulincx rejects the common scholastic ways of arguing for an actual relation of “inherence” between them. He does, however, offer an alternative ground for our habit of seeing things this way. Whenever we refer to things either as “beings” or as “properties”, it may be that we do so because of the relative stability of our various sense impressions: “The real cause (…) may be, that people see some things as more firm, stable and lasting, others as more fluid, fleeting and frail. Thus (…) light and darkness, colours and sounds and all similar things are regarded as more fluid than body or extension” (Geulincx, 1892: 305). What, in other words, modern psychology and evolutionary biology might consider to be innate propensities, Geulincx was tempted to explain on empiricist grounds. Repeatedly confirming that fluid impressions find their support in firmer ones, rather than that firm marks rest on fleeting signals, our senses will encourage our intellect to follow suit and conceptualise the world in terms of independent beings and their dependent properties.

Rather than to Locke, Kant, or Wittgenstein, Geulincx accordingly compares best to Geulincx himself. A similar interpretation of the way in which the human intellect conceptually rearranges sense experience is found only in the works of his pupil Richard Burthogge. According to Burthogge, the senses give us “external qualia, which reason interprets as predicable of substances or subjects”; a position that, in the terminology of analytical philosophy, has been interpreted as a form of “idealism” (Ayers, 2005: 195). As with so many other statements of this underestimated English philosopher, however, this particular view derives straight from Geulincx’ Metaphysica ad mentem Peripateticam.

Again coming closer to Kant than to Locke, Geulincx developed his epistemological arguments vis-à-vis Aristotelianism not so much in order to make room for a new understanding of nature, but rather in order to heighten our philosophical awareness of the fact that we are fundamentally ignorant of what the world is like independently of our experience. As with Kant, moreover, there is a certain religious susceptibility at play in Geulincx’ philosophical concerns. Exhibiting a mental predisposition coloured by Augustinianism in all of his works, Geulincx would always keep wondering at the ineffable character of God’s universe and our position in it.

6. A Philosophy of Wonder

If Geulincx hardly compares to other philosophers in the Western tradition, others did take their inspiration from Geulincx. Having developed an interest in seventeenth-century philosophy during his assistantship at the École Normale Supérieure in Paris from 1928 to 1930, the Irish poet and novelist Samuel Beckett would take up a close study of Geulincx’ works (the Metaphysica Vera and Ethics in particular) at Trinity College Dublin, in the spring of 1936. As a direct result of this interest, Arnold Geulincx was to play a crucial role in Beckett’s Murphy (finished in June 1936 and published in 1938)—a book that presents its leading character preferably sitting naked in his London apartment, tied to a teakwood rocking chair. Implicit references to Spinoza and explicit references to Geulincx accompany the way in which Murphy’s inner experience is detailed, and is further explained in later chapters.

It has been well-established how Geulincxian imagery, such as that of the cradle (which Geulincx used to explain the relationship between our will and God’s, and which Beckett turned into a rocking chair), the two synchronised clocks, and the passenger walking on the deck of a ship against the vessel’s direction (an image Geulincx himself may have derived from Justus Lipsius), were continually reused by Beckett well beyond Murphy; how Geulincxian expressions, such as “coming hither, acting here, departing hence”, turn into metaphorically rich elements of literary structure in Beckett; and how Geulincx’ overall theme of power and impotency would continue to resonate in Beckett’s plays, prose and cinematographic works. If it is true that “[what] chiefly endured for Beckett from Geulincx was his acceptance of ignorance as the basic human condition, his ethic of humility and his advocacy for ascetic withdrawal and rigorous self-examination” (Herren, 2012: 195), it is also clear why Geulincx might come to function as a replacement for Descartes in Beckett, and as “a philosopher who spoke to [Beckett] as no other had” (Cordingley, 2012: 49). The contrast between Geulincx and Descartes may also serve to accentuate that there was a Geulincxian conceptual background to what twentieth-century philosophers may have derived from Beckett’s plays. On account of the element of ineffability that Geulincx added to Cartesianism, it has been argued that the notion of an “absence of self-presence”, particularly in thinking and in authorship (a theme taken up by French philosophers such as Blanchot, Foucault and Derrida), found a Geulincxian inspiration in Beckett (Uhlmann, 2006: 113).

The great difference, however, between Geulincx on the one hand and twentieth-century French philosophers inspired by Beckett’s absurdist plays on the other, is that Geulincx—like Beckett himself, for that matter—had no inclination to diminish the importance of subjective experience. Indeed, it is precisely in this respect that Geulincx’ “experiential” defence of occasionalist arguments was squarely at odds with Malebranche’s alternative notion of God’s pre-ordination of human minds. Later commentators have been surprised by such disparities within occasionalist philosophy (Nadler, 1999), or have even drawn the misguided conclusion that Geulincx was an inconsistent occasionalist (Terraillon, 1912; Rousset, 1999). In fact, rather than to explain away human mental activity on the grounds of theological or determinist dogma, Geulincx not only took the inner world of conscious­ness as his starting point in philosophy, but also saw it as a cause for wonder at the singularity of the human condition. If things prove themselves to be ineffable, it is to the human subject that they do so. Similarly, if outside things remain inscrutable, it is only of inner experience itself that our knowledge is genuine and absolute.

Samuel Beckett is believed to have broken away from making further dogmatic use of philosophy after his post-war realisation that “All I am is feeling” (Uhlmann, 2006: 72). His interest in Geulincx, however, did not suffer from this. If a mood of estrangement, coupled to a painstaking examination of the inner life, is what Beckett found familiar in Geulincx, it is significant that Beckett never studied the quasi-Kantian arguments from the Metaphysica ad mentem peripateticam, arguably Geulincx’ most radical philosophical text. Apart from some transcriptions taken from the Metaphysica vera, Beckett took his notes mainly from Geulincx’ Ethics. A familiarity of viewpoints must have been obvious to Beckett in these texts as well, which may add to our conviction that Beckett’s prolonged interest in Geulincx was based primarily on an affection that went beyond specific images or doctrines of philosophy.

There was obviously “something of a friendship across centuries” between Beckett and Geulincx (Tucker, 2012: 181), apparently motivated by the articulation of a shared experience that Beckett cherished in Geulincx, and that presumably involved a recognition of something very intimate and relatively rare, even if it had been expressed in such technical philosophical contexts as a religiously motivated metaphysics and a theory of ethics combining classical and Christian themes.

The ultimate secret to Geulincx’ appeal may be that his philosophical texts, despite their traditional setting, have a captivating strangeness to them, which is linked to the alienating topics they address. Whatever his philosophy may have done for Beckett’s artistic development, it is beyond doubt that, just as in Samuel Beckett’s case, Geulincx’ Baroque blend of Augustino-Cartesianism will continue to impress likeminded readers by its unique evocation of the timeless motif of human metaphysical ignorance, as well as by its humbling expression of amazement at the mystery of existence.

7. References and Further Reading

a. Primary Sources

  • Geulincx, Arnold, Opera philosophica, vol. 1, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1891.
    • Geulincx’ Orationes and the Logica restituta
  • Geulincx, Arnold, Opera philosophica, vol. 2, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1892.
    • The Methodus, as well as the metaphysical and physical works
  • Geulincx, Arnold, Opera philosophica, vol. 3, ed. J.P.N. Land, The Hague: Martinus Nijhoff, 1893.
    • Ethics, ethical disputations and Notes on Descartes
  • Geulincx, Arnold, Sämtliche Schriften in fünf Bänden, ed. H.J. de Vleeschauwer, Stuttgart-Bad Cannstatt: Frommann-Holzboog, 1965–1968.
    • A handy reprint (in 3 vols.) of the Opera Philosophica
  • Geulincx: Présentation, choix de textes et traduction, ed. Alain De Lattre, Philosophes de tous les temps vol. 69, Paris: Seghers, 1970.
    • A selection of Geulincx’ texts in French
  • Geulincx, Arnout, Van de hoofddeugden. De eerste tuchtverhandeling, ed. Cornelis Verhoeven, Baarn: Ambo, 1986.
    • The first part of the Ethics in a modern version of the Dutch original
  • Geulincx, Arnold, Metaphysics, ed. Martin Wilson, Wisbech: Christoffel Press, 1999.
    • First English edition of the Metaphysica vera
  • Geulincx, Arnold, Ethics, With Samuel Beckett’s Notes, ed. Han van Ruler, Anthony Uhlmann and Martin Wilson. Leiden and Boston: Brill, 2006.
    • The complete Ethics in English with a transcription of Beckett’s notes
  • Geulincx, Arnold, Éthique, ed. Hélène Bah-Ostrowiecki, Turnhout: Brepols, 2010.
    • The Ethics in a modern French edition

b. Secondary Sources

  • Aalderink, Mark, ‘Spinoza and Geulincx on the human condition, passions, and love’, Studia Spinozana vol. 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism, Würzburg: Königshausen & Neumann, 2006, pp. 67-87.
    • On the Augustinian concept of love and its impact on Geulincx and Spinoza
  • Aalderink, Mark, Philosophy, Scientific Knowledge, and Concept Formation in Geulincx and Descartes, Utrecht: Zeno, 2010.
    • Published dissertation on the epistemological differences between Descartes and Geulincx
  • Armogathe, Jean-Robert, and Vincent Carraud, ‘The First Condemnation of Descartes’ Œuvres: Some Unpublished Documents from the Vatican Archives’, in: Daniel Garber and Steven Nadler (eds.), Oxford Studies in Early Modern Philosophy, vol. 1, Oxford: Clarendon, 2003, pp. 67-109.
    • Contains the only known reference to Geulincx’ marriage plans as a reason for his dismissal
  • Ayers, M.R., ‘Richard Burthogge and the Origins of Modern Conceptualism’, in: Tom Sorell and G.A.J. Roger (eds.), Analytic Philosophy and History of Philosophy, Oxford: Clarendon, 2005, pp. 179-200.
    • On Geulincx’ most important pupil in epistemology
  • Cassirer, Ernst, Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit (Berlin, 1906-1923), ed. Dagmar Vogel in 2 vols., Hamburg: Meiner, 1999.
    • On Geulincx and Kant
  • Cooney, Brian, ‘Arnold Geulincx: A Cartesian Idealist’, Journal of the History of Philosophy, vol. 16 (1978), pp. 167-180.
    • English language introduction to Geulincx
  • Cordingley, Anthony, ‘École Normale Supérieure’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 42-52.
    • On Samuel Beckett’s intellectual development during the late 1920s and early 1930s
  • Dürr, Karl, ‘Die mathematische Logik des Arnold Geulincx’, The Journal of Unified Science (Erkenntnis), vol. 8 (1939-40), pp. 361-8.
    • A translation of Geulincx’ logic in modern terms
  • Dürr, Karl, ‘Arnold Geulincx und die klassische Logik des 17. Jahrhunderts’, Studium Generale 18 (1965-8), pp. 520-541.
    • Geulincx’ logic in the context of other seventeenth-century sources in the field
  • Eekhof, A., ‘De wijsgeer Arnoldus Geulincx te Leuven en te Leiden’, in Nederlandsch Archief voor Kerkgeschiedenis, new series, vol. 15 (1919), pp. 1-24.
    • On the Letter of Recommendation of 3 May 1658
  • Herren, Graley, ‘Working on Film and Television’, in: Anthony Uhlmann (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013, pp. 192-202.
    • On Beckett’s psychology and Geulincx’ influence on his screenplays
  • Kossmann, E.F., ‘De laatste woning van Arnold Geulincx’, in Bijdragen voor Vaderlandsche Geschiedenis en Oudheidkunde 7-3, pp. 136-138.
    • On Geulincx’ last residence and debts
  • Land, J.P.N., ‘Arnold Geulincx te Leiden (1658-1669)’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 3 (1887), pp. 277-327.
    • On Geulincx’ Leuven dismissal and Leiden career
  • Land, J.P.N., ‘Aanteekeningen betreffende het leven van Arnold Geulincx’, in Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschappen, Afdeeling Letterkunde, 3rd series, vol. 10 (1894), pp. 99-119.
    • On Geulincx’ life in Flanders
  • Land, J.P.N., Arnold Geulincx und seine Philosophie. The Hague: Martinus Nijhoff, 1895.
    • A Geulincx biography.
  • Lattre, A. de, L’occasionalisme d’Arnold Geulincx, Paris: Les Editions de Minuit, 1967.
    • Published dissertation on Geulincx’ philosophy
  • McCracken, J.D, Thinking and Valuing: An Introduction, Partly Historical, to the Study of the Philosophy of Value. London: Macmillan, 1950.
    • Interpretation of Descartes, Geulincx and Spinoza as a particular school of ethics
  • Monchamp, Georges, Histoire du Carté­sianis­me en Belgique, Bruxelles et St. Trond: F. Hayez, 1886.
    • An as yet unsurpassed history of Cartesianism in the Southern Netherlands
  • Nadler, Steven, ‘Knowledge, Volitional Agency and Causation in Malebranche and Geulincx’, British Journal for the History of Philosophy 7 (1999-2), pp. 263-274.
    • On similarities and differences between Malebranche and Geulincx
  • Nuchelmans, Gabriël, Geulincx’ Containment Theory of Logic, Amsterdam: Koninklijke Nederlandse Akademie van Wetenschappen / Noord-Hollandsche Uitgevers Maatschappij, 1988.
    • Detailed account of Geulincx’ use of set theory in logic
  • Paquot, Jean Noël, Memoires pour servir a l’histoire litteraire des dix-sept provinces des Pays-Bas, de la principauté de Liege, et de quelque contrées voisines, vol. 13, Louvain: De l’imprimerie academique, 1768.
    • Reference to Geulincx’ presumed Leuven quarrels and debts
  • Pfleiderer, Edmund, Leibniz und Geulincx: Mit besonderer Beziehung auf ihr beiderseitiges Uhrengleichniss, Tübingen: Tübinger Universitäts-Schriften, 1884.
    • Start of the controversy on the image of the synchronised clocks in Geulincx and Leibniz
  • Renz, Ursula, and Han van Ruler, ‘Okkasionalismus’, in: Hans Jörg Sandkühler (ed.), Enzyklopädie Philosophie, Hamburg: Felix Meiner, 2010, vol. 2, pp. 1843-1846.
    • On the diversity of occasionalisms
  • Rousset, Bernard, Geulincx entre Descartes et Spinoza, Parijs : Vrin, 1999.
    • Posthumously published monograph on Geulincx
  • Ruler, Han van, ‘“Something, I know not what.” The Concept of Substance in Early Modern Thought’, in Lodi Nauta and Arjo Vanderjagt (eds.), Between Imagination and Demonstration. Essays in the History of Science and Philosophy Presented to John D. North, Leiden: Brill, 1999, pp. 365-93.
    • On Geulincx, Locke and the notion of individuality in scholastic and Cartesian thought
  • Ruler, Han van, ‘Geulincx, Arnold (1624-1669)’, in Wiep van Bunge, Henri Krop, Bart Leeuwenburgh, Han van Ruler, Paul Schuurman and Michiel Wielema (eds.), The Dictionary of Seventeenth and Eighteenth-Century Dutch Philosophers, in 2 vols, Bristol: Thoemmes, 2003, vol. 1, pp. 322-331.
    • Extended dictionary entry on Geulincx and his works
  • Ruler, Han van, ‘Geulincx and Spinoza: Books, Backgrounds and Biographies’, in Studia Spinozana 15 / Wiep van Bunge (ed.), Spinoza and Dutch Cartesianism. Würzburg: Königshausen & Neumann, 2006, pp. 89-106.
    • On whether Geulincx and Spinoza knew each other or each other’s work
  • Ruler, Han van, ‘Spinozas doppelter Dualismus’, transl. Andreas Fliedner, in: Deutsche Zeitschrift für Philosophie 57 (2009-3), pp. 399-417.
    • On parallel forms of dualism in Geulincx and Spinoza
  • Terraillon, Eugène, La morale de Geulincx dans ses rapports avec la philosophie de Descartes, Paris: Alcan, 1912.
    • Short work on Geulincx’ occasionalism
  • Thijssen-Schoute, C. Louise, Nederlands Cartesianisme, Amsterdam: Noord-Hollandsche Uitgevers Maatschappij, 1954; new ed. by Theo Verbeek, Utrecht: HES, 1989.
    • Source book on Dutch Cartesianism
  • Tucker, David, Samuel Beckett and Arnold Geulincx: Tracing ‘a literary fantasia’, London: Continuum, 2012.
    • Detailed study and interpretation of all of Beckett’s references to Geulincx
  • Uhlmann, Anthony, Samuel Beckett and the Philosophical Image, Cambridge: Cambridge U.P., 2006.
    • On Beckett’s use of philosophical themes and their literary and philosophical impact
  • Uhlmann, Anthony, Chris Conti and Andrea Curr (eds.), Arnold Geulincx Resource Site, funded by the Australia Research Council: www.geulincx.org
    • A website dedicated to Geulincx research by The Beckett and Geulincx Research Project
  • Uhlmann, Anthony (ed.), Samuel Beckett in Context, Cambridge: Cambridge U.P., 2013.
    • A volume of articles on Beckett’s intellectual biography
  • Vander Haeghen, Victor, Geulincx. Étude sur sa vie, sa philosophie et ses ouvrages, Diss. Liège, Gent: Vander­haeghen, 1886.
    • Complete intellectual biography
  • Vanpaemel, Geert, Echo’s van een wetenschappelijke revolutie. De mechanistische natuur­wetenschap aan de Leuvense Artesfaculteit, Brussel: KAWLSK, 1986.
    • On Leuven University’s curriculum and Geulincx’ proposals for change
  • Verbeek, Theo, ‘Geulincx, Arnold (1624-69)’, in Edward Craig (ed.), Routledge Encyclopedia of Philosophy, vol. 4, Londen: Routledge, 1998, pp. 59-61.
    • Concise account of Geulincx’ philosophy and its relation to Descartes
  • Vleeschauwer, Herman J. de, Three Centuries of Geulincx Research, Mededelings van die Universiteit van Suid-Afrika / Communications of the University of South Africa, Pretoria 1957.
    • Bibliographical outline of Geulincx-interpretations
  • Vleeschauwer, Herman J. de, ‘Ha Arnold Geulincx letto il « De la Sagesse » de Pierre Charron?’, Filosofia 25 (1974-2 and 1974-4), pp. 117-134 and 373-388.
    • On Charron’s De la Sagesse as a model for Geulincx’ Ethics.

 

Author Information

Han van Ruler
Email: vanruler@fwb.eur.nl
Erasmus University
The Netherlands

Neocolonialism

The term “neocolonialism” generally represents the actions and effects of certain remnant features and agents of the colonial era in a given society. Post-colonial studies have shown extensively that despite achieving independence, the influences of colonialism and its agents are still very much present in the lives of most former colonies. Practically, every aspect of the ex-colonized society still harbors colonial influences. These influences, their agents and effects constitute the subject matter of neocolonialism.

Jean Paul Sartre’s Colonialism and Neocolonialism (1964) contains the first recorded use of the term neocolonialism. The term has become an essential theme in African Philosophy, most especially in African political philosophy. In the book, Sartre argued for the immediate disengagement of France’s grip upon its ex-colonies and for total emancipation from the continued influence of French policies on those colonies, particularly Algeria. However, it was at one of the All African People’s Conferences (AAPC), a movement of political groups from countries in Africa under colonial rule, which held conferences in the late 1950s and early 1960s in Accra, Ghana, where the term was first officially used in Africa. At the AAPC’s “1961 Resolution on Neocolonialism,” the term neocolonialism was given its first official definition. It was described as the deliberate and continued survival of the colonial system in independent African states, by turning these states into victims of political, mental, economic, social, military and technical forms of domination carried out through indirect and subtle means that did not include direct violence. With the publication of Kwame Nkrumah’s Neo-colonialism: The Last Stage of Imperialism in 1965, the term neocolonialism finally came to the fore. Neocolonialism has since become a theme in African philosophy around which a body of literature has evolved and has been written and studied by scholars in sub-Saharan Africa and beyond. As a theme of African philosophy, reflection on the term neocolonialism requires a critical reflection upon the present socio-economic and political state of Africa after independence from colonial rule and upon the continued existence of the influences of the ex-colonizers’ socio-economic and political ideologies in Africa.

Table of Contents

  1. Introduction
  2. History of Neocolonialism
  3. Neocolonialism: Related Concepts
  4. Colonialism
  5. Imperialism
  6. Decolonization
  7. Neocolonialism: The Last Stage of Imperialism
  8. The Myth of Neocolonialism
  9. Neocolonialism Today in Africa: The Era of Globalization
  10. Conclusion
  11. References and Further Reading

1. Introduction

Neocolonialism can be described as the subtle propagation of socio-economic and political activity by former colonial rulers aimed at reinforcing capitalism, neo-liberal globalization, and cultural subjugation of their former colonies. In a neocolonial state, the former colonial masters ensure that the newly independent colonies remain dependent on them for economic and political direction. The dependency and exploitation of the socio-economic and political lives of the now independent colonies are carried out for the economic, political, ideological, cultural, and military benefits of the colonial masters’ home states. This is usually carried out through indirect control of the economic and political practices of the newly independent states instead of through direct military control as was the case in the colonial era.

Conceptually, the idea of neocolonialism can be said to have developed from the writings of Karl Marx (1818-1883) in his influential critique of capitalism as a stage in the socio-economic development of human society. The continued relevance of Marxist socio-economic philosophy in contemporary times cannot be denied. The model of society as structured by an economic basis, legal and political superstructures, and a definite form of social consciousness that Marx presented both in The Capital (1972) as well as in the Preface to the Critique of Political Economy (1977) remains important to socio-economic theory. Marx presents theories which explain a certain kind of evil in capitalism. Today, capitalism has produced the multinational corporations that can assemble far more effective intelligence behind their often nefarious designs than any nation’s government can assemble to try to hold multinationals at bay. As things go now with the capitalist system, there is an indication that there are some foresights in some of Marx’s prognostication. The world seems to continue to acquiesce to the vast control of economic and political resources by the wealthiest 1%. No doubt, Marx’s prognostications have been vindicated in many ways than they have been refuted.

Consequent analysis by Sartre, in his critique of French economic policies on Algeria, was an attempt to combine his existentialist idea of human freedom with Marx’s economic philosophy in order to better establish his opposition to France’s economic colonization of Algeria. Proper coinage of the term neocolonialism in Africa, however, is attributed to Nkrumah who used it in his 1963 preamble of the Organization of African States (OAU) Charter and later, as the title of his 1965 book, Neocolonialism: The Last Stage of Imperialism.

In a simple context, neocolonialism is a class name for all policies, infrastructures and agents actively contributing to society, which indirectly serve to grant continuity to the practices known to the colonial era. The essence of neocolonialism is that while the state appears to be independent and have total control over its dealings, it is in fact controlled by outsider economic and political influences (Nkrumah1965, 7). The loss of control of the machineries of the states to the neocolonialists underlies the basis of Nkrumah’s discourse.

In his article “Philosophy and Post-Colonial Africa”, Tsenay Serequeberhan explicates the nature of neocolonialism in Africa in a manner that reveals how Europe propagates its policy of socio-economic and political dominance in post-colonial Africa. For Serequeberhan, neocolonialism in Africa is that which internally replicates in a disguised manner what was carried out during the colonial period. This disguised form constitutes the nature of the European neocolonial subjugation as it concerns the politics of economic, cultural, and scientific subordination of African states (Serequeberhan 1998, 13). With this, we can describe the general nature of neocolonialism as a divergence in national power—political, economic, or military—which is used rather lopsidedly by the dominant power to subtly compel the dominated sectors of the dominated society to do its bidding. The method and praxis of neocolonialism lies in its guise to enjoin leaders of the independent colonies to accept developmental aids and support through which the imperial powers continue to penetrate and control their ex-colonies. Through the guise of developmental aids and support, technological and scientific assistance, the ex-colonial masters impose their hegemonic political and cultural control in the form of neocolonialism (Serequeberhan 1998, 13).  In such a situation, the leaders of the seemingly independent African states become minions to the whims and caprices of the ex-colonial lords or their multinational corporations in terms of the management of the affairs of the new states. Prima facie, it would seem that the neocolonial state is free of the influence of imperialists, and it appears to be governed completely by its own indigenes. In truth, though, the state remains under its former colonial masters and their accomplices. Being under the continued impression that the former colonialists are superior and more civilized, the leaders of the supposedly new independent states continue to practice and encourage the people to imbibe the ways and cultural practices, and more essentially the economic control, of the imperialists.

Within a neocolonial situation, therefore, the imperialists usually maintain their influence in as many sectors of the former colony as possible, making it less of an independent state and more of a neo-colony. To this end, in politics, economics, religion, and even education, the state looks up to its imperialists, rather than improving upon its own indigenous culture and practices. Through neocolonialism, the more technologically advanced nations ensure their involvement with low income nations, such that this relationship practically annihilates the potential for the development of the smaller states and contributes to the capital gain of the technologically advanced nations (Parenti 2011, 24).

In On the Postcolony, Achille Mbembe further examines the nature of neocolonialism in Africa and says that the underpinning theory on which neocolonialism rests consists of bald assertions with no tenable arguments to support it. Evidently, in his view, after colonialism has ended in Africa, the West did not consider that Africans were capable of organizing themselves socially, economically and politically. To Mbembe, the reason for holding such ideas and advancing them is simply because the African is believed to be intellectually poor and is reducible to the level of irrationality. In his words, the capacity for Africans to rationally organize themselves is “understood through a negative interpretation” (Mbembe 2001, 1). This interpretation reveals the African as never possessing things and attributes that are properly part of human nature, or rather (even if reluctantly granted the status of the human) those things and attributes are generally of lesser value, little importance, and of poor quality (Mbembe 2001, 1). In other words, since Africans and other people that are different in race, language, and culture from the West do not possess the power, the rigour, the quality, and the intellectual analytical abilities that characterize Western philosophical and political traditions (Mbembe 2001 2), it is then difficult to assume that they would have the rational capacity to organize themselves socially, economically and politically. In a rejoinder to this bald assertion and negative interpretation, Mbembe retorts that the West has always had insurmountable difficulties with accepting an African theory on the “experience of the Other”, or on the issue of the “I” of others to which the West seems to perceive as foreign to it. In other words, the typical Western tradition has always denied the existence of any “self” but its own. It has always denied the idea of a common human nature, such that, “a humanity shared with others, long posed, and still poses, a problem for Western consciousness” (Mbembe 2001, 2).

Fundamentally, this denial is not peculiar to the period of neocolonialism alone. It has a history that dates back to the period of the trans-Atlantic slave trade and colonialism. In his book, The Invention of Africa, V. Y. Mudimbe asserts that there are three methods that are representative of the colonial structure in Africa: the domination of physical space, the reformation of natives’ minds, and the integration of local economic histories into the Western perspective. This structure constitutes the three complementary aspects of the colonial organization which embraces the physical, human, and spiritual elements of the colonizing experience (Mudimbe 1988, 2). This colonial structure is aimed at emphasizing a historicity that promotes discourses on African primitiveness, which is used in justifying why the continent needed to be conquered and colonized in the first place (Mudimbe 1988, 20). Citing what Ignacy Sachs calls “europeocentricism”, Mudimbe says this model of colonialism is to “dominate our thought and given its projection on the world scale by the expansion of capitalism….it marks contemporary culture imposing itself as a strongly conditioning model for some and forced deculturation for others” (Sachs 1971, 22). In all of this, according to Mudimbe, europeocentricism is anchored on the denial of the “Other” in European consciousness. It continues to be a denial in spite of the assertion of the “existence of the other” in Paul Ricoeur’s meditation on the irruption of the other: “when we discover that there are several cultures instead of just one…be it illusory or real, we are threatened with destruction by our own discovery. Suddenly, it becomes possible that there are just others, that we ourselves are an “other” among others” (Ricoeur 1965, 278). To this end, if one accepts Ricoeur’s assertion on the existence of cultural pluralism one would be able to affirm the foundation of Mudimbe’s submission in his The Idea of Africa that in each continent, for example Africa, “there are natural features, cultural characteristics, and, probably, values that contribute to its reality different from those of, say, Asia and Europe” (Mudimbe 1994, xv).

It is based on this distinct reality of each culture that William Abraham in The Mind of Africa examines the problems and challenges that face post-colonial Africa vis-à-vis the continent’s interaction with Europe. Abraham acknowledges the existence of neocolonialism in Africa, but proposes an integrative form of culture whereby certain positive aspects of Western culture may be integrated with African culture in order to forge a common bond (Abraham 1962, 83). Abraham however emphasizes that in spite of the on-going social, economic and political change in Africa due to the impact of neocolonialism, Africa’s culture must be guarded from being eroded by Western influence and civilization, or what he refers to as “the externality of an outsider” (Abraham 1962, iv)

The above description of the nature of neocolonialism and its different dimensions sparsely elaborates on the themes of subjugation and an apparent imposition of a hegemonic economic, political, and social order mostly in the guise of trade relations or developmental aid grants by the imperialists. The attendant implication of this relates to how post-colonial African states have seemingly failed to apply themselves to the problems of self-maintenance.

2. History of Neocolonialism

Towards the late nineteenth century through to the latter half of the twentieth century, some European countries, such as Britain, France, Belgium, and Portugal, had colonized a large number of African nations, setting up economic systems that allowed for seemingly extensive exploitation. Decades after World War II, these European nations granted political independence to their colonies in Africa, but still found a way to retain their economic influence and power over the former colonies. From the 1950s when many African colonies began to gain independence, they soon realized that the actual liberation that they had anticipated was outlandish. So, in spite of the assumption of Africans to political leadership positions, Africans soon realized that the economic and political atmosphere were still under some form of control of the former colonial masters. By implication, post-colonial Africa continued to experience the domination of the Western styled economic model that was prevalent during the period of colonialism. It does appear that the former colonial masters only wanted to grant political independence to their former colonies, and did not want them to be liberated from colonialism. This is why it is inferred that the situation which informs the ideological implementation of neocolonialism in Africa began immediately after the political independence of most African states.

In postcolonial Africa, events and situations have revealed how neocolonialism was nurtured from the moment independence was granted. The elements of neocolonial influences that are apparent within the interactions continually exist between former colonial masters and their former colonies attest to this assertion. For example, the point could be made with regards to the ongoing interactions between France and Francophone African countries such as Cameroon, Togo and Ivory Coast, as well as between Britain and Anglophone African countries such as Ghana, Nigeria and the Gambia

In the case of Cameroon, particularly after the amalgamation of French Cameroon with Southern British Cameroon in 1961, the granting of political independence to Cameroon by France was dependent on certain negotiations on matters of defense, foreign policy, finance, and economy, as well as technical assistance. This resulted in the adoption and institutionalization of the French 5th Republic constitutional model, alongside French political, economic, monetary, and cultural dominance in the new Cameroon (Martin 1985, 192). Following the creation of the French Franc zone, which established the Franc CFA as the general currency for all Francophone countries, the West African colonies became tied in a fixed parity of 50:1 to the French franc, automatically granting the French government control over all financial and budgetary activities (Goldsborough 1979, 72). France also continued its military presence in Cameroon after independence. France established military and defense assistance agreements with Cameroon (Martin 1985, 193). Furthermore, the French institutionalized linguistic and cultural links with all its former colonies, thereby creating the “La Francophonie” heading which served as a platform for reinforcing the assimilation of the French language, culture and ideology (Martin 1985, 198).

Although Britain may have continued to maintain an indirect economic influence through multinational corporations on its former colonies, the direct effects of British’s neocolonial socio-political and political ideologies have diminished significantly over the years. However, the West in general maintains an indirect form of domination over all developing African countries through means such as loans from the World Bank or the International Monetary Fund (IMF). This form of neocolonialism is done through foreign aids or foreign direct investments where strict or severe financial conditionalities are imposed. Such conditionality often renders the neocolonial state subservient to the economic and sometimes political will of the foreign donor.

Clearly, from this brief history of neocolonialism in Africa, one can see that colonialism itself had an epochal dimension on the history of contemporary Africa. This is why the study of neocolonialism has become critical to the study of African history and politics. It is however more crucial to the field of African philosophy because of the need to reflect on the socio-political and moral impacts of neocolonialism on Africa.

3. Neocolonialism: Related Concepts

Included in the themes of African Philosophy, especially African Social and Political Philosophy are the related concepts of neocolonialism, colonialism, imperialism and decolonization. This is rightly so because of the events and activities of European expansionists’ agenda that occurred, especially in Africa’s modern history. Although these epochal events preceded one another, the methods and praxis of colonialism, imperialism, and neocolonialism are only slightly different. Their common denominators include social, economic, political, and cultural subjugations of the colonized. However, the concept of decolonization differs essentially from the others. While others are conceptually linked with exploitation and domination, decolonization is a means for liberation, which could be through a social, cultural, political and economic form of revolution.

a. Colonialism

Broadly construed, the term colonialism can be described as the deliberate imposition of the rules and policies of a nation on another nation. Its strategy is the forced placement of a nation over another that gives room for the opportunity to exploit the colonized nation in order to facilitate the economic development of the colonialist home state.

A definition by Ronald J. Horvath sees colonialism as a “form of domination – the control by individuals or groups over the territory and/or behavior of other individuals or groups” (Horvath 1972, 46). Clearly, colonialism is a tool for expansion and a form of exploitation on all fronts. This is why Robert Young’s view on colonialism is that it “involved an extraordinary range of different forms and practices carried out with respect to radically different cultures, over many centuries” (Young 2001: 17).

The idea behind colonialism basically is the conquest and rule over a country or region by another, allowing for the exploitation of the resources of the conquered for the profit of the conqueror. Colonialism is an instrumental process through which a state acquires and maintains colonies in another territory. The outcome of this, which is the colonial stage of society, alters mildly or altogether the economic, political, social and even intellectual structure of the conquered state.

Between the 1860s and 1900s, Africa as a whole was subjected to various forms of aggression from Europe, ranging from diplomatic pressures to military invasions until almost all African states were finally conquered and colonized. The process of colonization came to its complete stage with invasions of the political, economic and socio-cultural spheres of the African societies.

The first attempts at colonization occurred when the Europeans began to seek trade pursuits outside their own continent, and thus discovered that many other nations, particularly in Africa, had wealth in natural resources which had potentials for their own economic gain.

We can simply say that the nature of Colonialism involves a forced relationship between an indigenous majority and a minority of foreign invaders. Of course, its history can be traced to slavery, where indigenous people, particularly of Africa, were forcibly and violently taken as slaves to plantations in Europe and the Americas. Through slavery, Africa’s sons and daughters in large numbers were violently seized and taken to Europe as sellable commodities (Nwolize 2001, 25). However, as slavery was ending in the 1850s, Europe was packaging another round of violent visitation against Africa. This invasion took off in earnest after the Berlin conference of 1884 – 1885. Colonialism came with further violence. Vandalism, murder, torture, looting, rape, death, and destruction were also the order of the day (Afisi 2009a, 62).

Certain perceived basic assumptions seem to have informed the colonial construction of African savagery which was used to justify the nature of colonial warfare. Works of enlightenment philosophers such as Frederick Hegel’s (1770-1831) Lectures on the Philosophy of World History (1975) and Immanuel Kant’s Anthropology from a Pragmatic Point of View (1798, 2006) essentially informed these assumptions. Hegel speculates about the continent of Africa and asserts that Africa proper “is enveloped in the dark mantle of Night”. To Hegel, “The peculiarly African character is difficult to comprehend, for the very reason that in reference to it, we must give up the principle which naturally accompanies all our ideas–the category of Universality” (1975, 174). Hegel here states that Africans’ lack the category of Universality and, also, situates the African at the level of irrationality. “The Negro,” Hegel writes, “exhibits the natural man in his completely wild and untamed state” (174). The African was, to Hegel, a complete moron who had no idea of decency and could not distinguish his right from left. Similarly, the racist proclivities of Kant lie in his denial of any intellectual endowments and rational abilities to non-white races.

In a further attempt to rationalize colonialism, Lucien Levy Bruhl (1985, 63) standardizes the colonial discourse when he commissioned rationality as a Western signature, and thus granting what he terms mystic or pre-logical thinking to non-Western peoples. These denigrating words in particular refer to the African. These arguments justify the colonialist’s actions and reasons for invading and conquering the territories of the perceived Dark Continent. With this invasion, the entirety of the lives of this indigenous majority came to depend solely upon the powerful invaders. The fundamental decisions affecting the lives of the indigenes were made by the colonial masters. The colonialists gradually perpetuated the socio-economic and political spheres of the state, and finally, the minds of the people. The conquered were made to believe that they were inferior and, as such, only the ways of the colonialists were worthy to be imbibed.

In his article, “Modern Western Philosophy and African Colonialism”, E. Chukwudi Eze queries the rationality that underlies the thoughts and assumptions that emanate from the European Enlightenment philosophers who promoted the ideals of individual freedom and the dignity of the human person on the one hand, and who, on the other hand, were associated with the thoughts and promotion of slavery and colonialism (Eze 1998, 217). For European Enlightenment philosophers, Africans were not in the same logical set as normal humans. Therefore, their advocacy for the ideals of humanity and democracy did not apply to Africans. This justified their arguments for the promotion of “imperial and colonial subjugation of non-European peoples” (Eze 1998, 218). This suggests that there is a distinction for the enlightenment philosophers between, in Cornel West’s words, “sterling rhetoric and lived reality” or, in Abiola Irele’s, between the “word and deed” (Eze 1998).

Refutations have been made against these assumptions, which suggest that Africa, in the words of Walter Rodney (1972), was developing at its pace before the advent of colonialism. However, due to the debilitating effects of colonial rule, African scholars and political thinkers were faced with the serious challenges of socio-political and cultural reconstruction of Africa. The colonialists had imposed European beliefs and values on Africa. Thus, European languages, belief systems, social, economic, and political systems replaced pre-colonial African ones. As a reaction to the effects of colonialism, there was the need to find an alternative ideology for decolonization. The reflective attitude and the thought process in the search of the ideology for decolonization resulted in the abstraction of different philosophical ideas and the development of theories in political philosophy. Consequently, what is known as African social and Political Philosophy started as a reaction to colonialism. This explains the reason why colonialism is an important theme in African philosophy.

b. Imperialism

Imperialism dates further back in history, as it is traced back to the disintegrated Roman Empire. Imperialism can be described as an orientation which holds that a country can gain political or economic power over another through imposed sovereignty or more indirect mechanisms of control. Imperialism does not focus only on political dominance, but also conquest over expansion. It is particularly focused on the acquisition of power by a state over another group of people. It is also described as a state policy, practice or advocacy of extending power and dominion, especially by direct territorial acquisition or by gaining political and economic control of other areas. As Michael Parenti describes it, Western European imperialism first took place against other Europeans such as when Ireland became the first colony of what later became known as the British Empire (Parenti 2011, 11). However, those who virtually faced the thrust of the European, North American, and Japanese imperial powers have been states in Africa, Asia, and Latin America (Parenti 2011, 13).

An understanding of the basic modus operandi of imperialism suggests that foreign governments can govern a territory without significant settlement, quite unlike colonialism in which settlement is a key feature. Imperialism is merely an exercise of power over the conquered regions without immigration of any form.

In his book Decolonising the Mind, Ngugi wa Thiong’o explicates the nature of imperialism, particularly as it affects the culture and language of the African. Wa Thiong’o asserts that imperialism has absolute effects on the economic, political, military, cultural and psychological wellbeing of the people affected. He describes the effect of imperialism on Africa from two main perspectives. First is the socio-economic and political effect of the imperialist tradition on “consolidated finance capital” (Wa Thiong’o 1986, 2). He maintains that the subjugation of Africa’s economic life is done through the use of multinational corporations, and particularly how most African countries have been lured into accepting loans from the International Monetary Fund (IMF). Wa Thiong’o’s concern about the IMF is that the economic life of every worker and peasant of such countries that have taken the loans are mortgaged forever. This is because as such countries continue to service the IMF loans, the organization is entrusted with the power to dictate the direction of the economic policies of those states. This can also be said of the imperialist domination of politics where it is ensured that African states rely on Western models of politics, policing, judiciary practice, and education.

Wa Thiong’o’s second perspective on the consequences of imperialism relates to what he calls “effect of cultural bomb” (Wa Thiong’o 1986, 3). According to him, imperialism uses a cultural bomb to isolate people and estrange them from their identity. This is done by annihilating the people from their heritage, their environment, their names and, above all, their language. Wa Thiong’o asserts that language remains the most essential vehicle through which the human soul can be held captive. In this case, the imperialists are fully aware of this essence and deliberately use “language as a means of spiritual subjugation” (Wa Thiong’o 1986, 9). So in Wa Thiong’o’s submission, this cultural and psychological form of imperialism remains the biggest weapons that undermine the value of the human person and erodes the dignity of the people’s identity. This form of imperialism has the tendency to make people embrace the imperialists’ alien culture, language, and way of life, and be far removed from their indigenous heritage and identity.

The study of the dignity of the African in every form is central to African philosophy. The need to embrace the value of Africa’s cultural heritage that is devoid of any form of imperialist subjugation is essential for the promulgation of African philosophy. It is in the light of this that, as a theme of African philosophy, the study of imperialism remains crucial to understanding its methods and its effects on the socio-economic, political, and cultural life of the African.

c. Decolonization

As a theme in African philosophy, the term decolonization connotes an ideology for true emancipation in post-colonial Africa. We talk of emancipation from cultural, economic, political, psychological forms of colonialism. African philosophers have consistently been concerned with the issue of liberation of the mind, spirit and body, as well as the emancipation of the African from all elements and influences of colonialism. It is as a result of these concerns that the study of decolonization is essential to the project of African philosophy.

The term decolonization can be described as the abolishment of colonialism and the enthronement of a people/nation’s powers over its own territories. It is typically referred to not merely as independence from colonialism but a total liberation from the influences and powers of imperial neocolonialism. It is a situation in which a new state acts under its own volition, free from the direct control of foreign actors. Decolonization refers to the ability or willingness of the previously colonized nation to become free from imperial rule in order to control its own domestic and international affairs. It is also the mechanism and/or ability of a people to be liberated from cultural and psychological domination of foreign influences.

In Africa, for instance, many theoretical assumptions informed the need to necessarily decolonize. In principle, we can trace the spark for an ideology for decolonization as beginning with the rise of communism in the former Soviet Union. The teachings of Marx, Frederick Engels (1820-1895), and Vladimir Lenin (1870-1924) against the exploitation of the masses remain the backdrop of this decolonization. The influences of the teachings of Frantz Fanon on decolonization cannot also be gainsaid. These teachings and influences seemed to have led to the conviction that informed the early African political thinkers on the need to radically decolonize and end the influences of neocolonialism. Some post-independent African thinkers, such as Leopold Sedar Senghor of Senegal, Sekou Toure of Guinea, Julius Nyerere of Tanzania, Obafemi Awolowo of Nigeria, and Kwame Nkrumah of Ghana, among others, were faced with the serious challenges of socio-economic, political, and cultural reconstruction of the postcolonial African states. They were faced with the task of liberating Africa from the imposition of neocolonial European values, languages, and belief systems, social, economic, and political systems which seemed to have replaced the pre-colonial African ones. Consequently, the principle of individualism, believed to have been a European signature, seemed to have replaced the African cultural context of brotherhood, which suggests a welfare system of communalism, collectivism, and egalitarianism; hence, the need for a search for an ideology for decolonization (Afisi 2009b, 33).

As noted above, among the writers on decolonization in Africa, Fanon was one of the prominent figures. In fact, his writings are notably extensive on the process and methods of decolonization and of true liberation. Central to Fanon is the idea that only through decolonization can there be true liberation. Fanon’s strong advocacy of decolonization results from his commitment to the preservation of individual human dignity.

Fanon’s works, Black Skin White Mask (1952) and The Wretched of the Earth (1962), contain radical critiques of the French colonization. He views colonialism as a forcible control of another state, with the word ‘force’ as key. Fanon accuses the colonizers of using force to exploit raw materials and labour from colonized countries. To justify their actions however, the colonial masters proclaim that the natives were savages and that European culture was the most ideal for adoption.

Fanon claims that the colonial situation is by definition a violent one. He condemns the violence inflicted on the colonized by the colonizer. However, he distinguishes between a threefold categorization of violence. This includes; physical, structural and psychological violence. Physical violence implies the somatic injury inflicted on human beings, the most radical manifestation of which is the killing of an individual. Structural violence reflects the fact of exploitation and its necessary institutional form of the colonial situation. Psychological violence is the injury or harm done to the human psyche (Fanon 1952, 39). This third categorization includes elements of indoctrination of various kinds and threats which tend to decrease the victims’ mental potentialities. As a way out of all of these, Fanon advocates a reprisal use of violence against the settlers to enable the colonized regain their self-respect. To him, since the colonial situation is itself a violent one, the colonial masses can only achieve liberation through replicated form of violence. True liberation, according to Fanon, must be accompanied by violence. His submission is that for liberation to be total, accurate and objectively achieved, it has to be accompanied by violence (Fanon 1962, 102).

In Fanon, decolonization requires violence on the part of the colonized. Violence plays a critical role in the decolonization struggle. The colonized must see violence in decolonization as that which leads not to retrogression, but liberation. Fanon sees decolonization as implementation of the concept of ‘the last shall be the first’. It is a psycho-social process, a historical process that changes the order of the world. Decolonization involves a struggle for the mental elevation of the colonized African people (Fanon 1962, 116). So, from all of this, Fanon contends that Africa is in need of true liberation which can only result from decolonization. In his submission, resisting a colonial power using only politics cannot be effective; violence is the best way to attain decolonization.

Arguing from a similar position, Kwasi Wiredu in his book, Conceptual Decolonization in African Philosophy contends that colonialism has not only affected Africa’s political society but also its mental reasoning. He advocates the need for Africans to go through a process of mental decolonization. Wiredu recommends the process of decolonization from two conceptual analyses: first, ‘avoiding through a critical conceptual self-awareness the unexamined assimilation in our thought… of the conceptual frameworks embedded in foreign philosophical tradition’, and second ‘exploiting as much as judicious the resources of our own indigenous conceptual schemes in our philosophical meditations…’ (Wiredu 1998, 117). For Wiredu, the most important function of post-colonial philosophy is what he refers to as “conceptual decolonization”. This simply implies “divesting African philosophical thinking of all undue influences emanating from our colonial past” (Wiredu 1998). Wiredu sees decolonization as a necessary tool for developing an authentic African philosophy that is devoid of any neo-positivist influences.

Wa Thiong’o, in his thinking, believes that decolonization can only take place in Africa when the “cultural bomb” is diffused. This process begins when “writing” is done in the various indigenous African languages. Such writings, which would enhance the renaissance of African cultures, must also carry with it the spirit and content of anti-imperialist struggles. This would ultimately help in liberating the mind of the people from foreign control (Wa Thiong’o 1986, 29). For total decolonization to occur, Wa Thiong’o enjoins writers in African languages to form a revolutionary vanguard in the struggle to decolonize the mind of Africans from imperialism.

4. Neocolonialism: The Last Stage of Imperialism

After the independence of most African nations, Africans soon began to notice that their countries were being subjected to a new form of colonialism, waged by their former colonialists and some other developed nations. It is pertinent to mention that even though neocolonialism is a subtle propagation of social-economic and sometimes political activities of former colonial overlords in their ex-colonies, documented evidence has shown that a country that was never colonized can also become a neo-colonialist state. Countries such as Liberia and Ethiopia that never experienced colonialism in the classical sense have become neocolonial states by dint of their reliance on international finance capital, courtesy of its fragile economic structure (Attah, 2013:71). It is based on this that neocolonialism can be said to be a new form of colonial exploitation and control of the new independent states of Africa, and other African states with fragile economies.

Nkrumah views neocolonialism as a new form of subjugation of the economic, social, cultural, and political life of the African. His postulation is that European imperialism of Africa has passed through several stages, from slavery to colonization and subsequently to neocolonialism being the last stage of the imperialist subjugation and exploitation process. Nkrumah’s (1965) classic, Neo-colonialism: The Last Stage of Imperialism, is an analysis of neocolonialism in relation to imperialism. The book emphasizes the need to recognize that colonialism had yet to be abolished in Africa. Rather, it had evolved into what he calls neocolonialism. Nkrumah reveals the methods that the West used in its shift in tactics from colonialism to neocolonialism. In his words: “without a qualm it dispenses with its flags, and claims that it is ‘giving’ independence to its former subjects, to be followed by ‘aid’ for their development. Under cover of such phrases however, it devises innumerable ways to accomplish objectives formerly achieved by naked colonialism” (Nkrumah 1965). This explains the condition under which a nation is continually enslaved by the fetters of neocolonialism while being independent in theory, and yet being trapped outwardly by international sovereignty, so that it is actually directed politically and economically from the outside.

Nkrumah contends that neocolonialism is usually exercised through economic or monetary means. As part of the methods of control in a neocolonial state, the imperialist power and control over the state is gained through contributions to the cost of running the state, promotion of civil servants into positions that allow them to dictate and wield power, and through monetary control of foreign exchange by the imposition of a banking system that favors the imperial system.

Nkrumah further explains that neocolonialism results in the exploitation of different sectors of the nation, using different forms and methods: “[t]he result of colonialism is that foreign capital is used for the exploitation rather than for the development of the less developed parts of the world. Investment under neocolonialism increases rather than decreases the gap between the rich and the poor countries of the world” (Nkrumah 1965).

On the link between Neocolonialism and Imperialism, Nkrumah writes that neocolonialism is the worst and most heightened form of imperialism. For those who practice it, it ensures power without responsibility and unchecked exploitation for those who suffer it. He explains that neocolonialist exploitation is implemented in the political, religious, ideological, economic, and cultural spheres of society. He further provides details of the infiltration and manipulation of organized labour by agencies of the West in African countries. He discusses how the mass media is used as an instrument of neocolonialism in the following statement: “[w]hile Hollywood takes care of fiction, the enormous monopoly press, together with the outflow of slick, clever, expensive magazines attends to what it chooses to call ‘news’” (Nkrumah 1965). Religion too, according to Nkrumah, is distorted and used to support the cause of neocolonialism.

Nkrumah’s submission, however, is in the projection that as dangerous as neocolonialism is to the future of Africa, it would eventually, like colonialism, be defeated by the unity of all those who are being oppressed and exploited. He prescribes unity and awareness amongst all Africans.

Buttressing the above submission, Noah Echa Attah in his paper, “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria” (2013), traces the root of underdevelopment in Africa, particularly in Nigeria, to the effects of neocolonialism. In his assertion, African countries have never been truly independent after colonialism had left because the idea of partnering with the ex-colonialists has continued to guide state economic policies. Foreign firms have continued to dominate the business sectors of the economy such that relatively few, but large and integrated foreign firms otherwise called multi-national corporations, have made themselves indispensable to the growth or otherwise of the economy. Local industries in Africa are extensions of metropolitan firms, such that the needed raw materials for the industries depend on very high import content of over 90% from the capitalist economies (Attah 2013, 76). Thus, the continued dependence of industrial investments in Africa on the capitalist intensive technology is strictly aimed at further developing the metropolitan economies.

Attah explicates how Western neocolonialists have collaborated with local bourgeoisie in Africa to perpetuate the exploitation of the people and state economies in Africa. According to him, most of the local bourgeoisie collaborators are not committed to national interest and development, and their aim is to ensure the continued reproduction of foreign domination of the African economic space. The local bourgeoisie are bereft of ideas capable of engendering growth and development. The objective of foreign capital, therefore, is to continue to co-opt the weak and nascent local bourgeoisie into its operations. “The co-optation of the local bourgeoisie into the network of foreign capital condemned the former to the position of ‘comprador’” (Attah 2013, 77).

Adducing from the above exposition, Attah also asserts that neocolonialism is a new form of imperial rule characterized by the domination of foreign capital. His claim is that instead of real independence, what Africa has is pseudo-independence with the trappings of the illusion of freedom. To him, neocolonialism in Africa is made possible due to the roles and actions of local bourgeoisie in collusion with foreign capital. He is concerned that the different African economies have become willing tools in the hands of the West because of their fragility. In many cases, the African states have inadvertently authorized the dependency of African economies on foreign capital, which is a necessary legitimacy for neocolonialism. Neocolonialism, Attah submits, leads to underdevelopment where the local bourgeoisie and the foreign capitals are interested in the economy for personal accumulation rather than national development of the neocolonialist state.

5. The Myth of Neocolonialism

Thomas Molnar (1965) in his paper, “Neocolonialism in Africa?”, asserts that African nations continue to depend on the Western industrial nations for economic aid, loans, investment, market, and other technical assistance because they require this dependency for their development. He acknowledges that the colonial regime in Africa left Africa in destitution, not only materially but also in terms of education and technical training. Molnar also affirms that nobody will deny that the colonialist period sanctioned abuses and exploitation on Africa (Molnar 1965, 177).

However, in spite of the end of colonialism in Africa, Molnar is concerned that African economies have not been properly functional, independent of foreign aids and investments. He claims that the economic presence of the West is imperative for the future progress of Africa’s socio-economic and political stability. The assumption is that only fruitful economic arrangements with western industrialized countries may guarantee Africa’s future (Molnar 1965, 182).

Further, Molnar asserts that the call for decolonization in Africa took place in a hurried, haphazard way. Africans were unready and immature for economic and political independence as of the time it achieved it. As a result of this situation, the West is under obligation in post-colonial Africa to keep up its aid, not as a tribute paid for past colonial situation but as one half of a two-way process of cooperation (Molnar 1965, 183). For its development, Africa needs the West. Since the newly independent African countries will continue to be economically dependent on the West, neocolonialism is not a negative term. In fact, “neo-colonialism” is the only way of getting Africa to the take-off stage” (Molnar 1965, 183).

In a reaction to Molnar’s glorification of neocolonialism in Africa, Tunde Obadina’s article, “The Myth of Neocolonialism” (2000), which is a critical analysis of the colonial situation in Africa, and the myth surrounding indigenous growth and development in post-colonial Africa, gushes at the ‘apologist’ claim about the positive influences of colonialism. According to Obadina, these apologists contend that despite the exploitation of resources perpetrated by the colonialists, their overall influence on the African society in terms of reducing the economic gap between Africa and the West is positive. The argument here is that colonialism improved the living condition of Africans, providing necessary tools for civilization such as formal education, modern medicine, and enlightenment, including shaping the political organization. However, Obadina notes that in spite of these apologetic claims, Africa is still today considered a continent in economic and political crisis. However, the apologists are usually quick to point out that the failure of Africa’s development is due to the conscious or unconscious refusal to adopt the legacies of the colonialists. These apologists even go to the extent of saying that Africa is in such a state because they gained independence a lot sooner than was necessary for them.

Citing D.K Fieldhouse as one of the apologists, Obadina mentions that Fieldhouse is of the view that it would be difficult to imagine what would have become of African countries had the colonial rule not come. Fieldhouse had contended that pre-colonial Africa by itself lacked the capacity, social and economic organization to transform itself into modern states that would result in the establishment of advance economies. According to Fieldhouse, African states today would be a direct replica of what they were in the primitive days if they had not encountered the European culture and civilization.

Fieldhouse’s Eurocentric orientations result from the contention that Africans are by nature irrational, incompetent, and unable to produce anything useful. These orientations have gone to great extents to undermine Africa’s indigenous culture, tradition, religion and even philosophy. Even Gene Blocker opines that ‘the more philosophical African philosophy becomes, the less African it is in content and the more African it becomes, the less philosophical it is in content’ (Blocker 1987, 2).

In contrast to these apologetic arguments, Obadina points out that many nationalist African scholars have raised arguments to criticize the apologists’ positions on the basis that Africa would definitely have developed in a unique way, different from the European system. Obadina asserts that colonialism bore nothing but negative effects on Africa. According to him, Africans lived more ‘enriched’ lives before colonialism, and would have continued in that manner had they not been colonized. The effect of colonial rule has left the continent in a more dilapidated state; it has compromised the nations’ capabilities to develop. Obadina cites Walter Rodney’s assessments of how colonialism only succeeded in making Africa underdeveloped and, worse still, dependent on Western nations. The colonial juxtaposing of people from different cultures, ignoring the already established borders and redrawing Africa’s map created and still results today in various degrees of ethnic conflicts. Furthermore, colonialism undermined pre-colonial political systems which used to be effective for Africans and imposed foreign political concepts which include multi-party democracy. This, according to Rodney and many other critics, has left Africa in serious social and political crises. Obadina takes Nigeria as an example, which, because of its great population and natural resources, had qualities that seem to be leading eventually to her destruction. The party politics, according to Obadina, introduced by the colonialists was the major cause of ethnic conflicts in Africa.

Obadina acknowledges the difficulty in providing an objective analysis of the impact of colonialism in Africa. Despite this, he avers that colonialism in Africa may have some positives. However, what cannot be denied is the fact that it was something imposed, which had no regard for the existing structures already in place. Furthermore, colonial rule was not an idea geared towards the development of the colonized states in any way, but something established solely for the benefit of the colonial states.

Furthermore, Obadina forthrightly asserts that African nations are to be blamed for the continued reliance on their former colonial lords for economic and political direction. This neocolonial situation poses serious danger to the evolution of indigenous-based economic growth, and at the same time, has adverse effects on political stability. It has, according to him, hampered the growth of movements geared towards change. He believes that African nations, after independence, should have shut the door against imports and exports from the West and sought to develop themselves using their own resources, not dependent on foreign corporations. This would have, he says, improved Africa’s infrastructural levels of economic and political growth. In Obadina’s view, if African nations, for example, had pursued this independent economic agenda, they would have survived, because Cuba did so and survived. Obadina opines that the traditional agenda that came with colonialism was the false ‘idea of progress’. With this idea as the fundamental gospel, Africans were made to believe that their living conditions could be positively altered. This, among other things, smoothened their way into the continent, since, after all, it has been peoples’ desires for material improvement. It created in Africans the desire for Western civilization; but the West failed to hand over to Africans the tools for realizing such civilization.

Africa in the early 21st century is a neocolonial continent, according to Obadina. Africa continues to face the problem of dealing with the overbearing presence of Western civilization. In the quest for modernization, the focus is mostly on the Western world and there is little or no focus on the urgent need for internal changes in this same quest. Despite colonial rule in Africa ending only late near the end of the twentieth century, Obadina submits that African nations at the beginning of the 21st century have the responsibility to develop themselves by making changes in their internal structures using indigenous knowledge, while at the same time learning all they can from the influence of the Western world and putting these to use for their own benefit.

In all of these, African philosophers have continued to interrogate the idea of neocolonialism and its effects on Africa’s development. The outcomes of such interrogations continue to form content that need to be taught and studied within the project of African philosophy.

6. Neocolonialism Today in Africa: The Era of Globalization

The heavy dependence on foreign aid and the apparent activities of the multinational corporations in Africa reveal that Africa at the beginning of the 21st century is still in a neocolonial stage of development. The activities of the corporations in Africa, particularly those from Europe and America reveal nothing short of economic exploitation and cultural domination. Early 21st century Africa is witnessing neocolonialism from different fronts, from the influences of trans-national corporations from Europe and America to the form of a new imperial China, which many African governments now seem obligated to. The establishment of the multinational corporations, and more recently Chinese interests in Africa through Chinese companies, appear mainly to exist for the benefits of the home economies of the neocolonialists than to infuse local African economies with cash to stimulate growth and increase local capacity.

In the Africa of the early 21st century, some scholars, such as Ali Mazrui, have opined that the new form of neocolonialism is globalization. Much as the way that neocolonialism has been variously described, Mazrui also describes how globalization “allows itself to be a handmaiden to ruthless capitalism, increases the danger of warfare by remote control, deepens the divide between the haves and have-nots, and accelerates damage to our environment” (Mazrui 2002, 59). This negative perspective on globalization, particularly as it relates to extreme capitalism, essentially corroborates the assertion by Michael Maduagwu that “globalization is only the latest stage of European economic and cultural domination of the rest of the world which started with colonialism, went through imperialism and has now arrived at the globalization stage” (Maduagwu 1999, 65).

Looking at globalization in this way, Oseni Afisi, also condemns it to the corridor of neocolonialism and cultural subjugation.  Globalization becomes the imposition of a particular culture and value system upon other nations with the direct intent of exploitation. What this indicates is that globalization is indeed the engine room for the propagation of neocolonialism and new imperialism on the African soil.  While colonialism has ended, the reality on the ground in Africa in the immediate years after it is that political independence in many African states has not culminated in the much desired economic and cultural freedom (Afisi 2011, 5).

Afisi further opines that the greatest venture upon which the negative impact of globalization in Africa rests primarily is the erosion of Africa’s cultural heritage.  Upon this heritage hinges the political, economic, social, educational subjugation of the continent of Africa.  The forcible integration of Africa into globalization through slavery and colonialism has led to the problem of personal identity and cultural dilemma for the African.  Africa has had to be dependent upon Europe and America, and, more recently, upon China for its development and, one might add, the development of her identity and culture.

By contrast, Olufemi Taiwo, in Africa Must Be Modern: A Manifesto, has a radical position which seeks to have sufficient bearing on Africa’s consideration of the inherent benefits of globalization. Taiwo uncompromisingly defends globalization and suggests that its benefits must be harnessed by Africans. Taiwo berates the level of hostility that Africa has shown towards modernity, stating the regrettable impact of such hostility to the economic, social, and political development of the continent (Taiwo 2014). To Taiwo, Africa must address the challenges of modernity and globalization by embracing them instead of being hostile to them. As Taiwo further posits, Africa needs to fully engage with and derive benefits from globalization and its attendant capitalist democracy (Taiwo 2014).

In a similar vein, D. A. Masolo, in African Philosophy in Search of Identity, remarks that the needs and experiences of Africans today are conditioned by their peculiar cultural circumstances. The nature of these cultural circumstances is that the African of the post-colonial period has embraced modernity, science and technology as part of his/her culture. These African understand the world around them by being open minded and ridding themselves of any traction that may mold their thinking. This is the nature of African identity after colonialism, an identity which will aid in the intellectual construction of a modern African philosophy (Masolo 1994, 251).

7. Conclusion

As a theme of African philosophy, the term neocolonialism became widespread in use—particularly in reference to Africa—immediately the process of decolonization began in Africa.  The widespread use of the term neocolonialism began when Africans realized that even after independence their countries were still being subjected to a new form of colonialism.  The challenges that neocolonialism poses to Africa seem to be related to the socio-economic, cultural, and political development of the people and states of the continent. These challenges have, however, been attributed both positive and negative impacts on the continent.

On the whole, this article is an exposition of the theme of neocolonialism within the project of African philosophy. The introduction is a cursory look at the term “neocolonialism” with a view to clarifying the basic concept of the term. The history of neocolonialism is a historical analysis of the beginning of neocolonialism in Africa. This analysis reveals how the idea of neocolonialism was nurtured before independence was granted to most African states. No doubt, the term neocolonialism has some close relations to some other concepts. This explains the reasons the term colonialism, imperialism, decolonization, and globalization are essential to better understanding neocolonialism. Discussions about the negatives and some positives of colonialism and its offshoot neocolonialism in Africa are exposed in this article. This reveals that neocolonial elements may continue to be an integral part of Africa’s socio-economic, cultural, and political existence. However, some of the social and political philosophical questions which may continue to preoccupy the minds of Africans include: Can neocolonialism be abolished from Africa? Can the positives of neocolonialism outweigh the negatives, or vice versa, in terms of the impacts on the African economy? Will Africa ever be truly decolonized?

8. References and Further Reading

  • Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
    • A discourse on neocolonialism and integrative form of culture in Africa.
  • Afisi, Oseni Taiwo. “Tracing Contemporary Africa’s Conflict Situation to Colonialism: A Breakdown of Communication among Natives”. Philosophical Papers and Review Vol.1 (4): 59-66 (2009a).
    • A historico-philosophical analysis of colonialism as the root of tribal conflicts in Africa.
  • Afisi, Oseni Taiwo. “Human Nature in Marxism-Leninism and African Socialism”, Thought and Practice: A Journal of the Philosophical Association of Kenya, New Series. Vol. 1(2): 25-40 (2009b).
    • A comparative analysis of the nature of man in the philosophical ideologies of Marxism and African philosophy.
  • Afisi, Oseni Taiwo. “Globalization and Value System”. LUMINA. Vol. 22.(2): 1-12 (2011).
    • A discourse on the nature of globalization: its negatives and benefits on Africa’s value system.
  • Attah, Noah Echa. “The historical conjuncture of neo-colonialism and underdevelopment in Nigeria”. Journal of African Studies and Development. Vol.5 (5): 70-79 (2013).
    • A historical analysis of the effect of colonialism in Africa.
  • Blocker, Gene. “African Philosophy”. African Philosophical Inquiry. Vol.1(2): 1- 12 (1987).
    • A critical discussion on the idea and content of African Philosophy.
  • Bruhl, Lucien Levy. How Natives Think, Princeton. N.J: Princeton University Press, 1985.
    • A discussion on the distinction between the mindset of the primitive and the mindset of the civilized human beings.
  • Eze E. Chukwudi. “Modern Western Philosophy and African Colonialism”. E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discourse on the contradictory nature of European Enlightenment period and its promotion of slavery and colonialism at the same time.
  • Fanon, Frantz. Black Skin, White Masks. London: MacGibbon, 1952.
    • An analysis of the psychology of the colonial situation.
  • Fanon, Frantz. The Wretched of the Earth. New York: Grove Press, 1962.
    • A critical analysis of colonialism, cultural and political decolonization.
  • Goldsborough, J. “Dateline Paris: Africa’s Policeman”. Foreign Policy 33. (1979).
    • An analysis of the French African economic policy.
  • Hegel, G.W.F. Lectures on the Philosophy of World History. Trans. H. B Nisbet. Cambrdige: Cambridge University Press, 1975.
    • Hegel’s philosophical analysis of world history.
  • Horvath J. Ronald. “A Definition of Colonialism”. Current Anthropology 13 (1): 45-57 (1972).
    • A general discussion on definition and classification of colonialism.
  • Kant, Immanuel. Anthropology from a Pragmatic Point of View. Robert B. Louden, ed. Introduction by Manfred Kuehn. Cambridge: Cambridge University Press, 2006.
    • A collection of Kant’s lectures on Anthropology and the nature of man.
  • Lenin, Vladimir. Imperialism: The Highest Stage of Capitalism. Moscow: Progress Publishers, 1916.
    • An exposition of the nature and the process of the imperialist’s financial capital in generating greater profits from their colonies.
  • Maduagwu, O. Michael. “Globalization and its challenges to National Cultures and Values: A Perspective from Sub-Saharan Africa” being a paper presented at the International Roundtable on the challenges of Globalization, University of Munish, 18-19 March. (1999).
    • A presentation of globalization as a tool of European economic and cultural domination of the rest of the world.
  • Martin, Guy. The Historical, Economic and Political Bases of France’s African Policy. The Journal of Modern African Studies 23 (2): 189-208 (1985).
    • An analysis of France’s continued influence and power on its former African colonies.
  • Marx, Karl. Capital: The Process of Capitalist Production. Trans. Fowkes. Knopf Doubleday. 1972.
    • A critical analysis of the economic law of capitalist mode of production.
  • Marx, Karl. Economy, Class and Social Revolution. London: Nelson Publishers, 1977.
    • Marx’s selected writings on capitalism and the process of social revolution.
  • Masolo, D.A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
    • A discussion on the debate of what constitutes the study African philosophy.
  • Mazrui Ali. A. “Nkrumanizm and the Triple Heritage in the Shadow of Globalisation” being a paper presented at the Aggrey-Fraser-Guggisberg Memorial Lectures, University of Ghana, Legon, Accra, (2002).
    • A presentation of the effects of globalization on Africa.
  • Mbembe, Achili. On the Postcolony. Berkeley: University of California Press, 2001.
    • A discourse on the nature of neocolonialism and its negative impact in Africa.
  • Molnar, Thomas. “Neocolonialism in Africa?” Modern Age. Spring (1965).
    • An analysis of nature of neocolonial economy in Africa and its positive impacts.
  • Mudimbe V.Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge. Bloomington and Indianapolis: Indiana University Press, 1988.
    • A discourse on the interplay of Western colonialism in Africa, and its denial of the existence of “the other” in European consciousness.
  • Mudimbe V.Y. The Idea of Africa. Bloomington: Indiana University Press, 1994.
    • A discourse on the distinct cultural values that constitute the African reality.
  • Ngugi, wa Thiong’o. Decolonising the Mind: The Politics of Language in African Literature. London: James Curreys, 1986.
    • A discourse on cultural imperialism and on written African languages as a vehicle for Africa’s decolonization.
  • Nkrumah, Kwame. Neo-Colonialism: The Highest Stage of Imperialism. London: Heinemann, 1965.
    • An analysis of the nature of neocolonial economy and its relationship with Imperialism.
  • Nwolize, OBC. “The Fate of Women, Children and the Aged in Contemporary Africa’s Conflict Theatres”, Paper delivered at the Public Annual lecture of the National Association of Political Science Students, University of Ibadan, (2001).
    • A presentation of the effects of colonialism and conflict situations in contemporary Africa.
  • Obadina, Tunde. “The myth of Neo-colonialism” in Africa Economic Analysis, 2000.
    • A critical analysis of colonialism and neocolonialism in Africa.
  • Parenti, Michael. The Face of Imperialism. New York: Paradigm Publishers, 2011.
    • An exposition of the role of multinational corporations in the imperialist conquests.
  • Sartre, Jean-Paul. Colonialism and Neocolonialism, translated by Steve Brewer, Azzedine Haddour, Terry McWilliams; Paris: Routledge, 1964.
    • A critical analysis of French colonial policies on Africa, especially in Algeria.
  • Serequeberhan, Tsenay. “Philosophy and Post-Colonial Africa”in E. Chukwudi Eze (ed) African Philosophy: An Anthology. Massachusetts: Blackwell Publishers Ltd, 1998.
    • A discussion on the nature of neocolonialism in Africa.
  • Taiwo, Olufemi. Africa Must Be Modern: A Manifesto. Indianapolis: Indiana University Press, 2014.
    • A discourse on globalization and modernity in Africa.
  • Young, Robert. Postcolonialism: An Historical Introduction. Oxford: Blackwell, 2001.
    • A discussion on the historical & theoretical origins of postcolonial theory.
  • Wiredu, Kwasi. Conceptual Decolonization in African Philosophy: 4 Essays. Ibadan: Hope Publications, 1998.
    • A discourse on decolonization and development of an authentic Africa Philosophy

 

Author Information

Oseni Taiwo Afisi
Email: oseni.afisi@lasu.edu.ng
Lagos State University
Nigeria

Artistic Medium

Laocoon sculptureArtistic medium is an art critical concept that first arose in 18th century European discourse about art. Medium analysis has historically attempted to identify that out of which works of art and, more generally, art forms are created, in order to better articulate norms or standards by which works of art and art forms can be evaluated. Since the 19th century, medium analysis has emerged in two different forms of critical and theoretical discourse about art.Within traditional art forms, such as painting and sculpture, modernist artists and critics began to interrogate art forms and the history of their possibilities in order to discover the necessary conditions for instances of those forms. This modernist interest in medium aimed to strip away unnecessary traditional artistic conventions in order to identify that which is essential to the form. Within newly emergent forms of popular art, such as movies, comics, and video games, artists and critics have attempted to articulate both the ways the norms of these forms of popular art arise from new material and technological modes of creating and interacting with reproducible images.

The possibilities for an art form, whether traditional or newly emergent, can only be discovered by artists in acts of artistic creation. For this reason, the relation between art forms and their media develops and changes as the art forms continue to be discovered and reimagined by artists.

Table of Contents

  1. Introduction
  2. The Challenge of Medium Skepticism
    1. Carroll’s Medium Skepticism
    2. The Need for the Concept of Artistic Medium
  3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium
    1. Aristotle
    2. Aristotle and Horace as Models for Theorizing Art
    3. Music
  4. Gotthold Lessing and the Problem of Art in the 18th Century
    1. Art in the 18th Century
    2. Lessing on Painting and Poetry
    3. Herder and Hegel
  5. The Invention of Photography and the Discovery of Its Artistic Possibilities
    1. The Etymology of the Term “Artistic Medium”
    2. The Challenge of Photography
    3. Accounting for Photography’s Artistic Possibilities
  6. Modernism as the Discovery of Medium
    1. The Emergence of Modernism
    2. Modernism and 20th Century Music
    3. Fried on the Value of Modernism
    4. Postmodernism
  7. New Forms of Popular Art in the 20th Century
    1. Movies
    2. Comics
    3. Video Games
  8. Conclusion
  9. References and Further Reading

1. Introduction

Artistic medium is a term that is used by artists and art critics to refer to that out of which a work of art or, more generally, a particular art form, is made. There are, generally speaking, two related ways of using artistic medium in critical or artistic discourse. On the one hand, we often talk about an artistic medium by reference to the material out of which a work of art is made. Works of art in museums or galleries will often have the medium listed along with the title and the artist’s name on the display card. A painting might have “oil on canvas” or “watercolor” listed along with the artist’s name and the work’s title; a sculpture might have “marble,” “steel,” or “papier-mâché” listed in the same way. On the other hand, we also talk about medium to refer to the way a work of art organizes its audience’s experience in space and time. An actor might talk about the differences in performing on television and on film as performing in two different artistic media. Or a critic might describe television as a “writer’s medium” and movies as a “director’s medium.” Sometimes there may be no interesting differences regarding the material out of which the work was made; for this way of using medium, the crucial differences have to do with the spatiotemporal organization of the audience’s experience of the work of art.

Much of the critical and theoretical interest in the concept of artistic medium stems from a belief that analyzing the material conditions that underlie a particular art form allows us to articulate its norms and standards. Often critics and theorists who make use of the concept of artistic medium do so in order to connect an analysis of an art form’s material basis and conditions with some claim about what artistic norms or standards are proper to the art form. Because the connection between a description of a medium, an art form’s material basis, and the artistic experiences appropriate to that medium is a matter of some controversy, clarification of the philosophical insights and confusions associated with the concept of artistic medium must start not by arriving at its comprehensive definition, but rather by noting the characteristic forms of reasoning in which the concept is used.

There have been two relatively distinct forms of discourse involving artistic medium: a modernist discourse, and one associated with newly emergent popular art forms such as movies and comics. The uses of artistic medium in these discursive traditions have shared important similarities, especially a reliance on the concept to identify what is distinctive about a particular art form and an interest in grounding the norms governing a particular art form in the form’s material basis. But there are important differences as well. Modernist uses of the concept appeal to artistic medium as a way of justifying avant-garde approaches to traditional art forms by making clear how contemporary experimental instances of a form are genuine instances of that form because they inherit the tradition in question by purifying it of all that is inessential and accidental. Proponents of newly emergent popular art forms, on the other hand, are interested in articulating what is unique about the new forms in order to locate their possibilities in distinction from traditional or older forms and to demonstrate how its best instances are works of art.

In recent years, some analytic philosophers of art have suggested that the concept of artistic medium is necessarily a confused one and should be abandoned in favor of other art-critical concepts such as style or genre. For example, Noël Carroll, in his theorization of film in the 1980s and 90s, suggested abandoning medium as a critically inert and confused category. More recently, Carroll has found critical uses for the concept of artistic medium, especially in the analysis of exemplary instances of avant-garde film. Though Carroll does now recognize legitimate applications of the concept of artistic medium in film criticism and theory, it is nonetheless worthwhile to take seriously his initial radical skeptical challenge to critical and theoretical uses of the concept of artistic medium. Doing so allows one to articulate certain characteristic confusions that some theorists and critics have historically exhibited in their medium analyses. But, equally, it allows for the opportunity to clarify what, historically, has characterized the richest and most insightful critical and theoretical uses of artistic medium. As we shall see, these kinds of confusions are apt to arise when the theorist or critic does not remember that artistic medium is an art critical concept. As an art critical concept, what a medium for an art form is can only be known through artists discovering its possibilities in the creation of works within the form.

In general, confusions arise in using artistic medium when theorists and critics do not treat the concept as a critical one, but instead picture a medium as something that could be identified prior to and independently of any particular artistic uses to which it is put. In Art as Experience (1934), John Dewey attempts to combat this possibility for confusion by distinguishing between an artistic medium and raw material. When we identify some collection of matter prior to and independent of any particular artistic context, then we have identified some raw material, which may, it is true, be put to use for various artistic ends. But we cannot specify what artistic possibilities are available to artists by identifying and analyzing that material. Rather, when a given vehicle is taken up and explored within a particular artistic problematic or tradition, artists discover it as an artistic medium. It is thus through the work of artists that the artistic possibilities of an artistic medium can be discovered, and not by analyzing the material in isolation. In this sense artistic medium essentially is a critical concept. What is possible within a particular medium is discovered by artists as they attempt to explore a particular artistic problematic or inherit a particular artistic tradition. For this reason, what the medium of an art form is, as Theodor Adorno insists in Philosophy of New Music (1948), is a historical question. There is no fixed, ahistorical answer to the question, “What are the material conditions for painting, or music, or any particular art form?”

In order to clarify the nature of the concept artistic medium, this article takes two different, although closely related, lines of approach. This article will first clarify the roles artistic medium can rightfully play within critical and theoretical discourses by responding to the challenge of medium skepticism, which takes the concept to be necessarily confused. Then, it will outline the history of artistic medium’s emergence by describing the forms of critical reasoning in which the concept has been characteristically used. In so doing, the article will articulate why the concept has been so important for the development of new forms of popular art and for avant-garde and modernist experimentation, and why the concept has been vulnerable to characteristic confusions.

The first section of this article will engage with the challenge of medium skepticism. Medium skepticism, a position recently prominent in the philosophy of art, holds that artistic medium gives rise to a set of characteristic confusions because the concept is both essentializing and one that grounds its reasoning in a priori reflection upon the nature of the material basis of an art form. As we shall see, those two theoretical temptations are not inherent in the concept but are dangers only given a certain picture of how we determine what the medium is.

Then, employing Adorno’s thought that our understanding of what a medium is must be located in the history of the development of its art form, the article describes the emergence of the concept of artistic medium and the history of its critical and theoretical uses in the development of modern arts. First, there is a brief account of how philosophers and critics in the ancient world and the European tradition theorized artistic possibilities relative to a given art form prior to the emergence of artistic medium as a critical and theoretical category: namely, by identifying an art form and its norms and standards by specifying its proper experience. Then, the two sites of emergence for the concept of artistic medium are described: first, in the 18th century, in the critical work of Gotthold Lessing and, most importantly, his reflections on the differences between painting and poetry; second, most decisively, in the 19th century, in response to the invention of photography, its potential as a new art form, and its relation to painting. This complex historical field within which the concept of artistic medium emerged allows us to locate the centrality of the concept in 20th-century artistic discourses and also the philosophical confusions associated with it.

2. The Challenge of Medium Skepticism

a. Carroll’s Medium Skepticism

 In the late 1980s and early 1990s, Noël Carroll issued what we may call the challenge of medium skepticism; he argued that medium analysis of film is necessarily confused and that film’s medium can be identified, but that it has no artistically normative implications. Carroll’s interest in the concept of artistic medium originated from within film theory, but his claims about medium analysis being an essentializing discourse, and his recommendations that theorists and critics abandon the concept of artistic medium in favor of other art theoretic concepts like genre and style, apply to the use of medium as an art theoretic concept in general. More recently, Carroll has moved away from his medium skepticism and has acknowledged uses for artistic medium, especially in describing and evaluating avant-garde or structural film. Nonetheless, evaluating the challenge of medium skepticism is valuable in order to clarify how critics and theorists use artistic medium in characteristically confused ways and how the concept can be used in ways that avoid those confusions. In responding to the challenge of medium skepticism, we articulate the value of artistic medium.

At the root of Carroll’s concern is his contention that medium analysis ultimately depends, either implicitly or explicitly, on an illicit judgment from the nature of material conditions underlying the work of art to a set of norms and prescriptions meant to govern an art form. Carroll contends that medium analysis must slip into a theory of medium specificity in which each art form has a single medium and each medium has a distinctive feature that does and should characterize artistic creation within the form: the medium’s distinctive feature or power provides the aim the art form and its practitioners should pursue. Carroll’s rejection of medium specificity, which he sees as the inescapable heart of medium analysis, consists of two related objections: first, that medium analysis necessarily essentializes by identifying an art form with a single medium and a medium with a unique characteristic; and second, that medium analysis is structured around a priori reflection on the nature of the medium, yielding normative prescriptions about which artistic experiences are appropriate that in fact merely reflect one’s theoretical biases or idiosyncratic tastes.

b. The Need for the Concept of Artistic Medium

It is true that much medium analysis essentializes, but critical or theoretical discourse involving medium is not necessarily essentializing. It may seem that the modernist tradition, for example, supports Carroll’s contention that use of artistic medium is necessarily essentializing, since exploration of an art form by means of its media often meant stripping away all that could be in order to discover what was essential to the art form. However, the modernist question of what is essential to the art form is itself a particular, historically-located question about artistic medium, and whatever answers modernist artists generated need not be taken as definitive of the timeless and unchanging essence of some particular art form. In fact, rejection of essentializing claims is revealed as necessary for sound medium analysis, since there is no independent grasp on what counts as an artistic medium outside of the context provided by a particular artistic problem or concern. This can be somewhat obscured for us because of the importance of modernism for our understanding of artistic medium as a concept; it is characteristic of modernist artists to take an art form itself as a question or problem to be explored. The modernist question of what, for example, constitutes the conditions of painting is part of an artistic project that takes as its starting point the history of painting and looks to inherit that tradition by stripping away all that is inessential to painting. But modernist artists and critics identify and explore shape or surface or color as they arise as problems or conditions for painting at a particular historical moment, not because of some timeless understanding they have of the nature of the media as such.

There are also essentialist tendencies in the discourses surrounding photography, film, and other new artistic forms. These essentialist claims often arise because the medium analysis starts from the problem of the new technological bases for these art forms. In this tradition of medium analysis, theorists and critics start with reflection on the nature of, say, photography as a new technological, productive process and draw aesthetic prescriptions or standards from the ontological structure of photographic experiences. Rudolf Arnheim, a film theorist prominent in the 1930s and after, offers a perspicuous example of a commitment to medium specificity in his theorization of film. Arnheim identifies the characteristic differences between a black and white photographic moving image and reality, and prescribes their accentuation as the basis for film art. Arnheim’s theoretical commitments to the purity of film as a medium led him to reject the development of color film and sound film as detracting from the artistic possibilities of silent black and white movies. As an example of medium analysis gone wrong, Arnheim’s restrictive prescriptions exemplify the reasons for Carroll’s rejection of medium specificity theories, for Arnheim’s commitment to the purity of silent film and rejection of the possibilities of sound movies reflects his own theoretical views about the unique characteristics of film itself, but voiced as an essentialist understanding of film. Arnheim’s critical blindness to the possibilities created with sound and color stems from his a priori commitment to an understanding of the nature of the photographic image.

But if Arnheim’s use of the concept artistic medium is subject to the confusions of medium specificity that Carroll warns of, other paradigmatic instances of medium analysis for emergent popular art forms do not fall prey to them. The critic and philosopher Walter Benjamin, in his “Little History of Photography” (1931), for example, articulates unique characteristics of photography that make possible artistic expression. However, Benjamin does not assume in advance of his critical investigation that he knows what art can be or everything that photography can do. Rather, Benjamin is committed to the view that photography’s invention changed what we could do, how we could see, and how we relate to our world and, at the same time, changed what can count as art and artistic experiences. Benjamin starts not with an essentializing, a priori analysis of the nature of photography in itself and for any possible use, but with a critically honed appreciation for the photographer Eugène Atget’s work and how that work discovers and explores certain characteristic possibilities for photography. Benjamin aims to identify the artistic possibilities within emergent artistic practices, what he identifies in his essay “The Work of Art in the Age of Its Technological Reproducibility” as art’s developmental tendencies. He does not attempt to offer an analysis of the medium independent of its emerging artistic uses and prescribe what those uses should be. Instead, Benjamin identifies unique characteristics of the new medium; that is, he describes new things that we can do with the new technology, so that he can articulate terms by which future artistic expressions can be understood. This is a critical judgment, to be evaluated in light of past and future instances of photography and photographic arts.

Carroll, then, worries about certain forms of theoretical confusion and critical blindness that can arise around commitments to medium specificity. Sometimes critics and theorists concerned with emergent popular art forms, like Arnheim, do succumb to the temptation towards medium specificity and prescribe some range of appropriate artistic experiences based on a supposed a priori, essentialist understanding of the nature of the technological basis for the art form. But the best critics and theorists engaged in medium analysis are not attempting to prescribe to artists which experiences are proper to the medium; rather, they are critically evaluating works of art in order to offer an analysis of why the art works on its audience in the ways that it does, and to articulate new possibilities that change what art can be.

Within the modernist tradition of medium analysis, most artists, critics, and theorists do not begin with an a priori, essentialist analysis of the medium in order to identify how it should be used in particular art forms. Instead, it is characteristic of the modernist tradition that the art form itself is taken up by artists as a problem or a question. Such artists, theorists, and critics do not draw out artistic prescriptions based on an understanding of the medium in isolation from any particular use. Rather, questions of shape and color are explored so as to find ways in which shape and color are conditions of possibility for painting. As we shall see, some postmodernist theorists and critics, such as Rosalind Krauss, share a version of Carroll’s worry that interest in exploring a medium demonstrates a commitment to some form of medium specificity: they object to the role of medium exploration in modernist arts, believing that modernist artists continually rediscover the same few automatisms or forms of repetition that they then explore as if they have, through the originality of their creation, taken on the art form itself.

It is common in critical discussions of art to use art form and medium more or less interchangeably and to talk about, for example, “the medium of painting.” It is worth distinguishing between an art form as a particular form of experience and a medium as the material conditions that underlie that form of experience and make it possible. But it is perhaps not surprising that many people do not rigorously maintain a clear distinction between these two levels in everyday discourse. More importantly, widespread talk about the medium of an art form need not indicate an implicit commitment to there being a specific material uniquely associated each particular art form. Instead, we may think about talking about the medium of an art form as indicating a level of analysis. Furthermore, what is picked out by some claim about medium is a contextual question relative to the history of the art form and the particular artistic problematic the work in question explores. If talking about the medium of an art form can indicate a level of analysis rather than a commitment to the medium specificity thesis, then we are open to thinking about the medium as a shifting collection of automatisms or forms of productive repetition that can evolve through the history of the art form.

Though Carroll worries that confusions can arise in using the concept of artistic medium, he acknowledges in his later work that critics and theorists have found productive critical and theoretical uses of the concept. In general, medium analysis that is grounded in exploring how characteristic experiences constituting a work of art or an art form are structured and achieved does not fall prey to the dangers of medium specificity that worried Carroll. For medium analysis pursued in this manner is critical, articulating the automatisms that underlie ongoing artistic practices. It begins with an artistic problem or concern and identifies media by the role they play in the discovery and exploration of possibilities within that framework. It does not start with reflection on the nature of the medium itself in order to deduce what characteristic properties or features are appropriate to exploit for artistic ends. Theorists and critics are most likely to fall victim to medium specificity confusions when they picture the medium of an art form as some type of raw material that can be grasped independently of and prior to its uses within an art form. Instead, our grasp on what a medium is and can be only arises within the context established by an artistic problem or tradition, to be discovered by artists and articulated by critics.

Furthermore, Carroll warns against attempting to identify an art form with a single artistic medium. There is no reason to think that there is some single material condition that constitutes the possibilities within an entire art form. Instead, within art forms, there are different media, different forms of repetition or automatisms, which have significance in structuring characteristic instances of the form. This means that, for example, film itself, considered as a technological innovation, may not always provide the most productive level of unity for medium analysis. Rather, we can ask: What are the various media at work in popular movies, in documentaries, in cartoons, and in particular film arts? How do those automatisms allow for the discovery of the artistic possibilities within particular forms of film art? Approaching medium analysis in this way allows one to locate the concept of artistic medium as a critical tool for understanding modern art experiences in which art is taken to have its distinct forms of experience that have their own norms and standards without necessarily committing to any essentialist assumptions.

3. Theorizing About Art Forms Before the Emergence of the Concept of Artistic Medium

Prior to the 18th century, theorists of art articulated the norms and standards characteristic of a given art form without any reference to a medium or the material conditions that underlie the work of art. Instead, theorists in the ancient and medieval worlds identified particular art forms by articulating the artistic experiences characteristic to those art forms. In so doing, they then could develop an account of what was and was not appropriate within an art form, given the experience at which the art form necessarily aimed.

a. Aristotle

Aristotle’s account of tragedy in his Poetics is one of the earliest examples of this way of theorizing about an art form and is a paradigmatic instance of it. Aristotle develops his account of tragedy as an art form by identifying the characteristic experience—catharsis—instances of the art form provide for audiences. He is able to develop a number of claims about how a tragedy should be structured in order to achieve this characteristic experience. Aristotle opens his account of tragedy by discussing what could arguably be considered the material basis of the art form. He identifies rhythm, melody, and verse as the means by which the effects fundamental to tragedy are achieved. However, he immediately notes that other art forms also utilize those same means and argues that what is specific to tragedy as an art form cannot be identified by analysis of the means by which instances of the art form are achieved. Instead, he turns his attention to the features of tragedy that are specific to the art form by analyzing the experience that structures the form.

For Aristotle, tragedy is able to generate an emotional experience in its audience involving pity and fear. His name for this experience is catharsis. The precise nature of this Aristotelian catharsis has been and continues to be a matter of great debate: it has been taken to be an experience of emotional discharge, of emotional purification, and of moral education, to name just a few interpretations. Fortunately, we do not need to determine what exactly Aristotle took catharsis to be in order to note the general shape of his reasoning about tragedy as an art form.

That tragedy aims for catharsis, a specific emotional experience for those who watch the drama, defines the nature of tragedy as an art form for Aristotle and organizes his analysis of how tragedies are structured. Most importantly, for Aristotle, tragedy can best achieve an experience of catharsis in its audience because it, unlike history or epic poetry, has a dramatic form. The events of a tragedy unfold as the audience watches; the audience apprehends events as they unfold, constituting the unity of the plot as a single action. The audience of a history or epic poem, on the other hand, learns of many actions and events, and their interrelation and history, by means of a narrative rather than dramatic form. Because the audience members for a tragedy witness the events of the drama as they play out in front of them, they are able to understand how those events, on the one hand, have an inexorable logic and, on the other, arise from choices that the protagonist makes that could have been otherwise. It is tragedy’s dramatic form that allows the audience to experience the choices made by the protagonist as both highly contingent, in that other options or other paths are always available, and necessary, in that the protagonist is such that the central action of the tragedy is characteristic of him. This tension between the contingency and the inevitability of the events depicted in a tragedy arises because the audience witnesses the protagonist make a choice and come to live with, or be destroyed by, its consequences. There is, therefore, an intimate connection between the dramatic structure of tragedy as an art form and the emotional cathartic experience available only to tragedy’s audiences.

Tragedy’s characteristic experience of emotional catharsis for the audience accounts for a number of features of the art form. Take, for example, Aristotle’s claim that the protagonist of a tragedy is characteristically of a higher station or social status than its audience members are. On his view, audience members who recognize the protagonist as their social better are in position to achieve the appropriate emotional catharsis because the events that unfold are understood to be the result of the protagonist’s choices and not merely to be the result of intractable or unfortunate circumstances; the heights from which the protagonist falls clarify the consequences of his action. Whether or not we agree with Aristotle about the reasons he offers for the fact that Greek tragedies characteristically centered on royal figures or even demigods, we can see that he is identifying the art form’s characteristic experience in order to explain why the art form has the features it does. Other features of tragedy that Aristotle accounts for include the extent to which the central action is grounded within basic familial structures and tensions and the role of the protagonist’s ignorance in the completion of the action.

Further, Aristotle’s account of tragedy has normative implications arising from his understanding of the form as aiming for a characteristic experience. In articulating the characteristic experience for the audience at which the art form aims, Aristotle identifies and explains why certain features, such as the high social status of its protagonist and the protagonist’s ignorance, aid in producing catharsis. Thus, his account of tragedy and how it is structured outlines a set of norms and standards that arise from the aim of achieving the art form’s characteristic experience.

b. Aristotle and Horace as Models for Theorizing Art

Aristotle’s analysis of tragedy as structured by the aim of achieving a characteristic experience in its audience is paradigmatic of artistic analysis and theory in the ancient Greek and Roman world and then again in Europe up through the 18th century. Horace’s Ars Poetica, for example, identifies poetry by its characteristic experience, namely, an experience of apprehending unity of action. Beginning in the Renaissance, Europeans began more extensive theorizing about art and art forms and they relied on Horace and, to a lesser extent, Aristotle, in order to justify a variety of accounts of art forms and the characteristic experiences that establish their norms and standards. For example, literary theorists such as Lodovico Castelvetro and Julius Caesar Scaliger, writing in 16th century Italy, both defended poetry as an imitative art (following Horatian principles) and argued for the legitimacy of literature produced in popular, vernacular languages. Such arguments were soon adopted across Europe by theorists such as Joachim du Bellay and Philip Sidney, who developed largely Horatian-style defenses of vernacular poetry as an imitative art. In the 17th century, Aristotle’s Poetics achieved a kind of prominence as a model for theorizing the norms of art forms; especially in France, dramatists such as Pierre Corneille and literary figures associated with the Académie française, took Aristotle to argue for the unity of action, of place, and of time as fundamental to dramatic structure. Although it is certainly possible to identify appeals to medium within this tradition of literary criticism inspired by Aristotle and Horace, most obviously in the arguments made in favor of works written in vernacular language as a legitimate means of artistic expression, by and large this Horatian tradition of poetics centers around analysis of the art form’s norms and standards by reference to an artistic aim, characteristically imitation, which structures the particular type of artistic experience.

c. Music

By contrast, theorizing about music, both in the ancient world and in the European tradition prior to the 18th century, did make appeal to what we might think of as the medium of music in articulating the norms underlying musical practices. However, what was identified as music’s medium itself changed over time. This is not a weakness of these theoretical accounts but rather an illustration of Adorno’s contention, noted above, that our understanding of the nature of the medium of some art form is itself a product of the history of that form: there is no fixed, ahistorical characterization of music’s medium because what musical experiences are is not fixed but continually discovered in composing and playing music.

The earliest theories about the nature of music within the Western tradition held that music is the expression of a set of natural ratios that are equally expressed macroscopically in the movement of the celestial spheres and microscopically with the harmonies of the human soul. Pythagoras and his followers placed mathematical and musical knowledge at the center of their studies and influenced Plato and other ancient philosophers. One prominent example of this ancient tradition of theorizing music as an expression of natural harmonies is Boethius, a Roman statesman and Neoplatonist philosopher from the 6th century C.E. In his De institutione musica, Boethius distinguishes between three types of music: music of the spheres, music of the human spirit, and instrumental music. Boethius’ account of music as the expression of the natural macro and microscopic harmonies of the universe was important through the European Middle Ages.

As the musical possibilities within the European tradition developed from monophony to polyphony in the late Middle Ages and Renaissance and then, during the Baroque era, to more complex contrapuntal forms of composition, theorizing about the nature of music also changed. Much medieval theorizing was directly inspired by Boethius and oriented around practical instructional concerns for composers and musicians. Indeed, much of the history of Western musical theory is bound up with theories of tuning. By the 16th century, there emerged in the work of theorists such as Gioseffo Zarlino, a new theoretical category—temperament—which made possible accounts better suited to describe the innovations in polyphony and counterpoint composition. Another strand of European music theory that emerged during and after the Renaissance drew on concepts from ancient rhetorical theories in order to describe the space of musical possibilities. By the end of the 18th century, such rhetorical analysis had largely been supplanted by a new type of analysis of musical forms, such as the sonata, in the work of Heinrich Koch and others. These theoretical developments around the nature of musical experience were responsive to the evolving nature and increasing complexity of music composition and performance in the 18th and 19th century.

The preceding is not meant to be a comprehensive overview of theorizing about art and art forms in the ancient and European traditions prior to the 18th century. Importantly, theorizing about artistic medium did not have the central place in theorizing about art forms more generally that it came to occupy beginning in the 18th and 19th centuries. As art established itself as a relatively autonomous region of experience, the concept of medium emerged as a critical means of understanding distinct types of artistic experience. Thus while we can, in retrospect, identify candidates for music’s medium, there is something anachronistic about thinking of them as examples of theorizing about artistic medium, since theorists like Boethius, for example, did not categorize music as one type of artistic experience among others. As Europeans began to conceive of art as a distinct region of experience, the Aristotelian and Horatian model for articulating the norms of an art form by reflection on its overall aim was fundamentally modified with the introduction of sustained consideration of the medium as the means for achieving a particular type of artistic experience.

4. Gotthold Lessing and the Problem of Art in the 18th Century

 With his Laocoön: An Essay on the Limits of Painting and Poetry, published in 1766, Gotthold Ephraim Lessing is often identified as the first theorist or critic to engage in medium analysis. In that essay, he articulates the standards by which painting and poetry should depict bodies in action through an analysis of the spatiotemporal conditions under which the art forms are experienced. Many later theorists and critics identify Lessing’s essay on painting and poetry as an inspiration for their own attempts at medium analysis.

a. Art in the 18th Century

Before examining the particulars of Lessing’s own analysis of painting and poetry, it will be helpful to note something of the wider context of art and art criticism in 18th century Europe. By the mid-18th century in Europe, art had become its own distinct realm of experience, the culmination of a long and complex process in which artistic creation gradually decoupled from religious expression. It is a reflection of art’s new status as a distinct form of experience that the 18th century in Europe saw the emergence of art history and art criticism as intellectual practices and aesthetic experience became an important topic for philosophers. Johann Winckelmann, for instance, developed the first comprehensive account of ancient art, distinguishing between Greek, Greco-Roman, and Roman art, and explicitly took up the Greeks in particular as a model for contemporary artists. Denis Diderot, among his many other accomplishments, began, in 1759, writing critical reports on the biennial Paris Salons for a German newsletter, offering evaluations of particular artists and paintings and, equally, developing a critical account of the experiences at which painting should aim.

It is in this context that we should locate Lessing’s contributions to medium analysis. In writing his Laocoön essay, Lessing shared with Winckelmann and Diderot (all three writing more or less concurrently in the 1750s and 1760s) an awareness of art as a distinct form of experience and thus as posing its own particular questions and distinct problems. Lessing, like Winckelmann, held that art, inasmuch as it was distinct from religious experience, should take beauty as its ultimate aim. Further, the ancient Greek, Hellenic, and Roman artists provide the best model for contemporary artists in large part because, being pre-Christian, their work reflects an unadulterated focus on artistic beauty for beauty’s sake. Later Christian artists were, on this view, required to maintain a sort of double allegiance to the demands of beauty and the teachings of the Church, to the detriment of their work artistically.

Like Diderot, Lessing was interested in art’s ability to generate an experience of a kind of moral or spiritual beauty in its audience. Both Diderot and Lessing believed that painting, for example, can show moments of beauty that are not exclusively visual in nature by encouraging audiences to imagine moral and spiritual possibilities that we do not ordinarily encounter or recognize in our everyday lives. The aesthetic aim means that painters should choose a revelatory moment within the action depicted that offers the chance to think through the nature of that action.

b. Lessing on Painting and Poetry

Lessing’s work of medium analysis in the Laocoön essay begins with an art historical question: did the Laocoön Group, a statue excavated in Rome in 1506 and currently on display at the Vatican, precede or come after Virgil’s account of Laocoön and his death in the Aeneid? In the Aeneid, Virgil recounts the story of Laocoön, a Trojan priest, who warns against bringing the Greek offering of a giant horse statue into Troy. Snakes sent by the gods kill Laocoön and his sons; the Trojans interpret this a sign from the gods that Laocoön should not be heeded and bring the offering into the city, ensuring their ultimate doom. The question Lessing sets out to answer, whether the sculptors inspired the poet or vice versa, serves as a jumping-off point for Lessing’s broader interest in establishing the different norms and standards that govern painting and poetry.

Lessing characterizes painting and poetry in quite abstract and capacious terms. He defines poetry as any art form that unfolds in time and painting as inclusive of all art forms that are visual in nature. Distinctions between the different materials out of which works of art (between marble and oil paint on canvas, say) are not relevant for Lessing’s analysis: he abstracts away from what will later be thought of by some critics and theorists as distinct artistic mediums in order to characterize painting and poetry in terms of the spatiotemporal experience of the audience in apprehending the work. This stands in contrast with later analysis of artistic medium, which often centers on the particular matter out of which works of art are made.

In fact, though often (rightly) credited as the first critic to offer an analysis of artistic medium, Lessing himself does not describe painting and poetry as artistic mediums. It is worth noting that there was no widespread appeal to a concept of artistic medium until the middle of the 19th century, when a term that had its home in scientific contexts was extended into artistic contexts. So, although Lessing is correctly credited with the first developed medium analysis, he describes painting and poetry as different methods for achieving a particular artistic experience.

Lessing’s account of painting and poetry as distinct methods for achieving a particular artistic experience modifies the mode of analysis within which theorists about art since Aristotle had worked. That Aristotle-inspired mode of analysis developed an account of the norms and standards governing an art form by identifying the experience characteristic of the art form and generating an account of the features of the forms in light of their contribution to the overall experience aimed at. Similarly, Lessing’s analysis of painting and poetry takes as its starting point a particular artistic aim; namely the audience’s imaginative apprehension of bodies in action as beautiful, and distinguishes between two different methods for achieving that experience. Unlike Aristotle, Lessing has in mind a general type of experience, the imaginative apprehension of bodies in action as beautiful, achieved by two different methods. Lessing is able to offer an account of the different artistic norms governing painting and poetry because he takes them up as different means by which a kind of artistic experience can be achieved, where each means is constituted by a distinct spatiotemporal structure.

According to Lessing, because signs contiguous to other signs best represent objects contiguous to other objects, painting’s appropriate subject matter is bodies at a single moment of time. Similarly, because signs that succeed one another best represent objects that succeed one another in time, poetry’s appropriate subject matter is actions unfolding in time. Lessing argues that the material conditions of the method determine what is appropriate to that art form:

If it is true that in its imitations painting uses completely different means or signs than does poetry, namely figures and colors in space rather than articulated sounds in time, and if these signs must indisputably bear a suitable relation to the thing signified, then signs existing in space can express only objects whose wholes or parts coexist, while signs that follow one another can express only objects whose wholes or parts are consecutive.

Objects or parts of objects which exist in space are called bodies. Accordingly, bodies with their visible properties are the true subjects of painting.

Objects or parts of objects which follow one another are called actions. Accordingly, actions are the true subject of poetry. (78)

Some subsequent critics and theorists have read Lessing here as offering two distinct tasks for painting and poetry, depicting bodies and depicting action. But Lessing considers these to be two different methods for achieving a single effect; namely, getting the audience to imagine bodies in action. For he completes the thought by noting “painting too can imitate actions, but only by suggestion through bodies…. Poetry also depicts bodies, but only by suggestion through action.” (78) Because Lessing is interested in painting and poetry inasmuch as they are different methods for encouraging audiences to experience beauty through imagining bodies in action, he immediately goes on to outline the norms that govern these different methods. Poets should construct their descriptions of actions by referencing each body participating in the overall action in terms of a single characteristic as it makes its contributions to the action. That allows the audience to imagine each body’s role in the overall action and so not be distracted by unnecessary detail or description. Painters should, according to Lessing, choose to depict the single moment of an overall action that encourages the audience to best imagine the action and one that offers the audience particular insight into what is at stake in the action.

Homer and the sculptors who created the Laocoön Group are exemplary artists in Lessing’s view because they grasp the norms at work in their respective art forms. Homer’s mastery in part stems from his ability to offer descriptions of scenes that are centered around and grow out of an action, as when he describes Agamemnon’s armor and regalia as he is in the act of donning it. According to Lessing, Homer’s usual practice is not to linger on description for its own sake but only to describe objects in the midst of action and only in terms of a single distinct characteristic, so as to encourage the audience’s imagination: for example, he evokes black-prowed ships skimming over a wine-dark sea. Likewise, the sculptors of the Laocoön Group are exemplary inasmuch as they have chosen a moment before the snakes have crushed and broken Laocoön and he has started to scream. Instead, they show his resistance to his suffering, the way in which he is enduring the pain and suffering that will inevitably overwhelm him. In this way, the audience is able to imagine both the enormity of his suffering and the spiritual beauty of his resistance in the face of immense suffering.

Lessing’s analysis of painting and poetry, therefore, identifies them as distinct methods for exploring a shared artistic problematic and considers them insofar as they constitute the aimed-for artistic experience. He is able to articulate a set of critical norms and standards for the art forms by reflecting on their different underlying spatiotemporal conditions. Lessing’s analysis establishes the dependence of the norms of painting and poetry on the spatiotemporal conditions of the experiences of those works of art. This dependence is the result of his choice to begin not with the material conditions of the art forms, but by locating those art forms as participating in a particular artistic aim, that is, the demand that painting and poetry encourage their audiences’ imaginative apprehension of the beauty possible for bodies in action. In turn this aim determines painting and poetry as methods and gives Lessing’s normative recommendations the force they have.

c. Herder and Hegel

 Lessing’s theorization of artistic medium proved influential as questions of art and aesthetic experience moved to the center of philosophical thought at the end of the 18th and the beginning of the 19th century. Johann Herder, for example, in his Sculpture: Some Observations on Form and Shape from Pygmalion’s Creative Dream (1778) extends and complicates Lessing’s approach to artistic medium by denying that painting and sculpture could, as Lessing held, be understood in the same terms because they both offer a single moment of an action up to the audience for contemplation. Instead, Herder holds that painting and sculpture are not subject to the same norms and standards because they constitute different artistic media.

Most decisively, the concept of artistic medium plays a central role in the thought of Georg Wilhelm Friedrich Hegel, arguably the most influential philosopher of the early 19th century. On Hegel’s view, the norms and ideals that structure human interactions develop out of and so are made explicit in the history of human political development, intellectual development, and artistic development. Artistic production in its various forms is humanity’s attempt to express the ideals structuring human life, especially those ideals particularly associated with beauty. But if this is case, then the question arises: Why are there different art forms, given that they all are structured around a shared, if at times inchoate, desire? Hegel’s answer to this question, most developed in his Lectures on Aesthetics (published posthumously in 1835), is that different artistic media serve as the basis for and so give rise to the possible forms of expression within particular art forms. In Hegel’s work, we have perhaps the clearest example of the consequences of conceiving of art as a distinct form of human experience, separate from religion and politics for example. In thinking of art as a general field of experience within which we can distinguish distinct art forms, the concept of artistic medium is critical in allowing Hegel to maintain the unity of art in general while still distinguishing clearly between particular art forms and the norms and standards that govern them.

5. The Invention of Photography and the Discovery of Its Artistic Possibilities

a. The Etymology of the Term “Artistic Medium”

As noted above, Lessing does not describe his analysis of painting and poetry as an analysis of artistic medium. He talks about painting and poetry as different methods for achieving an experience. Indeed, it was not until the middle of the 19th century, almost 90 years after Lessing’s Laocoön essay, that medium began to be used in artistic contexts, referring to the material conditions out of which works of art are made. The Oxford English Dictionary notes that the earliest use of medium in an artistic context, signifying the raw material out of which a work of art is made, is from 1861. This new use of medium in an artistic context grew out of an earlier use that describes the substance (such as oil or water) that painters mix with pigment to create paint. We still speak of oil paint as a distinct medium that differs from watercolor; this use is an extension from an earlier one that identifies oil and water as media in which pigments are mixed.

b. The Challenge of Photography

While the first uses of medium in artistic contexts often referenced the material out of which paint and then paintings were made, the timing of this development is likely connected to a radical problem that gripped the art world of the 19th century; namely, the emergence of technologies that reproduced images: first lithography, and then photography, which reproduces images of our world now past. In the 1820s, Nicéphore Niépce developed the first photoetching process; in the 1830s, a number of inventors, most prominently Niépce’s partner Louis Daguerre and William Fox Talbot, worked independently on developing photographic processes that were capable of capturing images mechanically with much shorter exposures. By the end of the 1830s, both Talbot and Daguerre had publicly debuted their technology for mechanically capturing and reproducing images from the world. The invention of photography was widely felt as a challenge to the received understanding of what could be art and what artistic experiences were proper to painting specifically. But the debates surrounding photography and painting in the 19th century largely centered on whether or in what ways photography could serve as the means of artistic expression.

The new photographic technology was, in many cases, quickly distinguished from processes by which works of art were produced and dismissed as being incapable of producing art. The most prominent argument made against the possibility that photography could be art was based on the mechanical nature of the photographic process. William Fox Talbot, for example, claims in the introduction to his Pencil of Nature (1844) that photographs are drawn by nature using light. Talbot’s view that photography was the production of natural images by mechanical means alone, without the intervention of any human artistry, was widely shared in the 19th century and taken to be grounds that photography could not ultimately be an artistic medium. If the photograph is made by the interaction of natural processes of light and chemicals, then it cannot be a work of art, any more than a tree or a sunset could be.

The 19th century debates about the possibility of photographic art seem, from our 21st century vantage point, hopelessly misguided. But this is largely because we are the recipients of an understanding of what can count as art that has been altered by the development of reproductive technologies like photography and film. Throughout most of the 19th century, it seemed obvious to a large number of critics that the mechanical nature of photography excluded it straightforwardly from consideration as art. Charles Baudelaire for example, in his Salon of 1859, worries that the public is starting to confuse photography for art by mistakenly taking a mechanical means for image reproduction as capable of inspiring imagination. Photographers, on this view, were mere technicians, only capable of reproducing natural images by exploiting the laws of nature. Because there was no human creativity at work in producing the photographic images, those images could not be art and the photographers were not artists.

Those who argued that photography could be art generally took two lines of response. On the one hand, many held that, while photography was ultimately a mechanical process, it could be artistic inasmuch as it is able to mimic painting and the artistic experiences of which painting is capable. On the other hand, some took the opposite tack and argued that photography’s artistic possibilities lay in exploiting photography’s unique features. On the first line of response, photography was artistic by the extent to which it was able to look like painting or otherwise reproduce it. On the second, photography was seen as artistic by the extent to which it distinguished itself from painting’s artistic possibilities by taking advantage of the features that only it possessed.

Photographers attempted to imitate painting in several ways. One of the earliest uses to which photography was put was the reproduction of paintings in order to disseminate widely what would otherwise require a pilgrimage to see. Further, some early photographers took photographs that were essentially reproducing the subject matter of earlier paintings by staging scenes reminiscent of those paintings. These genre photographs are the primary objects of Baudelaire’s denunciation of photography as essentially lacking in human creativity. Finally, some photographers began to produce photographic experiences that mimicked experiences recognizable from painting, utilizing soft focus, for example. Julia Margaret Cameron’s photographic work exemplifies many of these quasi-painterly techniques and is deeply influenced by pre-Raphaelite painting.

On the other hand, many photographers and critics, beginning with the earliest instances of photography, emphasized aspects of photographic production that were taken to be unique to it and therefore unlike painting. In announcing Daguerre’s invention to the French Academy of Sciences in 1839, François Arago emphasized the precision of the photographic image and its ability to allow its audience to see aspects of their world that they would otherwise be unable to see. During the 19th century, photographic practices evolved to capture moments and aspects of the world that are fleeting. Eugène Atget, for example, developed his photographic productive practice around capturing aspects of Parisian life that were disappearing as the city continually modernized, memorializing scenes from a form of life that was fading away. Walter Benjamin, looking back at the development of 19th and early 20th century photography from the vantage point of the 1930s, articulates photography’s unique features in terms of its ability to allow us to become conscious of what otherwise passes before our eyes unrecognized. On Benjamin’s view, photography’s ability to reveal what he terms the optical unconscious constitutes its distinctive power and its value as an artistic medium.

c. Accounting for Photography’s Artistic Possibilities

The problem posed by photography and its relation to art generally and painting specifically led to an approach towards questions of artistic medium distinct from the approach pioneered by Lessing. Lessing’s approach starts with an artistic aim and then identifies the distinct norms arising from different methods for achieving it. Photography instead presented itself as a problem: the question is not how best to achieve a particular artistic experience with a mode of expression or set of material conditions, but instead, to what extent, if any, can this new mode of expression be artistic? The emergence of this new technology raised a pressing question: Can it serve as the basis for artistic creation and, if so, what aspects or features of it are most appropriate for creating art? This approach to medium analysis begins by identifying a particular medium and its unique features or characteristics and determining what artistic experiences artists using the medium should pursue.

Both the early defenders and objectors to photography’s artistic value follow the same basic template for analyzing medium. First, the critic identifies the medium (the photographic technology that constitutes a new mode of expression) and determines its unique features. Then, the theorist or critic evaluates the ways in which those unique features can generate artistic experiences. There are two ways this evaluation can happen. One can reflect on the nature of the medium and the unique features of its productive process and try to deduce an absolute a priori claim about its artistic possibilities independently of critical examination of the works produced; this approach generates confusions or, at best, tendentious critical prescriptions. Alternately, one can engage in a critical investigation of work that utilizes the unique features of the medium in order to articulate how new artistic possibilities are being generated. Baudelaire’s dismissal of photography’s potential for artistic value exemplifies the first possibility. Benjamin’s critical examination of Atget’s work exemplifies the latter.

6. Modernism as the Discovery of Medium

a. The Emergence of Modernism

 As modernism transformed almost all traditional art forms more or less simultaneously during the first half of the 20th century, artistic medium became one of the crucial art critical concepts not just for theorists and critics but for artists as well. For modernist artists, inheriting traditional art forms meant querying the conditions of possibility underlying the art form in order to determine, through discovery and exploration, the necessary conditions for contemporary instances of the art form. For this reason, modernist arts often seemed to critics and some artists to be exercises in shedding, as some things taken to be essential to the form earlier in the tradition are discovered to be mere conventions and thus no longer conditions for contemporary instances of the art form.

There are too many important modernist artists across a wide variety of traditional art forms to give a comprehensive survey of them here. However, it is worth identifying a few of them in order to emphasize the modernist concerns that were deeply shared by artists across a broad swath of different art forms. In dance, for example, Isadora Duncan, the American dancer, rejected the inherited tradition of ballet techniques and thought of her own practice as the exploration of dance’s medium, the human body in its freedom of movement and gesture. In his “Ornament and Crime” (1913), Adolf Loos, an Austrian architect, critic, and theorist, argued against unnecessary architectural ornamentation in ways that heralded the modernist emphasis on purity of form and design. The modernist architectural commitment to form following function in design and the general dictum that buildings should be “machines for living” culminated in the work of architects such as Le Corbusier, Walter Gropius, and Ludwig Mies van der Rohe. In literature, the work of Gertrude Stein, James Joyce, and Franz Kafka, to name only a few, all differently exemplify modernist commitments. In Joyce’s work, for example, we can see a broad modernist development from an exploration of the history of forms of literary expression in Ulysses (1922) to an obsessive examination of the expressive possibilities of language itself in his final book, Finnegans Wake (1939).

b. Modernism and 20th Century Music

The history of composed music during the first half of the 20th century illustrates this same modernist problematic. By the 1920s, a number of composers began to explore new and unconventional forms of composition, including serial and atonal composition. Among the most prominent of these modernist composers was Arnold Schoenberg, who developed twelve-tone technique or row composition. Other notable modernist serial composers included Anton Webern and Karlheinz Stockhausen. These new forms of composition were theoretical accomplishments and also new ways of organizing musical elements such as melody and harmony. Not only did these developments in composition provide new systems of musical organization, but they readjusted audience’s understanding of older forms of composition now thought of as, for example, limited to tonal relations in contrast with modernist atonality. As Theodor Adorno observes, Schoenberg’s twelve-tone technique, for example, stands in contrast with 19th century composers who manipulate and transform the repetition of certain basic musical relations. In Schoenberg’s compositions, however, there is no room for repetition. Rather, the composer moves through a series of distinct relations between pitches. Usually these rows of related pitches do not get repeated but explored once, and then a new row is generated. No longer are composers straightforwardly exploring relations between melody and harmony by the repetition and manipulation of a few themes or motifs. Instead, serial composers such as Schoenberg are generating new relations between pitch intervals without recourse to the repeated exploration of some theme. Thus, in the modernist exploration of music’s possibilities, the medium of music itself undergoes radical developments. In other words, modernist composers no longer took for granted compositional techniques or assumptions that had for prior generations seemed obvious or unproblematic. Instead, modernist composers aimed to generate an entire system of composition and so too a theoretical articulation of the constraints and rules by which their particular system of composition operates. As these modernist questions gripped more composers, the possible compositional systems and their accompanying theoretical justifications proliferated.

c. Fried on the Value of Modernism

Why modernism should have taken hold in a number of traditional art forms more or less simultaneously in the early part of the 20th century remains an important question, one that cannot be answered directly in this article. We will simply note that it did happen and that although many artists and critics embraced the modernist moment with traditional art forms as the promise of clarifying what was truly necessary for those arts, the modernist moment also clearly marked a kind of crisis for those traditional art forms, in which that which had previously been accepted as the possible basis for serious work within the form no longer satisfied artists or audiences.

The logic of modernism is important for understanding the concept of artistic medium because the exploration of the art form’s medium in its purity was central to it. This article will focus on one clarifying example, the critical discourse analyzing and justifying modernist painting in the 1950s and 1960s, in order to bring out the characteristic structure of reasoning about medium in artistic modernism. Three critics in particular, Clement Greenberg, writing in the 1940s and 1950s, and Stanley Cavell and Michael Fried, writing in the 1960s, championed the modernist project in American painting and sculpture; their work offers a perspicuous example of the logic of modernism as an exploration of artistic medium. Greenberg’s “Modernist Painting,” in particular, is an early statement of critical purpose, justifying the modernist project of medium exploration for its own sake. Greenberg saw the modernist project as akin to the Kantian commitment to critical philosophy: like Kant, modernist artists, on Greenberg’s view, engage in a project of criticism, reflecting on the nature of the form in its purity by discovering and articulating its limits. Cavell, in his “A Matter of Meaning It” (1969), identifies modernism as the realization of an art form’s artistic media through the discovery of its contemporary conditions of possibility. The work of the modernist artist, according to Cavell, is to find the criteria for an instance of an art form in the act of inheriting that form.

The dominance of this modernist problematic was challenged in the 1960s as minimalist or conceptual art on the one hand and pop art on the other developed alternative artistic possibilities to be explored. These alternative artistic programs competed with modernist painting by rejecting painting and sculpture altogether as forms for artistic expression. Instead, the aim was to cultivate forms of experience in ways not bound by painting’s forms, its problematics, and its media. For example, pop art was interested in exploring the image and contemporary experiences of images as such, rather than posing the image as a problem situated merely within the history of painting.

This confrontation between modernism on the one hand and pop art, minimalism, or conceptual art on the other was felt as a crisis involving the very existence of painting and sculpture as art forms by a number of artists and critics. Michael Fried, in “Art and Objecthood” (1967), offers perhaps the strongest critical polemic on behalf of modernist painting and sculpture. Fried identifies recent developments in painting as responding to a conflict between minimalists and modernists about how shape should function as an artistic medium:

What is at stake in this conflict is whether the paintings or objects in question are experienced as paintings or objects, and what decides their identity as painting is their confronting of the demand that they hold as shapes. Otherwise they are experienced as nothing more than objects. This can be summed up by saying that modernist painting has come to find it imperative that it defeat or suspend its own objecthood, and that the crucial factor in this undertaking is shape, but shape that must belong to painting—it must be pictorial, not, or not merely, literal. (151)

Fried here identifies the minimalist project as taking what he terms a literal approach to shape, for example, in which shape on its own is apparently explored for its artistic possibilities. By contrast, for Fried the modernist project takes the art form itself as an artistic problematic or a contemporary question and the medium exploration is in service of the exploration of that problematic: What now are the conditions of painting?

Fried’s objection in condemning conceptual artists and minimalists as literalists is that exploration of the medium as such loses connection with what is possible within traditional art forms like painting and sculpture; namely, an aesthetic experience. Painting and sculpture aim at the production in their audiences of a shared moment of judgment, a moment of judgment that audiences together take pleasure in extending and contemplating. The literalists, on the other hand, construct for their audiences experiences that cannot be shared in a single moment of judgment but are necessarily individual explorations of objects within a space over some duration. Fried thinks conceptual and minimal artists offer audiences theatricalized experiences, unfolding for each individual in time without the possibility of a shared moment of aesthetic judgment. For Fried, it is not possible to arrive at the unity of an aesthetic experience simply by the exploration of material conditions in themselves, cut loose from any artistic problematic or aim. In contrast, modernist artists committed to the traditional art forms are interested in discovering the material conditions for experiences that demand aesthetic judgment. The modernist worry, articulated by Fried and Cavell, is that the possibility for authentic experiences of art are lost when the questions of artistic medium no longer arise in relation to an existing art form and its traditions. Substituting theatricalized experiences for serious artistic experiences will mean that people no longer have experiences that are both aesthetic and ascetic. Grounding one’s explorations within the history of an art form in order to offer a contemporary instance of the art form calls for an appropriately serious response on the part of the beholder, a response that demands self-work on the part of the beholder in ways that enrich both the experience and the beholder. In contrast to aesthetic literalism, the modernist project thus involves the cultivation of aesthetic judgment; through contemplation, better understanding of the relation between the present instance of the form and the history of the form is achieved.

d. Postmodernism

For those artists and critics committed to modernist art, the task at hand was the survival of traditional art forms through a radical exploration of what is most essential to a particular form. In so doing, the modernist artist aims to continue the art form by an original contribution to the tradition and creating work that discovers artistic possibilities on behalf of the art form. But to those artists and critics that emerged in the wake of the modernist moment, this stance of the heroic artist revealing possibilities for an art form through creating new instances of the form came to seem inappropriate and a bit self-aggrandizing. Postmodern critics and artists in the 1970s and after developed new approaches to the history of traditional art forms. Rosalind Krauss in “The Originality of the Avant-Garde” (1981) argues that modernists and avant-garde artists imagine that they make themselves the new origin of the art form as they continuously discover its essential conditions. Such modernist artists continually rediscover a few prominent automatisms, forms of repetition, as if they were the essence of painting and their discovery were an act of artistic originality. Krauss argues that rather than discover the essential material conditions of the art, modernist artists returned again and again to a fundamental form of repetition activated throughout the history of painting; namely, the grid. Avant-garde and modernist artists from this point of view do nothing but treat the various forms of repetition and automatisms that constitute the history of the art form as original individual discoveries of the grid and its possibilities for painting.

Postmodern art is characterized by a change in relation to an art form’s tradition. Rather than attempt to investigate the necessary material conditions for contemporary expression within the art form, the postmodernist attempts to otherwise activate a tradition’s discarded conventions through exaggeration, juxtaposition, and unabashed repetition. Modernism’s approach to tradition is to strip away everything conventional and inessential in order to discover the fundamental conditions of the art form. Postmodernism instead approaches an art form’s tradition as a collection of automatisms to be explored and activated again through conscious repetition. Rather than discarding all that is unnecessary, the postmodern artist juxtaposes or exaggerates disparate conventions and so hopes to rediscover possibilities within forgotten automatisms that modernism would have discarded. From the point of view of postmodernism, Fried’s attachment to the experience of working on oneself in order to better behold modernist works of art looks like a self-aggrandizing response analogous to the modernist artist’s “genius” in laying bare the conditions of painting as such. On the other hand, if Benjamin is correct that the invention of lithography and subsequent technologies that produce and reproduce images irrevocably accelerated European art’s transition from forms of experience centered around cult value to forms of experience centered around exhibition value, then modernist commitments to the possibilities of aesthetic-ascetic experience in inheriting traditional art forms may be seen as late attempts to sustain the possibilities of art with cult value.

Modernism coheres around the concept of artistic medium, for the aims of modernism are to discover the possibilities and thus the limits, the strengths, the tensions, the contradictions within an art form and its history by discovering how the form’s material conditions can be transformed into new and newly definitive instances of the form. Because postmodern artists return to the history of the form to discover its discarded conventions and automatisms rather than discarding them, they no longer think of media in terms of the essence of an art form. But although medium is not central to postmodern art, it is nonetheless still useful in critically evaluating works of art. Postmodernism emerged in the wake of modernism; the break with the history of traditional art forms that constituted the modernist moment was a break from conventions that no longer provided conviction for artistic expression. Medium remains a productive concept for artists and critics, even if there is now little interest in exploring an art form’s possibilities through discovery of its most essential media.

7. New Forms of Popular Art in the 20th Century

The 20th century saw the emergence of a succession of new forms of popular art, including movies, comics, and video games. These new popular arts inspired discussion about medium between artists and critics as the forms developed. Especially early in the lives of new popular art forms, questions of medium and medium analysis seem pressing to both artists and critics. This is because new art forms grow by borrowing artistic problems and aims from related earlier forms and by exploring a different material basis that makes new forms of artistic expression possible and in which artistic questions and interests can be pursued, critiqued, or otherwise engaged.

a. Movies

 Movies and film criticism are an exemplary instance of a new form of popular art generating elaborate and often productive discourses about medium. Much of the early history of film criticism and film theory is marked out by exploration of a number of questions related to film as a medium. As film theory began to establish itself as an academic field of interest in the early 1970s, interest shifted away from questions of medium. But early in the development of the movies as an art form, film’s potential for artistic expression was a prominent critical conversation between theorists and artists.

In the 1920s, Soviet filmmakers and critics such as Sergei Eisenstein, Vsevolod Pudovkin, and Dziga Vertov were engaged in a critical discourse about film’s potential for popular art in writing and with their movies. Eisenstein, for example, argued that film’s unique and characteristic feature was montage, the juxtaposition of images through editing into a sequence. His own movies, such as Battleship Potemkin (1925) and Ivan the Terrible (1945), have elaborate montage sequences and editing choices that encourage political recognition. Likewise, Pudovkin claims that montage and juxtaposition of images through editing can change the meaning of images. For both, the emphasis on montage as a unique and characteristic feature of film being central to its possible artistic experiences stems from their interest in the ways in which juxtaposing images can generate both abstract judgments and strong emotional responses. Dziga Vertov’s exploration of the photographic and mechanical basis of the film image was central to his artistic and political project of discovering new ways of allowing his audiences to understand the world around them. He emphasized film’s ability to reveal minute features and gestures, otherwise unseen or unnoticed, so that the audience is able to recognize them as characteristic of the overall environment. One exemplary instance of Vertov’s exploration of the nature of film as a medium of expression is his experimental movie, Man with a Movie Camera (1928).

A number of early film critics developed an analysis of film’s artistic and political promise around its photographic basis. In “The Work of Art in the Age of Its Technological Reproducibility,” (1939) Walter Benjamin emphasized photography’s ability to make visible minute aspects and gestures so as to display the character of people and environments. Similarly, popular movies offer the opportunity to develop new habits of perception that allow audiences to recognize fraught meaningful gestures. Walter Benjamin’s medium analysis is exemplary, for Benjamin explicitly asks of all reproductive technologies that have developed after lithography not, “What features of these material conditions are unique and thus capable of artistic experiences that take advantage of those features?” but rather, “How does the existence of these new reproductive technologies change what art can be?” Benjamin’s form of medium analysis is historically and critically grounded in successful instances of emergent forms of popular art.

Rudolf Arnheim, also writing film criticism in the 1920s and 1930s, offered an analysis of film’s potential as an artistic medium. Unlike Benjamin, who emphasizes film’s ability to reveal the optical unconscious, Arnheim identifies the ways in which the film image differs from everyday images and derives from those features the norms that should serve as the basis for film art. Arnheim’s critical blinders and commitment to an idea about purity of medium led him to argue against the possibility of film art that includes sound because film and sound are distinct media and should not be mixed.

Another early theorist committed to a medium analysis of film and photography is Siegfried Kracauer. Kracauer’s critical articulation of film’s artistic possibilities stems, like Benjamin’s, from the unique capabilities of photography and film, and so he identifies and encourages film’s documentary and democratizing impulses.

André Bazin, a midcentury French film critic who championed the Italian neorealists and cultivated a generation of French film critics and filmmakers, developed an analysis of film’s photographic basis. Bazin emphasizes photography’s ability to satisfy absolutely the desire to preserve the world as the basis for a critical understanding of film and its possibilities. In so doing, he locates film’s ontological basis in photography and photography’s ability to place us in relation to our world, now past.

This tradition of exploring the meaning of film’s ontological status as photographic includes Stanley Cavell’s work, especially his The World Viewed: Reflections on the Ontology of Film (1971). Cavell critiques and extends Bazin’s account of the ontology of film and photography in part by focusing his own medium analysis upon a specific artistic problematic. The World Viewed offers a medium analysis not of film as such, but of popular Hollywood movies. In analyzing the medium of movies, according to Cavell, prior to the 1960s popular movies explored the possibilities and tensions within a problematic of modern action that emerged in the 19th century concerning the possibilities for urbane, stylish, and productive action, but by the late 1960s, a new problematic concerning the contemporary possibilities for action simpliciter was emerging. The World Viewed was written in observance of this transition within popular movies and draws on the conceptual tools of medium analysis in order to register the fact of this transformation.

If the beginning of the 1970s saw the emergence of a new problematic for popular movies to discover and explore, it also saw the establishment of film theory as an academic discipline. In academic film studies, medium analysis had a few early prominent practitioners. Leo Braudy’s The World in a Frame: What We See in Films (1976), for example, offers an analysis of film’s artistic possibilities by distinguishing between the ways in which movie worlds are both closed off from and open to and interpenetrate with our world. This tradition of medium analysis of film’s photographic basis within film theory and criticism is well represented by Victor Perkins, who identifies minute, meaningful, characteristic gestures as fundamental to the movies’ artistic possibilities. Perkins’ commitment to the fundamental role that artistic medium has within his critical practice points to an intimate nexus of considerations of medium and artistic experience within the creation of movies and artistic practices more generally.

However, soon after film theory and criticism found an academic home within film studies in the 1970s, theorists and critics moved away from sustained medium analysis of film or the film arts. Instead, academics developed alternative interpretative frameworks, prominently Lacanian, feminist, and Marxist ones, that displaced the prominence of medium analysis within film theory. In analytic philosophy, as philosophy and film established itself as a domain of inquiry, instances of medium analysis gave way primarily to cognitive science approaches to theorizing film and film experiences. Medium analysis depends on the unity of the aesthetic experience to which the medium in question is able to contribute. A cognitive science approach to the effects possible in certain modes of filmmaking need not concern itself with the unity of aesthetic experience.

In the 1990s, Noël Carroll, a leading advocate of the cognitive-science approach in analytic philosophy and film, argued that medium was necessarily a confused category and should be eschewed by philosophers interested in theorizing movies and other film arts. In the mid-2000s, Carroll adjusted his view and acknowledged uses for the concept of medium, especially in describing the practices of certain experimental or avant-garde film artists. Regardless, for many years since its inception, indeed until the mid-1970s, one prominent form film theory has taken is medium analysis.

b. Comics

Comics, as they have developed as an art form, have also developed critical and theoretic discourses that participate in some form of medium analysis. Much of the most prominent medium analysis has been by artists adopting a critical and theoretic stance with respect to their own artistic practices. Prominent instances of this medium analysis of comics by comics artists include Will Eisner’s Comics and Sequential Art (1985) and Scott McCloud’s Understanding Comics (1993). Both Eisner and McCloud offer paradigmatic instances of medium analysis, in that both are theorizing the particular ways in which comics, as an art form, are able to achieve forms of aesthetic unity in relating image and action.

c. Video Games

Video games and 21st century gaming offer another instance of an emergent popular art form that has inspired early practitioners, critics, and theorists to engage in medium analysis. Much of the academic discourse analyzing gaming grows out of film studies and necessitates some medium consideration as terms and interpretative frameworks are applied in new contexts or, alternatively, theorists attempt to distinguish clearly between experiences that are proper to movies and other narrative visual forms and experiences that are proper to games and gaming. Medium analysis has been an important aspect of developing theoretic and critical discourses about gaming in which game creators and theorists are in conversation.

8. Conclusion

Currently within academia, medium analysis is largely pursued in media studies and disciplines exploring the emergence of new media. In philosophy, medium analysis has recently been utilized in numerous ways within the philosophy of gaming and video games. Given the ways in which screens and screen technology continue to interpenetrate contemporary reality, we can anticipate further recourse to medium analysis in theorizing these new forms of experience. Even if the collapse of interest in modernist projects in the arts has moved contemporary theorizing about art away from medium as a central concept, academic theorists of new media and new popular arts still participate in a discourse of medium analysis.

Artistic medium continues to be a productive critical concept as well for working artists and critics interested in articulating the means by which an artistic experience is structured and organized. That theorists and critics attempting to theorize medium should run into characteristic confusions in defining and theorizing medium stems from the picture they share of medium as an object. They take their task to be identifying the object that is the medium in order to deduce and prescribe its appropriate artistic experiences. But working critics and artists are less likely to think of artistic medium as an object to be studied for its own sake. For such critics and artists, thinking about medium is thinking about how something functions in creating a particular effect or in structuring a particular form of experience. What photography, for example, can be is to be discovered by artists as they pursue particular projects or lines of exploration. In this sense, artistic medium is a critical concept; we can only say what media are constitutive of an art form by critically examining instances of the form. This way of approaching medium analysis, as necessarily a critical pursuit, conceives of artistic media as the capacities for organizing and structuring the audience’s experience as the means of exploring and discovering the possibilities and tensions within an artistic problematic. These capacities for organizing artistic experience are forms of repetition or automatisms that have significance as the means by which a form of artistic experience is structured.

Medium analysis emerged with, and has developed in response to, modern art. As critics and theorists began to argue for art and the aesthetic as a distinct form of experience in the 18th century, independent of its former subservience to religion and able to dedicate itself to aiming only at beauty, medium analysis developed. First developed in Lessing’s work, medium analysis is a critical tool for understanding the norms constituting a capacity for structuring an artistic experience. The value of artistic medium in theoretical and critical discourse is realized when medium is approached not as some raw material to be investigated in advance of its possible artistic uses but as a means by which artists discover and explore possibilities within a particular artistic problematic. Discovery of a medium’s possibilities happens by artists in the creation of new instances of an art form, and by audiences and critics in the experience of particular instances of the art form. Theorists can avoid confusion by remembering that medium is an essentially critical concept, in that what is possible within a medium is discovered by artists as they continue to create and explore.

9. References and Further Reading

  • Adorno, T. W. (2006). Philosophy of new music. R. Hullot-Kentor (Ed.). Minneapolis: University of Minnesota Press.
    • First published in 1947, Adorno here analyzes the work of Schoenberg and Strindberg as exemplary of the new possibilities in 20th century music and identifies the medium of music as a historical phenomenon.
  • Adorno, T. W. (2014). Current of music. Cambridge, United Kingdom: Polity.
    • This is a collection of Adorno’s work on radio, much of it unpublished during his lifetime, which analyzes how radio determines possibilities for experiencing music.
  • Aristotle. (2014). Poetics. In J. Barnes (Ed.). Complete works of Aristotle: The revised Oxford translation (Vol. 2). Princeton, NJ: Princeton University Press.
    • This is the standard English translation of Aristotle’s analysis of tragedy as an art form.
  • Arnheim, R. (1957). Film as art. Berkeley: University of California Press.
    • Arnheim argues that film’s artistic potential is best realized by taking advantage of the features unique to the medium.
  • Atget, E., & Abbott, B. (1964). The world of Atget. New York, NY: Horizon.
    • This is a collection of the work of Eugene Atget, whose photographs of Paris streets, according to Walter Benjamin, exemplify artistic possibilities for photography as an artistic medium.
  • Baudelaire, C. (1980). “The modern public and photography.” In A. Trachtenberg (Ed.), Classic essays on photography (pp. 83–90). Stony Creek, CT: Leete’s Island Books.
    • In this essay, Baudelaire argues that photography cannot be an artistic medium because it does not engage the imagination appropriately.
  • Bazin, A. (1968). “The ontology of the photographic image.” In H. Gray (Trans.), What is cinema? (Vol. 1, pp. 9–16). Berkeley: University of California Press.
    • Bazin holds that photography satisfies once and for all the desire to preserve reality and thus opens up new artistic possibilities.
  • Benjamin, W. (1999). Little history of photography. In M. W. Jennings, H. Eiland, and G. Smith (Eds.), Selected writings (Vol. 2, Part 2, pp. 507–530). Cambridge, MA: Harvard University Press.
    • Benjamin describes the history of 19th and early 20th century photographic theories and practices.
  • Benjamin, W. (2003). The work of art in the age of its technological reproducibility: Third version. In H. Eiland and M. W. Jennings (Eds.), Selected writings (Vol. 4, pp. 251–283). Cambridge, MA: Harvard University Press.
    • In this seminal essay, Benjamin identifies the transition to technological reproducibility as a fundamental shift in the nature of art and argues that film constitutes a new mode of perception.
  • Blausius, L. (2006). Mapping the terrain. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 27–45). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful overview of the history of Western music theories.
  • Bower, C. (2006). The transmission of ancient music theory into the Middle Ages. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 136–167). Cambridge, United Kingdom: Cambridge University Press.
    • This essay describes the reception of ancient music theory during the European Middle Ages.
  • Braudy, L. (2002). The world in a frame: What we see in films. Chicago, IL: University of Chicago Press.
    • Originally published in 1976, Braudy describes the artistic possibilities particular to popular movies.
  • Burnham, S. (2006). Form. In T. Christensen (Ed.), The Cambridge history of Western music theory (880–906). Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives an overview of the emergence of musical form as a central theoretical category in the 18th and 19th centuries.
  • Carroll, N. (1985). The specificity of media in the arts. Journal of Aesthetic Education19(4), 5–20.
    • This essay is the earliest instance of Carroll’s critique of medium specificity.
  • Carroll, N. (1996). Theorizing the moving image. Cambridge, United Kingdom: Cambridge University Press.
    • In the opening chapters of this book, Carroll offers his most developed criticism of medium specific theories and his most sustained skepticism about the coherence of the concept of artistic medium.
  • Carroll, N. (2006). Philosophizing through the moving image: The case of serene velocity. The Journal of Aesthetics and Art Criticism64(1), 173–185.
    • In this essay, Carroll acknowledges the need for the concept of artistic medium in describing the experience of structural films such as Serene Velocity.
  • Cavell, S. (1969). A matter of meaning it. In Must we mean what we say?: A book of essays (pp. 213–237). Cambridge, United Kingdom: Cambridge University Press.
    • In this early essay, Cavell articulates an understanding of artistic medium as something discovered and explored by artists as they create.
  • Cavell, S. (1979). The world viewed: Reflections on the ontology of film. Cambridge, MA: Harvard University Press.
    • In this seminal work in the philosophy of film, Cavell articulates the medium of film as a succession of automatic world projections.
  • Cavell, S. (1981). Pursuits of happiness: The Hollywood comedy of remarriage. Cambridge, MA: Harvard University Press.
    • Cavell develops an account of movie genre as artistic medium by articulating Hollywood remarriage comedies as a distinct genre.
  • Cavell, S. (1996). Contesting tears: The Hollywood melodrama of the unknown woman. Chicago, IL: University of Chicago Press.
    • Cavell returns to the concept of genre as medium by developing an account of a companion genre, the melodrama of the unknown woman.
  • Cavell, S. (2014). The fact of television. In Themes out of school: Effects and causes (pp. 235–281). Chicago, IL: University of Chicago Press.
    • Cavell develops an account of television as medium by contrast with his account of film in The World Viewed.
  • Christensen, T. (2006a). Introduction. In The Cambridge history of Western music theory (1–26). Cambridge, United Kingdom: Cambridge University Press.
    • This is a helpful précis of the history of Western music theory.
  • Christensen, T. (Ed.). (2006b). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This collection offers in-depth essays on various crucial aspects of the history of Western music theory.
  • Cox, J., & Ford, C. (2003). Julia Margaret Cameron: The complete photographs. Los Angeles, CA: Getty.
    • This is a comprehensive collection of Julia Margaret Cameron’s photography.
  • Curtis, W. J. (1996). Modern architecture since 1900. London, United Kingdom: Phaidon.
    • This is a good overview of architecture in the 20th century, including modernist architecture.
  • de Font-Reaulx, D. (2013). Painting and photography: (1839–1914). Paris, France: Flammarion.
    • This describes the complicated relations and lines of influence between painting and photography in the 19th and early 20th centuries.
  • Dewey, J. (2005). Art as experience. New York, NY: Penguin.
    • In his classic text on art as a form of experience, Dewey distinguishes between an artistic medium and the raw material out of which works of art are made.
  • Diderot, D. (1995). Diderot on art, volume 1. The salon of 1765 and notes on painting. J. Goodman (Ed.). New Haven, CT: Yale University Press.
    • This collection contains much of Diderot’s critical writing on painting.
  • Duncan, I. (2013). My life (revised and updated). New York, NY: Liveright.
    • Duncan’s autobiography also includes her reflections on her dance practices and commitments.
  • Eisenstein, S. (2014). Film form: Essays in film theory. Chicago, IL: Houghton Mifflin Harcourt.
    • This is a collection of Eisenstein’s essays on film technique and theory.
  • Eisner, W. (2008). Comics and sequential art: Principles and practices from the legendary cartoonist (Will Eisner instructional books). New York, NY: W. W. Norton.
    • Eisner offers an account of the principles underlying his approach to comics based on his decades as a comic artist.
  • Frascina, F., Harrison, C., & Paul, D. (Eds.). (1982). Modern art and modernism: a critical anthology. London, United Kingdom: Sage.
    • This is a good collection of critical writings about modernist arts.
  • Fried, M. (1980). Absorption and theatricality: Painting and beholder in the age of Diderot. Berkeley: University of California Press.
    • Fried offers an account of the artistic problematic articulated by Diderot and explored in 18th century painting.
  • Fried, M. (1998a). Art and Objecthood. In Art and objecthood: Essays and reviews (148–172). Chicago, IL: University of Chicago Press.
    • Fried’s essay argues that the turn from modernism to minimalism and conceptual art in the 1960s depends on a misunderstanding of what exploring an artistic medium can be.
  • Fried, M. (1998b). Manet’s Modernism: Or, the face of painting in the 1860s. Chicago, IL: University of Chicago Press.
    • In this book, Fried argues that Manet’s exploration of the artistic problematic he inherited from 18th and 19th century French painting constituted a form of modernism.
  • Fried, M. (2008). Why photography matters as art as never before. New Haven, CT: Yale University Press.
    • Fried argues that the cross-pollination of painting and photography has led to the most important artistic developments of the late 20th and early 21st centuries.
  • Greenberg, C. (1982). Modernist painting. In Frascina, F., Harrison, C., & Paul, D. (Eds.), Modern art and modernism: A critical anthology (5–10). London, United Kingdom: Sage.
    • Greenberg’s essay argues that modernist painting approaches the medium of painting on the model of Kantian criticism, in order to discover it in its purity.
  • Greenberg, C. (1984a). Modernist sculpture, its pictorial past. In Art and culture: Critical essays (158–163). Boston, MA: Beacon.
    • In this essay, Greenberg outlines the ways in which modernist sculpture distinguishes itself from painting.
  • Greenberg, C. (1984b). Art and culture: Critical essays. Boston, MA: Beacon.
    • This is a collection of many of Greenberg’s essays on the project of modernism in painting and sculpture.
  • Halliwell, S. (1998). Aristotle’s Poetics. Chicago, IL: University of Chicago Press.
    • This is an insightful critical analysis of Aristotle’s Poetics.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (1998). Art in theory, 1815–1900: An anthology of changing ideas. Oxford, United Kingdom: Blackwell
    • This is a comprehensive collection of 19th century art theory.
  • Harrison, C., Wood, P., & Gaiger, J. (Eds.). (2000). Art in theory 1648–1815: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of Western art theories prior to the 19th century.
  • Harrison, C., & Wood, P. (Eds.). (2003). Art in theory, 1900–2000: An anthology of changing ideas. Oxford, United Kingdom: Blackwell.
    • This is a comprehensive collection of 20th century art theories.
  • Hegel, G. W. F. (1975a). Hegel’s aesthetics: Lectures on fine art (Vol. 1). (T. M. Knox, Trans.). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art begin with reflections on the idea of art and its relation to thought.
  • Hegel, G. W. F. (1975b). Hegel’s aesthetics: Lectures on fine art (Vol. 2). (T. M. Knox, Trans). Oxford, United Kingdom: Oxford University Press.
    • Hegel’s lectures on art conclude with reflections on the differentiation and development of particular art forms.
  • Herder, J. G., & Gaiger, J. (2002). Sculpture: Some observations on shape and form from Pygmalion’s creative dream. Chicago, IL: University of Chicago Press.
    • Herder’s reflections on the nature of sculpture develop a critical response to Lessing’s account of painting as inclusive of sculpture.
  • Horace. (1989). Ars Poetica. In N. Rudd (Ed.). Horace: Epistles book II and Ars Poetica (Vol. 2). Cambridge, United Kingdom: Cambridge University Press.
    • Horace describes the form of poetic experience as structured by imitation.
  • Joyce, J. (1986). Ulysses. New York, NY: Vintage Books.
    • Joyce’s classic work of modernism, first published in 1922, explores the history of literature and its forms.
  • Joyce, J. (1999). Finnegans wake. New York, NY: Penguin.
    • Joyce’s final work explores the nature of language and its expressive possibilities.
  • Kracauer, S. (1997). Theory of film: The redemption of physical reality. Princeton, NJ: Princeton University Press.
    • Kracauer’s account of film identifies the medium’s central feature to be its ability to connect us with reality.
  • Krauss, R. (1986a). The originality of the avant-garde. In The originality of the avant-garde and other modernist myths (pp. 151–170). Cambridge, MA: MIT Press.
    • Krauss’ essay calls into question the cult of originality underlying the critical reception of modernist art and articulates the theoretical framework for postmodernist responses.
  • Krauss, R. (1986b). The originality of the avant-garde and other modernist myths. Cambridge, MA: MIT Press.
    • This is a collection of Krauss’ critical essays arguing for postmodern artistic possibilities.
  • Lear, J. (1992). Katharsis. In Essays on Aristotle’s Poetics (pp. 315–340). Princeton, NJ: Princeton University Press.
    • Lear’s essay offers an insightful interpretation of the role of catharsis in Aristotle’s account of tragedy.
  • Lessing, G. E. (1962). Hamburg dramaturgy. (H. Zimmern, Trans.). Mineola, NY: Dover.
    • This volume collects Lessing’s theatrical criticism and contains his reflections on Aristotle’s account of tragedy.
  • Lessing, G. E. (1984). Laocoön: An essay on the limits of painting and poetry. (E. A. McCormick, Trans.). Baltimore, MD:Johns Hopkins University Press.
    • Lessing’s essay is arguably the classic work of medium analysis; in it, he distinguishes between painting and poetry as two different methods for imagining action.
  • Levenson, M. (Ed.). (2011). The Cambridge companion to modernism. Cambridge, United Kingdom: Cambridge University Press.
    • This is a collection of critical essays reflecting on modernist art.
  • Loos, A. (1998). Ornament and crime. In A. Opel (Ed.), Ornament and crime: Selected essays (pp. 167–177). Riverside, CA: Ariadne.
    • Loos’ essay is a polemic for architectural modernism and against unnecessary ornamentation.
  • McCloud, S. (1994). Understanding comics: The invisible art. New York, NY: William Morrow.
    • McCloud’s book is the articulation of the nature of the medium of comics by a contemporary comic artist.
  • Medium, n. and adj. 2016. OED Online. Web.
    • This article tracks the etymology and evolution of the term “artistic medium.”
  • Perkins, V. F. (1990). Must we say what they mean? Film criticism and interpretation. Movie34(5), 1–6.
    • In this essay, Perkins defends the critical value of the concept of medium and identifies the ability to capture minute but meaningful gestures at the heart of the medium of the movies.
  • Perkins, V. F. (1993). Film as film: Understanding and judging movies. Boston, MA: Da Capo.
    • Perkins’ book articulates normative standards specific to movies and draws on the concept of medium to do so.
  • Perron, B., & Wolf, M. J. (Eds.). (2009). The video game theory reader. New York, NY: Routledge.
    • This collection of recent essays of video game theory includes a number of essays that use the concept of medium in order to articulate the experiences specific to video game play.
  • Pudovkin, V. I. (2013). Film technique and film acting: The cinema writings of V. I. Pudovkin. Redditch, United Kingdom: Read Books.
    • This collection of Pudovkin’s writing on film theory includes many reflections on what is particular to the medium of film.
  • Rasch, R. (2006). Tuning and temperament. In T Christensen (Ed.). The Cambridge history of Western music theory. Cambridge, United Kingdom: Cambridge University Press.
    • This essay gives a helpful account of the emergence of temperament as a music theoretic category in the European music tradition.
  • Rorty, A. (1992). The psychology of Aristotelian tragedy. In A. Rorty (Ed.), Essays on Aristotle’s Poetics (pp. 1–22). Princeton, NJ: Princeton University Press.
    • This essay describes Aristotle’s views on the audience’s experience of tragedy.
  • Talbot, W. H. F. (1969). The pencil of nature. Boston, MA: Da Capo.
    • This essay is a reflection on the nature of photography by one of its inventors.
  • Trachtenberg, A. (1980). Classic essays on photography. Stony Creek, CT: Leetes Island Books.
    • This collection contains a number of important essays on photography and its possibilities from the 19th century.
  • Vertov, D. (1984). Kino-eye: The writings of Dziga Vertov. Berkeley: University of California Press.
    • Vertov’s writings reflect on the radical possibilities for contemporary perception offered by film.
  • Vertoy, D. (1998). Man with a movie camera [Motion Picture]. United States: Image Entertainment.
    • Released in 1929, Vertov’s experimental film explores the range of documentary possibilities for film.
  • Wack, D. (2013). Medium and the end of myths: Transformation of the imagination in The world viewedConversations: The Journal of Cavellian Studies, 1, 39–58.
    • This essay describes the transformation in the medium of movies that Cavell identifies in The World Viewed.
  • Wack, D. (2014). How movies do philosophy. Film and Philosophy, 18.
    • This essay argues that movies, documentaries, structural films, cartoons, and so on all constitute distinct artistic mediums and identifies the medium of the movies as structured around the apprehension of action.
  • Wellbery, D. E. (1984). Lessing’s Laocoön: Semiotics and aesthetics in the age of reason. Cambridge, United Kingdom: Cambridge University Press.
    • Wellbery’s book is an insightful and sustained interpretation of Lessing’s Laocoön essay.
  • Winckelmann, J. J., & Potts, A. (2006). History of the art of antiquity. Los Angeles, CA: Getty.
    • Winckelmann’s book on ancient art was widely influential in the 18th century and definitive for the development of art history as an intellectual discipline. 

 

Author Information

Daniel Wack
Email: dwack@knox.edu
Knox College
U. S. A.

Political Revolution

Revolutions are commonly understood as instances of fundamental socio-political transformation. Since “the age of revolutions” in the late 18th century, political philosophers and theorists have developed approaches aimed at defining what forms of change can count as revolutionary (as opposed to, for example, reformist types of change) as well as determining if and under what conditions such change can be justified by normative arguments (for example, with recourse to human rights). Although the term has its origins in the fields of astrology and astronomy, “revolution” has witnessed a gradual politicization since the 17th century. Over the course of significant semantic shifts that often mirrored concrete political events and experiences, the aspect of regularity, originally central to the meaning of the term, was lost: Whereas in the studies of, for example, Nicolaus Copernicus, “revolution” expressed the invariable movements of the heavenly bodies and, thus, the repetitive character of change, in its political usage, particularly stresses the moments of irregularity, unpredictability, and uniqueness.

In light of the marked heterogeneity of the ways in which thinkers such as Thomas Paine (1737-1809), J.A.N. de Condorcet (1743-1794), Immanuel Kant (1724-1804), G.W.F. Hegel (1770-1831), Mikhail Bakunin (1814-1876), Karl Marx (1818-1883), Hannah Arendt (1906-1975), and Michel Foucault (1926-1984) reflect on the possibilities and conditions of radically transforming political and social structures, this article concentrates on a set of key questions confronted by all these theories of revolution. Most notably, these questions pertain to the problems of the new, of violence, of freedom, of the revolutionary subject, the revolutionary object or target, and of the temporal and spatial extension of revolution. In covering these problems in turn, it is the goal of this article to outline substantial arguments, analyses, and aporias that shape modern and contemporary debates and, thereby, to indicate important conceptual and normative issues concerning revolution.

This article is divided into three main sections. The first section briefly reconstructs the history of the concept “revolution.” The second section gives an overview of the most important strands of politico-philosophical thought on revolution. The third section examines paradigmatic positions developed by theorists with respect to the central problems mentioned above. As the majority of thinkers who address revolution do not elaborate comprehensive theories and as there is comparatively little thematic secondary literature on the subject, this part proposes a framework for individually situating and systematically relating the differing approaches.

Table of Contents

  1. History of the Concept
  2. Three Traditions of Thought
    1. The Democratic Tradition
    2. The Communist Tradition
    3. The Anarchist Tradition
  3. Concepts of Revolution
    1. The Question of Novelty
    2. The Question of Violence
    3. The Question of Freedom
    4. The Question of the Revolutionary Subject
    5. The Question of the Revolutionary Object
    6. The Question of the Extension of Revolution
  4. Conclusion
  5. References and Further Reading

1. History of the Concept

In preparation for presentation of the different philosophical approaches to revolution in the following article, this section is concerned with providing a concise outline of the history of the concept. In so far as “revolution” is employed to describe political transformation, conceptual historians understand its origins to be genuinely modern. Critically informed by the experience of the revolutions in England, America, and France, the term in common usage designates the epitome of political change, that is, change not only in laws, policies, or government but in the established order that is both profound and durable. Earlier conceptions of political change are missing the notions of a people’s autonomous ability to act or of its right to emancipation. Further, the absence of two structural preconditions explains why revolution in the sense of fundamental politico-social transformation is not conceived prior to modernity. On the historical level, it is the formation of the “strong” state that is conducive to a political imagination of radical liberation from state oppression and the subsequent founding of an essentially different order. The extent of the Hobbesian type of the state’s disciplining power and the impossibility of direct political participation thus lay the ground for revolutionary projects. On the conceptual level, the supersession of cyclical conceptions of history as advocated by Aristotle, Polybius, Cicero, or Machiavelli by linear models of thought allows for the idea of irreversible progress in politics and society. In the course of this shift in historical thinking revolution is eventually looked upon as a catalyzing, even enabling factor of progress. Since history is no longer understood as dependent on forces beyond human control (such as, for example, divine providence), human agency comes to be regarded as the decisive factor in shaping its course (compare Koselleck, 1984 and 2004; for arguments that revolution, both as a concept and a phenomenon, does have pre-modern origins, compare Rosenstock-Huessy, 1993 [1938]; Berman, 1985).

The history of political thought largely attests to the assessment that the idea of revolution as structural, justifiable change is unknown prior to modernity. Aristotle’s reflections on political change (metabolé tes politeías) in books III and IV of Politics show that the alterations he takes into consideration do not amount to the complete breakdown of an existing order, its organizing hierarchy, and its principles of inclusion/exclusion. Despite certain arguable similarities to modern concepts (for instance, with respect to the element of violence), conceptual predecessors of “revolution” such as stasis and kinesis in the Greek tradition or seditio, secessio, and tumultus in the Roman tradition have strong negative connotations. In ancient and medieval political thought, they are primarily related to anarchy and civil war. Even in the works of an early-modern thinker like Machiavelli the idea of an absolute hiatus, a fundamental rupture on the continuum of politics is not developed fully. Although he is occupied with political change, key concepts related to the topic (most importantly, rinovazione, mutazione, and alterazione) are overridden by the conviction that all shifts as to forms of constitutions ultimately do not break out of a cycle of historical recurrence. In short, the notion of a world-shaping human “power to interrupt” and “to begin” (compare Merleau-Ponty, 2005 [1945]) and the corresponding “pathos of novelty” (compare Arendt, 2006 [1963]) remain alien to pre-modern thought.

In the 17th and 18th century, the discovery of revolution as a relevant political category is reflected and supported by political and moral philosophy. John Locke, in his Second Treatise on Civil Government (1689), develops an influential defense of the right of resistance, rebellion, and even revolution. Going beyond Thomas Hobbes’s considerations on a subject’s right to defend herself against the sovereign if her life is under threat, his social contract theory presents this protective right against stately coercion and oppression as a necessary political concretization of the individuals’ inalienable natural right to “life, liberty, and estate.” Jean-Jacques Rousseau, in the Discourse on the Origin of Inequality (1755) and the Social Contract (1762), aims at exposing the morally degenerate, politically illegitimate state of the Ancien Régime and proposing a liberal, egalitarian political and legal constitution to replace it. According to Rousseau, the “general will” ousts the particular will of the monarch as the guideline in politics, thereby implying that the people attain autonomy, sovereignty, and, thus, the status of full political subjectivity. Locke’s and Rousseau’s considerations thus importantly add to a revaluation of acts of protest and insurrection: Such acts can no longer be dismissed as the work of political offenders or public enemies as was the case prior to the undermining of the “political theology” of absolutism and feudalism, which was largely based on the doctrine of divine right (compare Kantorowicz, 1997 [1957]; Walzer, 1992). Instead, thanks to the political thought of the enlightenment in general and to Lockean and Rousseauian social contract and natural rights theory in particular, such acts can now be interpreted as an exercise of rationally and morally justifiable political self-determination. Although neither Locke nor Rousseau present elaborated theories of revolution, they develop positions that are inherently critical of any political order that is not built on the principles of consent and trust and, thus, potentially revolutionary. Their reflections on legitimate governance and on citizens’ rights go beyond earlier discussions of justified resistance to monarchs—such as the 1579 Vindiciae contra Tyrannos, published under the pseudonym Stephen Junius Brutus—, which rely on expertocratic leadership as opposed to political self-determination of the people. Their works thus prepare the ground for the two main ideas of the revolutionary age: “natural” human rights and national sovereignty (compare Habermas, 1990; Menke/Raimondi, 2011).

Resulting from a plethora of intellectual and material factors, the distinctly modern understanding of “revolution” takes shape on the eve of the historical revolutions of the late 18th century: It is both a “combat term” (R. Koselleck) in political praxis and an “essentially contested concept” (W.B. Gallie) in political theory. It is in the works of thinkers like Condorcet, Kant, or Marx that this contest is henceforth held and that the specific political and philosophical meaning of the term is spelled out, albeit in widely differing ways.

2. Three Traditions of Thought

Before turning to a detailed examination of important conceptual and normative issues concerning revolution, this section aims at giving an overview of three dominant lines of thought on revolution. Given the considerable discontinuities and breaks within each of these strands on the one hand and the numerous overlaps and interchanges between them on the other, the lines of thought presented here have to be understood as ideal types. Although it is likely that there are alternative perspectives, very few theories of revolution resist classification into one of these strands.

a. The Democratic Tradition

A primarily democratic strand of theory is influenced by the works of Locke, takes shape in Thomas Jefferson’s and J.A.N. de Condorcet’s thinking, and is further developed in Kant’s reflections on gradual, yet profound transformation. Throughout the 19th and 20th century, it is continued selectively in the late writings of Friedrich Engels or in Hannah Arendt’s and Jürgen Habermas’s considerations of the subject. This strand is characterized by a strong emphasis on non-violent, legal means and on politico-legal liberty and equality as the essential aims of revolution. Its representatives understand revolution as a continuing project or task that cannot reach a point of completion and satisfaction. Correspondingly, these thinkers, for the most part, reject notions of instantaneous rupture and absolute novelty whereby they undermine rigid distinctions between revolutionary and reformist change. Key elements of this tradition resonate in the work of a contemporary thinker like Etienne Balibar. He suggests an understanding of revolution as a progressive power that operates from within the democratic system. Instead of aiming at the radical overthrow of this system, democratic citizens assume the role of the revolutionary subject by advocating constant additions to and revisions of the existing order and its institutions—for example, an extension of what Arendt calls “the right to have rights” to non-citizens, increased possibilities for political participation, or a more consequent adherence to human rights—allowing for its continued legitimacy (compare Balibar, 2014).

b. The Communist Tradition

A primarily communist line of revolutionary theory begins with the works of Rousseau. This line is elaborated decisively in the thinking of Karl Marx and Friedrich Engels. Significant modifications notwithstanding, it is continued in the writings of Vladimir Lenin and Jean-Paul Sartre during the 20th century. The majority of its representatives share the belief in the possibility of revolutions being finalized and completed. Although they offer different suggestions as to justifiable forms and degrees of violence, they further share the idea that violence, in general, can function as an acceptable means of revolution. They also agree that the realization of material liberty and equality (as opposed to merely “formal,” that is, legal liberty and equality) in the social sphere are its main goals. As this sphere includes apolitical institutions such as the market, substantial revolutionary transformation cannot satisfy itself with abstract political principles but needs to affect the concrete conditions in which a society exists (for example, the relations of production). In addition, the notion of solidarity is central to these thinkers’ vision of revolutionary action and of a post-revolutionary society that is realized through these actions. Key elements of this strand of revolutionary thought shape the works of contemporary theorists such as Alain Badiou and Slavoj Zizek. Interpreting existing democratic orders as regimes of radical immanence, it is evident to them that genuine transcendence (a “communism to come”) has to manifest itself as a supersession of this order. To overcome the inherently bourgeois structures and discourses of power that are ceaselessly reproduced by late-capitalist democracies, radical disruptions are needed. Taking the form of acts of “terror” or “subtraction,” such disruptions express the “eternal truths” of the suffering of the masses (compare Badiou, 2012; Zizek, 2012).

c. The Anarchist Tradition

An anarchist tradition of revolutionary theory has its sources in 19th century America (Josiah Warren), France (Pierre-Joseph Proudhon), and in the thought of the Russian theorists Mikhail Bakunin and Peter Kropotkin. This tradition is later taken up in the works of, for example, Emma Goldman, Rosa Luxemburg, and Paul Goodman. Although these thinkers differ considerably in their assessment of revolutionary violence, they converge as to the crucial emancipatory aim of revolution: As any form of institutionalized authority is considered incompatible with human autonomy, their vision is the creation of a society independent of “imperial institutions” in the economic, social, and political realms. Consequently, they do not content themselves with a redistribution of political power, however radical, within the framework of the state, but aim at its abolition instead. David Graeber, in his contemporary reformulation of anarchism, describes the way in which the envisaged revolutionary abolition of vertical structures is linked to the emergence of new forms of horizontal relations, that is, of communal existence. These forms are no longer organized by the logic of dominance and of cost/benefit; instead, they are shaped by the principles of mutual aid and free cooperation, which are not guided by instrumental rationality (compare Graber, 2004).

3. Concepts of Revolution

The following section discusses central questions addressed in the works of theorists from these main strands: The questions of novelty, violence, freedom, the revolutionary subject, the revolutionary object or target, and the extension of revolution. As it is neither possible to comprehensively discuss relevant concepts of revolution proposed by political philosophers and theorists nor to comprehensively include thematic considerations of the theorists presented here, this section contents itself with highlighting certain crucial features. Since this article is concerned with concepts of revolution as developed by political philosophers and theorists, important historical (compare Furet/Ozouf, 1989; Hobsbawm, 1996 [1962]; Palmer, 2014 [1959]), sociological (compare Skopcol, 1979), and politological (compare DeFronzo, 2011) studies that primarily concentrate on the phenomenon of revolution, its empirical forms and causes, are not taken into account. Further, a number of theoretical explorations of revolution are also not taken into consideration. This applies to the works of partisans of revolution such as, for example, Georges Sorel or Georg Lukács as well as to the works of critics of revolution such as, for example, Edmund Burke, Jeremy Bentham, Joseph de Maistre, or Carl Schmitt.

The exclusive focus on the six questions mentioned above is justified by the fact that they constantly appear in the theoretical debates regarding revolution as criteria in determining (a) if and under what conditions political change can be considered as revolutionary and (b) if and under what conditions such revolutionary change can be considered as legitimate. Despite the differing historical settings as well as the differing political and philosophical commitments of the individual thinkers, these questions thus constitute the common themes that connect their heterogeneous approaches to revolution. For each of these questions, the intent is to display the extremes of the spectrum on which important theorists of revolution operate and to indicate paradigmatic stances they take on this spectrum. It is with the help of this analytical framework that the various approaches to revolution since its intellectual discovery can be individually situated and systematically related to one another: The original revolutionary experience in the context of the American and French Revolution as reflected in the writings of Jefferson, Paine, Sieyès, and Condorcet; its reception in German Idealism; the further development of revolutionary thought in different versions of Marxism; its application to the problem of colonialism in the 20th century; and, finally, contemporary debates about the relevance and meaning of revolution informed, among other things, by the crises of late capitalism and representative democracy.

a. The Question of Novelty

The question of novelty pertains to the degree of revolutionary transformation and to the mode in which such transformation is achieved. Whereas some theorists of revolution argue that the post-revolutionary state needs to be absolutely new and different in comparison to the pre-revolutionary state, others hold that revolution is conceivable as a realization of relative novelty. Although some theorists argue that transformation needs to take place in a historically disruptive or discontinuous fashion in order to be revolutionary in character, others hold that effective revolutionary change can unfold in a continuous or stepwise manner.

For Thomas Paine, there can be no doubt that the American revolutionary struggle for independence from colonial rule, understood as a practical application of enlightenment thought, amounts to a radical break in history. According to his remarks, the liberation of the colonies from monarchical government must be seen as the unique and irreversible establishment of a fundamentally new political order. Employing nature as a timeless criterion for revolution, he describes monarchy not only as an anachronistic, unjustifiable “absurdity” but as a grave violation of natural law. In Paine’s view, its supersession by consent-based, liberal, and egalitarian republicanism is therefore tantamount to “begin[ning] the world over again” (Paine, 2000: 44). In contrast to Paine’s considerations that often oscillate between conceptual analyses and calls to revolutionary action (and, thus, indicate the difficulty inherent to addressing the subject of revolution in an objective, non-partisan manner), his contemporary Condorcet suggests an understanding of revolution that is not informed by a comparatively strong concept of novelty. Condorcet’s understanding becomes particularly apparent in his stance towards the trial and execution of Louis XVI. Rejecting the extra-legalism advocated by, among others, Robespierre and Saint-Just, he develops a theoretical position that argues for the compatibility of profound change of the political system and historical continuity: For him, the largely unprecedented challenge of bringing the king to court can only be met by taking recourse to elements of previous politico-legal systems. What is more, it is precisely such elements that—under the condition that they are not just imitated, but innovatively rearranged—make the necessary “regulation” of revolutionary dynamics possible and, thus, guarantee revolutionary progress (compare Condorcet, 2012; Walzer, 1992). Instead of interpreting novelty in terms of the political creation of a “new world” without historical parallel, the new, here, is comprehended in terms of a reconfiguration of constitutive parts of the old, that is, of the pre-revolutionary world.

As represented here by Paine and Condorcet, the axis of the new, crucial for conceptually grasping revolution, runs between the extremes of absolute and relative rupture or inception. The ends of this spectrum are reflected in numerous later theories of revolution. For instance, Friedrich Engels (1820-1895), in late works such as, for example, his introduction to the reprint of Marx’s The Class Struggles in France, describes revolutionary struggle as ongoing and procedural in character. For Engels, this struggle cannot be detached from existing political, legal, and economic conditions, meaning that radical revolutionary breaks or leaps are inconceivable. As his moderate understanding of the new allows for minor modifications of the state of affairs to be labeled as revolutionary, it is inclined to tie revolution closely to reform. This propensity is reflected in his programmatic idea of a re-appropriation of universal suffrage, which turns it from a means of bourgeois dominance into an ultimately revolutionary means of proletarian liberation. As opposed to Engels’s approach to the question of the new, Walter Benjamin (1892-1940), in On the Concept of History, propounds an understanding of revolution as a state of exception in which the continuum of history is “burst open.” According to his “messianic” concept of novelty, revolutions are unforeseeable, kairological events that suspend the regular, chronological order of time: They constitute a leap into an epoch that is incommensurable with what has previously existed.

Immanuel Kant, in his thoughts on revolution, attempts to avoid similarly one-sided answers to the question of the new. Rather, his complex considerations on progressive transformation aim at undermining the dichotomy between either emphatic or deflationary notions of the new by closely associating “complete change” or “complete revolution” (völlige Umwälzung) and “thorough reform” (gründliche Reform) (compare Kant, 2006c [1795/96]). Yet, Kant’s remarks on the subject of political or politico-moral change—scattered over writings such as What is Enlightenment?, Toward Perpetual Peace, The Metaphysics of Morals, and The Contest of the Faculties—seem marked by a tension between a reformist bias and revolutionary tendencies. Whereas the former is expressed in his privileging of enlightened monarchs such as Frederick II of Prussia as the agents of change or in his explicit criticism of the French Revolution on the grounds of excessive use of violence, the latter becomes apparent in his comments on the “enthusiasm” with which contemporary Europeans observe the revolutionary events in France or in his reflections on the radical switch from “despotism” to “republicanism,” that is, from the old absolutist order to a new order of freedom and morality. Kant evidently considers the difference between the two types of order to be tremendous: An order responsible for the heteronomous subjugation of the individual by the ruler is overcome by an order primarily characterized by the proliferation of individual autonomy and political participation as well as the decrease of armed conflict and war. Kant appears to resolve what presents itself as a tension between differing, even incompatible concepts of the new by taking into account the specific temporal constitution of profound political change: For him, such change is grasped adequately only as a process that is mediated in multiple ways, but not as a sudden gestalt switch. Rejecting the sharp, static dichotomy between relative and absolute novelty (and, with it, the dichotomy between reform and revolution) and integrating the two instead, Kant shows that there is no necessary interdependence between the suddenness and the depth of political change. He thus does not accept the common assumption among theorists of revolution and active revolutionaries alike that only abrupt, immediate transformation can count as profound and progressive in a relevant sense. Although republican states, according to Kant, are fundamentally different from despotic states the principles of which are superseded entirely, the emancipatory transition from heteronomy to autonomy is achieved stepwise. Kant’s idea of “complete change” reflects his teleological understanding of history as an imperfect, yet steady development “from worse to better” as expounded in his considerations on the conditions of the possibility of progress in Idea for a Universal History from a Cosmopolitan Perspective and Conjectural Beginning of Human History; it crystallizes in concepts such as “gradualness” (Allmählichkeit) and “approximation” (Annäherung) used by Kant to illustrate his notion of progressive transformation. It follows that, with Kant, the new can impossibly be conceived in theologically charged terms of the miracle or the “event.” Yet, the terminal phase of this gradual, indeterminate transition, for him, does mark the inception of a genuinely new age in the history of humanity, which is not only “an age of ‘enlightenment’ but ‘an enlightened age’” (compare Kant, 2006a [1784]). Politically, the latter manifests itself in consent-based republican systems essentially guided by the humanity formulation of the Categorical Imperative and, thus, in a “political body the likes of which the earlier world has never known” (Kant, 2006b [1784]: 14).

Within the theoretical debates, further problems arise that are immediately tied to the question of revolutionary novelty. For instance, several theorists of revolution do not merely reflect upon the new in terms of its degree and its mode. Instead, they also investigate its sources: The new is conceived as a result made possible by acts of re-appropriation (as expressed, for example, in Jefferson’s recourse to classical antiquity), by acts of reconfiguration (as expressed, for example, in Condorcet’s approach to assembling individual elements of various previous and present legal systems), or by acts of creation (as expressed, for example, in Bakunin’s idea of creative destruction by revolutionary “bandits”).

b. The Question of Violence

The question of violence pertains to legitimate means of revolutionary transformation. While some thinkers of revolution approve of violence as an essential vehicle for bringing about radical change and assert its creative capacities, others advocate its unreserved exclusion from the realm of progressive politics and make recourse to right and law instead. Again, numerous intermediate positions between the extremes of permissive and prohibitive attitudes toward violence can be found in which theorists try to identify specific conditions under which the use of violence is legitimate (for example, if violence contributes to a measurable increase in freedom) or to determine specific forms of violence that are justifiable (for example, violence against property). In addition, this section focuses on prevalent strategies for justifying revolutionary violence with recourse to, among others, utilitarian and politico-theological arguments.

Anarchist theorist and activist Mikhail Bakunin, in his thoughts on radical socio-political transformation, stresses the creative power of humans in general and the creative potential of violence in particular. For him, revolution begins with the forcible destruction of the old (statist) order, which prepares the “fertile” ground for a fundamentally new (non-statist) order (compare Bakunin, 1990 [1873]). Even though Bakunin declares the institutions that constitute the political and economic centers of power to be the primary target of acts of revolutionary “bandits,” he holds that such violence can also legitimately affect the persons who are present at these centers. In order to justify the use of revolutionary violence Bakunin argues for an understanding of such violence as reactive and necessary: Confronted with the repressive violence of the state, its police and military units, partisans of the “social revolution” must resort to violence. In his view, such violence is justified both as an act of self-defense and as a means of a progressive politics that transcends a deeply unjust status quo in which autonomy is made impossible by the existence and the authority of the state. Thus, for Bakunin, violence is not merely an extreme alternative in case non-violent (for example, legal) vehicles of transformation fail. Instead, it is an inherent factor of revolution. In his comments on revolution, provoked by the experience of the Iranian Revolution, Michel Foucault agrees with this assessment insofar as he considers manifestations of violence an important motor of transformative politics (compare Foucault, 2005 [1978-79]). Based on irreconcilable concepts of the political and further fueled by resentment, intolerance, and hatred, a quasi-Schmittian fighting position between “friends” and “enemies” of the revolution, that is, between the supporters of the “saint” (Ayatollah Khomeini) and the “king” (Shah Reza Pahlevi) emerges. This fighting position, for Foucault, is to be seen as an inevitable element of radical change. Despite his constative judgment that violent conflict essentially enables revolutionary dynamics, he does not present an elaborate justification of revolutionary violence.

Contrary to Bakunin and Foucault, Kant understands violence as neither a necessary nor a justifiable element of revolution. Not only do his remarks reveal a pronounced reservation resulting from empirical observations of the cruelties committed in the course of the revolution in France (cf. Kant, 1991 [1798]). What is more, his rejection of the idea that violence could be considered a legitimate means of progress is a matter of principle. His position becomes particularly manifest in his reflections on the trial against Louis XVI as presented in the Doctrine of Right (compare Kant, 1996 [1797]). From the standpoint of his practical philosophy, there can be no doubt that the execution of the previous monarch is not acceptable. For Kant, this form of legally regulated and sanctioned regicide differs from historically well-known simple regicide, that is, the killing of a king on impulse or motivated by political power strategies: For in the trial, the established political principle of the inviolable nature of sovereign power is undermined and ultimately replaced by the principle of violence. Since the prosecution, in trying and finally executing the former king of France, does not appeal to a singular, exceptional situation but, instead, lends general juridical character to it, violent revolutionary insurrection against the sovereign is turned into a principle or Grundsatz of politics. To understand the right to violent resistance and revolution as a political principle (as is the case in the trial), for Kant, anticipates the Great Terror of 1793/94 (for a similar critique of the trial and execution of Louis XVI, compare Camus, 1991 [1951]). More importantly, it passes off the violent protest against sovereign governments as generally permissible and problematically normalizes it. As a consequence of the legalization of permanent insurrection, the consolidation of political (and, with it, moral) order is considerably complicated while civil disorder and war, in Kant’s view the key impediment to politico-moral progress, become the rule. A comparably unambiguous rejection of violence as an instrument of revolution can be found in Arendt’s On Revolution where she describes violence as a “limit” of the realm of the political: For her, the revolutionary praxis of violence (as exercised in the revolutions in France and Russia) as well as theoretical justifications of revolutionary violence (as given by, for example, Bakunin) are inherently anti-political.

Condorcet is one of the thinkers who neither understands violence as an integral part of revolution and gives carte blanche to its use nor completely rules out that it can serve as a justifiable means in processes of radical transformation. His intermediate position crystallizes in his considerations on the trial against Louis XVI: Representing the standpoint of the Girondins, he argues that the charges against the former king (or, rather, the “citizen Louis Capet”) cannot be based on “enmity” as suggested by Jacobins like Robespierre and Saint-Just, but have to refer to “treason” instead. The binary logic of the Jacobins according to which any monarch has to either rule or die and their corresponding attempt to apply the laws of war in the trial against the king are thus curbed. The position suggested by Condorcet allows for an at least tentative maintenance of the rule of law and of the validity of principles of justice. Like any other laws and measures, revolutionary laws and measures as developed in the course of the trial are subject to the rules of justice (compare Condorcet, 2012). In stark contrast to the Jacobins’ enthusiasm for unrestricted, extralegal, and decisionist self-authorization, what is emphasized here is the necessity of revolutionary self-restraint. According to Condorcet, the exceptional, unprecedented situation of the revolutionary trial has to be modeled on the ideal of due process of law if it is to remain distinguishable from mere revolutionary terror. Thus, revolutionary violence as it manifests itself in the eventual execution of the former king is not categorically rejected. However, it can only be considered as justified if it is legally channeled and, as a result, compatible with certain demands of justice. Insisting on the significance of revolutionary justice (however imperfect in its practical realization) in the exercise of legally qualified violent acts, Condorcet avoids the common opposition of either violence or law as the decisive tools of transformation. On the one hand, this treatment of the representatives of the old system, in not suspending the law, sets an example for the new order and for the way in which it interprets law and justice. It thus contributes to the transformation of revolutionary violence into legitimate authority. On the other hand, this treatment of the collapsed regime contributes to facilitating the peaceful co-existence of partisans and opponents of the revolution in a post-revolutionary society: Instead of declaring the former king to be a “moral monster” to be immediately “annihilated” and instead of declaring war against supporters of the monarchy and all other “enemies of freedom” as suggested by Robespierre and Saint-Just, Condorcet’s insistence on legal equality aims at finding peaceful trading zones and common ground between the factions so that previous political opponents can be repositioned as potential future partners.

Intermediate positions between the extremes of approval and rejection of violence as an instrument of revolution are also developed by Walter Benjamin, Herbert Marcuse, and, more recently, by Slavoj Zizek. With regard to the question of justification, these thinkers propose alternatives to Condorcet’s idea of legalized and, thus, legitimate revolutionary violence. Benjamin, taking recourse to political theology, interprets and justifies revolutionary movements as inner-worldly manifestations of unmediated “divine violence” that overcomes the oppressive “mythical violence” exercised by the state. With respect to the content and effect of “divine violence,” Benjamin’s remarks remain sketchy. On the one hand, the notion can be taken to imply the use of force against representatives of the state’s “mythical” authority; on the other hand, it can be interpreted as resulting in a fundamental transformation of the law which becomes critical of itself by recognizing and counter-balancing its inherent violent potential. At any rate, revolutionary movements, for Benjamin, represent a form of justice that incommensurably exceeds the existing legal order. If they are successful, they cathartically suspend the “serfdom” and “barbarity” characteristic of human history and realize the possibility of the fundamentally new (compare Benjamin, 1999 [1921]). Marcuse (1898-1979), in contrast, proposes a quasi-utilitarian justification of revolutionary violence. In Ethics and Revolution (1964) he argues that only a “brutal calculus” can determine whether a specific revolutionary project is legitimate. The suggested calculus amounts to a cost-benefit analysis of the probable number of victims on the one hand and the probable gains in human progress on the other (in terms of, for example, tolerance or human rights). For Marcuse, the historical events in England, America, and France prove the dialectical character of revolutionary violence, that is, the fact that violent conflict can contribute decisively to substantial economic and social, political and moral improvements. However, he insists that such violence is justifiable only if its use (a) is directly and recognizably tied to specific moral goals and (b) ceases at the earliest possible stage of the revolutionary process. Zizek (1949) attributes a central role to violence as an instrument to break out of the absolutely imminent “deadlock” represented by the current order of liberal democracy and market economy. His reflections concentrate on the revolutionary capacities of passive forms of violence, which he presents as particularly justifiable. Most importantly, he suggests a “Bartlebian politics” of refusal and withdrawal that undermines the discursive power of the dominant system. Such a politics, which has an expressive, communicative function, rejects the prevailing “hegemonic” language and counters the existing system’s power to name with subversive silence. For Zizek, political forms of direct non-action, guided by Bartleby’s maxim of “I would prefer not to” allow for a first negative step in the revolutionary process in creating a “vacuum” of effective power which, in a second step, can be filled with positive content. Thus, in arguing that, in the present circumstances, “doing nothing is the most violent thing to do” (an idea that also informs the traditions of strikes, pickets, and silent vigils), he designates radical non-action as a justifiable mode of revolutionary violence (compare Zizek, 2008).

Debates within and around contemporary movements with fundamentally transformative social and political agendas attest to the continued significance of violence, of its permissibility and justifiability, as the central normative problem in the context of revolution. Supporters of the Occupy movement deny the legitimacy of physical violence and, in particular, of physical violence directed against persons, as a means of revolutionary change. Instead, they largely subscribe to a “Bartlebian” revolutionary politics of non-violent violence, that is, a politics of subversive silence and, respectively, creative re-naming. The adherence to this kind of inactive, discursive violence was expressed performatively during the 2013 Gezi Park protests in Istanbul. Whereas the “standing man” actions enacted a “bodily politics” of obstruction (compare Butler, 2015) and an attitude of refusal through silence and passivity, the derogative term çapulcu (looter, marauder) used by government officials to discredit the protesters was creatively appropriated by them and re-interpreted as a honorific title. In Egypt, supporters of the Arab Spring movement took recourse to certain strands within the Islamic legal tradition when considering the question of violence. It was not only in terms of human rights and democratic governance but also in terms of the Islamic law of rebellion and of war that the question of violence was discussed. Although the positions of the main legal schools of thought differ considerably in their assessment of the question, there is a pronounced tendency to attempt to avoid or, at least, limit violence in internal conflicts and to consider it justifiable only if all other means of bringing about change have been exhausted (compare El Fadl, 2006; Al Dawoody, 2011).

c. The Question of Freedom

The question of freedom pertains to the primary objective of revolutionary transformation. Here, the spectrum established by theorists of revolution spans between the poles of freedom as liberation from oppression (that is, negative revolutionary freedom) and of freedom as the foundation and realization of a new political order (that is, positive revolutionary freedom).

Post-colonial theorist Frantz Fanon (1925-1961), in his reflections on revolutionary change, primarily concentrates on the aspect of liberation. For Fanon, whose work attests to the de-Europeanization of revolution during the 20th century, decolonization is to be understood as a process of “rehabilitation” of the suppressed that importantly implies a justifiably violent moment of radical riddance of the structural cornerstones of political, social, economic, and cultural domination and exploitation. Revolutionary liberation thus leads to the creation of a “tabula rasa,” which is the precondition for the subsequent development of a new institutional order and, what is more, the emergence of “sovereign” forms of post-colonial subjectivity (compare Fanon, 1967 [1961]). A comparable focus on revolutionary freedom as freedom from oppression characterizes the thinking of critical theorist Herbert Marcuse. For him, breaking free from the existing order is the essential element of a revolution. He argues that in light of the extent to which an inherently “repressive” socio-political order, the order of late capitalism, clearly dominates, strategies of resisting and undermining have to be considered before anything else. As Marcuse makes clear in Ethics and Revolution, such strategies of liberation do not only include forms of passive resistance as indicated in the concept of “the great refusal” but also the use of violence. Both can serve as a means to unsettle the systemic “paralysis” or blockage of human needs and potentials in industrialized Western societies. Consequently, Marcuse’s understanding of freedom is shaped by the idea of emancipation from a system of extreme immanentism that produces entirely controlled, uniform, “one-dimensional” humans. In spite of the emphasis on revolutionary freedom as liberation from prevailing modes of materialistic existence and instrumentally rational thought, he also points to a more positive notion of freedom: With explicit recourse to the thought of Jean-Paul Sartre, he discusses the necessity of “projects” that allow for forms of free (for example, artistic) action to be released (compare Marcuse, 1991 [1964]).

As opposed to Fanon and Marcuse, Hannah Arendt holds that the content of revolutionary freedom is “participation in public affairs,” that is, the positive freedom to act politically. In historical terms, this kind of freedom is exemplified for Arendt in the American Revolution where the foundation of a new political constitution—a republican constitution which codifies participatory citizenship—is itself achieved by participatory, autonomous “speech and action.” Although Arendt admits that an element of negative freedom is integral to thorough transformation, she is unequivocal in qualifying the “desire for liberation” as an insufficient objective of revolution if the latter is to be genuinely “political” as opposed to merely “social.” Employing the term “political” in a normative rather than a descriptive way, she appropriates the Aristotelian distinction between “political” and “despotic” forms of constitutional order and transposes it to the problem of revolutionary disorder. Consequently, in Arendt’s view, not every revolution can automatically be considered political. Instead, processes of profound, sustainable transformation have to meet certain conditions if they are to be labeled as political. The delineation Arendt suggests is essentially based on two criteria: For her, a revolution is apolitical or even anti-political if (a) what she calls “the social question” is its essential driving force and if (b) violence plays a central role in bringing about a new order (compare Arendt, 2006 [1963]). Similarly, Thomas Jefferson (1743-1826), himself a central intellectual and political figure of the American Revolution, insists on the importance of positive aspects of revolutionary freedom. This becomes apparent when he directly relates the idea of an “empire for liberty” to the notion of “self-government.” It is underlined in his remarks on resistance and rebellion: Despite their potential legitimacy and their “refreshing” effects on the “tree of liberty,” such attempts to be free from forms of “despotism” and “tyranny” remain insufficient in that they fail to found an alternative order that reliably rests on a constitution conducive to the realization of “life, liberty, and the pursuit of happiness” (compare Jefferson, 2004).

Karl Marx endeavors to relativize the opposition between either negative or positive freedom as definitive of revolutionary freedom. For him, revolution has to be conceived as a temporal process spanning over different stages. Thereby, an element of liberation plays a crucial role at the beginning of radical change insofar as it contributes to the liquefaction of an existing, oppressive system (such as the system of bourgeois, capitalist “class rule”). However, Marx’s theory of revolution expounds that this deconstructive element needs to be complemented by a reconstructive element once, in the later stages of the revolutionary process, the solidification of its transformative dynamics, that is, the formation of a new system becomes the essential task. The final paragraph of the 1848 Communist Manifesto paradigmatically reveals Marx’s (and Engels’s) understanding of revolutionary freedom as necessarily encompassing both negative and positive moments: The communist revolution casts off the “chains” as well as it “wins” a new, classless “world.” According to Marx, such a world makes possible the exercise of “real freedom” in the positive sense of individual “self-realization” that is embedded in a community and manifested in labor. As already stated in On the Jewish Question (1843/44), freedom thus understood differs from the bourgeois conception of freedom which is based on a “monadic” view of humans who only relate to each other in terms of competition. Marx argues that under the guise of this strictly individualist and merely formal kind of freedom, it is exclusively capital, not humans that can be considered as free. Thus, it is the idea of commitment, grounded in the communal and practical orientation of his notion of “self-realization,” which, against the background of his criticism of capitalist society, characterizes Marx’s concept of post-revolutionary freedom. In his understanding, the indeterminacy or openness of this concept as regards content guarantees that the spontaneity constitutive of freedom is not prefigured and, thereby, inhibited or even suppressed: For Marx, it is evident that the precise results of authentically free human action and interaction cannot be predicted. Thus, the significance of his vision of a future free society, in which the difference between oppressors and oppressed is overcome, is underlined in his deliberate refusal to further specify its shape.

d. The Question of the Revolutionary Subject

The question of the revolutionary subject pertains to the primary agent of radical transformation. Here, the spectrum ranges from history unfolding largely independent of man’s decisions and actions on the one end to autonomous, history-shaping man on the other. In the latter case, the agent can take a variety of forms ranging from exceptional individuals to a transnational “multitude,” from a distinct avant-garde to an amorphous crowd.

G.W.F. Hegel’s concept of revolution is thoroughly determined by his concept of history. Radicalizing Kant’s teleological conception, Hegel understands history as a rational process in which the “idea of freedom” successively realizes itself. According to his macro-perspective, this progressive development, the self-actualization of objective “spirit,” unfolds based on the principle of dialectics. It becomes manifest in the “oriental” civilizations of China, India, and Persia, in ancient Greece, in the Roman Empire, and, finally, in the “Germanic” age of reformation and enlightenment which supersedes the “dark night” of the Middle Ages, Renaissance, and the era of feudalism (compare Hegel, 1991 [1832-45]). From this it follows that the revolutions in the United States and France or the 1791 slave uprising in Haiti on which Hegel comments have to be interpreted as indicative of the current stage of development of the idea of freedom. As a consequence, revolutions, for Hegel, cannot be “made” by humans as autonomous agents. Rather, they mark epochal transitions in the “necessary” progression of history, which finds expression in the thoughts and deeds of humans. Hegel’s remarks on the French Revolution reveal that revolutionary achievements (most importantly, man’s historically unparalleled attempt to govern reality through ideas) and revolutionary failures (most importantly, the “abstract,” “subjective,” and, thus, deficient understanding of freedom which leads to the Terreurs) are to be seen primarily as reflections of the imperfect level reached by “spirit” thus far (compare Hegel, 1977 [1807]).

In opposition to Hegel’s accentuation of the progressive dynamics inherent to history, a wide range of theorists emphasize the principal role of human action with regard to the question of revolutionary subjectivity. However, these thinkers suggest various concretizations of man as the driving force of profound transformation: Bakunin emphasizes the world-changing potential of individual “bandits” (compare Bakunin, 1990 [1873]); Lenin points to a revolutionary avant-garde of limited size (compare Lenin, 1987 [1902]); Foucault attributes this role to the entirety of a people united by an experience of “political spirituality” (compare Foucault, 2005 [1978-79]); Fanon understands revolutionary subjectivity to be actualized by the “wretched” victims of colonialism (compare Fanon, 1967 [1961]; Sartre, 1967); Marcuse sees the heterogeneous group of the marginalized and “hopeless” both within and without Western societies as the key agent of revolution (compare Marcuse, 1991 [1964]); finally, contemporary theorists like Michael Hardt and Antonio Negri present a global “multitude” as the only political unit capable of realizing a revolution against the system of late capitalism (compare Hardt/Negri, 2004; Negri, 2011).

In Marx’s thought, the dichotomy between the idea that revolution is the effect of history’s independent development and the idea that revolution is the immediate product of human action is put into question. On the one hand, Marx’s position is strongly influenced by Hegelian philosophy: Despite modifying Hegel’s dialectics materialistically, he reiterates the thought of an internal logic to history (for Marx, the logic of “class struggle”) on the basis of which all processes of transformation can be explained as “necessary.” Yet, on the other hand, a specific social class is needed to concretely carry out such processes. In the historical context of the 19th century this social class is the “proletariat,” which is presented as the decisive factor of revolutionary change (compare Marx/Engels, 2012 [1848]). Thus, although Marx and Engels hold that revolution cannot be “made” thanks to human will and action alone, it cannot become manifest without human will and action. With respect to the problem of the revolutionary subject, a similar interplay between history’s inaccessible movement and self-determined human agency is described by theorists concerned with the kairos, that is, the right moment or timing for radical change. Rousseau, for instance, argues that specific historical constellations (“crises”) are necessary for humans (here, a people) to successfully initiate revolutions (compare Rousseau, 2012 [1762]). For Jefferson, such constellations—“precious occasions” beyond human planning and control—are the precondition for successfully consolidating the progress thus far achieved by bringing to a halt the revolutionary dynamics before it escalates into continuing violence and irreversible political, social decomposition (compare Jefferson, 2010). In both cases, human will and action is autonomous. Yet, according to Rousseau and Jefferson, revolutionary subjectivity is strongly affected and limited by what historical situations grant or deny respectively.

Further questions arise once theorists have identified man as the subject to actively make revolution. For instance, it is to be determined whether the revolutionary subject’s capacity to act in a world-transforming way is the result of minute “organization” as argued by Lenin for example, or whether it emerges “spontaneously” as, for example, Kropotkin claims. Another debate in this context concerns the driving motivational forces behind revolutionary subjectivity. Here, some theorists emphasize material, that is, social or economic factors, while others understand immaterial, that is, intellectual or spiritual factors, to be decisive. This tension between “being” and “consciousness” is reflected in the controversy between Jean-Paul Sartre and Maurice Merleau-Ponty: Whereas the former understands the revolutionary subject’s actions as caused by a concrete material “situation” of oppression (compare Sartre, 1955 [1946]), the latter insists that such actions constitute a form of “significance” (“Sinn-gebung”), that is, a form of freely creating meaning through revolutionary projects, which is irreducible to materialist causality (compare Merleau-Ponty, 2005 [1945]). Finally, the positions diverge with respect to the attitudes that are considered particularly conducive to effective individual or collective revolutionary action. Foucault, based on his observations of the overthrow of the Shah, underlines the influence of the “profane register” of indignation, resentment, even hatred that crucially fuels the revolutionary movement in Iran (compare Foucault, 2005 [1978-79]). Pointing to the deeply transformative political projects of Mahatma Gandhi, Martin Luther King, and Nelson Mandela, Martha Nussbaum attributes their success to an attitude that overcomes negative, destructive emotions and is committed to “non-anger” instead (compare Nussbaum, 2013). In her view, this mental commitment to non-anger is more decisive for revolutionary justice and for post-revolutionary reconciliation between former opponents than the practical commitment to non-violence.

e. The Question of the Revolutionary Object

The question of the revolutionary object pertains to the primary target of revolutionary change. Two predominant strands can be distinguished: While some theorists hold that revolutions should primarily aim at converting the attitudes, convictions, belief systems and world-views of individuals, others argue that the material, institutional frameworks within which humans act and interact constitute the main object or site of revolutionary change. Once more, a variety of positions can be found in between these extremes. Such positions hold both dimensions not only to be necessary conditions of radical change but also to mutually affect each other.

Fanon is one of the thinkers who argue that revolution cannot be limited to a remaking of the external world, that is, to the establishment of a different political, economic, social, and cultural order. Instead, full transformation is only achieved by an internal process of “creation” in which the carriers of the revolution, individually as well as collectively, re-humanize themselves in their struggle for liberation from systemically de-humanizing colonial rule. According to Fanon’s politico-psychological theory of revolution, the inner sphere of attitudes towards oneself, one’s community, and one’s former oppressors is the essential locus of revolutionary change: It is there that a radical transformation of the revolutionaries’ status occurs which turns them from an “animalized” and “objectified,” anonymous and disposable mass into “sovereign” subjects capable not only of self-determination but also of self-respect (compare Fanon, 1967 [1961]).

In contrast, the anarchist theorists Mikhail Bakunin and Peter Kropotkin (1842-1921) point to the institutional conditions as the main target of the “social revolution” they advocate. In their understanding, it is above all the institution of the state that has to be destroyed if freedom, morality, and solidarity are to be realized among humans: Being a source of “artificial” authority, any state, independent of its specific form, makes the unrestricted, free flourishing of men impossible (compare Bakunin, 2009 [1871]). Therefore, conquering freedom in its totality is tantamount to establishing an order that abolishes every political or religious institution that exercises authority. Such a society organizes itself according to the principles of decentralization, social diversity, and horizontal interconnectedness, which allow for harmony and happiness on both the subjective and inter-subjective level (compare Kropotkin, 2008 [1892]). This line of thought, which emphasizes the primacy of institutional transformation, is also represented by Kant. Far from suggesting the abolition of the state, however, Kant marks the essential institutions of the state—its politico-legal constitution and system of law—as the decisive lever to unhinge despotism and promote progress with respect to freedom, rationality, and morality in a process of “complete revolution.” In his view, a program of political pedagogy that aims at directly transforming the way in which humans understand themselves and the world is not only empirically unreliable, but also categorically insufficient. What is needed instead is a progressive shift as to systemic conditions that make it possible for a “spirit of freedom” to unfold successively. It is conditions founded on principles of right that will eventually lead to a fuller realization of the individuals’ moral and rational potential (compare Kant, 1991 [1798]; 1996 [1797]).

Insisting on the comprehensive character of revolution, Rousseau, when thinking about its adequate object or target, attempts to avoid comparable predeterminations. He argues that both the modus operandi of individual humans (that is, their ways of thinking, feeling, and acting) and of political institutions (that is, their ways of being structured and of acting upon citizens) has to be tackled for thorough transformation to occur. Consequently, if “moral” and “civil” liberty and equality are to be realized, it takes the contribution of education, as elaborated in Emile, as well as of institutional restructuring, as elaborated in the Social Contract: According to Rousseau, both the individual and the framework of politico-legal institutions constitute necessary targets of revolutionary change. Rousseau’s considerations thus underline the interdependence of both transformative dimensions.

f. The Question of the Extension of Revolution

This question pertains to (a) the temporality or, more narrowly, the duration and (b) the expansion of revolutionary transformation. Theorists dissent considerably as to whether such transformation has to be conceived as momentary, procedural, or permanent; they also disagree whether revolutions are to be understood as local, national, international, or global instances of profound, lasting politico-social change.

On the basis of his “messianic” conception of time and history that rejects the conventional understanding of time as “empty” (that is, as continuous and homogenous), Benjamin interprets revolution as a “shock” that kairologically disrupts the prevailing chronological and, with it, social and political order. For him, revolution thus constitutes a momentary event that makes a switch from a state of historical normalcy to a state of historical exception possible. This switch is as radical as it is sudden: “Every second” has the potential to serve as the gate through which “the messiah” can enter to fundamentally transform the world (compare Benjamin, 2009 [1940/42]). As opposed to Benjamin, thinkers like Hegel or Antonio Gramsci (1891-1937) understand revolution as a process that spans in time before it leads to substantial, intelligible change, that is, to new political, legal, and economic, cultural, linguistic, and aesthetic principles being implemented and effectively taking root. Although Hegel describes the French Revolution as a “glorious dawn,” it is evident for him that the political events of the late 1780s and early 1790s are belated, derivative effects of a long-lasting historical epoch of revolution that encompasses the ages of the reformation and the enlightenment (compare Hegel, 1991 [1832-45]). Discussing revolution in more narrowly political terms, Gramsci describes its realization as a tedious “war of position” against “hegemonic” power structures: It is only by means of persistently working their way through numerous struggles with the opponents of revolution over time that its carriers can hope to supersede an established order (compare Gramsci, 1992 [1929-35]). Similarly, Marx and Engels put emphasis on the aspect of duration. Modeling their understanding of revolution on the Israelites’ exodus from Egypt (compare Walzer, 1985), they attribute great significance to the interval period that lies in between the status quo at the time of the failed revolutions of 1848 and the future actualization of a classless society. Given the considerable distance between the initial and the terminal point of revolution, they propose a notion and practical program of “permanent revolution” that links immature democratic revolution to mature “proletarian” revolution. In modernizing and democratizing this idea, Étienne Balibar (*1942) expounds an understanding of revolution as a continuous, open-ended task. According to his view, revolution cannot hope for a final stage of satisfaction and completion (compare Balibar, 2014). Instead, it means an ongoing exercise in responsible citizenship and in “democratizing democracy.” This exercise allows for an ever increasing inclusion of groups and individuals who, heretofore, have been denied the ability to “take part,” that is, for their unrestricted recognition as full subjects of “equaliberty,” which is a hybrid term indicating the two main trajectories of modern emancipatory politics: On one side, the Lockean liberal and individualist strand and, on the other, the Rousseauian socialist and collectivist strand, which Balibar takes to be interdependent and co-constitutive elements of democratic revolution.

Other thinkers discuss revolution primarily in terms of its spatial extension. Contemporary anarchist theorist David Graeber (*1961) argues that revolutionary projects can be pursued by the creation of “autonomous spaces” on a local scale. Within such spheres, alternatives to dominant forms of coexistence and interaction, of politics and economy can be practiced whereby the existing order is unmasked as contingent. What is more, in drawing on exemplary practices from other epochs and cultures, the contours of an order devoid of institutions such as the state or capitalism and of repressive convictions such as racism and misogyny are “pre-figured.” For Graeber, the narrow spatial limits of these alternative micro-worlds characterized by autonomy, mutual aid, and direct democracy do not negatively affect their subversive, transformative capacities (compare Graeber, 2004). Whereas thinkers such as, for example, Sieyès and Foucault see the nation state as the adequate space for revolution to occur (compare Sieyès, 2003 [1789]; Foucault, 2005 [1978-79]), others claim that this is too limited a scope for radical transformation to have profound and lasting impact. For instance, Lenin, not unlike Sartre in his “revolutionary humanism” (compare Sartre, 1955 [1946]), follows Marx in emphasizing the transnational implications of revolution even if its scope, especially in its early phases, has to be national for reasons of mere practicability. According to Lenin, emancipatory projects carried out by a “revolutionary people” send shockwaves across neighboring as well as distant countries. Thus, it is evident to him, that the Russian Revolution ultimately represents the “interests of world socialism,” which outweigh mere national interests (compare Lenin, 1987 [1902]; 1978 [1917]). This position takes up the universalism inherent to the American and French Revolution which finds its expression in pronounced references to the “rights of man” in the writings and speeches of Paine or Mirabeau as well as in the essential political documents of the revolutionary period: the 1776 Declaration of Independence and the 1789 Declaration of the Rights of Man and of the Citizen.

4. Conclusion

Even when the plurality of manners in which “revolution” is used in the domains of technology and science, culture and art, is left aside and when the term is applied in the domain of politics only, the heterogeneity and contested nature of understandings remains considerable. In spite of the wide range of specific approaches, arguments, and agendas characteristic of the individual theories of political revolution, they can be situated within one multifaceted, yet unified intellectual space: From the theoretical enablers and “inventors” of revolution like Rousseau, Paine, or Kant to contemporary thinkers of revolution like Balibar or Graeber, their theories have been confronted with a number of central problems and questions which open up, shape, and sustain this space. It is primarily in terms of these central questions that they have attempted to conceptually grasp revolution. Six of these questions have been outlined in the above sections: (1) the question of revolutionary novelty which is discussed on a spectrum between the extremes of absolute and relative notions of rupture and beginning; (2) the question of revolutionary violence and its legitimacy discussed on the spectrum between unqualified approval and unreserved exclusion as a means of revolution; (3) the question of revolutionary freedom discussed on the spectrum between negative (liberation) and positive (foundation) concepts of freedom as the aim of revolution; (4) the question of the revolutionary subject discussed on the spectrum between individual doers on the one end and a global “multitude” on the other; (5) the question of the revolutionary object or target discussed on the spectrum between political, social institutions and individual, subjective attitudes, convictions, and beliefs; and, (6), the question of the temporal and spatial extension of revolution discussed on the spectrum between momentary and local on the one end, permanent and global on the other. Despite their pronounced heterogeneity and their attempts to periodically redefine revolution, it is with respect to these key questions that the theories presented here share family resemblances to one another.

Defining whether political change can be considered revolutionary constitutes the conceptual issue at the core of these theories. In particular, they aim at circumscribing revolution in regard to related, yet distinct concepts such as revolt, rebellion, and reform whereby the questions of the new, of liberty, and of the legitimacy of violence serve as the most relevant criteria for demarcation. The first two criteria play a central role in the distinction between revolution on the one hand, revolt and rebellion on the other. As a consequence of the underlying main goal of casting off an unjust, oppressive regime, both revolt and rebellion are based on limited notions of novelty and liberty. Thus, in comparison to revolutionary change, the specific kind of change they aspire to is more marginal in its scope. However, once revolution is not conceived as momentary but as procedural (as is the case in Kant’s or Marx’s considerations), drawing such a clear conceptual line seems less feasible: If revolution is understood as a temporal sequence that encompasses multiple stages, an initial “revolting” or “rebellious” phase is conceivable, for which the aspect of durable foundation of a new order is secondary. For the differentiation of revolution and reform, the criteria of novelty and violence are central. Whereas the criterion of violence reliably allows for a demarcation, temporalized understandings of revolution entail the blurring of a seemingly obvious difference with respect to the aspect of novelty: Here, a concluding “reformist” phase of revolution is thinkable in which the configuration of an institutional order or the establishment of a common ground with former “enemies of the revolution” takes precedence. Accordingly, when Kropotkin links revolution and revolt or when Kant explicitly associates revolution with reform, the relatedness between these concepts and not to mention the phenomena is reflected. In light of these resemblances, attempts at a precise conceptual critique of revolution, which distinguishes it sharply from revolt, rebellion, or reform remain heuristic in character.

Determining if and under what conditions revolutionary action and, especially, revolutionary violence are morally justified constitutes the normative issue at the core of theories of revolution. Although revolution represents the most radical expression of dissent and protest, the determination of its legitimacy reveals points of contact with debates on less extreme forms of a politics of resistance and transformation such as, for example, civil disobedience (compare Rawls, 1999). Despite the differences as to, inter alia, the scope of the envisaged transformation, their legitimacy essentially depends on the underlying cause and motivation. Revolutionary action and, with it, at least temporary political disorder, can only be considered legitimate if it aims at overcoming continued violations of the basic rights of specific groups or entire nations by the regime in power that are both severe and systematic. While conflict between ruling powers and revolutionary movements typically takes place within the context of a state, broader issues independent of the policies of a specific state can also be invoked as a justified cause to engage in radically transformative politics. The Occupy movement and its appeal to the inequalities brought about by the current global economic system is a case in point. Within and beyond the context of the state, the intention to right the wrongs—that is, the injustices as to dignity, liberty, and equality—committed by a regime and secured by unjust political, legal, social, or economic institutions is the primary precondition for a revolutionary project’s justifiability.

Furthermore, the (il)legitimacy of revolutionary politics is determined by the heavily disputed question of the permissibility of revolutionary violence. In relation to this question, the focus is not on the just cause, the right reason and intention of such a politics, but on the conduct in the course of its realization. The dispute pertains to different dimensions: It concerns the general issue whether violence can be considered a politically and, more importantly, morally justifiable means of revolution, in other words, whether, based on strategic or principled considerations, its use can be justified at all. In addition, it concerns more specific issues such as its justifiable form (for example, violence against property), scope (for example, violence limited to early stages of the revolutionary process), and status (for example, violence as a last resort once all peaceful alternatives have failed). Here, the discussion on revolution resembles theoretical debates on just war (Arendt, 2006 [1963]; Walzer, 2006 [1977]). For instance, much like in the case of the ius in bello, attempts to formulate essential criteria of acceptable revolutionary conduct aim at ensuring the proportionality of the use of violence, at discriminating between legitimate and illegitimate targets, and at prohibiting hostile acts which are “vile in themselves” (compare Kant, 2006c [1795/96]). Besides the perspectives of cause (in analogy to the terminology of just war theory: ius ad revolutionem) and conduct (ius in revolutione), there is a third critical perspective, in terms of which the legitimacy of revolutionary action and violence is determined. This perspective focuses on the ius post revolutionem, that is, on the final stage of a revolution, and assesses its capacity to terminate the state of exception in order to transition into a new and stable political order. Thereby, the stability of such a reconstitution is largely predicated on reconciliation with and inclusion of former adversaries. It is mainly thanks to the criteria of cause, conduct, and reconstitution that revolutionary violence becomes distinguishable from the violence used by criminals and, especially, terrorists. However, largely on the basis of formative historical experiences of excessive revolutionary violence—of revolutions not only harming their enemies, but also “devouring their children” —as well as of Gandhi’s or Mandela’s successful transformative projects, non-violent revolutionary action generally has a greater claim to justification.

A further relevant issue with regard to just revolution theory pertains to the self-authorization of revolutionary movements, which raises the questions whom such movements speak for and whose interests they represent. This issue crystallizes in revolutionary declarations that often appeal to “the people” (compare Habermas, 1990; Derrida, 2002). In this case, the legitimacy of a revolutionary project depends, among other things, on whether the revolutionaries’ political power and the sovereignty of the regime they establish is based on force or on discourse, that is, on oppression or persuasion of the majority.

To conclude, this article provides a sample of the rich theoretical discourse surrounding the contested concept of revolution. While the positions developed within the three dominant schools of thought (democratic, communist, and anarchist) are strongly shaped by broader commitments to the underlying political philosophies and often indebted to other debates (for example, on war), this discourse has distinctive features due to the specificity of its object of investigation and the controversial exchange of views between the different traditions. Given both its width and unsettledness, there are significant conceptual and normative issues for philosophers to address. It is not only in light of the often problematic history of revolutions that it is expedient to theoretically “provide yardsticks and measurements” (Hannah Arendt); a thorough analysis and critical assessment of transformative concepts, agendas, and strategies is also required because of the contemporary re-emergence of movements with revolutionary aspirations from the Zapatistas to the Arabellion, Occupy, or the Indignados.

5. References and Further Reading

  • Arendt, H., 2006, On Revolution [1963], New York: Penguin.
  • Badiou, A., 2012, The Rebirth of History, trans. G. Elliott, London/New York: Verso.
  • Balibar, É., 2014, Equaliberty: Political Essays, trans J. Ingram, Durham: Duke University Press.
  • Bakunin, M., 2009, God and the State [1871], New York: Cosimo.
  • Bakunin, M., 1990, Statism and Anarchy [1873], trans. & ed. M.S. Shatz, Cambridge: Cambridge University Press.
  • Benjamin, W., 2009, On the Concept of History [1940/42], New York: Classic Books America.
  • Benjamin, W., 1999, Zur Kritik der Gewalt [1921], in Walter Benjamin Gesammelte Schriften, vol. II.1, eds. R. Tiedemann & H. Schweppenhauser, Frankfurt: Suhrkamp, 179–204.
  • Berman, H., 1985, Law and Revolution: The Formation of the Western Legal Tradition, Cambridge, MA: Harvard University Press.
  • Butler, J., 2015, Notes Toward a Performative Theory of Assembly, Cambridge, MA: Harvard University Press.
  • Camus, A., 1991, The Rebel: An Essay on Man in Revolt [1951], trans. A. Bower, New York: Vintage Books.
  • Condorcet, J.A.N. de, 2012, Political writings, eds. S. Lukes & N. Urbinati, Cambridge, UK/New York: Cambridge University Press.
  • Dawoody, M., 2011, The Islamic Law of War: Justifications and Regulations, London: Palgrave Macmillan.
  • DeFronzo, J., 2011, Revolution and Revolutionary Movements, Boulder: Westview Press.
  • Derrida, J., 2002, “Declarations of Independence”, in Negotiations: Interventions and Interviews 1971-2001, ed. & trans. E. Rottenberg, Stanford: Stanford University Press, 46–54.
  • Engels, F., 1969, Germany: Revolution and Counter-Revolution [1851/52], with the collaboration of Karl Marx, ed. E. Marx, London: Lawrence & Wishart.
  • Fadl, K., 2006, Rebellion and Violence in Islamic Law, Cambridge: Cambridge University Press.
  • Fanon, F., 1967, The Wretched of the Earth [1961], trans. C. Farrington, Harmondsworth: Penguin.
  • Foucault, M., 2005, “Writings on the Iranian Revolution” [1978-79], in Foucault and the Iranian Revolution: Gender and the Seductions of Islamism, eds. J. Afary & K.B. Anderson, Chicago: University of Chicago Press, 179–277.
  • Furet, F., and, M. Ozouf (eds.), 1989, A Critical Dictionary of the French Revolution, Cambridge, MA: Belknap Press.
  • Graeber, D., 2004, Fragments of an Anarchist Anthropology, Chicago: Prickly Paradigm Press.
  • Gramsci, A., 1992–, Prison Notebooks [1929-35], New York: Columbia University Press.
  • Habermas, J., 1990, “Naturrecht und Revolution”, in Theorie und Praxis, Frankfurt: Suhrkamp, 89–127.
  • Hardt, M., and A. Negri, 2004, Multitude: War and Democracy in the Age of Empire, New York: Penguin Press.
  • Hegel, G.W.F., 1977, Phenomenology of Spirit [1807], trans. A.V. Miller, Oxford: Clarendon Press.
  • Hegel, G.W.F., 1991, The Philosophy of History [1832-45], trans. J. Sibree, Buffalo, NY: Prometheus Books.
  • Hobsbawm, E., 1996, The Age of Revolution: Europe 1789-1848 [1962], New York: Vintage Books.
  • Jefferson, Th., 2004–, The Papers of Thomas Jefferson: Retirement Series, ed. J.J. Looney, Princeton: Princeton University Press.
  • Jefferson, Th., 2010, The Selected Writings of Thomas Jefferson: Authoritative Texts, Contexts, Criticism, ed. W. Franklin, New York: W. W. Norton & Co.
  • Kant, I., 2006a, “An Answer to the Question: What is Enlightenment?” [1784] in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 17–23.
  • Kant, I., 2006b, “Idea for a Universal History from a Cosmopolitan Perspective” [1784], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 3–16.
  • Kant, I., 1991, “The Contest of Faculties” [1798], in Kant: Political Writings, ed. H. Reiss, Cambridge: Cambridge University Press, 176–190.
  • Kant, I., 1996, The Metaphysics of Morals [1797], trans. & ed. M. Gregor, Cambridge/New York: Cambridge University Press.
  • Kant. I, 2006c, “Toward Perpetual Peace: A Philosophical Sketch” [1795/96], in Toward Perpetual Peace and other Writings on Politics, Peace, and History, ed. P. Kleingeld, trans. D.L. Colclasure, New Haven: Yale University Press, 67–109.
  • Kantorowicz, E., 1997, The King’s two Bodies: A Study in Medieval Political Theology [1957], Princeton: Princeton University Press.
  • Koselleck, R., 2004, Futures Past: On the Semantics of Historical Time, trans. K. Tribe, New York: Columbia University Press.
  • Koselleck, R. et al., 1984, “Revolution (Rebellion, Aufruhr, Bürgerkrieg)”, in Geschichtliche Grundbegriffe. Historisches Lexikon zur politisch-sozialen Sprache in Deutschland, volume 5, eds. O. Brunner, W. Conze, & R. Koselleck, Stuttgart: Klett-Cotta, 653–788.
  • Kropotkin, P., 2008, The Conquest of Bread [1892], Oakland, CA: AK Press.
  • Kropotkin, P., 1972, Mutual Aid: A Factor of Evolution [1902], ed. P. Avrich, New York: New York University Press.
  • Lenin, V.I., 1987, Essential Works of Lenin: What is to be done? and other writings, ed. H.M. Christman, New York: Dover Publications.
  • Lenin, V.I., 1978, State and Revolution: Marxist Teaching about the Theory of the State and the Tasks of the Proletariat in the Revolution [1917], Westport: Greenwood Press.
  • Locke, J., 1986, Second Treatise on Civil Government [1689], Amherst, NY: Prometheus Books.
  • Marcuse, H., 1984, “Ethik und Revolution” [1964], in Herbert Marcuse Schriften, volume 8, Frankfurt: Suhrkamp, 100–114.
  • Marcuse, H., 1991, One-dimensional Man: Studies in the Ideology of Advanced Industrial Society [1964], Boston: Beacon Press.
  • Marx, K., 2001a, Capital: A critique of Political Economy. Vol. I, Book One, The Process of Production of Capital [1867], trans. S. Moore & E. Aveling, London: Electric Book Co.
  • Marx, K., 2001b, The Class Struggles in France [1850], London: Electric Book Co.
  • Marx, K., 2012, The Communist Manifesto [1848], with Friedrich Engels, London: Verso.
  • Menke, Ch., and, F. Raimondi (eds.), 2011, Die Revolution der Menschenrechte. Grundlegende Texte zu einem neuen Begriff des Politischen, Berlin: Suhrkamp.
  • Merleau-Ponty, M., 2005, Philosophy of Perception [1945], London/New York: Routledge.
  • Nail, T., 2012, Returning to Revolution: Deleuze, Guattari, and Zapatismo, Edinburgh: Edinburgh University Press.
  • Nussbaum, M., 2013, Political Emotions: Why Love Matters for Justice, Cambridge, MA: Harvard University Press.
  • Paine, Th., 2012, Common Sense [1776], ed. R. Beeman, New York: Penguin.
  • Paine, Th., 2000, Political Writings, ed. B. Kuklick, Cambridge/New York: Cambridge University Press.
  • Paine, Th., 1992, Rights of Man [1791], ed. G. Claeys, Indianapolis: Hackett Pub. Co.
  • Palmer, R., 2014, The Age of Democratic Revolution: A Political History of Europe and America 1760-1800 [1959], Princeton: Princeton University Press.
  • Rawls, J., 1999, “The Justification of Civil Disobedience”, in John Rawls: Collected Papers, S. Freeman (ed.), Cambridge, MA: Harvard University Press, 176–189.
  • Rosenstock-Huessy, E., 1993, Out of Revolution: Autobiography of Western Man [1938], Providence: Berg Publishers.
  • Rousseau, J.-J., 1992, Discourse on the Origin of Inequality [1755], trans. D.A. Cress, Indianapolis: Hackett Pub. Co.
  • Rousseau, J.J., 2012, Of the Social Contract and other Political Writings [1762], ed. C. Bertram, trans. Q. Hoare, London: Penguin.
  • Sartre, J.-P., 1962, “Materialism and Revolution” [1946], in Literary and Philosophical Essays, transl. A. Michelson, New York: Criterion Books, 185–239.
  • Sartre, J.-P., 1967, “Preface”, in The Wretched of the Earth, F. Fanon, Harmondsworth: Penguin.
  • Sieyès, E.J., 2003, Political Writings: Including the Debate between Sieyès and Tom Paine, with a Translation of What is the Third Estate?, ed. M. Sonenscher, Indianapolis: Hackett Pub. Co.
  • Skopcol, T., 1979, States and Social Revolutions, Cambridge/New York: Cambridge University Press.
  • Walzer, M., 1985, Exodus and Revolution, New York: Basic Books.
  • Walzer, M., 2006, Just and Unjust Wars: A Moral Argument with Historical Illustrations [1977], New York: Basic Books.
  • Walzer, M., 1992, Regicide and Revolution: Speeches at the Trial of Louis XVI [1972], New York/Oxford: Columbia University Press.
  • Zizek, S., 2012, The Year of Dreaming Dangerously, London/Brooklyn, NY: Verso.
  • Zizek, S., 2008, Violence: Six Sideway Reflections, New York: Picador.

Author Information

Florian Grosser
Email: florian.grosser@unisg.ch
University of St. Gallen
Switzerland

Everettian Interpretations of Quantum Mechanics

Between the 1920s and the 1950s, the mathematical results of quantum mechanics were interpreted according to what is often referred to as “the standard interpretation” or the “Copenhagen interpretation.” This interpretation is known as the “collapse interpretation” because it supposes that an observer external to a system causes the system, upon observation, to collapse from a quantum mechanical state to a state in which the elements of the system appear to have a determinate value for the property measured. Although this interpretation is largely successful at explaining our experiences of the world, it fails in that it gives rise to what has become known as the measurement problem, a problem described in section 2.

In addition to this problem, there is another problem tied to the role of the observer. In the 1950s, Hugh Everett III (1930-1982) was considering quantum mechanics as it might apply to the entirety of the universe. Surely if quantum mechanics were true on the local level of laboratories and experiments taken as closed systems, it would also be true for the entire universe taken as a closed system. The problem with this approach is that there is no external observer available at the scale of the entire universe to cause a collapse of the quantum state, a state that the laws of quantum mechanics say the universe would be in, were it unobserved. Thus, Everett suggested that we abandon the notion of an observer-caused collapse and we consider all quantum states to be always non-collapsed.

Everett published one short paper in 1957—his doctoral dissertation (Everett 1957a, 1957b)—and after that, he left academia. He later published the longer, original version of his dissertation at the request of Bryce DeWitt and Neill Graham (Everett 1973). In his dissertation, Everett develops the mathematical theory that is the foundation of Everettian quantum mechanics [EQM]; but many people have believed the theory itself needs interpretation. Although Everett was not interested in the philosophical implications of his work, there has been great interest among philosophers in trying to interpret what EQM implies about the metaphysical structure of the world.

This article surveys the various ways philosophers have attempted to interpret Everett. To begin, the standard interpretation, as well as its attendant problems, is discussed briefly. Following that, the bare theory, the single and many minds theories, and versions of a many worlds theory are discussed. The article closes by discussing two relational interpretations of Everett.

Table of Contents

  1. Preamble
  2. The Standard Interpretation
  3. Everettian Interpretations
    1. Everett Plus Nothing (The Bare Theory)
    2. Everett Plus Minds
    3. Everett Plus Worlds (DeWitt’s Splitting Worlds)
      1. Problems with the Notion of Splitting
      2. The Preferred Basis Problem
  4. The Evolution of Many Worlds Interpretations
    1. The Problem of Probability and Graham’s Attempt at Adding a Measure
    2. The Oxford MWI
    3. Objections to the Oxford MWI
  5. Relational Interpretations of Everettian Quantum Mechanics
    1. Simon Saunders’ Relational Interpretation
    2. The Relative Facts Interpretation
  6. Conclusion
  7. References and Further Reading

1. Preamble

Before beginning to survey the various ways philosophers have attempted to interpret Everett, we must address the question of whether or not there even are rival interpretations of Everett. Some of the most influential physicists and philosophers working on EQM have either taken it as fact (DeWitt 1970) or explicitly argued (Deutsch 2010; Wallace 2012) that there is only one “interpretation” of EQM: some version of the many worlds interpretation [MWI].

Bryce DeWitt (1923-2004) took it to be the case that the only way one can “interpret” Everett was through a many worlds theory. He wrote, “The mathematical formalism of the quantum theory is capable of yielding its own interpretation” (DeWitt 1970: 160). And that formalism “forces us to believe in the reality of all the simultaneous worlds represented in the superposition described by equation [(2), below], in each of which the measurement has yielded a different outcome” (DeWitt 1970: 161).

David Deutsch (1953- ) and David Wallace (1976- ) have argued that there are no rival “interpretations” of Everett: “Other ‘interpretations’ . . . are really alternative physical theories . . .” (Wallace 2012: 382, Wallace’s emphasis). They see the “Everett interpretation” to be “just quantum mechanics itself, read literally, straightforwardly—naively, if you will—as a direct description of the physical world, just like any other microphysical theory” (Wallace 2012: 2). Deutsch writes:

. . . insisting that parallel universes are ‘only an interpretation’ and not a – what? a scientifically established fact or something (as if there were such a thing) – has the same logic as those stickers that they paste in some American biology textbooks, saying that evolution is ‘only a theory’, by which they mean precisely that it’s just an ‘interpretation’. Or, in terms of the analogy that Everett used in his famous exchange of letters with Bryce DeWitt, it’s like claiming that the motion of the Earth about its axis is only an ‘interpretation’ that we place on our observations of the sky (2010: 543, Deutsch’s emphasis).

Wallace writes:

The ‘Everett interpretation of quantum mechanics’ is just quantum mechanics itself, ‘interpreted’ the same way we have always interpreted scientific theories in the past: as modelling the world. Someone might be right or wrong about the Everett interpretation – they might be right or wrong about whether it succeeds in explaining the experimental results of quantum mechanics, or in describing our world of macroscopically definite objects, or even in making sense – but there cannot be multiple logically possible Everett interpretations any more than there are multiple logically possible interpretations of molecular biology or classical electrodynamics (2012: 38, Wallace’s emphasis).

The arguments Deutsch and Wallace provide may be persuasive to some readers. But the purpose of the current article is to survey what has historically been done by philosophers attempting to draw metaphysical pictures from Everett’s pure wave mechanics. Wallace is explicit about the fact that he is not attempting to do historical exegesis of Everett’s views (2012: 2), and whether Everett would be sympathetic to or supportive of an MWI is an open question. (Again see Barrett 2010, Barrett and Byrne 2012, and Bevers 2011.) So whether or not a version of the MWI is the correct interpretation of Everett, or even the only interpretation of Everett, is a question that can be adjudicated in other venues. Our purpose here is to consider the ways in which philosophers have attempted to interpret Everett’s pure wave mechanics, and so, after one final preliminary note, it is to this that we shall turn.

There is one other debate that ought to be considered before embarking on our project. And that is the debate over the appropriate way to explain the results of quantum mechanical experiments. Everett’s proposal for pure wave mechanics is but one way physicists explain what seem to be counterintuitive outcomes of quantum mechanical experiments. Other ways include Bohmian mechanics (de Broglie 1928, Bohm 1952) and GRW (Ghirardi, Rimini & Weber 1986). Whether or not the unitary dynamics proposed by Everett are the correct laws for describing the world is a question that is far from decided. But as this article is concerned with the question of Everett interpretation, a full rehearsal of this debate goes beyond its scope. There is no assumption made here about what the correct theory of the world is; rather there is only a historical discussion of the way people have interpreted Everett.

For more on interpretations of quantum mechanics, see “Interpretations of Quantum Mechanics” in this encyclopedia and also (Lewis 2016).

2. The Standard Interpretation

In Schrödinger’s cat thought experiment (Schrödinger, 1935),  there is a cat locked inside a box along with a glass vial of cyanide; a hammer set to potentially break the vial; and a Geiger counter inside of which there is a sample of a radioactive substance small enough that there is a 50% chance of one of the atoms decaying in the course of one hour and a 50% chance that none will. If an atom decays the Geiger counter will click which causes the hammer to fall, the flask to break, the cyanide gas to be released and the cat to die. If an atom does not decay, the cat remains alive. If the inside of the box is not observed during this hour, Schrödinger took the formalism of quantum mechanics to imply that the cat would be in a superposition of being alive and dead.

Superpositions are states of systems that are represented mathematically by a weighted sum of the possible values for the property in question. Each summand will represent one of the possible values for the property and will be accompanied by a complex number coefficient which, when it is multiplied by its complex conjugate and the result is squared (in other words, when its norm is squared), the standard interpretation takes to be the probability of the system collapsing into that value for the property. So in our cat example, there will be two terms in the superposition of the state of the system that includes the cat, each with a coefficient of 1/√2 which, when its norm is squared, gives us a ½ probability of the system collapsing into the state that includes the cat being alive and a ½ probability of its collapsing into the state of the system that includes the cat being dead.

We never seem to observe cats (or any other macroscopic objects) as being in superpositions. The standard interpretation assumes that when an observer interacts with a system, that observation causes a collapse of the superposition, and the objects in the system take on definite values for the property being measured. So when the box is opened and the observer looks into it, the system randomly and instantaneously collapses into either cat alive or cat dead with a 50% probability of finding either.

Now we can transition from cats to electrons. Electrons have a property called “angular momentum” that can take a definite value of either spin up or spin down along the x, y or z axis. These values are mutually exclusive in the sense that if an electron has a definite value for one of the properties, it does not have a definite value for any of the others. It has been experimentally determined that when electrons have a definite spin property along one axis, they are in a superposition of having spin up and spin down along both of the other axes. So, for example, when an electron is determinately x-spin up, it is in a superposition of being y-spin up and y-spin down, and it is in a superposition of being z-spin up and z-spin down.

The standard interpretation tells us that when we observe electrons that are in such superpositions, they instantaneously and randomly collapse from the superposition they were in to one of the definite properties that make up that superposition. So when we take an electron that is x-spin up, for example, and measure its z-spin, the standard interpretation tells us that it collapses from being in a superposition of being z-spin up and z-spin down into either being z-spin up or being z-spin down. In the standard interpretation, this collapse explains the determinate measurement records that we get in experiments with quantum mechanical systems. The standard interpretation also tells us that if we do not observe quantum particles, then that collapse will not happen and they will remain in their superpositions.

The difference in these empirical results is captured in two of the laws that are part of the theory of quantum mechanics:

  1. When no measurement, or other observation, is made of a system, then that system evolves in a deterministic and linear fashion.
  2. When a measurement, or other observation, is made of a system, then that system instantaneously and non-deterministically collapses into a definite value for the property being measured.

In the standard interpretation, the second law accounts for the fact that when we measure any property of an object, it has a definite value.

These two laws are not compatible, and there is no clear explanation of when one is to be used instead of the other.  In other words, there is no explanation of what constitutes an act of observation in the standard interpretation.  In addition to this, there is the problem that if we take measurement devices to be physical systems like any other, then the standard interpretation says that the quantum system that makes up the measuring device will evolve deterministically, but the second law says that it will take on a definite value with a certain probability.  In other words, it would have to follow both a deterministic law and a law governed by chance. This is not logically possible.  This, in short, is the quantum measurement problem.  It is part of what has driven the search for different interpretations of quantum mechanics.

Another part of what drove the search for a new interpretation of quantum mechanics was an interest in being able to apply quantum mechanics to the entire universe. This is what, at least in part, led Hugh Everett III to suggest that instead of having two mutually incompatible laws for the description of the evolution of states, we drop the law that is used when systems are observed (Everett 1957a, 1957b). The implication of this is that while the standard interpretation suggests that when a measurement is made, the superposition of a quantum particle collapses and the particle has a determinate measurable (and measured) property, the so-called “Everett interpretation” claims that there is no such collapse.  All the theories that have sprung from Everett’s “pure wave mechanics” have come to be known as “no-collapse” theories since they propose that there is no collapse of the superpositions.

One difficulty with no-collapse theories is making sense of how it is that we seem to have determinate measurement records for quantum particles even though those particles do not have a determinate value for the property measured, since they never collapse out of their superpositions. Another is the question of probability in a universe in which everything happens. Various interpretations of Everett have answered these issues differently. It is to a discussion of these various interpretations that we now turn.

3. Everettian Interpretations

a. Everett Plus Nothing (The Bare Theory)

Everett’s pure wave mechanics suggests that there is generally no determinate fact about the everyday properties of the objects in our world, since the equations that are supposed to describe such properties are such that they describe superpositions of those properties. Rather, Everett takes there to be only “relative states” and thus “relative properties” of quantum systems.

To see what he means by “relative states” and “relative properties,” consider the following. When we want to learn the value of a property for some system, we measure for that property.  But Everett treats measuring devices just as he would any other system with which the object system interacts, and so the measuring device will become correlated with the system that it is measuring.  In order to learn anything about one subsystem, even the reading on a measuring device, one must make reference to the complement of the subsystem:

As a result of the [measurement] interaction the state of the measuring apparatus is no longer capable of independent definition. It can be defined only relative to the state of the object system. In other words, there exists only a correlation between the two states of the two systems. It seems as if nothing can ever be settled by such a measurement . . . There is no longer any independent system state or observer state, although the two have become correlated in a one-one manner (Everett 1957b: 144, 146).

Everett explains “correlation” this way: “If one makes the statement that two variables, X and Y, are correlated, what is basically meant is that one learns something about one variable when he is told the value of the other” (Everett 2012: 61; this definition also shows up in Everett 1973: 17). Even in this case, it only “seems” as if nothing can be settled because of this correlation between an object system and a measuring device.  In fact, one can settle matters with the use of relative states:

. . . a constituent subsystem cannot be said to be in any single well-defined state, independently of the remainder of the composite system.  To any arbitrarily chosen state for one subsystem there will correspond a unique relative state for the remainder of the composite system.  This relative state will usually depend upon the choice of state for the first subsystem.  Thus the state of one subsystem does not have an independent existence, but is fixed only by the state of the remaining subsystem.  In other words, the states occupied by the subsystems are not independent, but correlated.  Such correlations between systems arise whenever systems interact (Everett 1957b: 142; Everett’s emphasis).

Consider again what happens when we measure the z-spin of an x-spin up electron. Before the measurement interaction, the state of the system consisting of the electron, e, and the measuring device, m, can be expressed in this way:

(1)        |m>ready1/√2(|↑z>e + |↓z>e)

Where “|m>ready” is what we use to express that the measuring device is ready to make a measurement, “|↑z>e” expresses that the electron is z-spin up and “|↓z>e” expresses that it is z-spin down. When we measure the z-spin, the measuring device interacts with the system it is measuring and becomes a part of the system. After that interaction, the state of the system that consists of the measuring device and the electron can be expressed in this way:

(2)        1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

Where “|“↑”z>m” expresses that the measuring device has recorded z-spin up and “|“↓”z>m” expresses that the measuring device has recorded z-spin down.

In this state, the measuring device is in a superposition of reading z-spin up and z-spin down. Everett writes, “one can . . . look upon the total wave function . . . as a superposition of pairs of subsystem states, each element of which has a definite q value . . . for each of which the apparatus has recorded a definite value . . .” (Everett 1957a: 58, 59; Everett’s emphasis). So the way we get an explanation of our determinate measurement records is by understanding that they are records of relative states of a system.

Using the concept of relative states, we can say two things: (1) relative to the electron’s being z-spin up, the measuring device recorded “z-spin up” as the value of the electron’s z-spin; (2) relative to the electron’s being z-spin down, the measuring device recorded “z-spin down” as the value of the electron’s z-spin. This explains our determinate measurement results because if we ask a reliable observer whether he got a determinate measurement result for an experiment, he will always say “yes,” even when his measurement result is a superposition of outcomes and not one outcome determinately (Albert 1992, Barrett 1999). Here is why:

Let us say that there is a reliable observer who is about to measure the z-spin of an x-spin up electron. By “reliable” we mean that when we ask him whether he has a measurement record, if he has one then he will answer that he does; if he does not have one, he will answer that he does not. Let us also say that our observer is truthful.  He will always truthfully report what he has as a measurement record—if he recorded “z-spin down,” then he will answer “z-spin down” when asked what he recorded. Recall that x-spin up electrons are in a superposition of being z-spin up and z-spin down. So when we describe the state of the electron mathematically there will be one summand that describes the electron as z-spin up and one that describes it as z-spin down.

According to Everett’s pure wave mechanics, when our observer makes a measurement of the electron he does not cause a collapse, but instead becomes correlated with the electron. What this means is that where once we had a system that consisted of just an electron, there is now a system that consists of the electron and the observer. The mathematical equation that describes the state of the new system has one summand in which the electron is z-spin up and the observer measured “z-spin up” and another in which the electron is z-spin down and the observer measured “z-spin down.” In both summands our observer got a determinate measurement record, so in both, if we ask him whether he got a determinate record, he will say “yes.” If, as in this case, all summands share a property (in this case the property of our observer saying “yes” when asked if he got a determinate measurement record), then that property is determinate.

This is strange because he did not in fact get a determinate measurement record; he instead recorded a superposition of two outcomes.  After our observer measures an x-spin up electron’s z-spin, he will not have determinately gotten either “z-spin up” or “z-spin down” as his record. Rather he will have determinately gotten “z-spin up or z-spin down,” since his state will have become correlated with the state of the electron due to his interaction with it through measurement. Everett believed he had explained determinate experience through the use of relative states (Everett 1957b: 146; Everett 1973: 63, 68–70, 98–9). That he did not succeed is largely agreed upon in the community of Everettians.

This sparse interpretation of Everett, adding no metaphysics or special assumptions to the theory, has come to be known as the “bare theory.”  One might say that the bare theory predicts disjunctive outcomes, since the observer will report that she got “either z-spin up or z-spin down ”—without any determinate classical outcome—without being in a state where she would determinately report that she got “z-spin up” or determinately report that she got “z-spin down” (Barrett 1999).  So, if the problem is to explain how we end up with determinate measurement results, the bare theory does not provide us with that explanation. Something must be added to Everett’s account.

For more on the bare theory see Albert and Loewer 1988, Albert 1992 and Barrett 1999. That Everett was uninterested in the philosophical implications of his work has been argued by Barrett 2010, Barrett and Byrne 2012 and Bevers 2011, though Deutsch 2010 and Wallace 2012 differ with this conclusion.

b. Everett Plus Minds

One suggestion about what to add to Everett’s account is to suppose that every time we are faced with an entangled state such as the state in which our observer found himself in the last section, we conclude that he got a determinate result from a particular perspective. To see what motivates this, consider what happens when an observer goes from being ready to read the result on a measuring device:

(3) |o>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to having read the device:

(4) 1/√2(|↑z>e |“↑”z>m |“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o).

Here we use “|“↑”z>o” to represent the state of the observer having read “z-spin up” off the measuring device’s pointer. If the observer forms beliefs about the z-spin of the electron based on the results of the experiment (as it seems reasonable to presume), then the state of the system that now contains the observer will be the following:

(5) 1/√2(|↑z>e |“↑”z>m|“↑”z>o |BEL“↑”z>o + |↓z>e|“↓”z>m|“↓”z>o |BEL“↓”z>o)

Here “|BEL“↑”z>o” expresses that the observer believes that the electron is z-spin up. This state of the system implies that our observer is in a superposition of belief states (Albert and Loewer, 1988: 197). But our observer doesn’t feel like he is in a superposition of belief states. He feels like he has a definite result for the z-spin measurement of the electron. David Albert and Barry Loewer set out to produce an interpretation of Everett’s pure wave mechanics that “explains how it is that we always ‘see’ (mistakenly so . . .) macroscopic objects as not being in superpositions and never experience ourselves as in superpositions” (Albert and Loewer, 1988: 203). They propose that it is a function of the evolution of one’s mental state that explains one’s experiences (Albert and Loewer, 1988; Albert 1992).

Albert and Loewer begin by supposing that mental states supervene on particular brain states and are accurately accessible to introspection (204). They take it that the state of our observer believing that he read “z-spin up” is identical with a physical brain state that is distinct from the physical brain state that is associated with our observer believing he read “z-spin down” (204). Albert and Loewer do not explain this supervening relation; they merely state it as a given fact. But, they argue, this fact is inconsistent with taking our observer to be reliable and to his holding the belief that there was a determinate measurement record obtained when he measured the z-spin of an x-spin up electron. This is because to the state expressed by (5) our observer will answer “yes” when asked, “Does the electron have a determinate z-spin?” since it does have a determinate z-spin in each term. But our observer does not believe that the electron is z-spin up, nor does he believe that it is z-spin down since his brain is not in the state that corresponds to either of those beliefs; his brain is in a superposition of states (Albert and Loewer, 1988: 204). So Albert and Loewer give up the connection between brain states and belief states. They accept a “modest . . . non-physicalism” (205).

The first way they explain this is with what they call the single mind view. This view adds to quantum theory the principle that the evolution of an observer’s mental states is probabilistic. In the state expressed by (4), our observer starts out with no beliefs about the z-spin of the electron. But when he measures the z-spin, the probability is 50% that he ends up believing it is z-spin up and 50% that he ends up believing it is z-spin down. The association of belief states with physical states is dictated by probability and is determined by the quantum evolution of the system that consists of him, the electron and the measuring device (Albert and Loewer, 1988: 205–206). Thus, mental states are never in superpositions, even if physical brain states are.

Albert and Loewer immediately recognized certain problems with this view (1988: 206). The dualism this implies is particularly problematic since it implies that mental states do not supervene on brain states, or any physical states in general, since “one cannot tell from the state of a brain what its single mind believes” (206). Additionally, in the superposition that describes the state of the system in question, all the terms but one will represent “mindless brains” (often referred to as “mindless hulks”), and which one represents a mind is impossible to determine from the quantum formalism or from experiment. Jeffrey Barrett describes this problem nicely (1999). If two observers are measuring the z-spin of our x-spin up electron, then the system of the observers, the measuring devices and the electron will evolve from this state

(6) |o1>ready|o2>ready 1/√2(|↑z>e |“↑”z>m + |↓z>e|“↓”z>m)

to this one

(7) 1/√2(|↑z>e |“↑”z>m |“↑”z>o1|“↑”z>o2 + |↓z>e|“↓”z>m|“↓”z>o1|“↓”z>o2).

There is nothing in the dynamics proposed by Albert and Loewer that prevents the mental states of each observer from being associated with the same term or from being associated with different terms. But there is nothing that tells us which is the case. In each term of (7) there is a physical brain state for each observer, but with only one is there a mental state for each observer. Thus, it is possible that observer 1 is associated with the first term of (7) and observer 2 is associated with the second, but that neither of them realizes it. As such, observer 1 will (correctly) remember having gotten “z-spin up” as the result of her measurement but (incorrectly) remember that her friend got the same result. The same will hold true, mutatis mutandis, for observer 2. There is no way to determine whether or not one is speaking to a mindless hulk rather than a set of physical states with which there is associated a mental state (Barrett, 1999: 189–190).

Barrett also points out that the single minds view predicts that when an observer repeats a measurement she may get a different result from what she first got and falsely remember having gotten a first measurement that matches her second (1999: 187–188). So this observer is in a position where she cannot trust that what she remembers is what actually happened. Here is why. Consider again the state represented in (4). If our observer were to repeat her measurement of the z-spin of the electron, then the second measurement would be identical to the first. The state would then be

(8) 1/√2(|↑z>e |“↑”z“↑”z >m |“↑”z“↑”z> o + |↓z>e|“↓”z“↓”z >m|“↓”z“↓”z >o).

Even if our observer ended up in the state in which she read “z-spin up” for her first measurement, there is still a 50/50 chance that her mental state will be associated with each term in (8). Thus, there is a 50% chance that her mental state will evolve in such a way that she (correctly) believes that the results of her two measurements agreed, but she will (incorrectly) believe that she got “z-spin down” each time.

To avoid some of these problems, Albert and Loewer propose what they call the many minds view. This view takes it that every observer is associated with an infinite set of minds. This immediately solves the mindless brains problem since there are minds associated with each term in the state of a system that includes an observer. It also solves the problem of dualism in the sense that “mental states are determined by or supervenient on brain (or brain + environment) states” (206), though the minds are still non-physical in that they are not subject to the rules of quantum mechanics for their evolution. This is a benefit of the many minds view since mental states will never (in accordance with our experience) be in a superposition. But this benefit leads to an additional difficulty in that it retains a certain element of dualism. The only mental state that does any supervening on an observer’s physical state is what might be called the “global” mental state. This is the state that is associated with the entire set of minds. It is the only thing that evolves according to quantum mechanical dynamics. Importantly, what might be called the “local” state, the state to which an observer has direct introspective access, does not so supervene. If it did, it would also evolve according to the dynamics, but Albert and Loewer take it as a benefit to the many minds view that it does not. (For more on this objection see Barrett, 1999: 194–196 and Lockwood, 1996: 174.)

Determinate experience is just one part of the dilemma of Everett interpretation. The other is probability. The many minds view takes the norm squared of the coefficients on the terms to be interpreted as giving the proportion of minds associated with each term in the state of the system. In this view, probabilities “refer . . . to sequences of states of individual minds” (208). In (4) our observer is in a brain state that has a mind associated with it that has no belief about the z-spin of the electron, but she can predict what the probability is of ending up with a mind that believes she observed z-spin up or z-spin down; she can predict what the sequence of her mental states will be, according to the norm squared of the coefficients. Knowing that (3) evolves into (4) after a measurement, and accepting the reasonable claim that beliefs are formed when the observer looks at the measuring device, our observer knows that half her minds will believe that she measured the electron as being z-spin up and half believe she measured it as z-spin down.

It is debatable whether Albert and Loewer’s many minds view succeeds in answering the questions of determinate experience and probability that come with any interpretation of Everett and whether the dualism that is inherent in their view is a price worth paying. Michael Lockwood (1996) believed that it was not a price worth paying and so worked to develop a competing many minds view that did not suffer this problem.

Lockwood’s version of many minds argues that “associated with a sentient being at any given time, there is a multiplicity of distinct conscious points of view . . .  it is these conscious points of view or ‘minds’ . . . that are to be conceived as literally dividing or differentiating over time” (1996: 170). For him there is a sense in which our observer can regard herself as having just one mind. He calls this her “multimind” or “Mind,” and it consists of all the minds (lower-case “m”) that are described in the terms of the state of the system of which the observer is a part (1996: 177). Each mind has a “maximal experience” that describes its complete state of consciousness, but it should not be identified with a state of the Mind of the observer. Lockwood takes it that there is “complete supervenience of the mental on the physical” and so he avoids the dualism that plagues Albert and Loewer’s version (1996: 184). To do this, though, he agrees with Albert and Loewer that one must give up “the assumption that . . . there is a uniquely correct way of linking earlier and later maximal experiences of the same Mind together to form persisting minds . . .” but he does not take that to be fatal to his project (1996: 183–184). For Lockwood, each mind makes up one subset of the Mind and “each stands in an equal relation of succession to the given . . . maximal experience . . . with which we started” (Lockwood, 1996: 183). To go back to our example, there is a mind associated with each term in (4), and these go to make up the Mind of the observer. Each of these minds has equal claim to be the successor to the mind in (3) that was in the ready state. So it is unclear who the observer in (3) should expect to be once the measurement is made.

Diachronic personal identity leads to the problem of how to interpret probabilities from this viewpoint. Lockwood believes that he has a meaningful interpretation of probability as “a naturally preferred measure on sets of simultaneous maximal experiences” (1996: 182). He wants to analogize this to duration so that in analogy to “this pain lasted twice as long as the last one” he can say something like, “this pain is, superpositionally speaking, twice as extensive as that” (182; Lockwood’s emphasis). The extensiveness of an experience is a function of its “temporal ‘length’ and of superpositional ‘width’” in the sense that it has a higher coefficient associated with its term (182). It is largely agreed that Lockwood does not have a sufficient explanation for probability. Loewer (1996) and Barrett (1999) both argue that Lockwood’s conception of probability is insufficient for explaining how we are to take probabilities to be guides to rational decision making and prediction. Lockwood cannot explain how the norm squared of the coefficient ought to guide an observer’s predictions about what her experience will be in the future since there is no way to track her over time. In our example above, when we ask what the probability will be of our observer in (3) seeing “z-spin up” upon measurement, we have to say it will be 1. The same is true for “z-spin down,” since in some term there is a mind that goes to make up the Mind of the observer that observes each measurement result.

Barrett (1999) argues that aside from this problem, Lockwood has a “very unusual notion of probability in mind” (209). He wants to “introduce a (sic) entirely new notion of probability,” and one that Barrett finds puzzling and insufficient for explaining an observer’s determinate experiences (209–210).

There are several other interpretations of Everett that are in the same vein as the single minds and many minds views of Albert and Loewer and Lockwood, but none of them are quite as fully worked out as those presented here. The interested reader may want to look at Zeh 1981, Squires 1987, Donald 1990, Squires 1991, Stapp 1991, Squires 1993, Donald 1993, Donald 1995, and Page 1995.

c. Everett Plus Worlds (DeWitt’s Splitting Worlds)

What is arguably the most common interpretation of Everett’s formulation is what Bryce DeWitt called the many-worlds interpretation [MWI] (DeWitt 1968 and Wheeler 1998: 269–70). In his 1967 lecture on what he calls the “Everett-Wheeler interpretation” of quantum mechanics, DeWitt takes Everett’s claim that:

. . . with each succeeding observation (or interaction), the observer state “branches” into a number of different outcomes of the measurement . . . for the object-system state.  All branches exist simultaneously in the superposition after any given sequence of observations (Everett 1957a: 25–6; Everett 1957b: 146).

to imply that we are forced:

. . . to believe in the ‘reality’ of all the simultaneous ‘worlds’ represented in the superposition [in which we find the universe after a measurement interaction] . . . in each of which the measurement has yielded a different outcome (DeWitt 1968: 326).

The first published version of Everett’s dissertation began to gain popularity when it was reprinted in The Many Worlds Interpretation of Quantum Mechanics (DeWitt and Graham 1973).  In his papers in this volume, DeWitt explicitly refers to the “reality of all simultaneous worlds” that he believes are implied by his reading of Everett and to the “reality composed of many worlds” that he takes Everett’s formalism to have taught us is true about our universe (DeWitt 1970: 161, DeWitt 1971: 182). So from this point on, Everett’s interpretation has most commonly been known as the “many worlds interpretation” of quantum mechanics.

The branching that occurs in DeWitt’s MWI can be interpreted in several different ways. One possibility is DeWitt’s way, which suggests that:

[o]ur universe must be viewed as constantly splitting into a stupendous number of branches, all resulting from the measurementlike interactions between its myriads of components . . . every quantum transition taking place on every star, in every galaxy, in every remote corner of the universe is splitting our local world on earth into myriads of copies of itself (DeWitt 1970: 178).

DeWitt takes a strong realist position in regards to the worlds that are the result of the branches splitting.  He takes each branch to be “a possible universe-as-we-actually-see-it” (DeWitt 1970: 163) and believes that in spite of the fact that “all branches must be regarded as equally real” (DeWitt 1970: 178; see also Everett 1957b: note added in proof), we inhabit only one of the worlds that go to make up reality and we have no access to other worlds (DeWitt 1970: 182).

In order to see how this fares with regards to the question of determinate measurement results, let us make a distinction between a local I, IL, and a global I, IG. IL is the self who experiences determinate measurement results when conducting quantum mechanical experiments; IL sees only one outcome of every experiment. IL is who we generally consider ourselves to be. So when DeWitt says that we inhabit only one world of the many possible worlds, we can take him to be referring to weL as a collection of ILs. We then can understand him to believe that there will be an IL for each world that is created in the split. In contrast, there is always only one IG. We can think of the IG as the self, were it to exist, which has access to all the worlds. IG can be thought of as someone outside the theory who can see the branching structure of the world and who knows every outcome of quantum mechanical experiments, someone who has a “god’s-eye view” of the universe as a whole. We as ILs do not seem to have access to the experiences of IG. Thus, the only thing that needs to be explained in terms of determinate measurement results is the perspective of IL since this is the only perspective we seem to have, and according to Everett, it is the only perspective that is of any importance to an observer (Everett 1973: 99).

Everett argues that relative states provide a way to understand an observerL‘s determinate measurement record. The only place where the indeterminacy shows up is in the global perspective of a state; the local, relative perspective will always be determinate. Everett argues that an observer will never have access to the global state and therefore will always and only have determinate measurement results relative to his state (Everett 1973).

DeWitt’s branching worlds follow along the same line. For him, there are many different branches of the universe and each branch splits every time there is an observation or correlation made. On each branch is an observerL who will split every time the branch does. Each observerL gets a determinate record of whatever outcome occurs on his branch, because that is exactly what happened on that branch.

i. Problems with the Notion of Splitting

However, there are objections that have been raised to DeWitt’s proposal that we take each branch to be “a possible universe-as-we-actually-see-it.” The main objection has been that the idea that we in some way split or branch seems absurd given our experience in the world (Everett 1957c: 2; DeWitt 1971: 179; DeWitt and Graham 1973: 161; Xavier 1962: Tuesday Morning, p. 20).

Additionally, one might object to what turns out to be an infinite (possibly uncountable) number of worlds (Healey 1984; Saunders 1997). Some early many-world interpreters have taken Everett to be implying such a profligacy of worlds when he writes: “[a]ll branches exist simultaneously in the superposition after any given sequence of observations” (Everett 1957b: 146), and so “[f]rom the viewpoint of the theory all elements of a superposition (all ‘branches’) are ‘actual’, none any more ‘real’ than the rest” (Everett57b: note added in proof; Everett’s emphasis).

There appear to be two distinct issues raised.  The first is that a multitude of worlds, always splitting, seems to defy our sense that we are not splitting.  The second is that it seems to run counter to intuition to propose that there are multiple copies of the world, and of people, that constitute the universe.  Let us see how these objections are addressed one at a time as doing so will shed light on Everett’s thoughts about the relative state formulation.

The first objection, that we do not feel the splitting, is addressed by Everett both in his response to DeWitt’s letter in 1957 (Everett 1957c) and ultimately in the short dissertation (Everett 1957b).  In the letter to DeWitt he writes:

I must confess that I do not see this “branching process” as the “vast contradiction” that you do.  The theory is in full accord with our experience (at least insofar as ordinary quantum mechanics is).  It is in full accord just because it is possible to show that no observer would ever be aware of any “branching,” which is alien to our experience as you point out.

The whole issue of the “transition from the possible to the actual” is taken care of in a very simple way – there is no such transition, nor is such a transition necessary for the theory to be in accord with our experience.

From the viewpoint of the theory, all elements of a superposition (all “branches”) are “actual,” none any more “real” than another.  It is completely unnecessary to suppose that after an observation somehow one element of the final superposition is selected to be awarded with a mysterious quality called “reality” and the others condemned to oblivion – they won’t cause any trouble anyway because all the separate elements of the superposition (“branches”) individually obey the wave equation with complete indifference to the presence of absence (“actuality” or not) of any other elements.

This is only to say that the theory manages to avoid the difficulty of the “transition from possible to actual” – and I consider this to be not a weakness, but rather a great strength of the theory.  The theory is isomorphic with experience when one takes the trouble to see what the theory itself says our experience will be.  Little more can be asked of it without exposing a naked philosophic prejudice of one kind or another (Everett 1957c: 3; Everett’s emphasis).

In the short dissertation this becomes:

Arguments that the world picture presented by this theory is contradicted by experience, because we are unaware of any branching process, are like the criticism of the Copernican theory that the mobility of the earth as a real physical fact is incompatible with the common sense interpretation of nature because we feel no such motion.  In both cases the argument fails when it is shown that the theory itself predicts that our experience will be what it in fact is.  (In the Copernican case the addition of Newtonian physics was required to be able to show that the earth’s inhabitants would be unaware of any motion of the earth.) (Everett 1957b: note added in proof).

Everett had no problem with an observerL not feeling the split because his theory predicts that the splitting of worlds is not something of which IL could be aware.  An observerL will never have access to the global state of a system and so will never be able to observe anything but the state of his branch.  It requires a perspectiveG to observe any branching event.  Since there can never be any observation of a branching event, there can never be a physical record (memory) of the event.

The second objection comes from those who want the most economic metaphysics possible.  Such philosophers might argue that to have an infinite number of worlds after a split is metaphysically extravagant.  While some theorists who first encounter this idea initially balk at it, ultimately most do not find anything terribly objectionable in it.  In fact, most MWI theorists embrace the notion of a multitude of existing worlds.  If the MWI as proposed by DeWitt solves the quantum measurement problem, then this metaphysical extravagance may be worth the cost.

For more on the story of the evolution from the long to the short thesis, and on Everett’s life, see Byrne 2010.

ii. The Preferred Basis Problem

DeWitt’s MWI fails to provide us with any explanation of when worlds split.  Unfortunately, saying when they do is just as difficult as saying when a collapse occurs, which is one way to understand the quantum measurement problem. To determine when a world splits, we would first need to know in which basis we should write the universal state. If we knew this, then we would know that a split has occurred because a new term would show up in the global state when it is written in the preferred basis.  The choice of basis also determines which properties in the universe are determinate—namely, those that are represented by vectors that are on the axes of the basis—and which worlds exist after a measurement interaction. DeWitt does not provide any way to choose a particular basis, and any way that we might suggest in the context of his MWI would be blatantly ad hoc.  This problem has come to be known as the preferred basis problem.

Even though DeWitt’s MWI seems to provide us with an explanation of why we get determinate measurement records, that explanation assumes that there is a basis in which the universal state has been written that guarantees that those measurement records are in fact determinate.  In order to be able to assert the determinateness of records in a particular branch, the basis that we choose ought to be one in which those records are determinate.  But the basis in which the pointer on our measuring device has a determinate position and the basis in which our mental states are determinate are not necessarily the same.  It is not clear that we can choose a basis in which they will both be determinate, not to mention all the other things that we want to have as determinate properties in order to be conscious beings capable of successfully completing a quantum mechanical experiment.

Without a preferred basis in which to write the universal state, one is free to choose any basis one likes with the result being that in different bases there will be different decompositions of the universal state vector.  If each term in the expansion of the universal state is taken to represent a different possible world, or a different description of a world, then with each choice of basis there will be different terms and so different worlds. Thus, without a preferred basis, there is no fact of the matter as to when splits occur, no fact of the matter as to which properties in the universe are determinate and no fact of the matter as to which worlds go to make up the universe; rather, these are all a function of the choice of basis.

So we are left without answers to several questions: What worlds are there?  Which properties are determinate?  When do worlds split?  Solving this is as difficult as solving the original quantum measurement problem and in fact is a version of it (Barrett 1999: 176).  Thus, no progress has been made toward solving the measurement problem, insofar as this is a problem that could be solved by saying when collapses or splits occur.  So the cost of DeWitt’s MWI is the reintroduction of the measurement problem, something Everett’s pure wave mechanics was developed to avoid.  Fortunately for those sympathetic to Everett’s ideas, quantum mechanics without the collapse postulate has evolved in the decades since DeWitt’s writing.  In what follows we will see how the concept of “many worlds” has been developed within the interpretation of Everett’s physics.

4. The Evolution of Many Worlds Interpretations

a. The Problem of Probability and Graham’s Attempt at Adding a Measure

Something that the standard interpretation of quantum mechanics provides that is missing in pure wave mechanics is a way to account for probabilistic claims. In any MWI of Everett’s relative state formulation, there is in some sense a different world in which every possible outcome of an experiment occurs; every world is real, and therefore every outcome occurs.  We lose the ability to say, “Event e happens with probability p” where p is less than 1. Everett writes, “In order to establish quantitative results, we must put some sort of measure (weighting) on the elements of the final superposition” (Everett 1957b: 147).

In the standard interpretation, this is done with the use of the Born rule (Born 1926). The Born rule is what physicists use to assign probabilities to the outcomes of quantum mechanical experiments. For each term in the state of the system, written in some basis, there is associated with it a complex number, the amplitude of that region of the wave function. The Born rule says to take that number and square its norm in order to get the probability of that term being the outcome of a measurement. But in EQM, a deterministic theory in which everything happens, the Born rule does not prima facie seem to be applicable.

To be able to derive probability from within his theory, Everett first needed to define what he meant by a “typical element of the final superposition” since “typical” presupposes some notion of probability. Presumably he meant an element in which the predications of quantum mechanics are borne out. But if he determined what is “typical” by counting up all the branches, taking there to be one world for each term in the expansion of the universal state when it is written in the preferred basis, and calling a world “typical” when it displays the same results as most of the others, then there is a problem because in a large majority of the worlds, by this measure, the quantum statistics will not even be close to true (Graham 1973: 235).

Neill Graham was the first to suggest that there was something missing from Everett’s derivation of a probability measure. He writes that the worlds that display the proper relative frequency are “in a numerical minority” and “any attempt to show that the probability interpretation holds in the majority of the resulting Everett worlds is doomed to failure” (Graham 1973: 236). (See also Barrett 1999: 168–70 for a very clear explanation of why the statistics fail to work for any coefficient other than 1/√2.) But Everett not only suggests that we use a counting measure, he believes that he has shown that the “only choice of measure” is the square amplitude measure (Everett 1957b: 147). A typical world is then one for which the value of the square amplitude measure is high.  But Everett’s use of this measure is not a solution to the original problem. The original problem was to derive probability from a deterministic theory in which everything happens. To solve this one needs to add the assumption that after a measurement one should expect to find oneself in a “typical” world, that typicality being determined by the norm-square of the coefficient on the term that describes that world; the higher the value, the more “typical” the world. But this is akin to claiming that the Born rule holds.  (For more on this see Barrett 1999: 168–173, Wallace 2007 and 2012.)

Early 21st century MWI theorists take a very different view of the multiplicity of the world, thereby solving some of the problems inherent in earlier MWIs. This view is in line with what David Wallace has proposed, namely that the multiple branches of the universe that arise from the mathematical theory of quantum mechanics are emergent (Wallace 2012). Because most of those who have developed and are proponents of Wallace’s emergent branching universe view are or have been located around Oxford, we will call this view the “Oxford MWI.”

b. The Oxford MWI

The main difference among Wallace’s emergent branching universe view, the Oxford MWI, and the MWIs that came before is that prior MWI theorists, in large part, understand the wave function to be a real entity, leading to a real multiplicity of worlds at the fundamental level of the theory. (For more on the question of realism regarding the wave function, see Ney and Albert 2013.) Wallace, on the other hand, sees these worlds to be emergent from the underlying microphysical description of the universe. They are no less real, but they are structural facts that are instantiated within EQM (Wallace 2012: Chapter 2). Wallace explains how he conceptualizes these structures with what he calls “Dennett’s criterion”: “A macro-object is a pattern, and the existence of a pattern as a real thing depends on the usefulness – in particular, the explanatory power and predictive reliability – of theories which admit that pattern in their ontology” (Wallace 2010a: 58, Wallace 2012: 50).

Wallace asks us to consider a tiger and its hunting patterns. We can describe both in terms of electrons and atoms, but in that description we do not see the patterns that emerge when we consider them at the macroscopic level. The tiger atoms and the swirl of atoms and energy that make up the hunting patterns are real, objective parts of the microphysical system, but they are not practically useful for predicting how tigers behave in the wild. Zoology cannot be reduced to physics. Rather, physics instantiates zoology. Wallace illustrates the instantiation relation with the example of the relation between quantum mechanics and a classical conception of the solar system. Classical mechanics is instantiated by quantum mechanics and is applicable to the solar system because some of the solar system’s properties “approximately instantiate a classical-mechanical dynamical system” (Wallace 2012: 56).

Applying these considerations to EQM, one can say that the microphysical description of the universe contains descriptions of states of affairs that are structured like the macroscopic objects we encounter in the world. When two of these states are superposed with one another, the quantum state of the system instantiates two different macroscopic systems at once. Thus, one can say that two different worlds emerge from the microphysical-level description that instantiates them. Wallace writes that “there are entities whose existence is entailed by the theory which deserve the name ‘worlds’” (Wallace 2010a: 68); thus we should take these worlds to be real entities.

Wallace considers decoherence the only consideration that one should use to help determine how the universe ought to be carved up. A system is said to “decohere” when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state, the state in which one finds an entangled system. Decoherence theorists take the destruction of the interference effects to be all that is required to explain determinate experience. Because such correlations are ubiquitous and radically swift, and because it is, in practice, impossible to isolate a macroscopic system from its environment sufficiently to prevent such correlations (even the best isolated system, if it is above absolute zero, will radiate heat and therefore interact with its environment), these correlations will produce results that seem to indicate that a collapse has occurred and not that the system is still in a superposition. The property that decoherence picks out is very close to position when one is considering an experiment that requires us to see the position of a pointer on a measuring device (as pointing either “up” or “down,” for example). (For more on decoherence see the original formulations of it in Zeh 1970, Zeh 1973, Zurek 1981, and Zurek 1982. In Zurek 1991, Zeh 1995, Giulini et al 1996, Butterfield 2001, Zurek 2002 and Schlosshauer 2007 the reader can find very accessible introductions to and discussions of decoherence.)

Given that there is no preferred way to carve up the universe, aside from decoherence considerations, and no particular “most-fine-grained” way to describe the quantum structure of the universe, there is also no fact of the matter about how many branches there are, but Wallace does not see this as a problem. Rather he sees it as misguided to even ask the question “How many worlds are there?” much as it is misguided to ask “How many experiences did you have yesterday?” (Wallace 2010a: 67–8, Wallace 2012: 99–102, 120).

Recall that part of the job of Everett interpretation is to explain how it is that we get determinate measurement records when we do quantum mechanics experiments. Wallace argues that “the emergence of a classical world from quantum mechanics is to be understood in terms of the emergence from the theory of certain sorts of structures and patterns” (2003: 5), and it is these structures that are in superpositions, not the emergent macroscopic objects. To see what he means here, it is worth quoting him at some length:

To see in a different way how the ideas of Sections 4-5 resolve the problem of macroscopic indefiniteness, consider the following sketch of the problem.

  1. After the experiment, there is a linear superposition of a live cat and a dead cat.
  2. Therefore, after the experiment the cat is in a linear superposition of being alive and being dead.
  3. Therefore, the macroscopic state of the cat is indefinite.
  4. This is either meaningless or refuted by experiment.

But (1) does not imply (2). The belief that it does is based upon an oversimplified view of the quantum formalism, in which there is a Hilbert space of cat states such that any vector in the space is a possible state of the cat. This is superficially plausible in view of the way that we treat microscopic subsystems: an electron or proton, for instance, is certainly understood this way, and any superposition of electron states is another electron state. But any state of a cat is actually a member of a Hilbert space containing states representing all possible macroscopic objects made out of the cat’s subatomic constituents. Because of Dennett’s criterion, this includes states which describe

a live cat;
a dead cat;
a dead dog;
this paper . . .

We can say (if we want, and within nonrelativistic quantum mechanics) that the particles which used to make up the cat are now jointly in a linear superposition of being a live cat and being a dead cat. But cats themselves are not the sort of things which can be in superpositions. Cats are by definition “patterns which behave like cats”, and there are definitely two such patterns in the superposition.

The point can be made more generally:

It makes sense to consider a superposition of patterns, but it is just meaningless to speak of a given pattern as being in a superposition (Wallace 2003: 12; Wallace’s emphases).

In the Oxford MWI, decoherence gives rise to the branching structures that make up the various worlds. And it is these structures that give rise to the patterns from which macroscopic objects emerge. Decoherence also causes the interference between branches to “wash out,” and so systems appear to have determinate values in the decoherence basis. (For more on the Oxford MWI see Albert 2010, Kent 2010, Maudlin 2010, Price 2010, Vaidman 2014 and Bacciagaluppi and Ismael 2015.)

There was quite a bit of work at developing a theory of probability in the context of the Oxford MWI (Saunders 1995, Saunders 1998, Vaidman 1998, Wallace 2002, Wallace September 2003, Saunders 2005, Wallace 2006, Greaves January 2007, Wallace 2007, Greaves and Myrvold 2010, Saunders 2010, Wallace 2010b, Tappenden 2011, Vaidman 2012, Wallace 2012, Wilson 2013). The work generally focuses on two different concerns in probability theory: explaining how one can recover uncertainty from a deterministic world, and explaining how we can make sense of the fact that we seem to be able to take branch weights to be related to probability, as the Born rule suggests that we can in a standard collapse interpretation.

Simon Saunders presents a tripartite view of what role chance plays in standard single world views of probability:

(i) Chance is measured by statistics, and perhaps, among observable quantities, only statistics, but only with high chance. (ii) Chance is quantitatively linked to subjective degrees of belief, or credences: all else being equal, one who believes that the chance of E is p will set his credence in E equal to p (the so-called “principal principle”). (iii) Chance involves uncertainty; chance events, prior to their occurrence, are uncertain (Saunders 2010: 181).

Linking chance to statistics (as in (i)) was originally argued by Everett (1973), as we have seen above. But while this might explain certain aspects of probability, it does not explain why we ought to take branch weights to be probability. The arguments for (ii) and (iii) aim to answer this question.

Wallace and others have argued that we can make sense of probability in a theory in which all possible outcomes occur (what has been called the “incoherence problem” in Greaves 2004, Wallace 2005, Wallace 2006, Baker 2007, Lewis 2007, and Saunders and Wallace 2008a; and what has been alternatively termed “Subjective Certainty” by Greaves 2004, Baker 2007, Greaves 2007, Lewis 2007, and Greaves and Myrvold 2010). Wallace’s justification for the claim that we can make sense of probability in a determinate universe is the emergent structure that we have discussed above. It is only at the fundamental level that EQM is deterministic; at the emergent level it is not (Wallace 2012: 115). Once this has been established, two other problems remain to be solved, what Wallace calls the “practical problem” and the “epistemic problem” (Wallace 2012: 158). The first of these is how to justify allowing branch weights to play the role in decision making that ordinary probability plays in a classical context ((ii), above). The second asks how we justify taking branch weights to play the role of probability in showing that the results of experiments support quantum mechanics. Albert (2010) explains this general concern clearly when he writes, “Why (for example) should it come as a surprise, on a picture like [EQM], to see what we would ordinarily consider a low-probability string of experimental results? Why should such a result cast any doubt on the truth of this theory (as it does, in fact, cast doubt on quantum mechanics)” (Albert 2010: 356)?

The general principle of rationality on which Wallace’s argument is founded is David Lewis’ Principal Principle (Lewis 1980). In Wallace’s terminology: “For any real number x, a rational agent’s personal probability of an event E conditional on the objective probability of E being x, and on any other background information, is also x” (Wallace 2012: 140). Wallace’s goal is to show that there is an “Everett-specific derivation of the Principle” and,

to prove, rigorously and from general principles of rationality, that a rational agent, believing that (Everett-interpreted) quantum mechanics correctly gives the structure and dynamics of the world and that the quantum states of his branch is |y>, will act for all intents and purposes as if he ascribed probabilities in accordance with the Born Rule, as applied to |y> (Wallace 2012: 150, 159).

To provide this rigorous proof, Wallace builds on an argument of David Deutsch’s (1997, 1999). He uses Deutsch’s results and furthers them to argue that branch weight non-circularly serves as objective probability. As the details of both Deutsch’s and Wallace’s arguments are quite technical, I leave it to the interested reader to investigate more fully. (More on the Deutsch-Wallace derivation of probability and refinements to their work can be found in Wallace 2002, Wallace September 2003, Greaves 2004, Saunders 2005, Greaves January 2007, Greaves March 2007, Wallace 2007, Greaves and Myrvold 2010.)

The arguments for (iii), that chance involves uncertainty, are the focus of quite a number of papers. Saunders believes that to have a fully worked out view of probability, one must explain uncertainty in EQM (2010: 189–90).  A view called Subjective Uncertainty [SU] is described by Saunders (1998). There he argues that there is uncertainty in even the deterministic physics of Everettian quantum mechanics. He asks us to consider a pre-measurement observer at time t1. Call her she1. When she1 measures the x-spin of a z-spin particle, this results in a branching of her world and she1 ends up with two successors at time t2: “she2” who sees “up” as her measurement result, and “she2” who sees “down” as her measurement result. The question is, “Who should she1 expect to become?”

There seem to be three possibilities: She can expect to become one of them, both of them or neither of them (Saunders 1998: 383). Saunders claims that it is nonsense to suggest that she1 should expect to become neither of them. In the emergent branching view, branching events are ubiquitous and yet we have the experience of continuing to move through time. She would not expect to become both of them because we do not have the experience of seeing two measurement results of our experiments. So the only remaining option is that she1 should expect to become one of her successors, but she does not know which one. This is the basis for the SU attitude about uncertainty. (For more on the link between uncertainty and probability in EQM see Ismael 2003, Greaves 2004, Wallace 2006, Baker 2007, Greaves January 2007, Greaves March 2007, Lewis 2007, and Wallace 2007.)

Saunders links uncertainty to questions of personal identity in EQM (2010). If an agent, Alice, does not know what branch she is on, that can account for a type of uncertainty in EQM. This implies that some of the questions of uncertainty, and therefore probability, will depend upon answers to the question of diachronic personal identity in the context of EQM. (For more on self-locating uncertainty and diachronic identity in EQM see Vaidman 1998, Wallace 2005, Lewis 2007, Lewis 2008, Saunders and Wallace 2008a, 2008b, Tappenden 2008, Wallace 2012, Conroy 2016, and Sebens and Carroll 2016.)

c. Objections to the Oxford MWI

There are of course objections to the Oxford MWI. The first set to consider are those that focus on the use of decoherence to solve the preferred basis problem—a problem that must be solved to explain determinate measurement records and probability.

Recall that a system is said to decohere when it becomes correlated with something in its environment and thereby destroys the interference effects that would otherwise have been present if the system were in a pure state. The coefficients on each term of the state of the system, when one traces over the environment (that is, when one essentially ignores the effects that are caused by only the environment) approximately match the probabilities one obtains from the standard collapse formulation. But even though decoherence theorists believe they have solved the problem of accounting for what seem to be determinate measurement records (Zeh 1997), such interactions do not produce determinate results (Barrett 1999). The interaction between, say, a pointer on a measuring device and its environment does not produce a determinate position for the pointer. Decoherence destroys the interference effects, but it does not produce determinate measurement records. There is a sense as if a collapse has occurred, but all we really end up with is a more complex entangled superposition (Albert 1992, Barrett 1999).

Related to this worry is the fact that using decoherence considerations to choose a preferred basis does not clarify exactly which basis is chosen. In every interaction the property that decoheres most quickly and completely will always be one that is close to position, but it will not always be the same property each time (Barrett 1999: 242–4). This is not going to be troubling to a MWI theorist like Wallace, however, as he does not take there to be a particular fine-grained way to carve up the universe—provided that it is done in a way that preserves the emergence of macroscopic entities from the underlying structure (Wallace 2012).

An additional problem with taking a property very close to position to be the preferred basis in which to write the state of the system is that this is adding an additional principle to quantum mechanics, one that says that whatever decoheres most quickly and completely is the preferred observable chosen to be determinate (Barrett 1999). This does not concern most Oxford MWI theorists, though, as they do not take themselves to be doing Everett exegesis.

A more pressing problem that faces the MWI theorist who relies on decoherence is the question of circularity. David Baker (2007) and Ruth Kastner (2014) have both argued that the use of decoherence begs the question regarding the derivation of probability because the use of decoherence presupposes a concept of objective probability. As Baker puts it, “proofs of decoherence depend implicitly upon the truth of the Born rule. Without first justifying the probability rule, Everettians cannot establish the existence of a preferred basis or the division of the wave function into branches. But without a preferred basis or a specification of branches, there can be no assignment of probabilities to measurement outcomes” (Baker 2007: 3). Kastner puts her concern this way: “the goal of decoherence is to obtain vanishing of the off-diagonal terms, which corresponds to the vanishing of interference and the selection of the observable R as the one with respect to which the universe purportedly ‘splits’ in an Everettian account . . . the vanishing of the off-diagonal terms is crucially dependent on an assumption that makes the derivation circular” (Kastner 2014: 57). Both Baker and Kastner are pointing to the fact that in order to ignore the off-diagonal terms, the crucial step in decoherence that provides the preferred basis, one must already have a conception of probability.

The best that Oxford MWI theorists can do, according to Baker, is to show that because the off-diagonal terms are incredibly close to zero, the observer can ignore them as part of her decision making (she ought not care much about them). If, however, that is justified by saying that it is because those terms are very unlikely to occur, then one is bringing in an illegitimate notion of probability (Baker 2007: 19ff). Kastner points to a different problem, the problem of the arbitrariness of the division between system, measuring device and environment (Kastner 2014: 57). It is crucial that one be able to ignore the effects of the environment, but if the division is arbitrary, then there is no non-circular way of isolating only the environment.

There are at least two other objections raised against the Oxford MWI that are in this vein, Barnum et al 2000 and Hemmo and Pitowsky 2007. The former points to a hidden assumption in Deutsch’s proof, one that “is not just a minor addition to Deutsch’s list of assumptions, but rather a major conceptual shift. The assumption is akin to applying Laplace’s Principle of Insufficient Reason to a set of indistinguishable alternatives, an application that requires acknowledging a priori that amplitudes are related to probabilities” (Barnum et al 2000: 1180). Hemmo and Pitowsky criticize the use of the Everettian Principal Principle, essential to Wallace’s derivation of the Born rule, on the grounds that it is incoherent to claim that observers should treat a term with a measure of zero as if it had a probability of zero. They also argue that there is no reason to believe (that is not question-begging) that rational observers would be justified in believing the statistical predications of quantum mechanics if they also believed the Oxford MWI (Hemmo and Pitowsky 2007: 334). Some of these objections were addressed by Wallace and others in the intervening years. But other criticisms have also been raised. Most of these criticize the use of decision theory to guide the derivations of probability in EQM.

David Albert (2010) argues that the entire program undertaken by Deutsch and Wallace (and others supporting them) misses the point. He writes, “the question out of which this entire program arises, seems like the wrong question. The questions to which this program is addressed are questions of what we would do if we believe that the fission hypothesis were correct. But the question at issue here is precisely whether to believe that the fission hypothesis is correct” (359)! The “fission hypothesis” is the hypothesis that the Schrödinger equation is the complete story of the evolution of the world and that each branch that results from a branching event has an observer with an actual experience. He goes on to say, “The decision-theoretic program seems to act as if what primarily and in the first instance stands in need to being explained about the world is why we bet the way we do” (359). And this is not what Albert believes needs to be explained, but rather, “What we need is an account of our actual empirical experiences of frequencies” (360).

Peter Lewis (2010) has argued that the decision-theoretic considerations that guide both Deutsch’s and Wallace’s arguments are not sufficient to show that the only rational guide to an observer’s decision-making procedures is the Born rule. Lewis argues that there is a gap in Deutsch’s proof by showing that there are other rules that an observer can follow and still be consistent with the rationality constraints assumed by Deutsch. Lewis acknowledges that Wallace has filled the gaps in Deutsch’s argument by adding a new axiom of rationality, but that unlike Deutsch’s axioms, Wallace’s addition is “not an innocuous and general axiom of rationality . . . [rather] it is a substantive claim about decision-making in the specific context of Everettian quantum mechanics, and so requires a substantive justification” (Lewis 2010: 21). Lewis does not believe that the justification that Wallace provides is sufficient. Alastair Wilson (2013, 2015) has worked to counter this criticism of Lewis’ by proposing new principles that tie the physics of EQM to modal metaphysics, thereby helping to provide the justification for some of Wallace’s most contentious claims.

Adrian Kent (2010) argues that Wallace’s attempt to axiomatize rational decision making in EQM, using decision theory as its model, is incoherent, and that in fact “Wallace’s axioms are not constitutive of rationality either in Everettian quantum theory or in theories in which branchings and branch weights are precisely defined” (307). Huw Price (2010) argues that because in the Oxford MWI view there are multiple future successors to an observer that she ought to care about, it is irrelevant to the observer (as far as rationality requirements go) which future branch she will subjectively occupy (378–79). Following this and further detailed argument, Price concludes that “there seems little prospect that a Deutschian decision rule can be a constraint of rationality, in a manner analogous to the classical case” (389).

the Oxford MWI has had a great deal of influence on many philosophers, and so work on the question of probability has not stopped. For more work on the question of probability see Dizadji-Bahmani 2013, Dawid and Thébault 2014, Wilson 2015, Jansson 2016, and Sebens and Carroll 2016.

5. Relational Interpretations of Everettian Quantum Mechanics

Given the centrality of Everett’s notion of relative states, it seems important to consider those interpretations that also highlight their importance. In section 3a we considered the bare theory interpretation of Everett. The two interpretations considered in this section are Simon Saunders’ relational interpretation and the relative facts interpretation.

a. Simon Saunders’ Relational Interpretation

Simon Saunders developed an interpretation of Everett in which values for systems can only be defined relative to a point of view, or a context of interest or relevance.  He proposes that just as we understand facts about tense to be relations, we should also understand facts about the properties of systems to be relations (Saunders 1993, 1995, 1996a, 1996b, 1997, 1998).

Let us assume that what appears to be the case is in fact the case, and that we are tracking the truth about the world when we make statements about how the world appears to us.  Then, there must be something about the world that makes true the proposition

 (9)  My coffee cup is on my desk.

This is what is known as the truthmaker principle.  Saunders has proposed that we understand the truthmaker for (9) analogously to the way B-theorists about time understand the truthmaker for a proposition like

(10)  My coffee cup was at home.

A B-theorist takes the truthmaker for (10) to be a fundamentally relational fact. For the B-theorist a statement like (10) can be reduced to a statement like

(11) The event of my coffee cup’s being at home is earlier than now.

For a B-theorist, relations such as the one in (11) are permanent dyadic relations that order positions in time. Other B-series relations include simultaneous with, earlier than, and later than. (For more on the B-theory of time, see McTaggart 1908, Maudlin 2002 and Markosian 2008 and the article on time in this encyclopedia.)

Saunders draws an analogy between what he considers the nature of the truthmaker for a proposition about the property of an object and what a B-theorist considers the nature of the truthmaker for a proposition about the property of an event like

(12) Event e is past.

In both cases the truthmaker is some fundamentally relative fact (Saunders 1995).

So now suppose that there are two non-concurrent events, E and E’. Then the following two statements, while both true, appear to be contradictory:

(13) Event E is now.

(14) Event E’ is now.

However, if we introduce two other events, W and W’, that are not identical and not concurrent, then we can resolve the apparent contradiction by instead saying:

(15) Event E is now relative to event W.

(16) Event E’ is now relative to event W’.

And these two statements are not contradictory.

Saunders suggests that we extend this analogy to the consideration of truthmakers for propositions like (9).  If we do, then a seeming contradiction in pure wave mechanics has a solution that is analogous to the solution to the apparent contradiction in tense metaphysics (Saunders 1995).

In a relational interpretation of EQM, most every physical systemG will typically have most every possible relative value for a property, just as every event has all of the qualities of past, present and future.  So, it is true to say:

 (17) X has value x.

and

(18) X has value x’.

even if ≠ x’ and the two are mutually exclusive.  But if so, then (17) and (18) are contradictory.

However, if we now introduce two parameters, Y and Y’, that can take values and are such that Y ≠ Y’, we can restate (17) and (18) as:

(19) X has value x relative to Y having value y.

(20) X has value x’ relative to Y’ having value y’.

It is clear that (19) and (20) are not contradictory.

In this view, an event’s having a seemingly particular (tensed) time is analogous to an object having a seemingly particular (determinate) value for a property.  An event happened in the past (or future) relative to another time.  Likewise, an objectL has a determinate value relative to some parameter.

By relativizing the property of an object, what we are doing is explicitly changing the focus of our discussion from that of the properties of an objectG to that of the properties of an objectL.  So in (17) and (18) it is to XG that we are referring, but in (19) and (20) it is to XL that we are referring.  In the tense case, we change our focus from one concept of “now” in pair (13) and (14), to a different, relative concept of “now” in the pair (15) and (16).

The question then becomes, what are the relativizing parameters Y and Y’?  For Saunders, the parameters are worlds, or branches, at a particular time. In light of this, consider again (9).  In analogy with the B-theory, one recourse for explaining its truth (when it is in fact true) is to say that it is true because there is a determinate relative fact that consists in the coffee cup being on my desk.  Each possible fact about the value for the coffee cupG‘s position occurs in a different world.  So Saunders makes sense of the truth of a proposition like (9) by relativizing the cupL‘s position value to the world in which it finds itself.  Relative to being in this world, the coffee cupL is on my desk; relative to being in a different world, the coffee cupL is in the Mariana Trench; relative to being in yet another world, the coffee cupL is in my mother’s kitchen.  Thus, the fact that makes (9) true is a relation between the position value for the coffee cupL and the world in which the coffee cupL finds itself. (For more discussion and criticism of Saunders’ relational interpretation see Barrett 1999, Conroy 2010, Laudisa and Rovelli 2008.)

b. The Relative Facts Interpretation

Another attempt to read a metaphysical picture from EQM is the relative facts interpretation [RFI]. It takes seriously Everett’s claim of the centrality of the notion of relative states and adds no additional principles to his pure wave mechanics. It bears a great deal of similarity to Saunders’ relational interpretation, but whereas Saunders’ view is a many-worlds view, the RFI takes there to be just one world in which all objects generally have relational properties (Conroy 2010, 2012, 2016).

Consider the problem described in the last section of resolving the contradiction between (17) and (18). The RFI can do so by defining the parameters Y and Y’ as being the (relative) state of the complement of the system that we are considering.  So we can make sense of the truth of a proposition like (9) by relativizing a coffee cupL‘s position to the state of the complement of the system of which it is a part.  My coffee cupL is on my desk relative to my having put it there, my desk being in my office, to my not having knocked the cup off, and so forth.  The fact that my desk is in my office is relative to some other collection of relative facts about the complement of its system (the movers that put it there being a part of that), and this goes on ad infinitum.  Likewise, my coffee cupL is in the Mariana Trench relative to my having decided to take a cruise in the Pacific, and my having dropped my coffee cup over the side of the ship at the right time, and so on. The truthmaker for a proposition like (9), in the RFI, is a relative fact.

That there can be a metaphysics of relative facts has been argued in the context of quantum mechanics in a general sense. Because entanglement is an inescapable part of the quantum mechanical world, several philosophers have argued that non-separable, entangled quantum mechanical states imply that there are relations that fail to be supervenient upon or be grounded by non-relational properties of their relata, and that this leads to quantum holism and a metaphysics consisting of non-reducible relations. Thus it seems reasonable to take a relational metaphysics and apply it to Everett interpretation given the importance he places on the notion of relative states. (For more on the development of a relational metaphysics in light of considerations of quantum entanglement see Cleland 1984, Teller 1986, Teller 1989, French 1989, Healey 1991, Esfeld 2003, Esfeld 2004, Schaffer 2010, Calosi 2014, McKenzie 2014, and Esfeld 2016.)

Both the RFI and Saunders’ relational interpretation can explain determinate measurement results by saying that an observer has a determinate relative result. While Saunders has worked on solving the problem of probability (see section 4b above), the RFI still lacks a development of a clear sense of how probability is meant to function.

6. Conclusion

Although there is no consensus about the correct way to interpret the outcome of quantum mechanical experiments, one very influential way of doing so is due to Hugh Everett III, who suggested that we drop the collapse postulate and take the universe to be such that its quantum state is an incredibly complex superposition that never collapses. The work that Everett did in the late 1950s has led many philosophers and philosophers of physics to attempt to build a metaphysical picture of the world based on the physics that he proposed. This article has surveyed the major developments in the work that was inspired by Everett’s 1957 dissertation. Just as there is no consensus on whether or not Everett has the best physics for the description of the world, there is no consensus on the best way to read a metaphysical picture off of the world Everett described. It is left to the reader to adjudicate these debates.

7. References and Further Reading

  • Albert, David Z. Quantum Mechanics and Experience.  Cambridge: Harvard University Press, 1992.
  • Albert, David Z. “Probability in the Everett Picture.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 355–368.
  • Albert, David Z. and Barry Loewer.  “Interpreting the Many Worlds Interpretation.”   Synthese 77 (1988): 195–213.
  • Bacciagaluppi, Guido and Jenann Ismael. “Book Review: The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.Philosophy of Science 82 (1): 129–148: January 2015.
  • Baker, David J. “Measurement Outcomes and Probability in Everettian Quantum Mechanics.” Studies in History and Philosophy of Modern Physics 38 (2007): 153–69.
  • Barnum, Howard, Carlton M. Caves, Jerome Finkelstein and Ruediger Schack. “Quantum Probability from Decision Theory.” Proceedings of the Royal Society of London A, 456 (2000): 1175–1182.
  • Barrett, Jeff.   The Quantum Mechanics of Minds and Worlds.  New York: Oxford University Press, 1999.
  • Barrett, Jeff.  “Ithaca Interpretation of Quantum Mechanics.” In Compendium of Quantum Physics. Greenberger, Daniel, Klaus Hentschel and Friedel Weinert (eds.). Berlin: Springer, 2009: 325–6.
  • Barrett, Jeff. “A Structural Interpretation of Pure Wave Mechanics.” Humana Mente 13  (April 2010): 225–235.
  • Barrett, Jeffrey and Peter Byrne. The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Princeton, NJ: Princeton University Press, 2012.
  • Bevers, Brett. “Everett’s ‘Many Worlds’ Proposal.” Studies in History and Philosophy of Modern Physics, 42 (1) (February 2011): 3–12.
  • Birkhoff, Garrett and John von Neumann.  The Logic of Quantum Mechanics. Vol 37. 1936.
  • Birman, Fernando.  “Quantum Mechanics, Correlations, and Relational Probability.”  CRÍTICA, Revista Hispanoamericana de Filosofía 41, no. 121 (April 2009): 3–22.
  • Bohm, David. “A suggested interpretation of the quantum theory in terms of ‘hidden’ variables, I and II.” Physical Review, 85 (1952): 166–193.
  • Bohr, Niels.  “Quantum Mechanics and Physical Reality.” Nature 136 (1935): 1025–1026. Reprinted in Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 144.
  • Bohr, Niels.  “Causality and Complementarity.” Philosophy of Science 4, no. 3 (July 1937): 289–98.
  • Bohr, Niels.  “Quantum Physics and Philosophy: Causality and Complementarity” (1958a).  In Essays 1958-1962 on Atomic Physics and Human Knowledge.  Vol. 3 of The Philosophical Writings of Niels Bohr.  Woodbridge, Conn.: Ox Bow Press, 1987: 1–7.
  • Bohr, Niels.  Atomic Physics and Human Knowledge.  New York: Wiley, 1958b.
  • Born, Max. “Zur Quantenmechanik der Stoßyorgänge”   Zeitschrift für Physik 37, No. 12 (December 1926): 863–67. English translation, “On the Quantum Mechanics of Collisions.” In Quantum Theory and Measurement.  Wheeler, John. A., and Wojciech. H. Zurek, (eds.).  Princeton: Princeton University Press, 1983: 52–55.
  • Brown, Matthew. J.  “Relational Quantum Mechanics and the Determinacy Problem.” British Journal for the Philosophy of Science 60 (2009): 679–95.
  • Butterfield, Jeremy. “Some Worlds of Quantum Theory.” In Scientific Perspectives on Divine Action. Robert John Russell, Philip Clayton, Kirk Wegter-McNelly, and John Polkinghorne (eds.). Vatican City: Vatican Observatory Publications, 2001.
  • Byrne, Peter. The Many Worlds of Hugh Everett III.  New York: Oxford University Press, 2010.
  • Calosi, Claudio. “Quantum Mechanics and Priority Monism.” Synthese 191, no. 5 (2014): 915–28.
  • Cleland, Carol. E.  “Space:  An Abstract System of Non-Supervenient Relations.” Philosophical Studies 46, no. 1 (July 1984): 19–40.
  • Clifton, Rob.  Ed.  Perspectives on Quantum Reality.  Boston: Kluwer Academic Press, 1996.
  • Conroy, Christina. “A relative facts interpretation of Everettian quantum mechanics.” PhD Thesis. Irvine: University of California, 2010.
  • Conroy, Christina. “The Relative Facts Interpretation and Everett’s note added in proof. Studies in History and Philosophy of Modern Physics 43 (2012): 112–120.
  • Conroy, Christina.  “Branch-Relative Identity.” In Individuals Across Sciences. Edited by Alexandre Guay and Pradeu. Oxford University Press: New York, 2016: 250-271.
  • Dawid, Richard and Karim Thébault. “Against the Empirical Viability of the Deutsch-Wallace-Everett Approach to Quantum Mechanics.” Studies in the History and Philosophy of Modern Physics, 47 (2014): 55–61.
  • Dizadji-Bahmani, Foad. “The probability problem in Everettian quantum mechanics persists.” British Journal for the Philosophy of Science 0. (2013): 1–27
  • de Broglie, Louis. Solvay Congress (1927). Electrons et Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24 au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay, Paris: Gauthier-Villars, 1928.
  • Deutsch, David. The Fabric of Reality. New York: Penguin Books, 1997.
  • Deutsch, David. “Quantum Theory of Probability and Decisions.” Proceedings of the Royal Society of London A458 (1999): 2911–23.
  • Deutsch, David. “Apart from Universes.” 2010. In Many Worlds? Everett, Quantum Theory, & Reality. Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press: 542-552.
  • DeWitt, Bryce. S. Letter to John Wheeler. 1957. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne (eds.). Princeton: Princeton University Press, 2012: 242–251.
  • DeWitt, Bryce. “Everett-Wheeler Interpretation of Quantum Mechanics.”  1968. In Battelle Rencontres. DeWitt, Cecile M. and John. A. Wheeler, (eds.).  New York: Benjamin, 1 January 1968.
  • DeWitt, Bryce.  1970. “Quantum Mechanics and Reality.”  Physics Today 23, no. 9.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce.  “The Many-Universes Interpretation of Quantum Mechanics.”   Foundations of Quantum Mechanics.  New York: Academic Press Inc., 1971.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973.
  • DeWitt, Bryce. S., and Neill Graham, (eds.).  The Many-Worlds Interpretation of Quantum Mechanics. Princeton: Princeton University Press, 1973.
  • DeWitt, Cecile M. and John. A. Wheeler, eds.  Battelle Rencontres. New York: Benjamin, 1 January 1968.
  • Esfeld, Michael.  “Do Relations Require Underlying Intrinsic Properties?  A Physical Argument for a Metaphysics of Relations.” Metaphysica: International Journal for Ontology and Metaphysics 4, no. 1 (2003): 5–25.
  • Esfeld, Michael.  “Quantum Entanglement and a Metaphysics of Relations.” Studies in History and Philosophy of Modern Physics 35 (2004): 601–617.
  • Esfeld, Michael. “The Reality of Relations: The Case from Quantum Physics.” In The Metaphysics of Relations. Marmadoro, Anna and David Yates, (eds.). New York: Oxford University Press, 2016: 218–34.
  • Everett, Hugh III. “On the Foundations of Quantum Mechanics.”  PhD Thesis.  Princeton University. 1957a.  Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III.  “’Relative State’ Formulation of Quantum Mechanics.” Reviews of Modern Physics 29 (1957b): 454–462. Reprinted in The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 141–49.
  • Everett, Hugh III. Letter to Bryce DeWitt.  May 31, 1957c. Reprinted in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 252–256.
  • Everett, Hugh III. “The Theory of the Universal Wave Function.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 3–140.
  • Everett, Hugh III. “Quantitative Measure of Correlation.” Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 61–63.
  • French, Steven.  “Individuality, supervenience and Bell’s Theorem.” Philosophical Studies 55 (1989): 1–22.
  • Ghirardi, Giancarlo, Alberto Rimini and Tullio Weber. “Unified dynamics for microscopic and macroscopic systems.” Physical Review D34 (1989): 470.
  • Graham, Neill.  “The Measurement of Relative Frequency.” 1973. In The Many-Worlds Interpretation of Quantum Mechanics.  DeWitt, Bryce. S., and Neill Graham, (eds.).  Princeton: Princeton University Press, 1973: 229–253.
  • Greaves, Hilary. “Understanding Deutsch’s Probability in a Deterministic Multiverse.” Studies in History and Philosophy of Modern Physics 35 (2004): 423–56.
  • Greaves, Hilary. “Probability in the Everett Interpretation.” Philosophy Compass 2, 1 (January 2007): 109–128.
  • Greaves, Hilary. “On the Everettian Epistemic Problem.” Studies in History and Philosophy of Modern Physics 38, 1 (March 2007): 120–152.
  • Greaves, Hilary and Wayne Myrvold. “Everett and Evidence.” In Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 264–304.
  • Healey, Richard A.  “How Many Worlds?” Noûs 18 (1984): 591–616.
  • Healey, Richard A.  “Holism and Nonseparability.” Journal of Philosophy 88 (1991): 393–421.
  • Hemmo, Meir and Itamar Pitowsky. “Quantum Probability and Many Worlds.” Studies in the History and Philosophy of Modern Physics 38 (2007): 333–350.
  • Hooker, Clifford A. The Logico-Alegraic Approach to Quantum Mechanics, vol. 1.  Dordrecht: D. Reidel, 1975.
  • Ismael, Jenann. “How to Combine Chance and Determinism: Thinking About the Future in an Everett Universe.” Philosophy of Science 70 (2003): 776–90.
  • Jammer, Max.  The Philosophy of Quantum Mechanics.  New York: McGraw Hill, 1974.
  • Jansson, Lina. “Everettian Quantum Mechanics and Physical Probability: Against the Principle of ‘State Supervenience’.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 53 (February 2016): 45–54.
  • Joos, Erich, H. Dieter Zeh, Claus Keifer, Giulini, Domenico, Joachim Kupsch, and Ion-Olimpiu Stametesci, (eds.). Decoherence and the Appearance of a Classical World in Quantum Theory. Berlin: Springer; second revised edition, 2003.
  • Kastner, Ruth E. “‘Einselection’ of Pointer Observables: The New H-Theorem?” Studies in History and Philosophy of Modern Physics 48 (2014): 56–58.
  • Kent, Adrian. “One World Versus Many: The Inadequacy of Everettian Accounts of Evolution, Probability and Scientific Confirmation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 307–354.
  • Laudisa, Federico and Carlo Rovelli.  “Relational Quantum Mechanics.” Stanford Encyclopedia of Philosophy, (2008). http://plato.stanford.edu/entries/qm-relational/.
  • Lewis, David. “A Subjectivist’s Guide to Objective Chance” in Studies in Inductive Logic and Probability, Volume II. Richard C. Jeffrey, (ed.). Berkeley: University of California Press. 1980: 263–293.
  • Lewis, Peter. “Interpretations of Quantum Mechanics.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/int-qm/.
  • Lewis, Peter. “Uncertainty and Probability for Branching Selves.” Studies in History and Philosophy of Modern Physics 38 (2007): 1–14.
  • Lewis, Peter. “Probability, Self-Location and Quantum Branching.” Philosophy of Science 76 (5), (December 2009): 1009–1019.
  • Lewis, Peter.  “Probability in Everettian Quantum Mechanics.” Manuscrito, 33 (2010): 285–306.
  • Lewis, Peter. Quantum Ontology. New York: Oxford University Press, 2016.
  • Mackey, George W.  “Quantum Mechanics and Hilbert Space.” American Mathematics Monthly 64 (1957): 45–57.
  • Mackey, George W.  Foundations of Quantum Mechanics.  New York: W. A. Benjamin, 1963.
  • Markosian, Ned.  “Time.” Stanford Encyclopedia of Philosophy (2008). http://plato.stanford.edu/entries/time.
  • Maudlin, Tim.  Quantum Non-Locality and Relativity: Metaphysical Intimations of Modern Physics. Oxford: Blackwell, 2002.
  • Maudlin, Tim. “Can the World Be Only Wavefunction?” in Many Worlds? Everett, Quantum Theory, & Reality. Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 121–143.
  • McKenzie, Kerry. “Priority and Particle Physics: Ontic Structural Realism as a Fundamentality Thesis.” British Journal for the Philosophy of Science 65 no. 2, (2014): 353–80.
  • McTaggart, J. M. Ellis. “The Unreality of Time.” Mind 17 (October 1908): 457–474.
  • Mermin, David. “What is Quantum Mechanics Trying to Tell Us?” American Journal of Physics 66 (1998): 753–767.
  • Ney, Alyssa and David Albert. The Wave Function. New York: Oxford University Press, 2013.
  • Price, Huw. “Decisions, Decisions, Decisions: Can Savage Salvage Everettian Probability?” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.).  New York: Oxford University Press. 2010: 369–390.
  • Putnam, Hilary. “Is Logic Empirical?” Boston Studies in the Philosophy of Science, vol. 5. Coehn, Robert S. and Marx W. Wartofsky, (eds.) Dordrecht: D. Reidel, 1968: 216–241.
  • Reichenbach, Hans. Philosophical Foundations of Quantum Mechanics. Berkeley: University of California Press, 1944.
  • Rovelli, C. “Relational quantum mechanics.” International Journal of Theoretical Physics 35, no. 8 (1995): 1637–1678.
  • Saunders, Simon.  “Decoherence, Relative States, and Evolutionary Adaptation.” Foundations of Physics 23, no. 12 (1993): 1553–1585.
  • Saunders, Simon.  “Time, Quantum Mechanics, and Decoherence.” Synthese 102, no. 2 (1995): 235–266.
  • Saunders, Simon.  “Relativism.”  (1996a). In Perspectives on Quantum Reality. Clifton, Rob, (ed.).  Boston: Kluwer Academic Press, 1996.
  • Saunders, Simon.  “Time, Quantum Mechanics and Tense.” Synthese 107 (1996b): 19–53.
  • Saunders, Simon. “Naturalizing Metaphysics.”  The Monist 80. no. 1 (1997): 44–69.
  • Saunders, Simon.  “Time, Quantum Mechanics and Probability.” Synthese 114 (1998): 373–404.
  • Saunders, Simon. “What is Probability?” In Quo Vadis Quantum Mechanics. Avshalom C. Elitzur, Shahar Dolev, and Nancy Kolenda, (eds.). Berlin: Springer-Verlag, 2005.
  • Saunders, Simon. “Chance in the Everett Interpretation” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010: 181–205.
  • Saunders, Simon, Jonathan Barrett, Adrian Kent and David Wallace (eds.).  Many Worlds? Everett, Quantum Theory, & Reality. New York: Oxford University Press. 2010.
  • Saunders, Simon and David Wallace. “Branching and Uncertainty.” (2008a). British Journal for the Philosophy of Science 59, (3): 293–305.
  • Saunders, Simon and David Wallace. “Reply.” (2008b). British Journal for the Philosophy of Science 59, (3): 315–37.
  • Schaffer, Jonathan. “Monism: The Priority of the Whole.” Philosophical Review 119 no.1 (2010): 31–76. Reprinted in Spinoza on Monism. Goff, Philip, (ed.). New York: Palgrave, 2012: 149–66.
  • Schaffer, Jonathan and Jenann Ismael. “Quantum Holism: Nonseparability as Common Ground.” Synthese (2016): online published first.
  • Schlosshauer, Maximilian. Decoherence and the Quantum-to-Classical Transition. Heidelberg: Springer. 2007.
  • Schrödinger, Erwin. “Die gegenwärtige Situation in der Qauntenmechanik.” Naturwissenschaften, 23 (1935): 807–812, 823–828, 844–849; English translation by Trimmer, J. D. “The Present Situation in Quantum Mechanics: A Translation of Schrödinger’s ‘Cat Paradox’ Paper.” Proceedings of the American Philosophical Society, 124: 323–338, reprinted in Wheeler and Zurek (1983).
  • Sebens, Charles and Sean M. Carroll. “Self-Locating Uncertainty and the Origin of Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science. First published online July 5, 2016.
  • Tappenden, Paul. “Saunders and Wallace on Everett and Lewis” (2008). British Journal for the Philosophy of Science, 59: 307–314.
  • Tappenden, Paul. “Evidence and Uncertainty in Everett’s Multiverse,” British Journal for the Philosophy of Science, 62 (2011): 99–123.
  • Teller, Paul.  “Relational Holism and Quantum Mechanics. The British Journal for the Philosophy of Science 37, no. 1 (March 1986): 71–81.
  • Teller, Paul. “Relativity, Relational Holism and the Bell Inequalities.” In Philosophical consequences of quantum theory: Reflections on Bell’s theory. Cushing, James and Ernan McMullin, (eds.).  South Bend, IN: University of Notre Dame Press. (1989): 208–23.
  • Vaidman, Lev. “On the Schizophrenic Experiences of the Neutron or Why We Should Believe in the Many-Worlds Interpretation of Quantum Theory.” International Studies in the Philosophy of Science 12, 3 (1998): 245–261.
  • Vaidman, Lev. “Probability in the Many-Worlds Interpretation of Quantum Mechanics.” In Probability in Physics, The Fronteirs Collection XII. Yemima Ben-Menahem and Meir Hemmo, (eds.). Berlin: Springer. (2012): 299–311.
  • Vaidman, Lev. “Review: David Wallace The Emergent Multiverse: Quantum Theory According to the Everett Interpretation.” The British Journal for the Philosophy of Science, 0 (2014): 1–4.
  • van Fraassen, Bas C. “Rovelli’s World.” Foundations of Physics 40, no. 4 (2009): 390–417.
  • Wallace, David. “Quantum Probability and Decision Theory Revisited.” (2002). http://arxiv.org/abs/quant-ph/0211104
  • Wallace, David. “Everettian Rationality: Defending Deutsch’s Approach to Probability in the Everett Interpretation.” Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 34, 3 (September 2003): 415–439.
  • Wallace, David. “Everett and Structure.” (2003). Studies in History and Philosophy of Science, Part B 34, (1): 87–105.
  • Wallace, David. “Language Use in a Branching Universe.”(2005). Preprint: http://philsci-archive.pitt.edu/archive/00002554/
  • Wallace, David. “Epistemology Quantized: Circumstances in Which We Should come to Believe in the Everett Interpretation.” (2006). British Journal for the Philosophy of Science 57, (4): 655–689.
  • Wallace, David. “Quantum Probability from Subjective Likelihood: Improving on Deutsch’s Proof of the Probability Rule.” (2007) Studies in History and Philosophy of Modern Physics 38, 311–32.
  • Wallace, David. “Decoherence and Ontology” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010a: 53–72.
  • Wallace, David. “How to Prove the Born Rule” in Many Worlds? Everett, Quantum Theory, & Reality.  Simon Saunders, Jonathan Barrett, Adrian Kent and David Wallace, (eds.). New York: Oxford University Press. 2010b: 227–263.
  • Wallace, David. (2012) The Emergent Universe. New York: Oxford University Press.
  • Wheeler, John A.  Geons, Black Holes and Quantum Foam.  New York: W. W. Norton, 1998.
  • Wheeler, John. A., and Wojciech. H. Zurek, (eds.). Quantum Theory and Measurement. Princeton: Princeton University Press, 1983.
  • Wilson, Alastair. “Objective Probability in Everettian Quantum Mechanics.” British Journal for the Philosophy of Science 64 (4) (2013): 709–737.
  • Wilson, Alastair. “The Quantum Doomsday Argument.” British Journal for the Philosophy of Science 0 (2015): 1–19.
  • Xavier University Conference Transcript. October 1–5, 1962.  Cincinnati, Ohio. Printed in The Everett Interpretation of Quantum Mechanics: Collected Works 1955-1980 with Commentary. Barrett, Jeffrey A. and Peter Byrne, (eds.). Princeton: Princeton University Press, 2012: 267–279.
  • Zeh, H. Dieter. “On the Interpretation of Measurement in Quantum Theory.” (1970). Foundations of Physics 1: 69–76. Reprinted in Quantum Theory and Measurement. Wheeler, John A., and Wojciech H. Zurek, (eds.). Princeton: Princeton University Press, 1983: 342–49.
  • Zeh, H. Dieter. “Toward a Quantum Theory of Observation.” (1973). Foundations of Physics 3: 109–16.
  • Zeh, H. Dieter. “Basic Concepts and Their Interpretation.” (1995). In Decoherence and the Appearance of a Classical World in Quantum Theory. Giulini, Domenico, Erich Joos, Claus Keifer, Joachim Kupsch, Ion-Olimpiu, and H. Dieter Zeh, (eds.). Berlin: Spinger, 2003: Chapter 2.
  • Zurek, Wojciech H. “Pointer Basis of Quantum Apparatus: Into what Mixture does the Wave Packet Collapse?” (1981). Physical Review D (24): 1516–1525.
  • Zurek, Wojciech H. “Environment-Induced Superselection Rules.” (1982). Physical Review D (26): 1862–1880.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical.” (1991). Physics Today 44 (October): 36–44.
  • Zurek, Wojciech H. “Decoherence and the Transition from Quantum to Classical – Revisited.” (2002). Los Alamos Science 27: 2–25.

 

Author Information

Christina Conroy
Email: c.conroy@moreheadstate.edu
Morehead State University
U. S. A.

Integrated Information Theory of Consciousness

Integrated Information Theory (IIT) offers an explanation for the nature and source of consciousness. Initially proposed by Giulio Tononi in 2004, it claims that consciousness is identical to a certain kind of information, the realization of which requires physical, not merely functional, integration, and which can be measured mathematically according to the phi metric.

The theory attempts a balance between two different sets of convictions. On the one hand, it strives to preserve the Cartesian intuitions that experience is immediate, direct, and unified. This, according to IIT’s proponents and its methodology, rules out accounts of consciousness such as functionalism that explain experience as a system operating in a certain way, as well as ruling out any eliminativist theories that deny the existence of consciousness. On the other hand, IIT takes neuroscientific descriptions of the brain as a starting point for understanding what must be true of a physical system in order for it to be conscious. (Most of IIT’s developers and main proponents are neuroscientists.) IIT’s methodology involves characterizing the fundamentally subjective nature of consciousness and positing the physical attributes necessary for a system to realize it.

In short, according to IIT, consciousness requires a grouping of elements within a system that have physical cause-effect power upon one another. This in turn implies that only reentrant architecture consisting of feedback loops, whether neural or computational, will realize consciousness. Such groupings make a difference to themselves, not just to outside observers. This constitutes integrated information. Of the various groupings within a system that possess such causal power, one will do so maximally. This local maximum of integrated information is identical to consciousness.

IIT claims that these predictions square with observations of the brain’s physical realization of consciousness, and that, where the brain does not instantiate the necessary attributes, it does not generate consciousness. Bolstered by these apparent predictive successes, IIT generalizes its claims beyond human consciousness to animal and artificial consciousness. Because IIT identifies the subjective experience of consciousness with objectively measurable dynamics of a system, the degree of consciousness of a system is measurable in principle; IIT proposes the phi metric to quantify consciousness.

Table of Contents

  1. The Main Argument
    1. Cartesian Commitments
      1. Axioms
      2. Postulates
    2. The Identity of Consciousness
      1. Some Predictions
    3. Characterizing the Argument
  2. The Phi Metric
    1. The Main Idea
    2. Some Issues of Application
  3. Situating the Theory
    1. Some Prehistory
    2. IIT’s Additional Support
    3. IIT as Sui Generis
    4. Relation to Panpsychism
      1. Relation to David Chalmers
  4. Implications
    1. The Spectrum of Consciousness
    2. IIT and Physics
    3. Artificial Consciousness
      1. Constraints on Structure/Architecture
      2. Relation to “Silent Neurons”
  5. Objections
    1. The Functionalist Alternative
      1. Rejecting Cartesian Commitments
      2. Case Study: Access vs. Phenomenal Consciousness
      3. Challenging IIT’s Augmentation of Naturalistic Ontology
    2. Aaronson’s Reductio ad Absurdum
    3. Searle’s Objection
  6. References and Further Reading

1. The Main Argument

IIT takes certain features of consciousness to be unavoidably true. Rather than beginning with the neural correlates of consciousness (NCC) and attempting to explain what about these sustains consciousness, IIT begins with its characterization of experience itself, determines the physical properties necessary for realizing these characteristics, and only then puts forward a theoretical explanation of consciousness, as identical to a special case of information instantiated by those physical properties. “The theory provides a principled account of both the quantity and quality of an individual experience… and a calculus to evaluate whether a physical system is conscious” (Tononi and Koch, 2015).

a. Cartesian Commitments

IIT takes Descartes very seriously. Descartes located the bedrock of epistemology in the knowledge of our own existence given to us by our thought. “I think, therefore I am” reflects an unavoidable certainty: one cannot deny one’s own existence as a thinker even if one’s particular thoughts are in error. For IIT, the relevance of this insight lies in its application to consciousness. Whatever else one might claim about consciousness, one cannot deny its existence.

i. Axioms

IIT takes consciousness as primary. Before speculating on the origins or the necessary and sufficient conditions for consciousness, IIT gives a characterization of what consciousness means. The theory advances five axioms intended to capture just this. Each axiom articulates a dimension of experience that IIT regards as self-evident.

First, following from the fundamental Cartesian insight, is the axiom of existence. Consciousness is real and undeniable; moreover, a subject’s consciousness has this reality intrinsically; it exists from its own perspective.

Second, consciousness has composition. In other words, each experience has structure. Color and shape, for example, structure visual experience. Such structure allows for various distinctions.

Third is the axiom of information: the way an experience is distinguishes it from other possible experiences. An experience specifies; it is specific to certain things, distinct from others.

Fourth, consciousness has the characteristic of integration. The elements of an experience are interdependent. For example, the particular colors and shapes that structure a visual conscious state are experienced together. As we read these words, we experience the font-shape and letter-color inseparably. We do not have isolated experiences of each and then add them together. This integration means that consciousness is irreducible to separate elements. Consciousness is unified.

Fifth, consciousness has the property of exclusion. Every experience has borders. Precisely because consciousness specifies certain things, it excludes others. Consciousness also flows at a particular speed.

ii. Postulates

In isolation, these axioms may seem trivial or overlapping. IIT labels them axioms precisely because it takes them to be obviously true. IIT does not present them in isolation. Rather, they motivate postulates. Sometimes the IIT literature refers to phenomenological axioms and ontological postulates. Each axiom leads to a corresponding postulate identifying a physical property. Any conscious system must possess these properties.

First, the existence of consciousness implies a system of mechanisms with a particular cause-effect power. IIT regards existence as inextricable from causality: for something to exist, it must be able to make a difference to other things, and vice versa. (What would it even mean for a thing to exist in the absence of any causal power whatsoever?) Because consciousness exists from its own perspective, the implied system of mechanisms must do more than simply have causal power; it must have cause-effect power upon itself.

Second, the compositional nature of consciousness implies that its system’s mechanistic elements must have the capacity to combine, and that those combinations have cause-effect power.

Third, because consciousness is informative, it must specify, or distinguish one experience from others. IIT calls the cause-effect powers of any given mechanism within a system its cause-effect repertoire. The cause-effect repertoires of all the system’s mechanistic elements taken together, it calls its cause-effect structure. This structure, at any given point, is in a particular state. In complex structures, the number of possible states is very high. For a structure to instantiate a particular state is for it to specify that state. The specified state is the particular way that the system is making a difference to itself.

Fourth, consciousness’s integration into a unified whole implies that the system must be irreducible. In other words, its parts must be interdependent. This in turn implies that every mechanistic element must have the capacity to act as a cause on the rest of the system and to be affected by the rest of the system. If a system can be divided into two parts without affecting its cause-effect structure, it fails to satisfy the requirement of this postulate.

Fifth, the exclusivity of the borders of consciousness implies that the state of a conscious system must be definite. In physical terms, the various simultaneous subgroupings of mechanisms in a system have varying cause-effect structures. Of these, only one will have a maximally irreducible cause-effect structure. This is called the maximally irreducible conceptual structure, or MICS. Others will have smaller cause-effect structures, at least when reduced to non-redundant elements. Precisely this is the conscious state.

b. The Identity of Consciousness

IIT accepts the Cartesian conviction that consciousness has immediate, self-evident properties, and outlines the implications of these phenomenological axioms for conscious physical systems. This characterization does not exhaustively describe the theoretical ambition of IIT. The ontological postulates concerning physical systems do not merely articulate necessities, or even sufficiencies, for realizing consciousness. The claim is much stronger than this. IIT identifies consciousness with a system’s having the physical features that the postulates describe. Each conscious state is a maximally irreducible conceptual structure, which just is and can only be a system of irreducibly interdependent physical parts whose causal interaction constitutes the integration of information.

An example may help to clarify the nature of IIT’s explanation of consciousness. Our experience of a cue ball integrates its white color and spherical shape, such that these elements are inseparably fused. The fusion of these elements constitutes the structure of the experience: the experience is composed of them. The nature of the experience informs us about whiteness and spherical shape in a way that distinguishes it from other possible experiences, such as of a blue cube of chalk. This is just a description of the phenomenology of a simple experience (perhaps necessarily awkward, because it articulates the self-evident). Our brain generates the experience through neurons physically communicating with one another in systems linked by cause-effect power. IIT interprets this physical communication as the integration of information, according to the various constraints laid out in the postulates. The neurobiology and phenomenology converge.

Theories of consciousness need to account for what is sometimes termed the “binding problem.” This concerns the unity of conscious experience. Even a simple experience like viewing a cue ball unites different elements such as color, shape, and size. Any theory of consciousness will need to make sense of how this happens. IIT’s account of the integration of information may be understood as a response to this problem.

According to IIT, the physical state of any conscious system must converge with phenomenology; otherwise the kind of information generated could not realize the axiomatic properties of consciousness. We can understand this by contrasting two kinds of information. First, there is Shannon information: When a digital camera takes a picture of a cue ball, the photodiodes operate in causal isolation from one another. This process does generate information; specifically, it generates observer-relative information. That is, the camera generates the information of an image of a cue ball for anyone looking at that photograph. The information that is the image of the cue ball is therefore relative to the observer; such information is called Shannon information. Because the elements of the system are causally isolated, the system does not make a difference to itself. Accordingly, although the camera gives information to an observer, it does not generate that information for itself. By contrast, consider what IIT refers to as intrinsic information: Unlike the digital camera’s photodiodes, the brain’s neurons do communicate with one another through physical cause and effect; the brain does not simply generate observer-relative information, it integrates intrinsic information.  This information from its own perspective just is the conscious state of the brain. The physical nature of the digital camera does not conform to IIT’s postulates and therefore does not have consciousness; the physical nature of the brain, at least in certain states, does conform to IIT’s postulates, and therefore does have consciousness.

To identify consciousness with such physical integration of information constitutes an ontological claim. The physical postulates do not describe one way or even the best way to realize the phenomenology of consciousness; the phenomenology of consciousness is one and the same as a system having the properties described by the postulates. It is even too weak to say that such systems give rise to or generate consciousness. Consciousness is fundamental to these systems in the same way as mass or charge is basic to certain particles.

i. Some Predictions

IIT’s conception of consciousness as mechanisms systematically integrating information through cause and effect lends itself to quantification. The more complex the MICS, the higher the level of consciousness: the corresponding metric is phi. Sometimes the IIT literature uses the term “prediction” to refer to implications of the theory whose falsifiability is a matter of controversy. This section will focus on more straightforward cases of prediction, where the evidence is consistent with IIT’s claims. These cases provide corroborative evidence that enhance the plausibility of IIT.

Deep sleep states are less experientially rich than waking ones. IIT predicts, therefore, that such sleep states will have lower phi values than waking states. For this to be true, analysis of the brain during these contrasting states would have to show a disparity in the systematic complexity of non-redundant mechanisms. On IIT, this disparity of MICS complexity directly implies a disparity in the amount of conscious integrated information, because the MICS is identical to the conscious state. The neuroscientific findings bear out this prediction.

IIT cites similar evidence from the study of patients with brain damage. For example, we already know that among vegetative patients, there are some whose brain scans indicate that they can hear and process language: When researchers prompt such patients to think about playing tennis, the appropriate areas of the brain become activated. Other vegetative patients do not respond this way. Naturally, this suggests that the former have a richer degree of consciousness than the latter. When analyzed according to IIT’s theory, the former have a higher phi metric than the latter; once again, IIT has made a prediction that receives empirical confirmation. IIT also claims that findings in the analysis of patients under anesthesia corroborate its claims.

In all these cases, one of two things happens. First, as consciousness fades, cortical activity may become less global. This reversion to local cortical activity constitutes a loss of integration: The system no longer is communicating across itself in as complex a way as it had. Second, as consciousness fades, cortical activity may remain global, but become stereotypical, consisting in numerous redundant cause-effect mechanisms, such that the informational achievement of the system is reduced: a loss of information. As information either becomes less integrated or becomes reduced, consciousness fades, which IIT takes as empirical support of its theory of consciousness as integrated information.

c. Characterizing the Argument

IIT combines Cartesian commitments with claims about engineering that it interprets, in part by citing corroborative neuroscientific evidence, as identifying the nature of consciousness. This borrows from recognizable traditions in the field of consciousness studies, but the structure of the argument is novel. While IIT’s proponents strive for clarity in the exposition of their work by breaking it down into the simpler elements of axioms, propositions, and identity claims, the nature of the relations between these parts remains largely implicit in the IIT literature. To evaluate the explanatory success or failure of IIT, it should prove helpful to attempt an explication of these logical relations. This requires characterizing the relationship of the axioms with the postulates, and of the identity claims with the axioms, postulates, and supporting neuroscientific evidence.

The axioms, of course, count as premises. These premises seem to lead to the postulates: each postulate flows from a corresponding axiom. At the same time, IIT describes these postulates as unproven assumptions, which seems at odds with their being conclusions drawn from the axioms. Consider the first axiom and its postulate, concerning existence. The axiom states that consciousness exists, and more specifically, exists intrinsically; the postulate holds that this requires a conscious system to have cause-effect power, and more specifically, to have this power over itself. The link involves, in part, the claim that existence implies cause-effect power. This claim that for a thing to exist, it must be able to make a difference, is plausible, but not self-evident. Nor does the axiomatic premise alone deductively imply this postulate. Epiphenomenalists, for example, claim that conscious mental states, although existent and caused, do not cause further events; they do not make a difference. Epiphenomenalists certainly do not go on to identify consciousness with physically causal systems, as IIT does.

Tononi (2015) adopts the position that the move from the axioms to the postulates is one of inference to the best explanation, or abduction. On this line, while the axioms do not deductively imply the postulates, the postulates have more than mere statistical inductive support. For example, consider the observation that human brains, which on IIT are conscious systems, have cause-effect power over themselves. Minimally, this offers a piece of inductive support for describing conscious systems in general as having such a power. Tononi takes a stronger line, claiming that a system’s property of having cause-effect power over itself most satisfyingly explains its intrinsic existence. So, what makes the brain a system at all, capable of having its own consciousness, is its ability to make a difference to itself. This illustrates the relation of postulates such as the first, concerning cause-effect power, to axioms such as the first axiom, concerning intrinsic existence, by appeal to something like explanatory fit, or satisfactoriness, which is to characterize that relation abductively.

In any case, IIT moves from the sub-conclusion about the postulates to a further conclusion, the identity claim: consciousness is identical to a system’s having the physical properties laid out by the postulates, which realize the phenomenology described by the axioms. Here again, the abductive interpretation remains an option. On this interpretation, the conjunction of the physical features of the postulates provides the most satisfactory explanation for the identity of consciousness.

This breakdown of the argument reveals the separability of the two parts. A less ambitious version of IIT might have limited itself to the first part, claiming that the physical features described by the postulates are the actual and/or best ways of realizing consciousness, or more strongly that they are necessary and/or sufficient, without going on to say that consciousness is identical to a system having these properties. The foregoing paragraphs outlined the possible motivation for the identification claim as lying in the abductive interpretation.

The notion of best explanation is notoriously slippery, but also ubiquitous in science. From an intuitive point of view one might regard the content of the conjunction of the postulates as apt for accounting for the phenomenology, but one might, motivated by theoretical conservatism, stop short of describing this as an identity relation. One clue as to why IIT does not take this tack may lie in IIT’s methodological goal of parsimony, something the literature mentions with some regularity. Perhaps the simplicity of identifying consciousness with a system’s having certain intuitively apt physical properties outweighs the non-conservatism of the claim that consciousness is fundamental to such systems the way mass is fundamental to a particle.

2. The Phi Metric

a. The Main Idea

IIT strives, among other things, not just to claim the existence of a scale of complexity of consciousness, but to provide a theoretical approach to the precise quantification of the richness of experience for any conscious system. This requires calculating the maximal amount of integrated information in a system. IIT refers to this as the system’s phi value, which can be expressed numerically, at least in principle.

Digital photography affords particularly apt illustrations of some of the basic principles involved in quantifying consciousness.

First, a photodiode exemplifies integrated information in the simplest way possible. A photodiode is a system of two elements, which together render it sensitive to two states only: light and dark. After initial input from the environment, the elements communicate input physically with one another, determining the output. So, the photodiode is a two-element system that integrates information. A photodiode not subsumed in another system of greater phi value is the simplest possible example of consciousness.

This consciousness, of course, is virtually negligible. The photodiode’s experience of light and dark is not rich in the way that ours is. The level of information of a state depends upon its specifying that state as distinct from others. The repertoire of the photodiodes allows only for the most limited differentiation (“this” vs. “that”), whereas the repertoire of a complex system such as the brain allows for an enormous amount of differentiation. Even our most basic experience of darkness distinguishes it not only from light, but from shapes, colors, and so forth.

Second, a digital camera’s photodiodes’ causal arrangement neatly exemplifies the distinction between integrated and non-integrated information. Putting to one side that each individual photodiode integrates information as simply as possible, those photodiodes do not take input or give output to one another, so the information does not get integrated across the system. For this reason, the camera’s image is informative to us, but not to itself.

Each isolated photodiode has integrated information in the most basic way, and would therefore have the lowest possible positive value of phi. The camera’s photodiodes taken as a system do not integrate information and have a phi value of zero.

IIT attributes consciousness to certain systems, or networks. These can be understood abstractly as models. A computer’s hardware may be modelled by logic circuits, which represent its elements and their connections as interconnected logic gates. The way a particular connection within this network mediates input and output determines what kind of logic gate it is. For example, consider a connection that takes two inputs, either which can be True or False, and then gives one output, True or False. The AND logic gate would give an output of True if both inputs were True or if both inputs were False. In other words, if both the one AND the other have the same value, the AND gate gives a True output. Such modelling captures the dynamics of binary systems, with “True” corresponding to 1 and “False” to 0. The arrangement of a network’s various logic gates (which include not only AND, but also OR, NOT, XOR, among others) determines how any particular input to the system at time-step 1 will result in output at time-step 2, and so on for that system. The brain can be modelled this way too. Input can come from a prior brain state, or from other parts of the nervous system, through the senses. The input causes a change in brain state, depending on the organization of the particular brain, which can be articulated in abstract logic.

In order to measure the level of consciousness of a system, IIT must describe the amount of its integrated information. This is done by partitioning the system in various ways. If the digital camera’s photodiodes are partitioned, say, by dividing the abstract model of its elements in half, no integrated information is lost, because all the photodiodes are in isolation from each other, and so the division does not break any connections. If no logically possible partition of the system results in a loss of connection, the conclusion is that the system does not make a difference to itself. So, in this case, the system has no phi.

Systems of interest to IIT will have connections that will be lost by some partitions and not by others. Some partitions will sever from the system elements that are comparatively low in original degree of connectivity to the system, in other words elements whose (de)activation has few causal consequences upon the (de)activation of other elements. A system where all or most elements have this property will have low phi. The lack of strong connectivity may be the result of relative isolation, or locality, an element not linking to many other elements, directly or indirectly. Or it could be from stereotypicality, where the element’s causal connections overlap in a largely redundant way with the causal connection of other elements. A system whose elements are connected more globally and non-redundantly will have higher phi. A partition that not only separates all elements that do not make a difference to the rest of the system for reasons of either isolation or redundancy from those that do make a difference, but also separates those elements whose lower causal connectivity decreases the overall level of integration of the system from those that do not, will thereby have picked out the maximally irreducible conceptual structure (MICS), which according to IIT is conscious. The degree of that consciousness, its phi, depends upon its elements’ level of causal connectivity. This is determined by how much information integration would be lost by the least costly further partition, or, in other words, how much the cause-effect structure of the system would be reduced by eliminating the least causally effective element within the MICS.

It is important to note that not every system with phi has consciousness. A sub- or super-system of an MICS may have phi but will not have consciousness.

If we were to take a non-MICS subsystem of a network, which in isolation still has causal power over itself, articulable as a logic circuit, then that would have phi. Were it indeed in isolation, it would have its own MICS, and its phi would correspond to that system’s degree of consciousness. It is, however, not in isolation, but rather part of a larger system.

IIT interprets the exclusion axiom—that any conscious system is in one conscious state only, excluding all others—as implying a postulate that holds that, at the level of the physical system, there be no “double counting” of consciousness. So, although a system may have multiple subsystems with phi, only the MICS is conscious, and only the phi value of the MICS (sometimes called phi max) measures conscious degree. The other phi values measure degrees of non-conscious integrated information. So, for example, each of a person’s visual cortices does not enjoy its own consciousness, but parts of each belong to a single MICS, which is the person’s one unitary consciousness.

If we were to take a supersystem of an MICS, one that includes the MICS and also other associated elements with lower connectivity, we could again assign it a phi value, but this would not measure the local maximum of integrated information. The supersystem integrates information, but not maximally, and its phi is therefore not a measure of consciousness. This is probably best understood by example: a group of people in a discussion integrate information, but the connective degree among them is lower than the degree of connectivity within each individual. The group as such has no consciousness, but each individual person—or, more properly, the MICS of each—does. The individuals’ MICSs are local maxima of integrated information and therefore conscious.

b. Some Issues of Application

The number of possible partitions of a system, called Bell’s number, grows immensely as the number of elements increases. For example, the tiny nematode, a simple species of worm, has 302 neurons, “and the number of ways that this network can be cut into parts is the hyperastronomical 10 followed by 467 zeros” (Koch, 2012). Calculating phi precisely for much more complex systems such as brains eludes computation pragmatically, although not in principle. In the absence of precise phi computation, IIT employs mathematical “heuristics, shortcuts, and approximations” (ibid.). The IIT literature includes several different mathematical interpretations of phi calculation, each intended to replace the last; it is not yet clear that IIT has a settled account of it. Proponents of IIT hold that the mathematical details will enable the application, but not bear on the merits, of the deeper theoretical claims. At least one serious objection to IIT, however, attempts a reductio ad absurdum of those deeper claims via analysis of the mathematical implications.

It is clear that, whatever the mathematical details, the basic principles of phi imply that biological nervous systems such as the brain will be capable of having very high phi, because neurons often have thousands of connections to one another. On the other hand, a typical circuit in a standard CPU only makes a few connections to other circuits, limiting the potential phi value considerably. It is also clear that even simple systems will have at least some phi value, and that, provided they constitute local maxima, will have a corresponding measure of consciousness.

IIT does not intend phi as a measure of the quality of consciousness, only of its quantity. Two systems may have the same phi value but different MICS organizations. In this case, each would be conscious to the same degree, but the nature of the conscious experience would differ. The phi metric captures one dimension, the amount of integrated information, of a system. IIT does address the quality of consciousness abstractly, although not with phi. A system’s model includes its elements and their connections, whose logic can be graphed as a constellation of (de)activated points with lines between them representing (de)activated connections. This is, precisely, its conceptual structure. Recall that the maximally irreducible conceptual structure, or MICS, is, on IIT, conscious. A graph of the MICS, according to IIT, captures its unique shape in quality space, the shape of that particular conscious state. In other words, this is the abstract form of the quality of the experience, or the experience’s form “seen from the outside.” The perspective “from the inside” is available only to the system itself, whose making differences to itself intrinsically, integrating information in one of many possible forms, just precisely is its experience. The phenomenological nature of the experience, its “qualia,” are evident only from the perspective of the conscious system, but the logical graph of its structure is a complete representation of its qualitative properties.

3. Situating the Theory

a. Some Prehistory

IIT made its explicit debut in the literature in 2004, but has roots in earlier work. Giulio Tononi, the theory’s founder and major proponent, worked for many years with Gerald Edelman. Their work rejected the notion that mental events such as consciousness will ever find full explanation by reference to the functioning of a system. Such functionalism, to them, ignores the crucial issue of the physical substrate itself. They especially emphasized the importance of re-entry. To them, only a system composed of feedback loops, where input may also serve as output, can integrate information. Feed-forward systems, then, do not integrate information. Even before the introduction of IIT, Tononi was claiming that integrated information was essential to the creation of a scene in primary consciousness.

Christof Koch, now a major proponent of IIT, collaborated for a long time with Francis Crick. Much of their earlier work focused on identifying the neural correlates of consciousness (NCC), especially in the visual system. While such research advances particular knowledge about the mechanisms of one set of conscious states, Crick and Koch came to see this work as failing to address the deeper problems of explaining consciousness generally. Koch also rejects the idea that identifying the functional dynamics of a system aptly treats what makes that system conscious. He came to regard information theory as the correct approach for explaining consciousness.

So, the two thinkers who became IIT’s chief advocates arrived at that position after close neuroscientific research with Nobel Laureates who eschewed functional approaches to consciousness, favoring investigation of the relation of physical substrate to information generation.

b. IIT’s Additional Support

As of 2016, Tononi, IIT’s creator, runs the Center for Sleep and Consciousness at the University of Madison-Wisconsin. The Center has more than forty researchers, many of whom work in its IIT Theory Group. Koch, a major supporter of IIT, heads the prestigious Allen Institute for Brain Science. The Institute has links with the White House Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, as well as the European Human Brain Project (HBP).  IIT’s body of literature continues to grow, often in the form of publications associated with the Center and the Institute. The public reputation of these organizations, as well as of Tononi and Koch’s earlier work, lends a certain authority or celebrity to IIT. The theory has enjoyed ample attention in mainstream media. Nevertheless, IIT remains a minority position among neuroscientists and philosophers.

c. IIT as Sui Generis

IIT does not fit neatly into any other school of thought about consciousness; there are points of connection to and departures from many categories of consciousness theory.

IIT clearly endorses a Cartesian interpretation of consciousness as immediate; such an association is unusual for a self-described naturalistic or scientific theory of consciousness. Cartesian convictions do inform IIT’s axioms and so motivate its overall methodological approach. To label IIT as a Cartesian theory generally, however, would be misleading. For one thing, like most modern theories of consciousness, it dissociates itself from the idea of a Cartesian theatre, or single point in the brain that is the seat of consciousness. Moreover, it is by no means clear how IIT stands in relation to dualism. Certainly, IIT does not advertise itself as positing a mental substance that is separate from the physical. At the same time, it draws an analogy between its identification of consciousness as a property of certain integrated information systems and physics’ identification of mass or charge as a property of particles. One might interpret such introduction of immediate experience into the naturalistic ontology as having parallels with positing a new kind of mental substance.  The literature will occasionally describe IIT as a form of materialism. It is true that IIT theorists do focus on the material substrate of informational systems, but again, one might challenge whether a theory that asserts direct experience as fundamental to substrates with particular architectural features is indeed limiting itself to reference to material in its explanation.

In describing the features of conscious systems, IIT will make reference to function, but IIT rejects functionalism outright. To IIT theorists, articulating the functional dynamics of a system alone will never do justice to the immediate nature of experience.

d. Relation to Panpsychism

The IIT literature, not only from Tononi but also from Koch and others, refers with some regularity to panpsychism—broadly put, any metaphysical system that attributes mental properties to basic elements of the world—as sharing important ground with IIT. Panpsychism comes in different forms, and the precise relationship between it and IIT has yet to be established. Both IIT and panpsychism strongly endorse Cartesian commitments concerning the immediate nature of experience. IIT, however, only attributes mental properties to re-entrant architectures, because it claims that only these will integrate information; this is inconsistent with any version of panpsychism that insists upon attributing mental properties to even more basic elements of the structure of existence.

i. Relation to David Chalmers

Of the various contemporary philosophical accounts of consciousness, IIT intersects perhaps most frequently with the work of David Chalmers. This makes sense, not only given Chalmers’s panpsychist leanings, but also given the express commitment of both to a Cartesian acceptance of the immediacy of experience, and a corresponding rejection of functionalist attempts to explain consciousness.

Moreover, Chalmers’s discussion of the relation of information to consciousness strongly anticipates IIT. Before the introduction of IIT, Chalmers had already endorsed the view of information as involving specification, or reduction of uncertainty. IIT often echoes this, especially in connection with the third axiom and postulate. Chalmers also characterizes information as making a difference, which relates to IIT’s first postulate especially. These notions of specification and difference-making are familiar from the standard Shannon account of information, but the point is that both Chalmers and IIT choose to understand consciousness partly by reference to these concepts rather than to information about something.

A major theme in Chalmers’s work involves addressing the problem of the precise nature of the connection between physical systems and consciousness. Before IIT, Chalmers speculated that information seems to be the connection between the two. If this is the case, then phenomenology is the realization of information. Chalmers suggests that information itself may be primitive in the way that mass or charge is. Now, this does not directly align with IIT’s later claims of how consciousness is fundamental to certain informational systems in the same way that mass or charge is fundamental to particles, but the parallels are clear enough. Similarly, Chalmers’s description of a “minimally sufficient neural system” as the neural correlate of consciousness resembles IIT’s discussion of the MICS. Both also use the term “core” in this context. It comes as no surprise that IIT and Chalmers have intersected when we read Chalmers’s earlier claims: “Perhaps, then, the intrinsic nature required to ground the information states is closely related to the intrinsic nature present in phenomenology. Perhaps one is even constitutive of the other” (Chalmers, 1996).

Still, the relationship between Chalmers’s work and IIT is not one of simple alliance. Despite the apparent similarity of their positions on what is fundamental, there is an important disagreement. Chalmers takes the physical to derive from the informational, and grounds the realization of phenomenal space—the instantiation of conscious experience—not upon causal “differences that make a difference,” but upon the intrinsic qualities of and structural relations among experiences. IIT regards consciousness as being intrinsic to certain causal structures, which might be read as the reverse of Chalmers’s claim.  In describing his path to IIT, Koch endorses Chalmers as “a philosophical defender of information theory’s potential for understanding consciousness” while faulting Chalmers’s work for not addressing the internal organization of conscious systems (Koch, 2012). To Koch, treating the architecture is necessary because consciousness does not alter in simple covariance with change in amounts of bits of information. Because IIT addresses the physical organization, it struck Koch as superior.

There has been some amount of cross-reference between IIT and Chalmers in the literature, although important differences are apparent. For example, Chalmers famously discusses the possibility of a zombie, or operating physical match of a human that does not have experience. On IIT, functional zombies are possible, but not zombies whose nervous connections duplicate our own. In other words, if a machine were built to imitate the behavior of a human perfectly, but whose hardware involved feed-forward circuits, then it would generate possibly no phi, or more likely low, local phi, rather than the high phi of human consciousness. But if we posit that the machine replicates the connections of the human down to the level of the hardware, then it would follow that the system would integrate the same level of phi and would be equally conscious.

Chalmers (2016) writes that IIT “can be construed as a form of emergent panpsychism,” which is true in a sense, but requires qualification. By “emergent panpsychism” Chalmers means that IIT posits consciousness as fundamental not to the merest elements of existence but to certain structures that emerge at the level of particles’ relations to one another.  This is a fair assessment, whether or not IIT’s advocates choose to label the theory this way. But in the difference between emergent and non-emergent lies the substance of IIT: what precisely makes one structure “emerge” conscious and another not is what IIT hopes to explain. Non-emergent panpsychism of the sort associated with Chalmers by definition pitches its explanation at a different level. Indeed, it does not necessarily grant the premise that there exist non-conscious elements, let alone structures, in the first place. Despite the similarities between IIT and some of Chalmers’s work, the two should not be confused.

4. Implications

a. The Spectrum of Consciousness

It is widely accepted that humans experience varying degrees of consciousness. In sleep, for example, the richness of experience diminishes, and sometimes we do not experience at all. IIT implies that brain activity during this time will generate either less information or less integrated information, and interprets experimental results as bearing this out. On IIT, the complexity of physical connections in the MICS corresponds to the level of consciousness. By contrast, the cerebellum, which has many neurons, but neurons that are not complexly interconnected and so do not belong to the MICS, does not generate consciousness.

There does not exist a widely accepted position on non-human consciousness. IIT counts among its merits that the principles it uses to characterize human consciousness can apply to non-human cases.

On IIT, consciousness happens when a system makes a difference to itself at a physical level: elements causally connected to one another in a re-entrant architecture integrate information, and the subset of these with maximal causal power is conscious. The human brain offers an excellent example of re-entrant architecture integrating information, capable of sustaining highly complex MICSs, but nothing in IIT limits the attribution of consciousness to human brains only.

Mammalian brains share similarities in neural and synaptic structure: the human case is not obviously exceptional. Other, non-mammalian species demonstrate behavior associated in humans with consciousness. These considerations suggest that humans are not the only species capable of consciousness. IIT makes a point of remaining open to the possibility that many other species may possess at least some degree of consciousness. At the same time, further study of non-human neuroanatomy is required to determine whether and how this in facts holds true. As mentioned above, even the human cerebellum does not have the correct architecture to generate consciousness, and it is possible that other species have neural organizations that facilitate complex behavior without generating high phi. The IIT research program offers a way to establish whether these other systems are more like the cerebellum or the cerebral cortex in humans. Of course, consciousness levels will not correspond completely to species alone. Within conscious species, there will be a range of phi levels, and even within a conscious phenotype, consciousness will not remain constant from infancy to death, wakefulness to sleep, and so forth.

IIT claims that its principles are consistent with the existence of cases of dual consciousness within split-brain patients. In such instances, on IIT, two local maxima of integrated information exist separately from one another, generating separate consciousness. IIT does not hold that a system need have only one local maximum, although this may be true of normal brains; in split-brain patients, the re-entrant architecture has been severed so as to create two. IIT also takes its identification of MICSs through quantification of phi as a potential tool for assessing other actual or possible cases of multiple consciousness within one brain.

Such claims also allow IIT to rule out instances of aggregate consciousness. The exclusion principle forbids double-counting of consciousness. A system will have various subsystems with phi value, but only the local maxima of phi within the system can be conscious. A normal waking human brain has only one conscious MICS, and even a split-brain patient’s conscious systems do not overlap but rather are separate. One’s conscious experience is precisely what it is and nothing else. All this implies that, for example, the United States of America has no superordinate consciousness in addition to the consciousness of its individuals. The local maxima of integrated information reside within the skulls of those individuals; the phi value of the connections among them is much lower.

Although IIT allows for a potentially very wide range of degrees of consciousness and conscious entities, this has its limits. Some versions of panpsychism attribute mental properties to even the most basic elements of the structure of the world, but the simplest conscious entity admitted on IIT to be conscious would have to be a system of at least two elements that have cause-effect power over one another. Otherwise no integrated information exists. Objects such as rocks and grains of sand have no phi, whether in isolation or heaped into an aggregate, and so no consciousness.

IIT’s criteria for consciousness are consistent with the existence of artificial consciousness. The photodiode, because it integrates information, has a phi value; if not subsumed into a system of higher phi, this will count as local maximum: the simplest possible MICS or conscious system. Many or most instances of phi and consciousness may be the result of evolution in nature, independent of human technology, but this is a contingent fact. Often technological systems involve feed-forward architecture that lowers or possibly eliminates phi, but if the system is physically re-entrant and satisfies the other criteria laid out by IIT, it may be conscious. In fact, according to IIT, we may build artificial systems with a greater degree of consciousness than humans.

b. IIT and Physics

IIT has garnered some attention in the physics literature. Even if one accepts the basic principles of IIT, it still remains open to offer a different account of the physical particulars. Tononi and the other proponents of IIT coming from neuroscientific backgrounds tend to offer description at a classical grain. They frame integrated information by reference to neurons and synapses in the case of brains, and to re-entrant hardware architecture in the case of artificial systems. Such descriptions stay within the classical physics paradigm. This does not exhaust the theoretical problem space for characterizing integrated information.

One alternative (Barrett, 2014) proposes that consciousness comes from the integration of information intrinsic to fundamental fields. This account calls for reconceiving the phi metric, which in its 2016 form applies only to discrete systems, and not to electromagnetic fields.  Another account (Tegmark, 2015) also proposes non-classical physical description of conscious, integrated information. This generalizes beyond neural or neural-type systems to quantum systems, suggesting that consciousness is a state of matter, whimsically labelled “perceptronium.”

c. Artificial Consciousness

IIT’s basic arguments imply, and the IIT literature often explicitly claims, certain important constraints upon artificial conscious systems.

i. Constraints on Structure/Architecture

At the level of hardware, computation may process information with either feed-forward or re-entrant architecture. In feed-forward systems, information gets processed in only one direction, taking input and giving output. In re-entrant systems, which consist of feedback loops, signals are not confined to movement in one direction only; output may operate as input also.

IIT interprets the integration axiom (the fourth axiom, which says that each experience’s phenomenological elements are interdependent) as entailing the fourth postulate, which claims that each mechanism of a conscious system must have the potential to relate causally to the other mechanisms of that system. By definition, in a feed-forward system, mechanisms cannot act as causes upon those parts of the system from which they take input. A purely feed-forward system would have no phi, because although it would process information, it would not integrate that information at the physical level.

One implication for artificial consciousness is immediately clear: Feed-forward architectures will not be conscious. Even a feed-forward system that perfectly replicated the behavior of a conscious system would only simulate consciousness. Artificial systems would need to have re-entrant structure to generate consciousness.

Furthermore, re-entrant systems may still generate very low levels of phi. Conventional CPUs have transistors that only communicate with several others. By contrast, each neuron of the conscious network of the brain connects with thousands of others, a far more complex re-entrant structure, making a difference to itself at the physical level in such a way as to generate much higher phi value. For this reason, brains are capable of realizing much richer consciousness than conventional computers. The field of artificial consciousness, therefore, would do well to emulate the neural connectivity of the brain.

Still another constraint applies, this one associated with the fifth postulate, the postulate of exclusion. A system may have numerous phi-generating subsystems, but according to IIT, only the network of elements with the greatest cause-effect power to integrate information—the maximally irreducible conceptual structure, or MICS—is conscious. Re-entrant systems may have local maxima of phi, and therefore small pockets of consciousness. Those attempting to engineer high degrees of artificial consciousness need to focus their design on creating a large MICS, not simply small, non-overlapping MICSs.

If IIT is correct in placing such constraints upon artificial consciousness, deep convolutional networks such as GooGleNet and advanced projects like Blue Brain may be unable to realize high levels of consciousness.

ii. Relation to “Silent Neurons”

IIT’s third postulate has a somewhat counterintuitive implication. The third axiom claims that each conscious experience is precisely what it is; that is, it is distinct from other experiences. The third postulate claims that, in order to realize this feature of consciousness, a system must have a range of possible states, describable by reference to the cause-effect repertoires of its mechanistic elements. The system realizes a specific conscious state by instantiating one of those particular physical arrangements.

An essential component of the phenomenology of a conscious state is the degree of specificity: a photodiode that registers light specifies one of only two possible states. IIT accepts that such a simple mechanism, if not subsumed under a larger MICS, must be conscious, but only to a negligible degree. On the other hand, when a human brain registers light, it distinguishes it from countless other states; not only from dark, but from different shades of color, from sound, and so forth. The brain state is correspondingly more informative than the photodiode state.

This means that not only active neuronal firing, but also neuronal silence, determines the nature of a conscious state. Inactive parts of the complex, as well as active ones, contribute to the specification at the physical level, which IIT takes as the realization of the conscious state.

It is important not to conflate silent, or inactive, neurons of this kind with de-activated neurons. Only neurons that genuinely fall within the cause-effect repertoires of the mechanistic elements of the system count as contributing to specification, and this applies to inactive as well as to active ones. If a neuron was incapable all along of having causal power within the MICS, its inactivity plays no role in generating phenomenology. Likewise, IIT predicts that should neurons otherwise belonging to cause-effect repertoires of the system be rendered incapable of such causation (for example, by optogenetics), their inactivity would not contribute to phenomenology.

5. Objections

a. The Functionalist Alternative

According to functionalism, mental states, including states of consciousness, find explanation by appeal to function. The nature of a certain function may limit the possibilities for its physical instantiation, but the function, and not the material details, is of primary relevance. IIT differs from functionalism on this basic issue: on IIT, the conscious state is identified with the way in which a system embodies the physical features that IIT’s postulates describe.

Their opposing views concerning constraints upon artificial consciousness nicely illustrate the contrast between functionalism and IIT. For the functionalist, any system that functions identically to, for example, a conscious human, will by definition have consciousness. Whether the artificial system uses re-entrant or feed-forward architecture is a pragmatic matter. It may turn out that re-entrant circuitry more efficiently realizes the function, but even if the system incorporates feed-forward engineering, so long as the function is achieved, the system is conscious. IIT, on the other hand, expressly claims that a system that performed in a way completely identical to a conscious human, but that employed feed-forward architecture, would only simulate, but not realize consciousness. Put simply, such a system would operate as if it were integrating information, but because its networks would not take output as input, would not actually integrate information at the physical level. The difference would not be visible to an observer, but the artificial system would have no conscious experience.

i. Rejecting Cartesian Commitments

Those who find functionalism unsatisfactory often take it as an inadequate account of phenomenology: no amount of description of functional dynamics seems to capture, for example, our experience of the whiteness of a cue ball. Indeed, IIT entertains even broader suspicions. Beginning with descriptions of physical systems may never lead to explanations of consciousness. Rather, IIT’s approach begins with what it takes to be the fundamental features of consciousness. These self-evident, Cartesian descriptors of phenomenology then lead to postulates concerning their physical realization; only then does IIT connect experience to the physical.

This methodological respect for Cartesian intuitions has a clear appeal, and the IIT literature largely takes this move for granted, rather than offering outright justification for it. In previous work with Edelman (2000), Tononi discusses machine-state functionalism, an early form of functionalism that identified a mental state entirely with its internal, “machine” state, describable in functional terms. Noting that Putnam, machine-state functionalism’s first advocate, came to abandon the theory because meanings are not sufficiently fixed by internal states alone, Tononi rejects functionalism generally. More recently, Koch (2012) describes much work in consciousness as “models that describe the mind as a number of functional boxes” where one box is “magically endowed with phenomenal awareness.” Koch confesses to being guilty of this in some of his earlier work. He then points to IIT as an exception.

Functionalism is not receiving a full or fair hearing in these instances. Machine-state functionalism is a straw man: contemporary versions of functionalism do not commit to an entirely internal explanation meaning, and not all functionalist accounts are subject to the charge of arbitrarily attributing consciousness to one part of a system. The success or failure of functionalism turns on its treatment of the Cartesian intuitions we all have that consciousness is immediate, unitary, and so on. Rather than taking these intuitions as evidence of the unavoidable truth of what IIT describes in its axioms, functionalism offers a subtle alternative. Consciousness indeed seems to us direct and immediate, but functionalists argue that this “seeming” can be adequately accounted for without positing a substantive phenomenality beyond function. Functionalists claim that the seeming immediacy of consciousness receives sufficient explanation as a set of beliefs and dispositions to believe that consciousness is immediate. The challenge lies in giving a functionalist account of such beliefs: no mean feat, but not the deep mystery that non-functionalists construe consciousness as posing. If functionalism is correct in this characterization of consciousness, it undercuts the very premises of IIT.

ii. Case Study: Access vs. Phenomenal Consciousness

Function may be understood in terms of access. If a conscious system has cognitive access to an association or belief, then that association or belief is conscious. In humans, access is often taken to be demonstrated by verbal reporting, although other behaviors may indicate cognitive access. Functionalists hold that cognitive access exhaustively describes consciousness (Cohen and Dennett, 2012). Others hold that subjects may be phenomenally conscious of stimuli without cognitively accessing them.

Interpretation of the relevant empirical studies is a matter of controversy. The phenomenon known as “change blindness” occurs when a subject fails to notice subtle differences between two pictures, even while reporting thoroughly perceiving each. Dennett’s version of functionalism, at least, interprets this as the subject not having cognitive access to the details that have changed, and moreover as not being conscious of them. The subject overestimates the richness of his or her conscious perception. Certain non-functionalists claim that the subject does indeed have the reported rich conscious phenomenology, even though cognitive access to that phenomenal experience is incomplete. Block (2011), for instance, holds this interpretation, claiming that “perceptual consciousness overflows cognitive access.” On this account, phenomenal consciousness may occur even in the absence of access consciousness.

IIT’s treatment of the role of silent neurons aligns with the non-functionalist interpretation. On IIT, a system’s consciousness grows in complexity and richness as the number of elements that could potentially relate causally within the MICS grows. Such elements, even when inactive, contribute to the specification of the integrated information, and so help to fix the phenomenal nature of the experience. In biological systems, this means that silent but potentially active neurons matter to consciousness.

Such silent neurons are not accessed by the system. These non-accessed neurons still contribute to consciousness. As in Block’s non-functionalism, access is not necessary for consciousness. On IIT, it is crucial that these neurons could potentially be active, so they must be accessible to the system. Block’s account is consistent with this in that he claims that the non-accessed phenomenal content need not be inaccessible. (Koch, separately from his support of IIT, takes the non-functionalist side of this argument (Koch and Tsuchiya, 2007); so do Fahrenfort and Lamme (2012); for a functionalist response to the latter, see Cohen and Dennett (2011, 2012).)

Non-functionalist accounts that argue for phenomenal consciousness without access make sense given a rejection of the functionalist claim that phenomenality may be understood as a set of beliefs and associations, rather than a Cartesian, immediate phenomenology beyond such things.

iii. Challenging IIT’s Augmentation of Naturalistic Ontology

Any account of consciousness that maintains that phenomenal experience is immediately first-personal stands in tension with naturalistic ontology, which holds that even experience in principle will receive explanation without appeal to anything beyond objective, or third-personal, physical features. Among theories of consciousness, those versions of panpsychism that attribute mental properties to basic structural elements depart perhaps most obviously from the standard scientific position. Because IIT limits its attribution of consciousness to particular physical systems, rather than to, for example, particles, it constitutes a somewhat more conservative position than panpsychism. Nevertheless, IIT’s claims amount to a radical reconception of the ontology of the physical world.

IIT’s allegiance to a Cartesian interpretation of experience from the outset lends itself to a non-naturalistic interpretation, although not every step in IIT’s argumentation implies a break from standard scientific ontology. IIT counts among its innovations the elucidation of integrated information, achieved when a system’s parts make a difference intrinsically, to the system itself. This differs from observer-relative, or Shannon, information, but by itself stays within the confines of naturalism: for example, IIT could have argued that integrated information constitutes an efficient functional route to realizing states of awareness.

Instead, IIT makes the much stronger claim that such integrated information, provided it is locally maximal, is identical to consciousness. The IIT literature is quite explicit on this point, routinely offering analogies to other fundamental physical properties. Consciousness is fundamental to integrated information in the same way as it is fundamental to mass that space-time bends around it. The degree and nature of any given phenomenal feeling follow basically from the particular conceptual structure that is the integrated information of the system. Consciousness is not a brute property of physical structure per se, as it is in some versions of panpsychism, but it is inextricable from physical systems with certain properties, just as mass or charge is inextricable from some particles. So, IIT is proposing an addition to what science admits into its ontology.

The extraordinary nature of the claim does not necessarily undermine it, but it may be cause for reservation. One line of objection to IIT might claim that this augmentation of naturalistic ontology is non-explanatory, or even ad hoc. We might accept that biological conscious systems possess neurology that physically integrates information in a way that converges with phenomenology (as outlined in the relation of the postulates to the axioms) without taking this as sufficient evidence for an identity relation between integrated information and consciousness.  In response, IIT advocates might claim that the theory’s postulates give better ontological ground than functionalism for picking out systems in the first place.

b. Aaronson’s Reductio ad Absurdum

The computer scientist Scott Aaronson (on his blog Shtetl-Optimized; see Horgan (2015) for an overview) has compelled IIT to admit a counterintuitive implication. Certain systems, which are computationally simple and seem implausible candidates for consciousness, may have values of phi higher even than those of human brains, and would count as conscious on IIT. Aaronson’s argument is intended as a reductio ad absurdum; the IIT response has been to accept its conclusion, but to deny the charge of absurdity. Aaronson’s basic claim involves applying phi calculation. Advocates of IIT have not questioned Aaronson’s mathematics, so the philosophical relevance lies in the aftermath.

IIT refers to richly complex systems such as human brains or hypothetical artificial systems in order to illustrate high phi value. Aaronson points out that systems that strike us as much simpler and less interesting will sometimes yield a high phi value. The physical realization of an expander graph (his example) could have a higher phi value than a human brain. A graph has points that connect to one another, making the points vertices and the connections edges. This may be thought of as modelling communication between points. Expander graphs are “sparse” – having not very many points – but those points are highly connected, and this connectivity means that the points have strong communication with one another. In short, such graphs have the right properties for generating high phi values. Because it is absurd to accept that a physical model of an expander graph could have a higher degree of consciousness than a human being, the theory that leads to this conclusion, IIT, must be false.

Tononi (2014) responds directly to this argument, conceding that Aaronson has drawn out the implications of IIT and phi fairly, even ceding further ground: a two-dimensional grid of logic gates, even simpler than an expander graph, would have a high phi value and would, according to IIT, have a high degree of consciousness. Tononi has already argued that a photodiode has minimal consciousness; to him, accepting where Aaronson’s reasoning leads is just another case of the theory producing surprising results. After all, science must be open to theoretical innovation.

Aaronson’s rejoinder challenges IIT by arguing that it implicitly holds inconsistent views on the role of intuition. In his response to Aaronson’s original claims, Tononi disparages intuitions regarding when a system is conscious: Aaronson should not be as confident as he is that expander graphs are not conscious. Indeed, the open-mindedness here suggested seems in line with the proper scientific attitude. Aaronson employs a thought-experiment to draw out what he takes to be the problem. Imagine that a scientist announces that he has discovered a superior definition of temperature and has constructed a new thermometer that reflects this advance. It so happens that the new thermometer reads ice as being warmer than boiling water. According to Aaronson, even if there is merit to the underlying scientific work, it is a mistake for the scientist to use the terms “temperature” or “heat” in this way, because it violates what we mean by those terms in the first place: “heat” means, partly, what ice has less of than boiling water. So, while IIT’s phi metric may have some merit, it is not in measuring consciousness degree, because “consciousness” means, partly, what humans have and expander graphs and logic gates do not have.

One might, in defense of IIT, respond by claiming that the cases are not as similar as they seem, that the definition of heat necessitates that ice has less of it than boiling water and that the definition of consciousness does not compel us to draw conclusions about expander graphs’ non-consciousness, strange as that might seem. Aaronson’s argument goes further, however, and it is here that the charge of inconsistency comes into play. Tononi’s answer to Aaronson’s original reductio argument partly relies upon claiming that facts such as that the cerebellum is not conscious are totally well-established and uncontroversial. (IIT predicts this because the wiring of the cerebellum yields a low phi and is not part of the conscious MICS of the brain.) Here, argues Aaronson, Tononi is depending upon intuition, but it is possible that although the cerebellum might not produce our consciousness, it may have one of its own. Aaronson is not arguing for the consciousness of the cerebellum, but rather pointing out an apparent logical contradiction. Tononi rejects Aaronson’s claim that expander graphs are not conscious because it relies on intuition, but here Tononi himself is relying upon intuition. Nor can Tononi here appeal to common sense, because IIT’s acceptance of expander graphs and logic gates as conscious flies in the face of common sense.

It is possible that IIT might respond to this serious charge by arguing that almost everyone agrees that the brain is conscious, and that IIT has more success than any other theory in accounting for this while preserving many of our other intuitions (that animals, infants, certain patients with brain-damage, and sleeping adults all have dimmer consciousness than adult waking humans, to give several examples). Because this would accept a certain role for intuitions, it would require walking back the gloss on intuition that Tononi has offered in response to Aaronson’s reductio. Moreover, Aaronson’s arguments show that such a defense of the overall intuitive plausibility of IIT will face difficult challenges.

c. Searle’s Objection

In one of very few published discussions of IIT by a philosopher, John Searle (2013a) has come out against it, criticizing its emphasis on information as a departure from the more promising “biological approach.” His objections may be divided into two parts; Koch and Tononi (2013) have offered a response.

First, Searle claims that in identifying consciousness with a certain kind of information, it has abandoned causal explanation. Appeal to cause should be the proper route for scientific explanation, and Searle maintains, as he has throughout his career, that solving the mystery of consciousness will depend upon the explication of the causal powers special to the brain that give rise to experience. Information fails aptly to address the problem because it is observer-relative; indeed, it is relative to the conscious observer. A book or computer, to take typical examples of objects associated with information, does not contain information except insofar as it is endowed by the conscious subject. Information is in the eye of the beholder. The notion of information presupposes consciousness, rather than explaining it.

Second, according to Searle, IIT leads to an absurdity, namely panpsychism, which is sufficient reason to reject it. He interprets IIT as imputing consciousness to all systems with causal relations, and so it follows on IIT that consciousness is “spread over the universe like a thin veneer of jam.” A successful theory of consciousness will have to appreciate that consciousness “comes in units” and give a principled account of how and why this is the case.

Koch and Tononi’s response (2013) addresses both strands of Searle’s argument. First, they agree that Shannonian information is observer-relative, but point out that integrated information is non-Shannonian. IIT defines integrated information as necessarily existing with respect to itself, which they understand in expressly causal terms, as a system whose parts make a difference to that system. Integrated information systems therefore exist intrinsically, rather than relative to observers. Not only does IIT attend to the observer-relativity point, then, but also does so in a way that, contrary to Searle’s characterization, crucially incorporates causality.

Second, they deny that IIT implies the kind of panpsychism that Searle rejects as absurd. As they point out, IIT only attributes consciousness to local maxima of integrated information (MICS), and although that implies that some simple systems such as the isolated photodiode have a minimal degree of consciousness, it provides a principle to determine which “units” are conscious, and which are not. As Tononi had already put it, before Searle’s charges: “How close is this position to panpsychism, which holds that everything in the universe has some kind of consciousness? Certainly, the IIT implies that many entities, as long as they include some functional mechanisms that can make choices between alternatives, have some degree of consciousness. Unlike traditional panpsychism, however, the IIT does not attribute consciousness indiscriminately to all things. For example, if there are no interactions, there is no consciousness whatsoever. For the IIT, a camera sensor as such is completely unconscious…” (Tononi, 2008).

Although Searle offers a rejoinder (2013b) to Tononi and Koch’s response, it largely rehearses the original claims. Regardless of whether IIT is true, Tononi and Koch have given good reason to read it as addressing precisely the concerns that Searle raises. Arguably, then, Searle might have reason to embrace IIT as a theory of consciousness that at least attempts a principled articulation of the special casual powers of the brain, which Searle has regarded for many years as the proper domain for explaining consciousness.

6. References and Further Reading

  • Barrett, Adam. “An Integration of Integrated Information Theory with Fundamental Physics.” Frontiers in Psychology, 5 (63). 2014.
    • Calls for a re-conception of phi with respect to electromagnetic fields.
  • Block, Ned. “Perceptual Consciousness Overflows Cognitive Access.” Trends in Cognitive Science, 15 (12). 2011.
    • Argues for the distinction between access and phenomenal consciousness.
  • Chalmers, David. The Conscious Mind. New York: Oxford University Press. 1996.
    • A major work, relevant here for its discussions of information, consciousness, and panpsychism.
  • Chalmers, David. “The Combination Problem for Panpsychism.” In L. Jaskolla and G. Bruntup (Ed.s) Panpsychism. Oxford University Press. 2016.
    • Updated take on a classical problem; Chalmers makes reference to IIT here.
  • Cohen, Michael, and Daniel Dennett. “Consciousness Cannot be Separated from Function.” Trends in Cognitive Science, 15 (8). 2011.
    • Argues for understanding phenomenal consciousness as access consciousness.
  • Cohen, Michael, and Daniel Dennett. “Response to Fahrenfort and Lamme: Defining Reportability, Accessibility and Sufficiency in Conscious Awareness.” Trends in Cognitive Science, 16 (3). 2012.
    • Further defends understanding phenomenal consciousness as access consciousness.
  • Dennett, Daniel. Consciousness Explained. Little, Brown and Co. 1991.
    • Classic, comparatively accessible teleofunctionalist account of consciousness.
  • Dennett, Daniel. Sweet Dreams: Philosophical Obstacles to a Science of Consciousness. London, England: The MIT Press. 2005.
    • A concise but wide-ranging, updated defense of functionalist explanation of consciousness.
  • Edelman, Gerald. The Remembered Present: A Biological Theory of Consciousness. New York: Basic Books. 1989.
    • Influential upon Tononi’s early thinking.
  • Edelman, Gerald, and Giulio Tononi. A Universe of Consciousness: How Matter Becomes Imagination. New York: Basic Books. 2000.
    • Puts forward many of the arguments that later constitute IIT.
  • Fahrenfort, Johannes, and Victor Lamme. “A True Science of Consciousness Explains Phenomenology: Comment on Cohen and Dennett.” Trends in Cognitive Science, 16 (3). 2012.
    • Argues for the access/phenomenal division supported by Block.
  • Horgan, John. “Can Integrated Information Theory Explain Consciousness?” Scientific American. 1 December 2015. http://blogs.scientificamerican.com/cross-check/can-integrated-information-theory-explain-consciousness/
    • Gives an overview of an invitation-only workshop on IIT at New York University that featured Tononi, Koch, Aaronson, and Chalmers, among others.
  • Koch, Christof. Consciousness: Confessions of a Romantic Reductionist. The MIT Press. 2012.
    • Intellectual autobiography, in part detailing the author’s attraction to IIT.
  • Koch, Christof, and Naotsugu Tsuchiya. “Phenomenology Without Conscious Access is a Form of Consciousness Without Top-down Attention.” Behavioral and Brain Sciences, 30 (5-6) 509-10. 2007.
    • Also argues for the access/phenomenal division supported by Block.
  • Koch, Christof, and Giulio Tononi. “Can a Photodiode Be Conscious?” New York Review of Books, 7 March 2013.
    • Responds to Searle’s critique.
  • Oizumi, Masafumi, Larissa Albantakis, and Giulio Tononi. “From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0.” PLOS Computational Biology, 10 (5). 2014. Doi: 10.1371/journal.pcbi.1003588.
    • Technically-oriented introduction to IIT.
  • Searle, John. “Minds, brains and programs.” In Hofstadter, Douglas and Daniel Dennett, (Eds.). The Mind’s I: Fantasies and Reflections on Self and Soul (pp. 353-373). New York: Basic Books. 1981.
    • Searle’s classic paper on intentionality and information.
  • Searle, John. The Rediscovery of Mind. Cambridge, MA: The MIT Press. 1992.
    • Fuller explication of Searle’s views on intentionality and information.
  • Searle, John. “Can Information Theory Explain Consciousness?” New York Review of Books 10 January 2013(a).
    • Objects to IIT.
  • Searle, John. “Reply to Koch and Tononi.” New York Review of Books. 7 March 2013(b).
    • Rejoinder to Koch and Tononi’s response to his objections to IIT.
  • Tegmark, Max. “Consciousness as a State of Matter.” Chaos, Solitons & Fractals. 2015.
    • Proposes a re-conception of phi, at the quantum level.
  • Tononi, Giulio. “An Information Integration Theory of Consciousness.” BMC Neuroscience, 5:42. 2004.
    • The earliest explicit introduction to IIT.
  • Tononi, Giulio. “Consciousness as Integrated Information: A Provisional Manifesto.” Biology Bulletin, 215: 216–242. 2008.
    • An early overview of IIT.
  • Tononi, Giulio. “Integrated Information Theory.” Scholarpedia, 10 (1). 2015. http://www.scholarpedia.org/w/index.php?title=Integrated_information_theory&action=cite&rev=147165.
    • A thorough synopsis of IIT.
  • Tononi, Giulio, and Gerald Edelman. “Consciousness and Complexity.” Science, 282 (5395). 1998.
    • Anticipates some of IIT’s claims.
  • Tononi, Giulio, and Christof Koch. “Consciousness: Here, There and Everywhere?” Philosophical Transactions of the Royal Society, Philosophical Transactions B, 370 (1668). 2015. Doi: 10.1098/rstb.2014.0167
    • Perhaps the most accessible current introduction to IIT, from the perspective of its founder and chief proponent.

 

Author Information

Francis Fallon
Email: Fallonf@stjohns.edu
St. John’s University
U. S. A.

Scientific Realism and Antirealism

Debates about scientific realism concern the extent to which we are entitled to hope or believe that science will tell us what the world is really like. Realists tend to be optimistic; antirealists do not. To a first approximation, scientific realism is the view that well-confirmed scientific theories are approximately true; the entities they postulate do exist; and we have good reason to believe their main tenets. Realists often add that, given the spectacular predictive, engineering, and theoretical successes of our best scientific theories, it would be miraculous were they not to be approximately correct. This natural line of thought has an honorable pedigree yet has been subject to philosophical dispute since modern science began.

In the 1970s, a particularly strong form of scientific realism was advocated by Putnam, Boyd, and others. When scientific realism is mentioned in the literature, usually some version of this is intended. It is often characterized in terms of these commitments:

  • Science aims to give a literally true account of the world.
  • To accept a theory is to believe it is (approximately) true.
  • There is a determinate mind-independent and language-independent world.
  • Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.
  • The progress of science asymptotically converges on a true account.

 

Table of Contents

  1. Brief History before the 19th Century
  2. The 19th Century Debate
    1. Poincaré’s Conventionalism
    2. The Reality of Forces and Atoms
    3. The Aim of Science: Causal Explanation or Abstract Representation?
  3. Logical Positivism
    1. General Background
    2. The Logical Part of Logical Positivism
    3. The Positivism Part of Logical Positivism
  4. Quine’s Immanent Realism
  5. Scientific Realism
    1. Criticisms of the Observational-Theoretical Distinction
    2. Putnam’s Critique of Positivistic Theory of Meaning
    3. Putnam’s Positive Account of Meaning
    4. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science
    5. Inference to the Best Explanation
  6. Constructive Empiricism
    1. The Semantic View of Theories and Empirical Adequacy
    2. The Observable-Unobservable Distinction
    3. The Argument from Empirically Equivalent Theories
    4. Constructive Empiricism, IBE, and Explanation
  7. Historical Challenges to Scientific Realism
    1. Kuhn’s Challenge
    2. Laudan’s Challenge: The Pessimistic Induction
  8. Semantic Challenges to Scientific Realism
    1. Semantic Deflationism
    2. Pragmatist Truth Surrogates
    3. Putnam’s Internal Realism
  9. Law-Antirealism and Entity-Realism
  10. NOA: The Natural Ontological Attitude
  11. The 21st Century Debates
    1. Structuralism
    2. Stanford’s New Induction
    3. Selective Realism
  12. References and Further Reading

1. Brief History before the 19th Century

The debate begins with modern science. Bellarmine advocated an antirealist interpretation of Copernicus’s heliocentrism—as a useful instrument that saved the phenomena—whereas Galileo advocated a realist interpretation—the planets really do orbit the sun. More generally, 17th century protagonists of the new sciences advocated a metaphysical picture: nature is not what it appears to our senses—it is a world of objects (Descartes’ matter-extension, Boyle’s corpuscles, Huygens’ atoms, and so forth) whose primary properties (Cartesian extension, or the sizes, shapes, and hardness of atoms and corpuscles, and/or forces of attraction or repulsion, and so forth) are causally responsible for the phenomena we observe. The task of science is “to strip reality of the appearances covering it like a veil, in order to see the bare reality itself” (Duhem 1991).

This metaphysical picture quickly led to empiricist scruples, voiced by Berkeley and Hume. If all knowledge must be traced to the senses, how can we have reason to believe scientific theories, given that reality lies behind the appearances (hidden by a veil of perception)? Indeed, if all content must be traced to the senses, how can we even understand such theories? The new science seems to postulate “hidden” causal powers without a legitimate epistemological or semantic grounding. A central problem for empiricists becomes that of drawing a line between objectionable metaphysics and legitimate science (portions of which seem to be as removed from experience as metaphysics seems to be). Kant attempted to circumvent this problem and find a philosophical home for Newtonian physics. He rejected both a veil of perception and the possibility of our representing the noumenal reality lying behind it. The possibility of making judgments depends on our having structured what is given: experience of x qua object requires that x be represented in space and time, and judgments about x require that x be located in a framework of concepts. What is real and judgable is just what is empirically real—what fits our system of representation in the right way—and there is no need for, and no possibility of, problematic inferences to noumenal goings-on. In pursuing this project Kant committed himself to several claims about space and time—in particular that space must be Euclidean, which he regarded as both a priori (because a condition of the possibility of our experience of objects) and synthetic (because not derivable from analytical equivalences)—which became increasingly problematic as 19th century science and mathematics advanced.

2. The 19th Century Debate

Many features of the contemporary debates were fashioned in 19th century disputes about the nature of space and the reality of forces and atoms. The principals of these debates—Duhem, Helmholtz, Hertz, Kelvin, Mach, Maxwell, Planck, and Poincaré—were primarily philosopher-physicists. Their separation into realists and antirealists is complicated, but Helmholtz, Hertz, Kelvin, Maxwell, and Planck had realist sympathies and Duhem, Mach, and Poincaré had antirealist doubts.

a. Poincaré’s Conventionalism

By the late 19th century several consistent non-Euclidean geometries, mathematically distinct from Euclidean geometry, had been developed. Euclidean geometry has a unique parallels axiom and angle sum of triangles equals 180º, whereas, for example, spherical geometry has a zero-parallel axiom and angle sum of triangles greater than or equal to 180º. These geometries raise the possibility that physical space could be non-Euclidean. Empiricists think we can determine whether physical space is Euclidean through experiments. For example, Gauss allegedly attempted to measure the angles of a triangle between three mountaintops to test whether physical space is Euclidean. Realists think physical space has some determinate geometrical character even if we cannot discover what character it has. Kantians think that physical space must be Euclidean because only Euclidean geometry is consistent with the form of our sensibility.

Poincaré (1913) argued that empiricists, realists, and Kantians are wrong: the geometry of physical space is not empirically determinable, factual, or synthetic a priori. Suppose Gauss’s experiment gave the angle-sum of a triangle as 180º. This would support the hypothesis that physical space is Euclidean only under certain presuppositions about the coordination of optics with geometry: that the shortest path of an undisturbed light ray is a Euclidean straight line. Instead, for example, the 180º measurement could also be accommodated by presupposing that light rays traverse shortest paths in spherical space but are disturbed by a force, so that physical space is “really” non-Euclidean: the true angle-sum of the triangle is greater than 180º, but the disturbing force makes it “appear” that space is Euclidean and the angle-sum of the triangle is 180º.

Arguing that there is no fact of the matter about the geometry of physical space. Poincaré proposed conventionalism: we decide conventionally that geometry is Euclidean, forces are Newtonian, light travels in Euclidean straight lines, and we see if experimental results will fit those conventions. Conventionalism is not an “anything-goes” doctrine—not all stipulations will accommodate the evidence—it is the claim that the physical meaning of measurements and evidence is determined by conventionally adopted frameworks. Measurements of lines and angles typically rely on the hypothesis that light travels shortest paths. But this lacks physical meaning unless we decide whether shortest paths are Euclidean or non-Euclidean. These conventions cannot be experimentally refuted or confirmed since experiments only have physical meaning relative to them. Which group of conventions we adopt depends on pragmatic factors: other things being equal, we choose conventions that make physics simpler, more tractable, more familiar, and so forth. Poincaré, for example, held that, because of its simplicity, we would never give up Euclidean geometry.

b. The Reality of Forces and Atoms

Ever since Newton, a certain realist ideal of science was influential: a theory that would explain all phenomena as the effects of moving atoms subject to forces. By the 1880s many physicists came to doubt the attainability of this ideal since classical mechanics lacked the tools to describe a host of terrestrial phenomena: “visualizable” atoms that are subject to position-dependent central forces (so successful for representing celestial phenomena) were ill-suited for representing electromagnetic phenomena, “dissipative” phenomena in heat engines and chemical reactions, and so forth. The concepts of atom and force became questionable. The kinetic theory of gases lent support to atomism, yet no consistent models could be found (for example, spectroscopic phenomena required atoms to vibrate while specific heat phenomena required them to be rigid). Moreover, intermolecular forces allowing for internal vibration and deformation could not be easily conceptualized as Newtonian central forces. Newtonian action-at-a-distance forces also came under pressure with the increasing acceptance of Maxwell’s theory of electromagnetism, which attributed electromagnetic phenomena to polarizations in a dielectric medium propagated by contiguous action. Many thought that physics had become a disorganized patchwork of poorly understood theories, lacking coherence, unity, empirical determinacy, and adequate foundations. As a result, physicists became increasingly preoccupied with foundational efforts to put their house in order. The most promising physics required general analytical principles (for example, conservation of energy and action, Hamilton’s principle) that could not be derived from Newtonian laws governing systems of classical atoms. The abstract concepts (action, energy, generalized potential, entropy, absolute temperature) needed to construct these principles could not be built from the ordinary intuitive concepts of classical mechanics. They could, however, be developed without recourse to “hidden mechanisms” and independently of specific hypotheses about the reality underlying the phenomena. Most physicists continued to be realists: they believed in a deeper reality underlying the phenomena that physics can meaningfully investigate; for them, the pressing foundational problem was to articulate the concepts and develop the laws that applied to that reality. But some physicists became antirealists. Some espoused local antirealism (antirealist about some kinds of entities, as Hertz (1956) was about forces, while not espousing antirealism about physics generally).

c. The Aim of Science: Causal Explanation or Abstract Representation?

Others espoused global antirealism. Like contemporary antirealists, they questioned the relationship among physics, common sense and metaphysics, the aims and methods of science, and the extent to which science, qua attempt to fathom the depth and extent of the universe, is bankrupt. While their realist colleagues hoped for a unified, explanatorily complete, fundamental theory as the proper aim of science, these global antirealists argued on historical grounds that physics had evolved into its current disorganized mess because it had been driven by the unattainable metaphysical goal of causal explanation. Instead, they proposed freeing physics from metaphysics, and they pursued phenomenological theories, like thermodynamics and energetics, which promised to provide abstract, mathematical organizations of the phenomena without inquiring into their causes. To justify this pursuit philosophically, they proposed a re-conceptualization of the aim and scope of physics that would bring order and clarity to science and be attainable. The aim of science is: economy of thought (science is a useful instrument without literal significance (Mach 1893)), the discovery of real relations between hidden entities underlying the phenomena (Poincaré 1913), and the discovery of a “natural classification” of the phenomena (a mathematical organization of the phenomena that is the reflection of a hidden ontological order (Duhem 1991)). These affinities, between 19th century global antirealism and 20th century antirealism, mask fundamental differences. The former is driven by methodological considerations concerning the proper way to do physics whereas the latter is driven by traditional metaphysical or epistemological concerns (about the meaningfulness and credibility of claims about goings-on behind the veil of appearances).

3. Logical Positivism

Logical positivism began in Vienna and Berlin in the 1910s and 1920s and migrated to America after 1933, when many of its proponents fled Nazism. The entire post-1960 conversation about scientific realism can be viewed as a response to logical positivism. More a movement than a position, the positivists adopted a set of philosophical stances: pro-science (including pro-verification and pro-observation) and anti-metaphysics (including anti-cause, anti-explanation, anti-theoretical entities). They are positivists because of their pro-science stance; they are logical positivists because they embraced and used the formal logic techniques developed by Frege, Russell, and Wittgenstein to clarify scientific and philosophical language.

a. General Background

As physics developed in the early 20th century, many of the 19th century methodological worries sorted themselves out: Perrin’s experiments with Brownian motion persuaded most of the reality of atoms; special relativity unified mechanics and electromagnetism and signaled the demise of traditional mechanism; general relativity further unified gravity with special relativity; quantum mechanics produced an account of the microscopic world that allowed atoms to vibrate and was spectacularly supported empirically. Moreover, scientific developments undermined several theses formerly taken as necessarily true. Einstein’s famous analysis of absolute simultaneity showed that Newtonian absolute space and time were incorrect and had to be replaced by the space-time structure of Special Relativity. His Theory of General Relativity introduced an even stranger notion of space-time: a space-time with a non-Euclidean structure of variable curvature. This undermined Kant’s claims that space has to be Euclidean and that there is synthetic a priori knowledge. Moreover, quantum mechanics, despite its empirical success, led to its own problems, since quantum particles have strange properties—they cannot have both determinate position and momentum at a given time, for example—and the quantum world has no unproblematic interpretation. So, though everyone was converted to atomism, no one understood what atoms were.

Logical positivism developed within this scientific context. Nowadays the positivists are often depicted as reactionaries who developed a crude, ahistorical philosophical viewpoint with pernicious consequences (Kuhn 1970, Kitcher 1993). In their day, however, they were revolutionaries, attempting to come to grips with the profound changes that Einstein’s relativity and Bohr’s quantum mechanics had wrought on the worldview of classical physics and to provide firm logical foundations for all science.

Logical positivism’s philosophical ancestry used to be traced to Hume’s empiricism (Putnam 1962, Quine 1969). On this interpretation, the positivist project provides epistemological foundations for problematic sentences of science that purport to describe unobservable realities, such as electrons, by reducing sentences employing these concepts to unproblematic sentences that describe only observable realities. Friedman (1999) offers a different Kantian interpretation: their project provides objective content for science, as Kant had attempted, by showing how it organizes our experience into a structured world of objects, but without commitment to scientifically outdated aspects of Kant’s apparatus, such as synthetic a priori truths or the necessity of Euclidean geometry. Whichever interpretation is correct, the logical positivists clearly began with traditional veil-of-perception worries (§1) and insisted on a distinction that both Hume and Kant advocated—between meaningful science and meaningless metaphysics.

b. The Logical Part of Logical Positivism

This distinction rests on their verificationist theory of meaning, according to which the meaning of a sentence is its verification conditions and understanding a sentence is knowing its verification conditions. For example, knowing the meaning of “This is blue” is being able to pick out the object referred to by “this” and to check that it is blue. While this works only for simple sentences built from terms that directly pick out their referents and predicates with directly verifiable content, it can be extended to other sentences. To understand “No emerald is blue” one need only know the verification conditions for “This is an emerald”, “This is blue” and the logical relations of such sentences to “No emerald is blue” (for example, that “no emerald is blue” implies “if this is an emerald, then this is not blue”, and so forth). Simple verification conditions plus some logical knowledge buys a lot. But it does not buy enough. For example, what are the verification conditions expressed by “This is an electron”,  where “this” does not pick out an ostendible object and where “is an electron” does not have directly verifiable content?

To deal with this, the positivists, especially Carnap, hit upon an ingenious program. First, they distinguished two kinds of linguistic terms: observational terms (O-terms), like “is blue”, which have relatively unproblematic, directly verifiable content, and theoretical terms (T-terms), like “is an electron”, which have more problematic content that is not directly verifiable. Second, they proposed to indirectly interpret the T-terms, using logical techniques inherited from Frege and Russell, by deductively connecting them within a theory to the directly interpreted O-terms. If each T-term could be explicitly defined using only O-terms, just as “x is a bachelor” can be defined as “x is an unmarried male human”, then one would understand the verification conditions for a T-term just by understanding the directly verifiable content of the O-terms used to define it, and a theory’s theoretical content would be just its observational content.

Unfortunately, the content of “is an electron” is open-ended and outstrips observational content so that no explicit definition of it in terms of a finite list of O-terms can be given in first-order logic. From the 1930s to the 1950s, Carnap (1936, 1937, 1939, 1950, 1956) struggled with this problem by using ever more elaborate logical techniques. He eventually settled for a less ambitious account: the meaning of a T-term is given by the logical role it plays in a theory (Carnap 1939). Although T-terms cannot be explicitly defined in first-order logic, the totality of their logical connections within the theory to other T-terms and O-terms specifies their meaning. Intuitively, the meaning of a theoretical term like “electron” is specified by: “electron” means “the thing x that plays the Θ-role”, where Θ is the theory of electrons. (This idea can be rendered precisely in second-order logic by a “Ramseyified” definition: “electron” means “the thing x such that Θ(x)”, where “Θ(x)” is the result of taking the theory of electrons Θ (understood as the conjunction of a set of sentences) and replacing all occurrences of “is an electron” with the (second-order) variable “x” (Lewis 1970).

Two features of this theory of meaning lay groundwork for later discussion. First, the meaning of any T-term is theory-relative since it is determined by the term’s deductive connections within a theory. Second, the positivists distinguished analytic truths (sentences true in virtue of meaning) and synthetic truths (sentences true in virtue of fact). “All bachelors are unmarried” and “All electrons have the property of being the x such that Θ(x)” are analytic truths, whereas “Kant was a bachelor” and “Electrons exist” are synthetic truths. The positivists inherited this distinction from Kant, but, unlike Kant, they rejected synthetic a priori truths. For them, there are only analytic a priori truths (all pure mathematics, for example) and synthetic a posteriori truths (all statements to the effect that a given claim is verified).

c. The Positivism Part of Logical Positivism

The positivists distinguished legitimate positive science, whose aim is to organize and predict observable phenomena, from illegitimate metaphysics, whose aim is to causally explain those phenomena in terms of underlying unobservable processes. We should restrict scientific attention to the phenomena we can know and banish unintelligible speculation about what lies behind the veil of appearances. This distinction rests on the observational-theoretical distinction (§3b): scientific sentences (even theoretical ones like “Electrons exist”) have meaningful verifiable content; sentences of metaphysics (like “God exists”) have no verifiable content and are meaningless.

Because of their hostility to metaphysics, the positivists “diluted” various concepts that have a metaphysical ring. For example, they replaced explanations in terms of causal powers with explanations in terms of law-like regularities so that “causal” explanations become arguments. According to the deductive-nomological (DN) model of explanation, pioneered by Hempel (1965), “Event b occurred because event a occurred” is elliptical for an argument like: “a is an event of kind A, b is an event of kind B, and if any A-event occurs, a B-event will occur; a occurred; therefore b occurred”. The explanandum logically follows from the explanantia, one of which is a law-like regularity.

Because they advocated a non-literal interpretation of theories, the positivists are considered to be antirealists. Nevertheless, they do not deny the existence or reality of electrons: for them, to say that electrons exist or are real is merely to say that the concept electron stands in a definite logical relationship to observable conditions in a structured system of representations. What they deny is a certain metaphysical interpretation of such claims—that electrons exist underlying and causing but completely transcending our experience. It is not that physical objects are fictions; rather, all there is to being a real physical object is its empirical reality—its system of relations to verifiable experience.

4. Quine’s Immanent Realism

Quine, an early critic of logical positivism, acknowledged their rejection of transcendental questions such as “Do electrons really exist (as opposed to being just useful fictions)?” Our evidence for molecules is similar to our evidence for everyday bodies, he argued; in each case we have a theory that posits an arrangement of objects that organizes our experience in a way that is simple, familiar, predictive, covering, and fecund. This is just what it is to have evidence for something. So, if we have such an organizing theory for molecules, then we can no more doubt the existence of molecules than we can doubt the existence of ordinary physical bodies (Quine 1955). Quine thus arrived at a realism not unlike the empirical realism of the logical positivists.

However, Quine rejected their theory of meaning and its central analytic-synthetic distinction, arguing that theoretical content cannot be analytically welded to observational content. The positivists, he argued, confuse the event of positing with the object posited. Yes, scientists conventionally introduce posits (an event) as Stoney introduced the term “electron” in 1894: “electron” means “the fundamental unit of electric charge that permanently attaches to atoms”. But no, scientists do not treat the conventions as analytic truths that cannot be revised without a change of meaning. Scientists did not treat Stoney’s definition as binding analytic truth and “Electrons exist” as a synthetic hypothesis whose truth must be verified. More generally, Quine argued, once the explicit definitional route failed by Carnap’s allowing the meaning of “electron” to be a function of the totality of its logical connections within a theory, Carnap had already adopted meaning holism, according to which one cannot separate the analytic sentences, whose truth-values are determined by the contribution of language, from the synthetic sentences, whose truth-values are determined by the contribution of fact.

Quine accepted meaning holism together with another thesis, epistemological holism, a doctrine often called “the Quine-Duhem Thesis”, because Duhem used it to argue against Poincaré’s conventionalism. The Quine-Duhem thesis says that only a group of hypotheses can be falsified because only a group of hypotheses has observational consequences. If a single hypothesis, H, implies an observational consequence O and we get evidence for not-O; then we can deduce not-H. But a single hypothesis will typically not imply any observational consequence. Take, for example, Gauss’s supposed mountaintop triangulation experiment to test whether space is Euclidean (§2a). Let H = “Space is Euclidean” and O = “The measured angle-sum of the triangle equals 180º”. Clearly H does not entail O without auxiliary assumptions: for example, A1 = “Light travels the shortest Euclidean paths”, A2 = “No physical force appreciably disturbs the light”, A3 = “The triangle is large enough for deviations from rectilinear paths to be experimentally detectable”, and so forth. Consequently, if the experiment yields not-O = “The measured angle-sum of the triangle is not equal to 180º”, we cannot deduce not-H = “Space is non-Euclidean”. We can only deduce not-(H and A1 and A2 and Aand so forth); that is, we can only deduce that one or more of the hypothesis and the auxiliary assumptions is false—perhaps space is Euclidean but some force is distorting the light paths to make it look non-Euclidean. Poincaré and the positivists reply that it is conventional or analytic that space is Euclidean; there is no fact of the matter. In rejecting conventionalism, Duhem and Quine claim that we may keep H and reject one of the Ais to accommodate not-O: any statement may be held true in light of disconfirming experience. [It is misleading, however, to call epistemological holism “the Quine-Duhem thesis”. For Duhem, epistemological holism holds only for physical theories for rather special reasons; it does not extend to mathematics or logic and is not connected with theses about meaning. Quine extends epistemological holism from physics to all knowledge, including all knowledge traditionally regarded as a priori, including allegedly analytic statements.] Quine, but not Duhem, believed that our reluctance to revise mathematics and logic (because of their centrality to our belief-systems) does not entail their a prioricity (irrevisability based on evidence).

Moreover, if the analytic-synthetic distinction collapses, so too does the positivist separation of metaphysics from science. For Quine, metaphysical questions are just the most general and abstract questions we ask and are decided on the grounds we use to decide whether electrons exist. All questions are “internal” in the sense that they must be formulated in our home language and answered with our standard procedures for gathering and weighing evidence. In particular, questions about the reality of some putative objects are to be answered in terms of whether they contribute to a useful organization of experience and whether they withstand the test of experience.

5. Scientific Realism

In the 1970s, a particularly strong form of scientific realism (SR) was advocated by Putnam, Boyd, and others (Boyd 1973, 1983; Putnam 1962, 1975a, 1975b). When scientific realism is mentioned in the literature, usually some version of SR is intended. SR is often characterized in terms of two commitments (van Fraassen 1980):

SR1     Science aims to give a literally true account of the world.

SR2     To accept a theory is to believe it is (approximately) true.

However, scientific realists’ arguments and their interpretation of SR1 and SR2 often presuppose further commitments:

SR3     There is a determinate mind-independent and language-independent world.

SR4     Theories are literally true (when they are) partly because their concepts “latch on to” or correspond to real properties (natural kinds, and the like) that causally underpin successful usage of the concepts.

SR5     The progress of science asymptotically converges on a true account.

a. Criticisms of the Observational-Theoretical Distinction

Critics of positivism argued that there is no workable, well-motivated distinction between observational and theoretical vocabulary that would make the former unproblematic and the latter problematic (for example, Putnam 1962; Maxwell 1962; van Fraassen 1980). First, O-terms apply to apparently theoretical entities (for example, red corpuscle) and T-terms apply to apparently observable entities (for example, the moon is a satellite). Second, if T-terms were epistemologically or semantically problematic, that would have to be due to the unobservable nature of their referents. But in the continuous gradation between seeing with the unaided eye, with binoculars, with an optical microscope, with an electron microscope, and so on, there is no sharp cut-off between being observable and being unobservable where we could non-arbitrarily say: beyond this we cannot trust the evidence of our senses or apply terms with confidence. Third, the “able” in “observable” cannot be specified in a way that motivates a plausible distinction. Most “theoretical” entities can be detected (like electrons) with scientific instruments or theoretically calculated (like lunar gravity). The positivist may respond that they cannot be directly sensed, and are thus unobservable, but why should being directly sensed be the criterion for epistemological or semantic confidence? Fourth, observation is theory-infected: what we can both observe and employ as evidence is a function of the language, concepts, and theories we possess. A primitive Amazonian may observe a tennis ball (he notices it), but without the relevant concepts he cannot use it as evidence for any claims about tennis. Such arguments undermine a central distinction of the positivist program.

b. Putnam’s Critique of Positivistic Theory of Meaning

Putnam (1975a, 1975b) provides a general argument against all theories of meaning (Frege, Russell, Carnap, Kuhn), including positivist theories, which are classical in the relevant sense. Classical concepts have two characteristics: they determine their extensions in the world, and we can “grasp” them. To know the meaning of a directly interpretable O-term is to associate it with a concept (verification condition) which determines the term’s extension. In turn, to know the meaning of an indirectly interpretable T-term is to know its logical connections to directly interpretable terms. These two features of the classical view are:

(1)  To know the meaning of F is to be in a certain psychological state (of grasping F’s associated concept and knowing it is the meaning of “F”);

(2)  The meaning of F determines the extension of F in the sense that, if two terms have the same meaning, they must have the same extension.

If the meaning of “water” is the concept the clear, tasteless, potable, nourishing liquid found in lakes and rivers, then by (1) I must associate that concept with “water” if I’m to know its meaning and by (2) something will be water just in case it satisfies that concept.

Putnam’s famous Twin Earth argument (Putnam 1975b) is intended to show that all classical theories fail because (1) and (2) are not co-tenable. Suppose the year is 1740 when speakers did not know that water is H2O. Suppose too that another planet, Twin-Earth, is just like Earth except that a different liquid, whose chemical nature is XYZ, is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Suppose finally that Earthling Oscar and Twin-Earthling Twin-Oscar are duplicates and share the very same internal psychological states so that Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers” if and only if Twin-Oscar thinks “water is the clear, tasteless, potable, nourishing liquid found in lakes and rivers”. In other words, they grasp the same meaning and associate it with the word “water”; (1) is satisfied. But then (2) cannot be satisfied: meaning does not determine extension because the extension of “water” (in English) = H2O yet XYZ = the extension of “water” (in Twin-English). If (1), then not-(2). Conversely, if meaning does determine extension, then since the extension of “water” (on Earth) is the extension of “water” (on Twin-Earth), Oscar and Twin-Oscar must associate different meanings with the term. Consequently, either (1) or (2) must go. Putnam keeps (2) and revises (1).

c. Putnam’s Positive Account of Meaning

How is extension determined, if not classically? Putnam develops a causal-historical account of reference for natural kind terms (“water”) and physical magnitude terms (“temperature”). Think of these terms being introduced into the language via an introducing event or baptism. The introducer points to an object (or phenomenon) and intones: “let ‘t’ apply to all and only objects that are relevantly similar (same kind, same magnitude) to this sample (or to whatever is the cause of this phenomenon)”. Later t-users learn conditions that normally pick out the referent of t, use these conditions to triangulate their usage with that of others and with extra-linguistic conditions, and intend their t-utterances to conform to the t-practices initiated in the introducing event. The term passes through the community so that reference is preserved. Then, on Putnam’s view, the extension of the term is part of the meaning of the term, the kind or magnitude that the term “locked on to” in the course of its introduction and historical development. So H2O is part of the English meaning of “water” and (2) is satisfied: meaning determines extension since extension is part of the meaning. This gives an intuitively plausible reading of the Twin-Earth scenario: Oscar is talking about water (H2O) and Twin-Oscar is talking about Twin-water (XYZ).

On classical accounts, a speaker S correctly uses a term “t” to refer to an object x only if x uniquely satisfies a concept, description, verification procedure, or theory that S associates with “t”. In the 1740s English-speakers lacked such uniquely identifying knowledge, though we would naturally say they were using “water” as we do — to refer to H2O. On Putnam’s account, S correctly uses t to refer to x only if S is a member of a linguistic community whose t-usage (via their linguistic and extra-linguistic interactions) is causally or historically tied to the things or stuff that are of the same kind as x. Realistic semantics ties correct usage to things in the world using causal relations. Because truth is defined in terms of reference (for example, “a is F” is true if and only if the referent of “a” has the property expressed by “F”), truth on Putnam’s account is also a causal notion.

We now see why SR is committed to SR3 and SR4 above. Clearly SR1 requires SR3: science can aim at a literally true account of the world only if the world is some determinate way that an account can be literally true of. But Putnam’s semantics requires more: that there be natural kinds and magnitudes that our terms lock onto, which is SR4. Note SR5 also seems to require SR3 and SR4. To many realists who accept SR3, SR4 seems extravagant and mysterious. Natural kinds seem to be an unnecessary traditional philosophical apparatus imposed on realism without the support of, and indeed undermined by, science. Our best science suggests that natural kinds do not exist: water, for example, is not a simple natural kind, H2O, but a more complicated structure of constantly changing polymeric variations, and biological species are anything but simple kinds. And even if there were natural kinds, it seems unreasonable to expect that language could neatly lock onto them: why should our accidental encounters with various samples in our limited part of the universe put us in a position to lock onto universal kinds? Continuity of reference of the kind advocated by Putnam may be too crude. More fine-grained accounts have been proposed (Kitcher 1993; Wilson 1982, 2006) which acknowledge the complicated evolution of science and language yet avoid metaphysical extravagance.

d. Putnam’s and Boyd’s Critique of Positivistic Philosophy of Science

A common argument for SR is the following:

  1. An acceptable philosophy of science should be able to explain standard scientific practice and its instrumental success.
  2. Only SR can explain standard scientific practice and its instrumental success.
  3. Thus SR is the only acceptable philosophy of science.

This is an instance of inference to the best explanation (§5e). Here we look at premise 2, which follows logically from:

2a. There are only two contending explanations: SR and Idealism.

2b. Idealism fails to explain the practice and its success, while SR succeeds.

Premise 2a: For Putnam the distinction between realism and idealism is fundamentally semantic. In realist (or externalist) semantics the world leads and content follows: content is determined causally and historically by the way world is; the content of “water” is H2O. In idealist (or internalist) semantics content drives and the world follows: the world is whatever satisfies the descriptive content of our thoughts; the content of “water” is the clear, tasteless, potable, nourishing liquid found in lakes and rivers. Idealism is a blanket category covering any account of meaning (including positivist and Kuhnian and pragmatist accounts (§§7-8)) in the family of classical theories (§5b).

Premise 2b: Idealism fails to explain scientific practice and success in several ways: (i) For the positivist, “Electrons exist” means “Θi implies ‘electrons exist’ and Θi is observationally correct” and “‘electron’ refers to x” means “x is a member of the kind X such that Θi(X)” (§3b). Existence, reference, and truth are all theory-relative. Take “electron” in Thomson’s 1898 theory, in Bohr’s 1911 theory, and in full quantum theory (late 1920s). Since the meaning of “electron” changes from theory to theory and meaning determines reference, the referent of “electron” changes from theory to theory. So, Thomson, early Bohr, later Bohr, Heisenberg, and Schrödinger were (a) talking about a different entity and (b) changing the meaning of “electron”. Putnam argues that this is a bizarre re-description of what we would normally say: they were (a) talking about the same entity and (b) making new discoveries about it. By contrast, realist truth and reference are trans-theoretic: once “electron” was introduced into the language by Stoney, it causally “locked onto” the property being an electron; then the various theorists were talking about that entity and making new discoveries about it. So realism, unlike positivism, saves our ordinary ways of talking and acting.

(ii) The conjunction objection: in practice we conjoin theories we accept. Realist truth has the right kind of properties, such as closure under the logical operation of conjunction (if T1 is true and T2 is true, then (T1 and T2) is true), to underwrite this conjunction practice. But positivist surrogates for truth, reference, and acceptance cannot underwrite this practice. From “T1 is observationally correct” and “T2 is observationally correct”, it does not follow that (T1 and T2) is observationally correct—their theoretical parts could contradict each other, for example, so that their conjunction would imply all observational sentences, both true and false. Again realism, but not positivism, succeeds. Similarly, the practice of conjoining auxiliary hypotheses with a theory to extend and test the theory cannot be accounted for by positivism. In {Newton’s theory of gravitation + there is no transneptunian planet}, “gravitation” has one meaning; in {Newton’s theory of gravitation + there are transneptunian planets}, it has another meaning. But the discovery that the latter was true and the former false should not be described as a change of meaning or reference of the word “gravitation”. Again realism succeeds where positivism fails.

(iii) The No-Miracles Argument (NMA): everyone agrees that science is instrumentally successful and increasingly so. Scientists believe that newly proposed theories stand a better chance of success if they resemble current successful theories or if they are tested by methods informed by such theories, and they construct scientific instruments, experiments, and applications relying on current theories. Moreover, scientists are getting better at doing this—consider improvements in microscopy over the past three centuries. Their actions are successful and rely on their beliefs that current theories can be depended upon to produce a likelihood of success. These successes are a miracle on positivist principles. Why should reliance on observationally correct theories be expected to produce success, unless we believe what they say about unobservables? In contrast, SR explains these successes: scientists’ actions rely upon their belief that the theories they use are approximately true; those actions have a high degree of success; the best explanation of their success is that the theories relied upon are approximately true.

e. Inference to the Best Explanation

Argument 1-3 (§5d) is an instance of inference to the best explanation (IBE), an inferential principle that realists endorse and antirealists reject. IBE is the rule that we should infer the truth of the theory (if there is one) that best explains the phenomena. Thus we should infer SR because it best explains scientific practice and its instrumental success.

First, a few clarifications of IBE are in order. If IBE is to be non-trivial, the best explanation must not entail that what is best must antecedently be what is most likely, since of course we should infer the truth of the most likely explanation. Rather the best explanation must be characterized in terms of properties like “loveliest” or “most explainey” (Lipton 2004). Traditional examples of such properties are: it has wide scope and precision; it appeals to plausible mechanisms; it is simple, smooth, elegant, and non-ad hoc; and it underwrites contrasts (why this rather than that). Then IBE says we should accept the theory that optimizes such explanatory virtues when explaining the phenomena. The caveat “if there is one” blocks inferences to the best of a bad lot: the best explanation may not reach a minimally acceptable threshold. Finally, like any inferential principle that amplifies our knowledge, conclusions inferred by IBE are fallible: while they are more likely to be true, they could be false. Second, the “justification” for IBE is two-fold. (1) It is needed for science. Simple enumerative induction (which entitles us to move probabilistically from “All observed As are Bs” to “All As are Bs” cannot handle inferences from observed phenomena to their “hidden” causes. For example, we cannot inductively infer “Galaxy X is receding” from “Light from Galaxy X is red-shifted”, but we can infer by IBE that Galaxy X is receding because that is the best explanation of why its light is red-shifted. More strongly, Harman (1965) argues that IBE is needed to warrant straight enumerative induction: we are entitled to make the induction from “All observed As are Bs” to “All As are Bs” only if “All As are Bs” provides the best explanation of our total evidence. (2) Scientific uses of IBE are grounded in, and are just sophisticated applications of, a principle we use in everyday inferential practice. If I see nibbled cheese and little black deposits in my kitchen and hear scratching noises in the walls, I reasonably infer that I have mice, because that best explains my evidence. IBE thus needs no more justification than does modus ponens—each is part of the very practices that constitute what rational inference is.

Realists employ IBE at different levels. At the ground-level, they observe surprising regularities like the phenomenological gas laws relating pressure, temperature, and volume. These cannot be just cosmic coincidences. Realists argue that observed gas behavior is as it is because of underlying molecular behavior; we have reason to believe the molecular hypothesis (by IBE) because it best explains the observed gas behavior. At this level, antirealist rejections of IBE seem stretched: it seems unsatisfactory to say either that we do not need an explanation (since it appears to be a guiding aim of inquiry to explain regularities where possible) or that observed gas behavior is as it is because gases behave as if they are composed of molecules (since ordinary and scientific practice distinguishes genuine explanations from just-so stories).

Realists also employ IBE at a meta-level (§5d): we should be realists about our current theories because only realism can explain how our methodological reliance on them leads to the construction of empirically successful theories (Boyd) or only realism can explain the way in which scientific theories succeed each other and the methodological constraints scientists impose on themselves when constructing new theories (Putnam). Relativity theorists felt bound to have Newton’s theory derivable in the limit from Einstein’s theory. Why? The realist answer is: “because a partially correct account of a theoretical object (as the gravitational field) must be replaced by a better account of the same theory-independent object (as the metric structure of spacetime)”. Similarly, realists claim that scientific progress is best explained by SR5, the thesis that science is converging on a true account of the world. As Putnam says, realism is the only hypothesis that does not make the success of science a miracle. At the meta-level, the alleged phenomenon is that our best scientific traditions and theories are instrumentally and methodologically successful; SR is alleged to be the best (or only) explanation of that phenomenon; thus we should infer SR. As we will see (§§6d, 7, 11b), it is not clear that these uses of IBE are legitimate, because the alleged phenomenon itself is questionable, or the SR-“explanation” does not explain, or no explanation may be needed, or alternative antirealist explanations may be better.

6. Constructive Empiricism

Van Fraassen (1980) proposed constructive empiricism (CE), arguing that we can preserve the epistemological spirit of positivism without subscribing to its letter. Van Fraassen’s is an antirealism concerning unobservable entities. Recognizing the difficulties of basing antirealism on a “broken-backed” linguistic distinction between O-terms and T-terms, he allows our judgments about unobservables to be literally construed but, he argues, our evidence can never entitle us to our beliefs about unobservables. CE is consistent with SR3 and SR4 (though it does not commit to them, it has no quarrels with realist objectivity or semantics) but replaces SR1, SR2, and SR5 respectively with:

CE1     Science aims to provide empirically adequate theories of the phenomena.

CE2     To accept a theory is to believe it is empirically adequate, but acceptance has further non-epistemic/pragmatic features.

CE5     The progress of science produces increasing empirical adequacy.

A theory T is empirically adequate if and only if what T says about all actual observable things and events is true (that is, T saves all the phenomena, or T has a model that all actual phenomena fit in). Empirical adequacy is logically weaker than truth: T’s truth entails its empirical adequacy but not conversely. But it is still quite strong: an empirically adequate theory must correctly represent all the phenomena, both observed and unobserved. CE2 distinguishes epistemic and pragmatic aspects of acceptance. Epistemic acceptance is belief; beliefs are either true or false. Pragmatic acceptance involves non-epistemic commitments to use the theory in certain ways (basing research, experiments, and explanations on it, for example); commitments are neither true nor false; they are either vindicated or not. CE5 acknowledges that there is instrumental progress without trying to explain it. CE concedes a realist semantics (“electron”-talk is not highly derived talk about observables) but preserves the spirit of positivism by recommending agnosticism about a theory’s literal claims about unobservables.

a. The Semantic View of Theories and Empirical Adequacy

On the positivist view, a theory T is a syntactic object: T is the set of theorems in a language generated from a set of axioms (the laws of T) and derivation rules. The empirical content (the entire literal content) of T is T/O, the theorems expressible in the observational vocabulary. A theory T is empirically (observationally) adequate if T/O is the class of all true observational sentences.  Two theories, T and T’, are empirically (observationally) equivalent if T/O = T’/O. Since such theory pairs have the same literal content and differ only in their non-literal, theoretical content, they are merely inter-definable variants of a common observational basis: they say the same thing but express it differently. There is no fact of the matter whether T or T’ is true (both are or neither are), and whether we work with T or T’ is purely a pragmatic matter concerning which is simpler, more convenient, and so forth. For SR and CE there is a fact of the matter: at most one of T, T’ can be true. For SR there may be reasons to believe one of T, T’. For CE there can be no epistemic reason to believe one over the other, though there may be pragmatic reasons to accept (commit to using) one over the other. Van Fraassen needs a different account of theories if he is to agree with realists about literal content and there being a fact of the matter about empirically equivalent theories.

For him, a theory T is a semantic object, the class of models, A = <D, R1, R2, …, Rn>, that satisfy its laws (where D is a set of objects and Ri are properties and relations defined on them). For example, D might contain billiards and molecules; the property is elastic in A might be instantiated by both billiards and molecules, is a molecule by some members of D, and is a billiard ball by others. Now let A’ = <D’, R’1, R’2, …, R’m> (where m < n, D’ is a proper subset of D, and R’i = Ri/D’ (Ri restricted to D’)). Intuitively A’ is obtained from A by removing all unobservables, so D’ would contain billiard balls but not molecules, is elastic would now be restricted to billiard balls, is a molecule would not be instantiated, and so forth. Then A’ is an empirical substructure of A, the result of restricting the original domain to observables and its properties and relations accordingly. T is empirically adequate if and only if T has an empirical substructure that all observables fit in. Two theories, T and T’, are empirically equivalent if all the observables in a model of T are isomorphic to the observables in a model of T’. Such theory pairs agree in what they say about observables but may disagree in what they say about unobservables. Thus CE can agree with SR that at most one of T, T’ can be true and to be a realist about that theory is to believe it is true (SR2). Yet CE can preserve the spirit of positivism by holding that we can never have reason to believe a theory; at most we have reason to believe it is empirically adequate. Friedman (1982) questions whether van Fraassen achieves this.

b. The Observable-Unobservable Distinction

Since CE recommends agnosticism about unobservables but permits belief about observables, the policy requires an epistemologically principled distinction between the two. Though rejecting the positivists’ distinction between T-terms and O-terms, van Fraassen defends a distinction between observable and unobservable objects and properties, a distinction that grounds his policy of agnosticism concerning what science tells us about unobservables. There is a fact of the matter about what is observable-for-humans: given the nature of the world and of the human sensory apparatus, some objects/events/properties possess the property is observable-for-humans; others lack that property; the former are observables, the latter unobservables. For example, Jupiter’s moons are observable because a human could travel close enough to see them unaided, but electrons are unobservable because a human could never see one (that is just the nature of humans and electrons). Van Fraassen also claims that the limits of observation are disclosed by empirical science and not by philosophical analysis—what is observable is simply a fact disclosed by science. It should be noted that the distinction, as he draws it, has no a priori ontological implications: flying horses are observable but do not exist; electrons may exist but are unobservable.

Critics (Churchland 1985; Musgrave 1985; Fine 1986; Wilson 1985) complain that this distinction cannot ground a sensible epistemological policy. First, van Fraassen runs together different notions, none of which has special epistemological relevance. What is observable is variously taken as: what is detectable by human senses without instruments (Jupiter’s moons); what can be “directly” measured as opposed to “indirectly” calculated; what is detectable by humans-qua-natural-measuring-instruments (as thermometers measure temperature, humans “measure” observables). Critics ask why any of these should divide the safe from the risky epistemic bet. Why is it legitimate to infer the presence of mice from casual observation of their tell-tale signs but illegitimate to infer the presence of electrons from careful and meticulous observation of their tell-tale ionized cloud-chamber tracks?

Second, many critics find van Fraassen’s agnosticism about unobservables unwarrantedly selective. CE claims that we ought to believe what science tells us about all observables (both observed and unobserved) but not about unobservables. In each case there is a gap between our evidence (what has been observed) and what science arrives at (claims about all observables (CE) or claims about all observables and unobservables (SR)). Why is it legitimate to infer from what we have observed in our spatiotemporally limited surroundings to everything observable but not to what is unobservable (though detectable with reliable instruments or calculable with reliable theories)? Our experience is limited in many ways, including lacking direct access to: medium-sized events in spatiotemporally remote regions, events involving very small or very large dimensions, very small or very large mass-energy, and so forth. Why should inductions to claims about the first be legitimate but not to claims about the others?

Third, CE’s epistemic policy is pragmatically self-defeating or incoherent. Suppose a scientific theory T tells us “A is unobservable by humans”. In order to use T to set our epistemic policy we must accept T; that is, believe what T tells us about observables, but we should be agnostic about what T tells us about unobservables, including whether A is observable or unobservable. But if we should be agnostic about A’s observability, then we do not know whether or not we should believe in As. A consistent constructive empiricist will have trouble letting science determine what is unobservable and using that determination to guide her epistemic policy—often she will not know what not to believe.

Finally, if we interpret the language of science literally (as van Fraassen does), then we ought to accept that we see tables if and only if we see collections of molecules subject to various kinds of forces. But then if we are willing to assert there are tables we should be willing to assert that there are collections of molecules (Friedman 1982; Wilson 1985).

c. The Argument from Empirically Equivalent Theories

As realists rely on IBE, antirealists rely on EET:

  1. If T and T’ are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.
  2. If (E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n), then we have no reason to believe T rather than T’ or vice versa.
  3. For any T, there exists a distinct empirically equivalent T’.
  4. Thus, for any theory T, we have no reason to believe it rather than its empirically equivalent rivals.

The argument appears to be valid, but each of its premises can be challenged (Boyd 1973; Laudan and Leplin 1991). Premise 1 is under-specified. Any abstract, sufficiently general theory (for example, Newton’s theory of gravitation) has no empirical consequences on its own. Trivially, two such theories are empirically equivalent since each has no empirical consequences; so any evidence equally confirms/infirms each. But no realist will worry about this. In order to give Premise 1 bite, the theories must have empirical consequences, which they will have only with the help of auxiliary hypotheses, A (§4). But then Premise 1 becomes:

1A. If (T and A) and (T’ and A) are empirically equivalent, then any evidence E confirms/infirms T to degree n if and only if E confirms/infirms T’ to degree n.

Whether 1A is plausible depends on what A is. If A is any hypothesis which has been accepted to date, then 1A is false because current empirical indistinguishability does not entail perpetual empirical indistinguishability, since evidence and auxiliary hypotheses change over time as we discover new instruments, methods, and knowledge. But if A is any hypothesis whatsoever, then there is no reason to think that the antecedent of Premise 1A is true, and thus 1A is again a trivial, vacuous truth. Moreover, the connection between empirical equivalence (agreement about observables in the sense of §6a) and evidential support is questionable (Laudan and Leplin 1991). Premise 1 presupposes that all and only what a theory says or implies about observables is evidentially relevant to that theory. But this is false: Brownian motion, though not an empirical consequence of atomic theory, supported it. Thus T and T’ could be empirically equivalent, yet one could have better evidential support than the other; for example, T, but not T’, might be derivable from a more comprehensive theory that entails evidentially well-supported hypotheses.

Some IBE-realists resist Premise 2: T and T’ may be equally confirmed by the evidence, yet one of them may possess superior explanatory virtues (§5e) that make it the best explanation of the evidence and thus, by IBE, more entitled to our assent—especially if the other is a less natural, ad hoc variant of the “nice” theory. The success of this response depends on whether explanatorily attractive theories are more likely to be true—why should nature care that we prefer simpler, more coherent, more unified theories?—and on whether a convincing case can be made for the claim that we are evolutionarily equipped with cognitive abilities that tend to select theories that are more likely to be true because their explanatory virtues appeal to us (Churchland 1985).

The very strong, very general conclusion of EET, however, depends on the very strong, very general Premise 3, which, critics argue, is typically supported either by “toy” examples of theory-pairs from the history of physics, by contrived examples of theories, one of which is transformed from the other by a general algorithm (Kukla 1998), or by some tricks of formal logic or mathematics. None is likely to convince any realist (Musgrave 1985; Stanford 2001).

d. Constructive Empiricism, IBE, and Explanation

For van Fraassen, a theory’s explanatory virtues (simplicity, unity, convenience of expression, power) are pragmatic—a function of its relationship to its users. This implies that explanatory power is not a rock bottom virtue like consistency (Newton could decline to explain gravity, but he could not decline to be consistent) and does not confer likelihood of truth or empirical adequacy (Newton’s theory explained lots of phenomena but is neither true nor empirically adequate). The fact that a theory satisfies our pragmatic desiderata has no implications for its being true or empirically adequate, contrary to what IBE-realists maintain.

IBE is a rule guiding rational choice among rival hypotheses. But there is always the option of declining to choose, of remaining agnostic. To undercut this general option, van Fraassen argues, the realist must commit to some claim like: every regularity and coincidence must be explained. Van Fraassen challenges this alleged requirement. First, the quest for explanation has to stop somewhere; even “realist” explanations must bottom out in brute fundamental laws; so, why cannot an antirealist bottom out in brute phenomenological laws? Second, scientists do not consider themselves bound by a principle that demands that every correlation be explained. In quantum mechanics, for example, spin states of entangled particles are perfectly correlated, yet every reasonable explanation-candidate has failed, and scientists no longer insist that they must be explained, contrary to what realists allegedly require (Fine 1986). However, these arguments may be directed at a straw man, since no realist is likely to require that every regularity be explained. Musgrave (1985), for example, suggests that these arguments confuse realism (the view that science aims to explain the phenomena where possible) with essentialism (the view that science aims to find theories that are fundamentally self-explanatory): it is not antirealist to claim that Newton explained a host of phenomena in terms of gravity but declined to explain gravity itself.

Van Fraassen also denies that only realism can explain the phenomena. There are rival explanations that are compatible with CE, and some of them are more plausible than realism. In §5e we distinguished ground-level and meta-level uses of IBE and suggested that this strategy might be more promising for the latter than the former. Recall the realists’ reasoning: there is a surprising phenomenon—our current scientific theories, scientific methodology, and the history of modern science, are surprisingly successful—which cries out for explanation; the only explanation is that the theories are approximately true; thus, by IBE, realism. But there is a more mundane explanation: many very smart people construct our scientific theories and methods, throwing out the unsuccessful ones (which we tend to ignore (Magnus and Callender 2004)) and refining and keeping only the successful ones. A variant of this success-by-design-and-trial-and-error is explanation of success in Darwinian terms: just as the mouse’s running away from its enemy the cat is better explained in Darwinian terms (only flight-successful mice survive and pass their genes along) than in representational terms (the mouse “sees” that the cat is his enemy and therefore runs), so too the instrumental success of science is better explained in Darwinian terms (only the successful theories survive) than in realist terms (they are successful because they are approximately true). These rival antirealist explanations of success are controversial, however (Musgrave 1985). The success-by-design explanation does not seem right, since scientists often construct theories that make completely unexpected, novel predictions.

7. Historical Challenges to Scientific Realism

A range of arguments attempt to show that scientific realism is often supported by an implausible history of science. (In what follows, T* and T are successor and predecessor theories in a sequence of theories; for example, think of the sequence <Aristotelian physics, Medieval physics, Cartesian physics, Newtonian physics, (Newtonian + Maxwellian physics), Special Theory of Relativity (STR), (General Theory of Relativity (GTR) + Quantum Mechanics (QM)), …> as ordered under the relation T* succeeds T.)  Both realists and empiricists think of science as being cumulative and progressive. For empiricists, cumulativeness requires at least that T* have more true (and perhaps less false) observational consequences than T. Since the content of a theory on logical positivists’ views is exhausted by its observational consequences, if T* has more true observational consequences than T, then T* is “more true than” T. However, SR-realists require more. Because of SR5, they are committed to a historical thesis: that science asymptotically converges on the truth. Because of their externalist semantics, they are committed to theses about reference: theoretical terms genuinely refer, reference is trans-theoretic, and reference is preserved in T-T* transitions (so that “electron” in Bohr’s earlier and later theories refers to the same object and the later theory provides a more adequate conception of that object). Finally, because of their meta-level appeals to IBE, they are committed to SR5 because it best explains the instrumental success of our best theories and the increasing instrumental success of sequences of theories (where T* is more successful than T because T* is closer to the truth than T), and so forth.

a. Kuhn’s Challenge

According to Kuhn (1970), the standard view of science as steadily cumulative (presupposed by both positivism and realism) rests on a myth that is inculcated by science education and fostered by Whiggish historiography of science. When the myth is deconstructed, we see science as historically unfolding through stable cycles of cumulativeness, punctuated by periods of crisis and revolution.

During periods of normal science, practitioners subscribe to a paradigm. They have the same background beliefs about: the world, its fundamental ontology, processes, and laws (statements that are not to be given up); correct mathematical and linguistic expression; scientific values, goals, and methods; scientifically relevant questions and problems; and experimental and mathematical techniques. Within a given paradigm P—for example, Newtonian physics—there is a relatively stable background: a world of Newtonian particles moving in space and time subject to Newtonian forces (like gravity) and obeying Newton’s laws. There are exemplary methods and techniques—for example, to solve a problem of motion, bring it under the equation, F = ma, which manifests itself across the board and is treated as counterexample-free. And there are shared values—for example, unified mathematical representation of phenomena—and problems (for example, the solution of the arbitrary n-body problem for a system of gravitationally attracting bodies or the resolution of the anomaly in the orbit of Uranus) that require further articulation of the theory. In normal science, cumulativeness occurs: the theory becomes extended to answer its own questions and cover its phenomena. (Kuhn thinks that clean views of history come from focusing too much on normal science.) But sooner or later anomalies crop up that the paradigm cannot handle (for example, the failure to bring electromagnetism, black body radiation, and Mercury’s orbit under the Newtonian scheme). There is a crisis that only a revolutionary new paradigm (for example, STR, QM, and GTR) can handle. Once in place, the new paradigm P* provides a radically new way of looking at the world.

Kuhn (1970) was interpreted (wrongly, but with some justice given his sometimes incautious language) as arguing for an extremely radical constructivist/relativist position: P and P* are incommensurable in the sense that they are so radically distinct that they cannot be compared; the P and P* scientists work “in different worlds”, “see different things”, use different maps (theories and conceptual schemes) and also have different rules for map-making (methods), different languages, and different goals and values. As a result, during the transition, scientists have to learn a new way of seeing and understanding phenomena—Kuhn likens the experience to a “gestalt switch” or “religious conversion”. There is no commonality—in ontology, methodology, observational base, or goals/values—that P and P* scientists can use to rationally adjudicate their disagreements. There is no paradigm-independent reason for preferring P* over P, since such reasons would have to appeal to something common (common observations, methods, or norms), and they share no commonality. Even more strongly, there is no paradigm-independent, objective fact of the matter concerning which of them is correct. If this were true, then all standard theses about progress would be undermined. There is no referential or meaning continuity across paradigms; no sense can attach to theses like T* is more true than T, T is a limiting case of T*; or T* preserves all T’s true observational consequences, since such theses presuppose T-T* commensurability.

Critics have pointed out that this view is too extreme (McMullin 1991). The history of science shows more continuity and fewer radical revolutions than this account attributes to it. Scientists make rational choices between “paradigms” (for example, most scientists who were skeptical of atoms came to reasonably believe in them as a result of Perrin’s experiments). Many scientists work within two traditions without experiencing gestalt shifts (for example, 19th century energetics and molecular theories). T and T* advocates often argue, criticize each other, and rationally persuade each other that one of the two is incorrect. How could this be, if the radical interpretation of Kuhn were correct?

Kuhn clearly did not intend the radical reading, and in later writings (1970 Postscript, 1977) he distinguishes his views from such radical, subjectivist, and relativist interpretations. Paradigm transitions and incommensurability, he argues, are never as total as the radical interpretation assumes: enough background (history, instrumentation, and every-day and scientific language) is shared by P- and P*-adherents to underwrite good reasons they can employ to mount persuasive arguments. Moreover, he lists several properties any theory should have—accuracy (of description of experimental data), consistency (internal and with accepted background theories), scope (T should apply beyond original intended applications), fecundity (T should suggest new research strategies, questions, problems), and simplicity (T should organize complex phenomena in a simple tractable structure). Application of these criteria accounts for progress and theory choice. However, these are “soft” values that guide choices rather than “hard” rules that determine choices. Unlike rules, (i) they are individually imprecise and incomplete, and (ii) they can collectively conflict (and there is no a priori method to break ties or resolve conflicts). Moreover, Kuhn argues, an individual’s choice is guided by a mixture of objective (accuracy, and so forth) and subjective (individual preferences like cautiousness and risk-taking, and so forth) factors, the latter influencing her interpretation and weighing of the criteria. A cautious scientist may be unwilling to risk a high probability of being wrong for a small probability of being informative in novel ways, and vice versa for the risk-taker. In this way Kuhn (1977) offers a middle ground between theory choices being completely subjective and being objective (qua being determined by rules applied to evidence). This “softer” view of science, he argues, enables new theories to get off the ground: progress can be made only if there are values to allow rational discussion and argument but not hard rules that would pre-determine an answer (because then everyone would conform to the rule and not risk proposing new alternatives).

Kuhn has shown that evidence and reasons are sometimes incapable of deciding between P and P*. But a realist may concede that hard choices occur: at most one of P or P* is correct, and we may have to wait and see which, if either, pans out. Temporary gridlock need not amount to permanent undecidability: the lack of decisive reasons at a time does not imply that there will be no decisive reasons forever; when more evidence is acquired and its relevance better understood, convincing reasons usually emerge. Realists should concede these points; many in the 21st century do. But no SR-realist can accept the thesis, never abandoned by Kuhn, that there is no fact of the matter whether P or P* is correct.

b. Laudan’s Challenge: The Pessimistic Induction

Although it is widely agreed that our best theories are instrumentally successful and many T-T* sequences show increasing success, Laudan (1981) disputes that success and progress are to be explained in realist SR5-terms of increasing approach to the truth. The history of science, Laudan argues, shows that referential success is neither necessary nor sufficient for empirical success: not necessary because the central terms of many successful theories did not refer (19th century ether, caloric, and phlogiston theories, for example); not sufficient because the central terms of many failing theories did, by our lights, refer (18th century chemical atomism, Prout’s hypothesis for most of the 19th century, Wegener’s theory of continental drift in the first half of the 20th century, and so forth).

Moreover, realist notions of approximate truth and convergence-to-the-truth are problematic. Despite best efforts, no satisfactory metric has emerged that would characterize distance from the truth or the truth-distance between T and T* (Laudan 1981; Miller 1974; Niiniluoto 1987). For some T-T* sequences in mathematical physics, there are limit theorems whereby T can be derived as a special case of T* under appropriate limiting conditions. For example, special relativity passes asymptotically into Newtonian mechanics as (v/c) 2 approaches 0. Such theorems suggest that Newtonian mechanics yields close to correct answers for applications close to the relativistic limits (not too fast). In this way realists can appeal to them to argue that T* extends and improves upon T. However, for many T-T* sequences there are no analogous limit theorems: Lavoisier’s oxygen theory is a progressive successor of Priestley’s phlogiston theory, yet there is no neat mathematical relationship indicating that phlogiston theory is a limiting case of oxygen theory. Moreover, even for cases where T* approaches T as some parameter approaches a limit, it is controversial what to conclude. If reference is determined by meaning (§5b), then “massnewton” and “masseinstein” refer to different things, and the fact that there is a derivation of classical mass-facts from relativistic mass-facts under certain conditions does nothing to show that T* provides a more global, more accurate description of mass-facts than T (since they’re talking about different things); the limit theorems show at most that some structure of abstract relations but not semantic content gets preserved in the T-T* transition (§11a). But if reference is determined by causal-historical relations (§5c), then the references of some key terms of T get lost in the transition to T*—“ether” was a key referring term of classical physics, but there is no ether in special relativity; so how can classical physics capture part of the same facts that special relativity captures when all its claims about the ether are either plainly false or truth valueless?

These are serious challenges to SR. On one hand, it is hard to shake the idea that theories are successful because they are “onto something”. Yes, we build them to be successful, but their scope and novel predictions generally greatly outstrip our initial intentions. Realists tend to see the history of science as supporting an optimistic meta-induction: since past theories were successful because they were approximately true and their core terms referred, so too current successful theories must be approximately true and their central terms refer. On the other hand, skeptics see the history of science as supporting a pessimistic meta-induction: since some (many, most) past successful theories turned out to be false and their core terms not to refer, so too current successful theories may (are likely to) turn out to be false and their key terms not to refer. Realists must be careful not to interpret history of science blindly (ignoring the successes of ether theories and the failures of early atomic theories, for example) or Whiggishly (begging questions by wrongly attributing to our predecessors our referential intentions—by assuming, for example, that Newton’s “gravity” referred to properties of the space-time metric).

8. Semantic Challenges to Scientific Realism

Realist truth and reference are word-world/thought-world correspondences (SR4), an intuitively plausible view with a respectable pedigree going back to Aristotle. Moreover, some IBE-realists argue that real correspondences are needed to explain the successful working of language and science: we use representations of our environment to perform tasks; our success depends on the representations causally “tracking” environmental information; truth is a causal-explanatory notion. Several philosophical positions challenge this idea.

a. Semantic Deflationism

Tarski showed how to define the concept is true-in-L (where L is a placeholder for some particular language). Treating “is true” as predicated of sentences in a formal language, he provided a definition of the concept that builds it up recursively from a primitive reference relation that is specified by a list correlating linguistic items syntactically categorized with extra-linguistic items semantically categorized. Thus, for example, a clause like “‘electron’ refers to electrons” would be on this list if the language were English. Although Tarski’s definition is technically sophisticated, the main points for our purposes are these. First, it satisfies an adequacy condition (referred to as Convention T): for every sentence P (of L), when P is run through the procedure specified by the definition, “P” is true (in L) if and only if P. Thus, for example, “Electrons exist” is true-in-English if and only if electrons exist, and so forth. Second, truth and reference are disquotational devices: because of the T-equivalences, to assert that “snow is white” is true (in English) is just to assert that snow is white; similarly, to assert that “snow” refers (in English) to some stuff is just to assert that the stuff is snow.

Semantic deflationists (Fine 1996; Horwich 1990; Leeds 1995, 2007) argue that Tarski’s theory provides a complete account of truth and reference: truth and reference are not causal explanatory notions; they are merely disquotational devices that are uninformative though expressively indispensable—useful predicates that enable us to express certain claims (like “Everything Putnam said is true”) that would be otherwise inexpressible. So long as a truth theory satisfies Convention T, these things will be expressible, and a trivial list-like definition of reference (“P” refers to x iff x is P) will suffice to generate the T-sentences. As native speakers, we know, without empirical investigation, that “electron” refers to electrons just by having mastered the word “refers” in our language. Our beliefs about electrons could be mistaken, but not our belief that “electron” applies to electrons. In particular, we cannot coherently suppose that “electron” does not refer to electrons because this is but a step away from a formal contradiction—some electrons are not electrons. Deflationists argue that such “thin” concepts and trivial relations cannot bear the explanatory burdens that scientific realists expect of them.

Deflationism is a controversial position. Field, before he endorsed deflationism, argued that Tarski merely reduced truth to a list-like definition of reference, but such a definition is physicalistically unacceptable (Field 1972). Chemical valence was originally defined by a list pairing chemical elements with their valence numbers, but later this definition was unified in terms of the number of outer shell electrons in the element’s atoms. Field argued that reference should be similarly reduced to physical notions. While this seems an implausibly strong requirement, many philosophers think it obvious that the success of action depends on the truth of the actors’ beliefs: John’s success in finding rabbits in the upper field, they argue, depends on his rabbit-beliefs corresponding to the local rabbits (Liston 2005). Deflationists respond that John’s success is explained by there being rabbits there (no need to mention ‘true’), but deflated explanations become strained when John is not an English thinker, because the sentences Jean holds true (‘Des lapins habitent le champ supérieur’) must first be translated into sentences we hold true and then disquoted—a strategy known as extended disquotationalism—and it is difficult to see why Jean’s success has anything to do with his sentences translating into ours.

Deflationists reject SR4 and SR5, but this does not mean they cannot believe what our best scientific theories tell us: deflationists can and typically do accept SR3 as well as all the object-level inferences that science uses, including object-level IBE (Leeds 1995, 2007). It means only that deflationists reject the meta-level IBE deployed by realists (§5e)—such inferences must be rejected if truth is not an explanatory notion.

b. Pragmatist Truth Surrogates

Pragmatists question metaphysical realism (SR3): it presupposes a relation between our representations (to which we have access) and a mind-independent world (to which we lack access), and there cannot be such a relation, because mind-independent objects are in principle beyond our cognitive reach. Thus SR3 (and correspondence truth) is either vacuous or unintelligible. For them, word-world relations are between words and objects-as-conceived by us. If we cannot reach out to mind-independent objects, we must bring them into our linguistic and conceptual range.

Pragmatists also tend to supplement Tarski’s understanding of truth, like philosophers in a broadly idealist tradition (including Hume, Kant, the positivists, and Kuhn) who employ truth-surrogates that structure the “world” side of the correspondence relation in some way (impressions, sense data, phenomena, a structured given) that would render the correspondence intelligible. Depending on the kind of idealism adopted “p is true” might be rendered “p is warrantedly assertible”, “p is derivable from theory Θ”, or “p is accepted in paradigm P”, all of the form “p is E” where E is some epistemic surrogate for “true”. We have already seen (§5d) how realists object to this move: it assigns to the concepts truth and reference the wrong properties (it makes them intra-theoretic rather than trans-theoretic) and thus cannot properly capture key features of practice. More generally, Putnam argues, truth cannot be identified with any epistemic notion E: take any revisable proposition p that satisfies E, we already know that p might not be true; so being E does not amount to being true. For example, that Venus has CO2 in its atmosphere is currently warrantedly assertible, but future investigation could lead us to discover that it is not true. Thus, Putnam thinks, truth is epistemically transcendent: it cannot be captured by any epistemic surrogate (Putnam 1978).

c. Putnam’s Internal Realism

In his SR period, Putnam held that only real word-world correspondences could capture the epistemic transcendence and causal explanatory features of truth. In the late 1970s Putnam came to doubt SR3, reversed his position, and proposed a new program, internal realism (Putnam 1981). IR has negative and positive components.

The main negative component rejects metaphysical realism (SR3) and the associated thesis that truth and reference are word-world correspondences (SR4). The primary argument for this rejection is Putnam’s model-theoretic argument (Merrill 1980; Putnam 1978, 1981). Take our language and total theory of the world. Suppose the intended reference scheme (which correlates our word uses with objects in the world) is that which satisfies all the constraints our best theory imposes. This supposition is problematic because those constraints would fix at best the truth conditions of every sentence of our language; they would not determine a unique assignment of referents for our terms. Proof: Assume there are n individuals in the world W, and our theory T is consistent. Model theory tells us that since T is consistent it has a model M of cardinality n; that is, all the sentences of T will be true-in-M. Now define a 1-1 mapping f from the domain of M, D(M), to the domain of W, D(W), and use f to define a reference relation R* between L(T) (the language of our theory) and objects in D(W) as follows: if x is an object in D(W) and P is a predicate of L(T), then P refers* to x if and only if P refers-in-M to f-1x. Then any sentence S will be true* (of W) if and only if S is true-in-M. Intuitively, truth* and reference* are not truth and reference but gerrymandered relations that mimic truth-in-M and refers-in-M, where M can be entirely arbitrary, provided it has enough objects in its domain. Unfortunately, anything we do to specify the correct reference scheme for our language and incorporate it into our total theory is subject to this permutation argument. One might object, for example, that a necessary condition for (real) reference is that P refer to x only if x causes P and P is not causally related to the objects it refers* to (Lewis 1984). But if we add this condition to our theory, then we can redeploy a permutation whereby “x causes* P (in W)” will mimic “f-1x causes P (in-M)”; and instead of failing to fix the real reference relation we will be failing to fix the real causal relations. This formal result is the basis of Putnam’s argument that even our best theory must fail to single out its intended model (reference scheme). The permutation move is so global that no matter what trick X one uses to distinguish reference from reference*, the argument will be redeployed so that if X relates to cats in a way that it does not to cats*, then X* (a permutation of X) will relate to cats* in the same sort of way, and there will be no way of singling out whether we’re referring to X or X*.

The positive component of internal realism replaces SR3 and SR4 with IR3 and IR4:

IR3 We can understand a determinate world only as a world containing objects and properties as it would be described in the ideal theory we would reach at the limit of human inquiry;

IR4 Theories are true, when they are, partly because their concepts correspond to objects and properties that the ideal theory carves out;

and it reinterprets references to truth in SR1, SR2, and SR5 in terms of IR3 and IR4.

IR3 replaces allegedly problematic, inaccessible mind-independent objects with unproblematic, accessible objects that would be produced by the conceptual scheme we would reach in the ideal theory, and IR4 relates our words to the world as it would be carved up according to the ideal theory. When truth, reference, objects, and properties are thus relativized to the ideal theory, then IR1, IR2, and IR5 are just IR counterparts of their SR analogs: we aim to give accounts that would be endorsed in the ideal theory; to accept a theory is to believe it approximates the ideal theory; science (trivially) progresses toward the ideal theory. Putnam believes he can avoid unintelligible correspondences to an inaccessible, God-eye view of the world yet still have a concept of truth that is explanatory and epistemically transcendent. While truth-in-the-ideal-limit is an epistemic concept—it is relativized to what humans can know—it transcends any particular epistemic context; so we can have the best reasons to believe that Venus has CO2 in its atmosphere though it may be false (for it may turn out not to be assertible in the ideal theory).

Objects and properties, according to IR3, are as much made as discovered. To many realists, this seems to be an extravagant solution to a non-problem (Field 1982): extravagant to claim we have a hand in making stars or dinosaurs; a non-problem, because many realists think the content of metaphysical realism (SR3) is just that there is a mind-independent world in the sense that stars and dinosaurs exist independently of what humans say, do, or think. The problem is not how to extend our epistemic and semantic grasp to objects separated from us by a metaphysical chasm; it is the more ordinary, scientific problem of how to extend our grasp from nearby middle-sized objects with moderate energies to objects that are very large, very small, very distant from us spatiotemporally, and so forth. (Kitcher 2001; Liston 1985). Moreover, realists point out, true-in-the-ideal-theory falls short of true. We know that either string theory is true and the material universe is composed of tiny strings or this is not the case. But it is conceivable that no amount of human inquiry, even taken to the ideal limit, will decide which; so though one disjunct is true, neither may be assertible in the ideal limit. Consequently, internalist truth lacks the properties of truth. (It is noteworthy that Putnam recanted internalist truth in his last writing on these matters (Putnam 2015)).

Rorty is another pragmatist who rejects, in a far more radical manner than Putnam, the fundamental presuppositions of the realist-antirealist debate (Rorty 1980).

9. Law-Antirealism and Entity-Realism

Cartwright (1983) and Hacking (1983) represent this mix of theoretical law antirealism and theoretical entity realism. The kind of account that Cartwright rejects has three main components. First is the facticity view of fundamental physical laws: adequate fundamental laws must be (approximately) true. The basic equations of Newton, Maxwell, Einstein (STR/GTR), quantum mechanics, relativistic quantum mechanics, and so forth, are typical examples of such laws. Second is the covering law (or DN) model of explanation (Hempel 1965, §3c): a correct explanation of a phenomenon or phenomenological law is a sound deduction of the explanandum from fundamental laws together with statements describing, for example, compositional details of the system, boundary and initial conditions, and so forth. The deduction renders the explanandum intelligible by showing it to be a special case of the general laws. Thus, for example, Galileo’s law of free fall is explained as a special case of Newtonian fundamental laws by its derivation from Newton’s gravitational theory plus background conditions close to the earth’s surface. Third is IBE: the success of DN-explanations in rendering large classes of phenomena intelligible can justify our inferring the truth of the covering laws. The fact that Galileo’s law, Kepler’s laws, the ideal gas laws, tidal phenomena, the behavior of macroscopic solids, liquids, and gases all find a deductive home under Newton’s laws provides warrant for belief in the facticity of Newton’s laws.

Cartwright rejects all three components. She begins by challenging the first two components: there is a trade-off between facticity and explanatory power. Newton’s law of gravitation, FG = Gm1m2/r122, tells us what the gravitational force between two massive bodies is. Coulomb’s law, FC = kq1q2/r122, tells us what the electrostatic force between two charged bodies is. Each law gives the total force only for bodies where no other forces are acting. But most actual bodies are charged and massive and have other forces acting on them; thus the laws either are not factive (if read literally) or do not cover (if read as subject to the ceteris paribus modifier “provided no other forces are acting”). In physics, we explain by combining the forces: the actual force acting on a charged massive body is FA = FG + FC, the vector-sum of the Newton and Coulomb forces, which determines the actual acceleration and path. Cartwright objects that (a) we lack general laws of interaction allowing us to add causal influences in this way, (b) there is no reason to think that we can get super-laws that will be true and cover, (c) in nature there is only the actual cause and resultant trajectory. But if the facticity and explanatory components clash in this way, the third component is in trouble also. Realists cannot appeal to IBE to justify belief in factive fundamental covering laws because good explanations that cover a host of phenomena rarely proceed from true (factive) laws. Consequently, the explanatory success of fundamental laws cannot be cited as evidence for their truth.

Cartwright’s own account has three corresponding components. First, fundamental laws are non-factive: they describe idealized objects in abstract mathematical models, not natural systems. In nature there are no purely Newtonian gravitational systems or purely electromagnetic systems. These are mathematical idealizations. Only messy phenomenological laws (describing empirical regularities and fairly directly supported by experiment) truly describe natural systems. Second, we should replace the DN model of explanation with a simulacrum account: explanations confer intelligibility by fitting staged mathematical descriptions of the phenomena to an idealized mathematical model provided by the theory by means of modeling techniques that are generally “rigged” and typically ignore (as negligible) disturbing forces or mathematically incorporate them (often inconsistently). To explain a phenomenon is to fit it in a theory so that we can derive fairly simple analogs of the messy phenomenological laws that are true of it. Intelligibility, not truth, is the goal of theoretical explanation. Third, although we should reject IBE, we should embrace inference to the most likely cause (ILC). Whereas theoretical explanations allow acceptable alternatives and need not be true, causal explanations prohibit acceptable alternatives and require the cause’s existence. ILC, on Cartwright’s view, can justify belief in unobservables that are experimentally detectible as the causes of phenomena. Thus, for example, Perrin’s experiments showed that the most likely cause of Brownian motion was molecular collisions with the Brownian particles; Rutherford’s experiments showed that the most likely cause of backward scattering of a-particles bombarded at gold foil were collisions with the nuclei of the gold atoms.

The laws of physics lie, Cartwright claims, and the hope of a true, unified, explanatory theory of physics is either based on a misunderstanding of physics practice or a vestige of 17th century metaphysical hankering for a neatly designed mechanical universe. The practice of physicists, she argues, indicates that we ought to be antirealists about fundamental laws and points instead to a messy, untidy universe that physicists cope with by constructing unified abstract stories (Cartwright 1999). Thus Cartwright is anti-realist about fundamental laws: contrary to realists, they are not (even approximately) true; contrary to van Fraassen, she is not recommending agnosticism—we now know they are non-factive. On the other hand, also contrary to van Fraassen, scientific practice indicates that we should be realists about “unobservable” entities that are the most likely causes of the phenomena we investigate.

Critics complain that Cartwright confuses metaphysics and epistemology: even if we lack general laws of interaction, it does not follow that there are none. Cartwright replies that the unifying ideal of such super-laws is merely a dogma. However, practice seems Janus-faced here: the history of modern physics is one of disunity leading to unity leading to disunity, and so forth. Each time distinct fundamental laws resist combination, a new unifying theory emerges that combines them: electrodynamics and eventually Einstein’s theories succeeded in combining Newton and Coulomb forces. The quest for unity is a powerful force guiding progress in physics, and, while the ideal of a unified “theory of everything” continues to elude us, Cartwright’s examples hardly show that it is a vain quest. Moreover, Cartwright arguably conflates different kinds of laws: in classical settings, the fundamental laws are Newton’s laws of motion, and his F = ma is the super-law that combines Newton’s gravitational and Coulomb’s electrostatic laws (Wilson 1998).

Cartwright’s distinction between “theoretical” and “causal” explanations has also been criticized. Nothing about successful theoretical explanations, she claims, requires their truth, whereas successful causal explanations require the existence of the cause. To many this move seems fallacious—if “successful” means correct, then the truth of the former follows as much as the existence of the latter; if “successful” does not mean correct, then neither follows. Presumably, in the IBE context, “successful” does not entail truth, but similarly in the ILC context, “successful” does not entail existence: the most likely cause could turn out not to exist (for example, caloric flow or phlogiston escape) just as the best explanation could turn out to be false (caloric or phlogiston theory).

10. NOA: The Natural Ontological Attitude

Fine (1996, 1986) presented NOA, an influential response to the debates amounting to a complete rejection of their presuppositions. We generally trust what our senses tell us and take our everyday beliefs as true. We should similarly trust what scientists tell us: they can check what is going on behind the appearances using instruments that extend our senses and methods that extend ordinary methods. This is NOA: we should accept the certified results of science on a par with homely truths. Both realists and antirealists accept this core position, but each adds an unnecessary and flawed philosophical interpretation to it.

Realists add to the core position the redundant word “REALLY”: “electrons REALLY exist”. SR realists add substantive word-world correspondences, a policy that serves no useful purpose. The only correct notion of correspondence is the disquotational one: “P” refers to (or is true of) x if and only if x is P. Realist appeals to IBE are problematic for two reasons. First, they beg the question against antirealists, who ab initio question any connection between explanatory success and approximate truth. Moreover, there is no inferential principle that realists could employ and antirealists would accept. Straight induction will not work: we can induce from the observed to the unobserved, because the unobserved can be later observed to check the induction; but we cannot induce to unobservables, because there can be no such independent check (according to the antirealist). Second, IBE does not work without some logical connection between success and (approximate) truth. But the inference from success to (approximate) truth is either invalid if read as a deductive move (because many successful theories turned out to be false (§7b)), weak if read as an inductive move (because nearly all successful past theories turned out to be false), or circular if read as a primitive IBE move. The antirealist, by contrast, has a ready answer: if a scientific theory or method worked well in the past, tinker with it, and try it again. Finally, Fine argues, contrary to what realists often claim, realism blocks rather than promotes scientific progress. In the Einstein-Bohr methodological debates about the completeness of quantum mechanics, the realist Einstein saw QM as a degenerate theory, while the instrumentalist Bohr saw QM as a progressive theory. Subsequent history favored Bohr over Einstein.

However, antirealism is no better off. Empiricists attempt to set limits: we should believe only what science tells us about observables. Fine criticizes these limits for reasons given in §5a and §6b—the observable-unobservable distinction cannot be drawn in a manner that would motivate skepticism or agnosticism about unobservables but not about observables. We have standard ways of cross checking to ensure that what we are “seeing’ with an instrument or calculating with a theory is reliable even if not “directly” observable. Fine concludes that the checks that science itself uses should be the ones we appeal to when in doubt. Pragmatists and constructivists react to the inaccessible, unintelligible word-world correspondences posited by realists by pulling back and trying to reformulate the correspondences in terms of some accessible surrogate for truth and reference (§8). Fine reiterates the criticisms of §5d and §8: truth has properties that any epistemic truth-surrogate lacks.

Both realists and antirealists view science as a practice in need of a philosophical interpretation. In fact, science is a self-interpreting practice that needs no philosophical interpretation. It has local aims and goals, which are reconfigured as science progresses. Asking about the (global) aim of science is like asking about the meaning of life: it has no answer and needs none. NOA takes science on its own terms, a practice whose history and methods are rooted in, and are extensions of, everyday thinking (Miller 1987). NOA accepts ordinary scientific practices but rejects apriorist philosophical ideas like the realist’s God’s-Eye view and antirealist’s truth-surrogates.

Critics see NOA as a flight from, rather than a response to, the scientific realism question (Musgrave 1989). The core position, they argue, is difficult to characterize in a philosophically neutral manner that does not invite a natural line of philosophical questioning. Once one accepts that science delivers truths and explanations, it is natural to ask what that means, and realist and antirealist replies will naturally ensue—as they always have, since these interpretations are as old as philosophy itself. Moreover, it may be difficult to characterize NOA non-tendentiously: ground-level IBE and correspondence truth, for example, are arguably rooted in common sense and ought to be included in NOA; but then any antirealism that rejects them is incompatible with NOA.

11. The 21st Century Debates

Between 1990 and 2016 new versions of the debates, many focusing on Laudan’s PI (§7b), have emerged.

a. Structuralism

Structural Realism claims that: science aims to provide a literally true account only of the structure of the world (StR1); to accept a theory is to believe it approximates such an account (StR2); the world has a determinate and mind-independent structure (StR3); theories are literally true only if they correctly represent that structure (StR4); and the progress of science asymptotically approaches a correct representation of the world’s structure (StR5). (Here we replace each SR thesis in §5 with an analogous StR thesis.)

Structuralism comes from philosophy of mathematics. Consider the abstract structure <ω, o, ξ>, where ω is an infinite sequence of objects, o an initial object, and ξ a relation that well-orders the sequence. This structure is distinct from its many exemplifications: for example, the natural numbers ordered under successor, <0, 1, 2, 3, …>; the even natural numbers in their natural order, <0, 2, 4, 6, …>; and so forth. We can similarly consider the offices of the U.S. President, Vice-President, Speaker of the House, and so forth. as a collection of objects defined by the structure of relations given in the U.S. Constitution, distinct from its particular exemplars at a given time: Bush, Cheney, Pelosi (January 2007), and Obama, Biden, Boehner (January 2011). Similarly, structuralists suggest, the structure of relations that obtain between scientific objects is distinct from the nature of those objects themselves. The structure of relations is typically expressed (at least in physics) by mathematical equations of the theory (Frigg and Votsis 2011). For example, Hooke’s law, F = -ks describes a structure, the set of all pairs of reals <x, y> in Rsuch that y = -kx, which is distinct from any of its concrete exemplifications like the direct proportionality between the restoring force F for a stretched spring and its elongation s. If the world is a structured collection of objects (StR3), then StR1 says that science aims to describe only the structure of the objects but not their intrinsic natures.

Structuralism is not new: precursors include Poincaré and Duhem in the 19th century (§2c), Russell (1927), Ramseyfied-theory versions of logical positivism (§3b), Quine (§4), and Maxwell (1970). Russell claimed that we can directly know (by acquaintance) only our percepts, but we can indirectly know (by structural description) the mind-independent objects that give rise to them. This approach presupposes a problematic distinction between acquaintance and description and a problematic isomorphism between the percept and causal-entity structures. Worse, it runs afoul of a devastating critique by the mathematician M.H.A. Newman (1928), closely related to Putnam’s model-theoretic argument (§8c), and never satisfactorily answered by Russell. Newman argues that a fixed structure of percepts can be mapped 1-1 onto a host of different causal-entity structures provided there are enough objects in the latter; thus the structural knowledge that science allegedly delivers is trivial—it merely amounts to a claim that the world has a certain cardinality, the size of the percept-structure. (The Ramseyfied-theory approach encounters similar problems (Psillos 2001).)

Contemporary proponents, beginning with Worrall (1989), hold that structuralism steers a middle path between standard versions of scientific realism and antirealism. StR, they argue, provides the best of both worlds by acknowledging and reconciling the pull of both pessimistic and optimistic inductions on the history of science. Pessimistic inductions (PI) argue against SR (§7b): the ontology of our current best theories (quarks, for example) will likely be discarded just like that of past best theories (for example, ether). Optimistic inductions (like the NMA) argue for SR (§5d): because past successful theories must have been approximately true, current more successful theories must be closer to the truth. Structuralists respond that, though ontologies come and go, our grip on the underlying structure of the world steadily improves. Underlying ontology need not be (and is not) preserved in theory change, but the mathematical structure is both preserved and improved upon: Fresnel’s correct claims about the structure of light (as a wave phenomenon) were retained in later theories, while his incorrect claims about the nature of light (as a mechanical vibration in a mechanical medium, the ether) were later discarded. Structuralists can also resist the argument from empirically equivalent theories (§6c)—to the extent that the theories are structurally equivalent they would capture the same structural facts, which is all a theory needs to capture—and do so without embracing a particular realist ontology occupying the nodes of the structure.

But can the needed distinction between structure and nature be drawn and can structures be rendered intelligible without the ontology that gives them flesh (Psillos 1995, 1999, 2001)? Two possible StR answers are suggested.

First, there is epistemological structural realism (EStR), endorsed by Poincaré, Worrall, and logical positivists in the Ramseyfied-theory tradition: electrons are objects as Obama is an object, but, unlike Obama, science can never discover anything about electrons’ natures other than their structural relations. For EStR to be a realist position, it will not suffice to say: we can know only observable objects (like Obama) and their (observable) structural relations; we must be agnostic about unobservable objects and their relations. This is merely a CE version of structuralism, as van Fraassen points out (2006, 2008), and inherits many problems of CE (§6). To be a realist position, EStR has to presuppose that, in addition to the structure of the phenomena whose objects are knowable, there is a mind-independent, knowable “underlying” structure, whose objects are unknowable. But now one must distinguish Obama from electrons so that Obama’s nature is knowable but electrons’ natures are not; the problematic observable-unobservable distinction (§§5a, 6b) has returned.

Critics argue that there is no sharp, epistemologically significant distinction between form (structure) and content (nature) of the kind needed for EStR. First, our knowledge of the nature of electrons is bound up with our knowledge of their structural relations so that we come to know them together: saying what an electron is includes saying how it is structured; our knowledge of its nature forms a continuum with our knowledge of its structure. Second, EStR requires a variant of the NMA (restricted to retention of structure) to uphold StR5. But this requires that, in progressive theory-change, structure (retained and improved) is what explains increased empirical success. But structure alone (without auxiliary hypotheses describing non-structural features of the world) never suffices to derive new empirical content. Finally, critics object to structuralists’ interpretations of the history. Worrall, for example, argues that Fresnel’s structural claims about light (the mathematics) were retained, but not his commitments to a mechanical ether; his critics question whether Fresnel could have been “just” right about the structure of light-propagation and completely wrong about the nature of light.

Second, there is ontological structural realism (OStR), advocated by Ladyman and others (Ladyman and Ross 2007) and similar to Quine’s realism (§4). OStR bites the bullet: we can know only structure because only structure exists. Obama is no more an object than electrons are; each is itself a structure; more strongly, everything is structure. Some of the attraction of this strange metaphysical position comes from its promise to handle problems in quantum mechanics that are orthogonal to our debates. Its proponents argue that it can account, for example, for apparently indistinguishable particles in entangled quantum states. In the context of our debates, OStR is supposed to avoid the epistemological problems of EStR: qua objects understood as structural nodes, electrons are in principle no more unknowable (or knowable) than Obama or ordinary physical objects. However, it runs into its own metaphysical problems, since it threatens to lose touch with concrete reality altogether. Even if God created nothing concrete, it would still be a structural (mathematical) fact that neutrons and protons, if they exist, form an isospin doublet related by SU(2) symmetry. For this to be a concrete (physical) fact, God would have had to create some objects—nucleons with symmetrically related isospin states or some more fundamental objects that compose nucleons—to occupy the neutron- and proton-nodes of the SU(2) group-structure. Even if those objects had only structural properties, they would have to have one non-structural property—existence (van Fraassen 2006, 2008). So, not everything is structure; there is a distinction between empty mathematical structures and realized physical structures; OStR can not capture that distinction.

b. Stanford’s New Induction

Kyle Stanford’s new induction provides the latest historical challenge to SR (Stanford 2001, 2006, 2015). Following Duhem (1991) Stanford poses what he calls the Problem of Unconceived Alternatives (PUA): for any fundamental domain of inquiry at any given time t there are alternative scientific hypotheses not entertained at t but which are consistent with (and even equally confirmed by) all the actual evidence available at t. PUA, were it true, would seem to create a serious underdetermination problem for SR: we opt for our current best confirmed theory, but there is a distinct alternative that is equally supported by all the evidence we possess, but which we currently lack the imagination to think of. (Two things about PUA are worth noting. First, it concerns the actual evidence we have at a time; it is not that the theory and the alternatives are underdetermined by all possible evidence; the underdetermination may be transient; future evidence may decide that the theory we have selected is not correct. Second, the unconceived alternative hypotheses are ordinary scientific hypotheses, not recherché philosophical hypotheses involving brains-in-vats, and so forth.)

Stanford argues that PUA is our general predicament. His New Induction on the history of science, he argues, shows that our epistemic situation is one of recurrent, transient underdetermination. Virtually all T-T* transitions in the past were affected by PUA: the earlier T-theorists selected T as the best supported theory of the available alternatives; they did not conceive of T* as an alternative; T* was conceived only later yet T* is typically better supported than T. At any given time, we could only conceive a limited set of hypotheses that were confirmed by all the evidence then available, yet subsequent inquiry revealed distinct alternatives that turned out to be equally or better confirmed by that evidence. We thus have good inductive reasons to believe we are now in the same predicament—our current best theories will be replaced by incompatible and currently unconceived successors that account for all the currently available evidence.

Stanford proposes a new instrumentalism. Like van Fraassen’s (§6), his instrumentalism is epistemic: it distinguishes claims we ought literally to believe from claims we ought only to accept as instrumentally reliable and argues that instrumental acceptance suffices to account for scientific practice. Unlike van Fraassen, Stanford bases his distinction, not on an observable-unobservable dichotomy, but on whether our access to a domain is based primarily on eliminative inference subject to PUA challenges: if it is, then we should adopt an instrumentalist stance; if it is not (as, for example, our access to the common sense world is not), then we may literally believe.

c. Selective Realism

Many debates in the early 21st century focus on historical inductions, especially on what representative basis would warrant an inductive extrapolation. Putnam and Boyd were aware that care was needed with the NMA and sometimes restricted their claims to mature theories so that we discount ab initio some theories on Laudan’s troublesome list—like the theory of crystalline spheres or of humoral medicine. Mature theories (with the credentials to warrant optimistic induction) must have passed a “take-off” point: there must be background beliefs that indicate their application boundaries and guide their theoretical development; their successes must be supported by converging but independent lines of inquiry and so forth. Moreover, many realists argue, a theory is suitable for optimistic induction only if it has yielded novel predictions; otherwise it could just have been rigged to fit the phenomena. Roughly, a prediction P (whether known or unexpected) is novel with respect to a theory T if no P-information is needed for the construction of T and no other available theory predicts P. Thus, for example, Newton’s prediction of tidal phenomena was novel because those phenomena were not used in (and not needed for) Newton’s construction of his theory and no other theory predicted the tides (Leplin 1997; Psillos 1999). Nevertheless, even thus restricted, the induction will not meet Laudan’s challenge, for that challenge includes an undermining argument (Stanford 2003a): many discarded yet empirically successful theories were mature and yielded novel predictions—for example, Newton’s theory, caloric theory, and Fresnel’s theory of light—so, if our current theories are correct, these theories were false.

More recent responses to these counterexamples attempt to steer a middle course between optimistic inductions like Putnam’s NMA (§5d) and pessimistic inductions like Laudan’s and Stanford’s (§§7b, 11b). These responses typically have a two-part normal form: (1) they concede to the pessimists that some parts of past empirically successful theories are discarded, yet (2) they argue with the optimists that some parts of past successful theories are retained, improved upon, and explain the successes of the old theories. Advocates of this “divide and conquer” strategy (Psillos 1999) try to have their cake and eat it too.

Variants of the strategy depend on how one separates those “good” features of past theories that are preserved, that explain empirical success, and that warrant optimistic induction from those “bad” features that are discarded. Structuralists, we saw, argue that structure (form), but not nature (content), is what is both preserved and responsible for success. Kitcher (1993) distinguishes a theory’s working and presuppositional posits. The term “light-wave” in Fresnel’s usage referred to light, no matter what its constitution is, in some contexts and to what satisfies the descriptionthe oscillations of ethereal molecules” in other contexts. In the former contexts, “light-wave” referred to high frequency electromagnetic waves, a mode of reference that was doing explanatory and inferential work and was retained in later theories. In the latter contexts, “light-wave” referred to the ether (that is, nothing), a mode of reference that was presupposed yet empty, idle, and not retained in later theories.

Other variants rely on the causal theory of reference. Hardin and Rosenburg (1982) exploit the idea that one can successfully refer to X (by being suitably causally linked to X) while having (largely) false beliefs about X. Thus, Fresnel and Maxwell were referring to the electromagnetic field when they used the term “ether”, and, though they had many false beliefs about it (that it was a mechanical medium, for example), the electromagnetic field was causally responsible for their theories’ success and was retained in later theories.  A big problem with this response is that referential continuity does not suffice for partial or approximate truth (Laudan 1984; Psillos 1999). Psillos (1999) employs causal descriptivism to deal with this problem: “ether” in 19th century theories refers to the electromagnetic field, since that (and only that) object has the properties (medium of light-propagation that is the repository of energy and transmits it locally) that are causally responsible for the relations between measurements we get when we perform optical experiments. By contrast, “phlogiston” does not refer since nothing has the properties that the phlogiston theorists mistakenly believed to be responsible for the body of information they had about oxidation of metals, and so forth. During theory change, the causal-theoretical descriptions of some terms are retained and thereby their references also; these are the essential parts of the theory that contribute to its success; but this is consistent with less central parts being completely wrong.

The latest twist to these divide and conquer strategies is Chakravartty’s doctrine of semirealism (Chakravartty 1998, 2007). Taking his cue from Hacking-Cartwright (§9), Chakravartty distinguishes detection and auxiliary properties. The former are causal properties of objects (and the structure of real relations between them) that are well-confirmed by experimental manipulation because they underwrite the causal interactions we and our instruments exploit in experimental set-ups; the latter are merely theoretical and inferential aids. The former are retained in later theories; the latter are not. Past theories that were on the right track were so because they mathematically coded in systematic ways the detection properties (as opposed to the idle auxiliary properties).

Any of these strategies must meet two further challenges, emphasized in (Stanford 2003a, 2003b). First, they must answer the undermining challenge (above) in a way that is not ad hoc, question-begging, or transparently Whiggish. Simply arguing (with Hardin and Rosenburg) for preservation of reference via preservation of causal role is too easy: do Aristotle’s natural place, Newton’s gravitational action, and Einstein’s space-time curvature all play the same causal role in explaining free-fall phenomena? And if we tighten the account by claiming that continuity requires retention of core causal descriptions (Psillos) or detection property clusters (Chakravartty), are we engaged in a self-serving enterprise? Are we using our own best theories to determine the core causal properties/descriptions and then “reading” those back into the past discarded theories?

Second, they must respond to the trust argument. Divide and conquer strategies argue that successful past theories were right about some things but wrong about others. But then we should expect our own theories to be right about some things and wrong about others. Though perhaps an advance, this does not provide us with a good reason to trust any particular part of our own theories, especially any particular assessment we make (from our vantage point) of the features of a past discarded theory that were responsible for its empirical success. We judge that X-s in a past theory were working posits (Kitcher), essentially contributing causes of success (Psillos), detection properties (Chakravartty), while Y-s in that theory were merely presuppositional posits, idle, or auxiliary properties. But the past theorists were generally unable to make these discriminations, so why do we think we can now make them in a reliable manner. Stanford argues that realists can avoid this problem only if they can provide prospectively applicable criteria of selective confirmation—criteria that past theorists could have used to distinguish the good from the bad in advance of future developments and that we could now use—but they did not have such criteria, nor do we.

12. References and Further Reading

  • Boyd, R. (1973), “Realism, Underdetermination and the Causal Theory of Evidence”, Nous 7, 1-12.
  • Boyd, R. (1983), “On the Current Status of the Issue of Scientific Realism”, Erkenntnis, 19, 45–90.
  • Carnap, R. (1936), “Testability and Meaning”, Philosophy of Science 3, 419-471.
  • Carnap, R. (1937), “Testability and Meaning–Continued”, Philosophy of Science 4, 1-40.
  • Carnap, R. (1939), “Foundations of Logic and Mathematics”, International Encyclopedia of Unified Science 1(3), Chicago: The University of Chicago Press.
  • Carnap, R. (1950), “Empiricism, Semantics and Ontology”, Revue Intérnationale de Philosophie 4, 20-40.
  • Carnap, R. (1956), “The Methodological Character of Theoretical Concepts”, in H. Feigl and M. Scriven (eds), Minnesota Studies in the Philosophy of Science I, Minneapolis: University of Minnesota Press.
  • Cartwright, N. (1983), How the Laws of Physics Lie. Oxford: Clarendon Press.
  • Cartwright, N. (1999), The Dappled World. Cambridge: Cambridge University Press.
  • Chakravartty, A. (1998), “Semirealism”, Studies in the History and Philosophy of Science 29 (3), 391-408.
  • Chakravartty, A. (2007), A Metaphysics for Scientific Realism: Knowing the Unobservable. Cambridge: Cambridge University Press.
  • Churchland, P. (1985), ‘The Ontological Status of Observables: In Praise of the Superempirical Virtues’, in Churchland and Hooker 1985.
  • Churchland, P. and C. Hooker (eds) (1985), Images of Science: Essays on Realism and Empiricism, (with a reply from Bas van Fraassen). Chicago: University of Chicago Press.
  • Duhem, P. (1991/1954/1906), The Aim and Structure of Physical Theory. trans. P Wiener, intro. Jules Vuillemin, Princeton: Princeton University Press.
  • Field, H. (1972), “Tarski’s Theory of Truth”, Journal of Philosophy 64 (13), 347-375.
  • Field, H. (1982), “Realism and Relativism”, Journal of Philosophy 79 (10), 553-567.
  • Fine, A. (1996/1986), The Shaky Game. Chicago: University of Chicago Press.
  • Friedman, M. (1982), “Review of The Scientific Image”, Journal of Philosophy 79 (5), 274-283.
  • Friedman, M. (1999), Reconsidering Logical Positivism. Cambridge: Cambridge University Press.
  • Frigg, R. and I. Votsis. (2011), “Everything You Always Wanted to Know about Structuralism but Were Afraid to Ask”, European Journal for the Philosophy of Science 1, 227-276.
  • Hacking, I. (1983), Representing and Intervening. Cambridge: Cambridge University Press.
  • Hardin, C. and A. Rosenburg. (1982), “In Defense of Convergent Realism”, Philosophy of Science 49, 604-615.
  • Harman, G. (1965), “The Inference to the Best Explanation”, The Philosophical Review 74, 88–95.
  • Hempel, C. G. (1965), Aspects of Scientific Explanation. New York: Free Press.
  • Hertz, H. (1956), The Principles of Mechanics. New York: Dover.
  • Horwich, P. (1990), Truth. Oxford: Blackwell.
  • Kitcher, P. (1993), The Advancement of Science. Oxford: Oxford University Press.
  • Kitcher, P. (2001). “Real Realism: The Galilean Strategy”, The Philosophical Review 110 (2), 151-197.
  • Kuhn, T.S. (1970/1962), The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Kuhn, T.S. (1977/1974), The Essential Tension. Chicago: University of Chicago Press.
  • Kukla, A. (1998), Studies in Scientific Realism. Oxford: Oxford University Press.
  • Ladyman, J. and D. Ross. (2007), Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press.
  • Laudan, L. (1981), “A Confutation of Convergent Realism”, Philosophy of Science, 48, 19–48.
  • Laudan, L. (1984), “Realism without the Real”, Philosophy of Science, 51, 156-162.
  • Laudan, L. and J. Leplin. (1991), “Empirical Equivalence and Underdetermination”, Journal of Philosophy 88 (9), 449-472.
  • Leeds, S. (1995), “Truth, Correspondence, and Success”, Philosophical Studies 79 (1), 1-36.
  • Leeds, S. (2007), “Correspondence Truth and Scientific Realism”, Synthese 159, 1–21.
  • Leplin, J. (1997), A Novel Defence of Scientific Realism. Oxford: Oxford University Press.
  • Lewis, D. (1970), “How to Define Theoretical Terms”, Journal of Philosophy 67, 427-446.
  • Lewis, D. (1984). ‘Putnam’s Paradox’, Australasian Journal of Philosophy 62: 221-236.
  • Lipton, P. (2004/1991), Inference to the Best Explanation. London: Routledge.
  • Liston, M. (1985), “Is a God’s-Eye-View an Ideal Theory?”, Pacific Philosophical Quarterly 66.3-4, 355-376.
  • Liston, M. (2005), “Does ‘Rabbit’ refer to Rabbits?”, European Journal of Analytic Philosophy 1, 39-56.
  • Mach, E. (1893), The Science of Mechanics, trans. T. J. McCormack, 6th edition., La Salle: Open Court.
  • Magnus, P.D. and C. Callender. (2004), “Realist Ennui and the Base Rate Fallacy”, Philosophy of Science 71, 320–338.
  • Maxwell, G. (1962), “On the Ontological Status of Theoretical Entities”, in H. Feigl and G. Maxwell (eds.), Minnesota Studies in the Philosophy of Science III, Minneapolis: University of Minnesota Press.
  • Maxwell, G. (1970), “Structural Realism and the Meaning of Theoretical Terms”, in S. Winoker and M. Radner (eds.), Minnesota Studies in the Philosophy of Science IV, Minneapolis: University of Minnesota Press.
  • McMullin, E. (1991), “Rationality and Theory Change in Science”, in P. Horwich (ed.), Thomas Kuhn and the Nature of Science. Cambridge: MIT Press.
  • Merrill, G. H. (1980), “The Model-Theoretic Argument Against Realism”, Philosophy of Science 47, 69-81.
  • Miller, D. (1974), “Popper’s Qualitative Theory of Verisimilitude”, British Journal for the Philosophy of Science 25, 166–177.
  • Miller, R. (1987), Fact and Method. Princeton: Princeton University Press.
  • Musgrave, A. (1985), “Realism vs Constructive Empiricism”, in Churchland and Hooker 1985.
  • Musgrave, A. (1989), “Noa’s Ark–Fine for Realism”, Philosophical Quarterly 39, 383–398.
  • Newman, M. H. A. (1928), “Mr. Russell’s ‘Causal Theory of Perception”’, Mind 37, 137-148.
  • Niiniluoto, I. (1987), Truthlikeness. Dordrecht: Reidel.
  • Poincaré, H. (1913), The Foundations of Science. New York: The Science Press.
  • Psillos, S. (1995), “Is Structural Realism the Best of Both Worlds?”, Dialectica 49, 15-46.
  • Psillos, S. (1999), Scientific Realism: How Science Tracks Truth. London: Routledge.
  • Psillos, S. (2001), “Is Structural Realism Possible?”, Philosophy of Science 68, S13–S24.
  • Putnam, H. (1962), “What Theories Are Not”, in Putnam 1975c.
  • Putnam, H. (1975a), “Explanation and Reference”, in Putnam 1975d.
  • Putnam, H. (1975b), “The Meaning of ‘Meaning”’, in (Putnam 1975d).
  • Putnam, H. (1975c), Philosophical Papers 1: Mathematics, Matter and Method. Cambridge: Cambridge University Press.
  • Putnam, H. (1975d), Philosophical Papers 2: Mind, Language and Reality. Cambridge: Cambridge University Press.
  • Putnam, H. (1978), Meaning and the Moral Sciences. London: Routledge.
  • Putnam, H. (1981), Reason, Truth and History. Cambridge: Cambridge University Press.
  • Putnam, H. (2015), “Naturalism, Realism, and Normativity”, Journal of the American Philosophical Association 1(2), 312-328.
  • Quine, W.V. (1955), “Posits and Reality”, in W. V. Quine, The Ways of Paradox and Other Essays. Cambridge: Harvard University Press (1976), 246-254.
  • Quine, W.V. (1969), “Epistemology Naturalized”, in W. V. Quine, Ontological Relativity and Other Essays. New York: Columbia University Press (1969): 69-90.
  • Rorty, R. (1980), Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
  • Russell, B. (1927), The Analysis of Matter. London: Routledge, Kegan-Paul.
  • Stanford, P. K. (2001), “Refusing the Devil’s Bargain: What Kind of Underdetermination Should We Take Seriously?”, Philosophy of Science 68 (3), S1-S12.
  • Stanford, P.K. (2003a), “Pyrrhic Victories for Scientific Realism”, Journal of Philosophy 100 (11), 553-572.
  • Stanford, P.K. (2003b), “No Refuge for Realism: Selective Confirmation and the History of Science”, Philosophy of Science 70, 917-925.
  • Stanford, P.K. (2006), Exceeding our Grasp. Oxford: Oxford University Press.
  • Stanford, P.K. (2015), ““Atoms Exist” is Probably True, and Other Facts That Should Not Comfort Scientific Realists”, Journal of Philosophy 112 (8), 397-416 .
  • van Fraassen, B. (1980), The Scientific Image. Oxford: Clarendon Press.
  • van Fraassen, B. (2006), “Structure: its Shadow and Substance”, British Journal for Philosophy of Science 57, 275-307.
  • van Fraassen, B. (2008), Scientific Representation. Oxford: Clarendon Press.
  • Wilson, M. (1982), “Predicate Meets Property”, Philosophical Review 91(4), 549-589.
  • Wilson, M. (1985), “What can Theory Tell us about Observation?”, in Churchland and Hooker 1985.
  • Wilson, M. (1998), “Mechanics, Classical”, in Edward Craig (ed.), The Routledge Encyclopedia of Philosophy Vol. 6, 251-259, London: Routledge.
  • Wilson, M. (2006), Wandering Significance. Oxford: Oxford University Press.
  • Worrall, J. (1989), “Structural Realism: The Best of Both Worlds?”, Dialectica 43, 99–124.

 

Author Information

Michael Liston
Email: mnliston@uwm.edu
University of Wisconsin-Milwaukee
U. S. A.

Stoicism

Stoa of Attalus in AthensStoicism originated as a Hellenistic philosophy, founded in Athens by Zeno of Citium (modern day Cyprus), c. 300 B.C.E. It was influenced by Socrates and the Cynics, and it engaged in vigorous debates with the Skeptics, the Academics, and the Epicureans. The name comes from the Stoa Poikile, or painted porch, an open market in Athens where the original Stoics used to meet and teach philosophy. Stoicism moved to Rome where it flourished during the period of the Empire, alternatively being persecuted by Emperors who disliked it (for example, Vespasian and Domitian) and openly embraced by Emperors who attempted to live by it (most prominently Marcus Aurelius). It influenced Christianity, as well as a number of major philosophical figures throughout the ages (for example, Thomas More, Descartes, Spinoza), and in the early 21st century saw a revival as a practical philosophy associated with Cognitive Behavioral Therapy and similar approaches. Stoicism is a type of eudaimonic virtue ethics, asserting that the practice of virtue is both necessary and sufficient to achieve happiness (in the eudaimonic sense). However, the Stoics also recognized the existence of “indifferents” (to eudaimonia) that could nevertheless be preferred (for example, health, wealth, education) or dispreferred (for example, sickness, poverty, ignorance), because they had (respectively, positive or negative) planning value with respect to the ability to practice virtue. Stoicism was very much a philosophy meant to be applied to everyday living, focused on ethics (understood as the study of how to live one’s life), which was in turn informed by what the Stoics called “physics” (nowadays, a combination of natural science and metaphysics) and what they called “logic” (a combination of modern logic, epistemology, philosophy of language, and cognitive science).

Table of Contents

  1. Historical Background
    1. Philosophical Antecedents
    2. Greek Stoicism
    3. Roman Stoicism
    4. Debates with Other Hellenistic Schools
  2. The First Two Topoi
    1. “Logic”
    2. “Physics”
  3. The Third Topos: Ethics
  4. Apatheia and the Stoic Treatment of Emotions
  5. Stoicism after the Hellenistic Era
  6. Contemporary Stoicism
  7. Glossary
  8. References and Further Readings

1. Historical Background

Classically, scholars recognize three major phases of ancient Stoicism (Sedley 2003): the early Stoa, from Zeno of Citium (the founder of the school, c. 300 B.C.E.) to the third head of the school, Chrysippus; the middle Stoa, including Panaetius and Posidonius (late II and I century B.C.E.); and the Roman Imperial period, or late Stoa, with Seneca, Musonius Rufus, Epictetus and Marcus Aurelius (I through II century C.E.). Of course, Stoicism itself originated as a modification from previous schools of thought (Schofield 2003), and its influence extended well beyond the formal closing of the ancient philosophical schools by the Byzantine Emperor Justinian I in 529 C.E. (Verbeke 1983; Colish 1985; Osler 1991).

a. Philosophical Antecedents

Stoicism is a Hellenistic eudaimonic philosophy, which means that we can expect it to be influenced by its immediate predecessors and contemporaries, as well as to be in open critical dialogue with them. These includes Socratic thinking, as it has arrived to us mainly through the early Platonic dialogues; the Platonism of the Academic school, particularly in its Skeptical phase; Aristotelianism of the Peripatetic school; Cynicism; Skepticism; and Epicureanism. It is worth noting, in order to put things into context, that a quantitative study of extant records concerning known philosophers of the ancient Greco-Roman world (Goulet 2013) estimates that the leading schools of the time were, in descending order: Academics-Platonists (19%), Stoics (12%), Epicureans (8%), and Peripatetics-Aristotelians (6%).

Eudaimonia was the term that meant a life worth living, often translated nowadays as “happiness” in the broad sense, or more appropriately, flourishing. For the Greco-Romans this often involved—but was not necessarily entirely defined by—excellence at moral virtues. The idea is therefore closely related to that of virtue ethics, an approach most famously associated with Aristotle and his Nicomachean Ethics (Broadie & Rowe 2002), and revived in modern times by a number of philosophers, including Philippa Foot (2001) and Alasdair MacIntyre (1981/2013).

Stoicism is best understood in the context of the differences among some of the similar schools of the time. Socrates had argued—in the Euthydemus, for instance (McBrayer et al. 2010)—that virtue, and in particular the four cardinal virtues of wisdom, courage, justice and temperance, are the only good. Everything else is neither good nor bad in and of itself. By contrast, for Aristotle the virtues (of which he listed a whopping twelve) were necessary but not sufficient for eudaimonia. One also needed a certain degree of positive goods, such as health, wealth, education, and even a bit of good looks. In other words, Aristotle expounded the rather commonsensical notion that a flourishing life is part effort, because one can and ought to cultivate one’s character, and part luck, in the form of the physical and cultural conditions that affect and shape one’s life.

Contrast this to the rather extreme (even for the time) take of the Cynics, who not only thought that virtue was the only good, like Socrates, but that the additional goods that Aristotle was worried about were actually distractions and needed to be positively avoided. Cynics like Diogenes of Sinope were famous for their ascetic and shall we say rather eclectic life style, as is epitomized by a story about him told by Diogenes Laertius (VI.37): “One day, observing a child drinking out of his hands, he cast away the cup from his wallet with the words, ‘A child has beaten me in plainness of living.’”

Diogenes and the boy without a cup

Diogenes and the boy without the cup

One way to think of this is that the Aristotelian approach comes across as a bit too aristocratic: if one does not have certain privileges in life, one cannot achieve eudaimonia. By contrast, the Cynics were preaching a rather extreme minimalist lifestyle, which is hard to practice for most human beings. What the Stoics tried to do, then, was to strike a balance in the middle, by endorsing the twin crucial ideas, on which I will elaborate later, that virtue is the only true good, in itself sufficient for eudaimonia regardless of one’s circumstances, but also that other things—like health, education, wealth—may be rationally preferred (Proēgmena) or “dispreferred” (Apoproēgmena), as in the case of sickness, ignorance, and poverty, as long as one did not confuse them for things with inherent value.

b. Greek Stoicism

The “Greek” phase of the Stoa covers the first and second periods, from the founding of the school by Zeno to the shifting of the center of gravity from Athens to Rome in the time of Posidonius in the I Century B.C.E., who became a friend of Cicero—not a Stoic himself, but one of our best indirect sources on early Stoicism. Stoicism was not just born, but flourished in Athens, even though most of its exponents originated from the Eastern Mediterranean: Zeno from Citium (modern Cyprus), Cleanthes from Assos (modern Western Turkey), and Chrysippus from Soli (modern Southern Turkey), among others. According to Medley (2003), this pattern is simply a reflection of the dominant cultural dynamics of the time, affected as they were by the conquests of Alexander.

From the beginning Stoicism was squarely a “Socratic” philosophy, and the Stoics themselves did not mind such a label. Zeno began his studies under the Cynic Crates, and Cynicism always had a strong influence on Stoicism, all the way to the later writings of Epictetus. But Zeno also counted among his teachers Polemo, the head of the Academy, and Stilpo, of the Megarian school founded by Euclid of Megaria, a pupil of Socrates. This is relevant because Zeno came to elaborate a philosophy that was both of clear Socratic inspiration (virtue is the Chief Good) and a compromise between Polemo’s and Stilpo’s positions, as the first one endorsed the idea that there are external goods—though they are of secondary importance—while the second one claimed that nothing external can be good or bad. That compromise consisted in the uniquely Stoic notion that external goods are of ethically neutral value, but are nonetheless the object of natural pursuit.

Zeno established the tripartite study of Stoic philosophy (see the three topoi[[hyperlink]]) comprising ethics, physics and logic. The ethics was basically a moderate version of Cynicism; the physics was influenced by Plato’s Timaeus (Taran 1971) and encompassed a universe permeated by an active (that is, rational) and a passive principle, as well as a cosmic web of cause and effect; the logic included both what we today refer to as formal logic and epistemology, that is, a theory of knowledge, which for the Stoics was decidedly empiricist-naturalistic.

The Stoics after Zeno disagreed on a number of issues, often interpreting Zeno’s teachings differently. Perhaps the most important example is provided by the dispute between Cleanthes and Chrysippus about the unity of the virtues: Zeno had talked about each virtue in turn being a kind of wisdom, which Cleanthes interpreted in a strict unitary sense (that is, all virtues are one: wisdom), while Chrysippus understood in a more pluralistic fashion (that is, each virtue is a “branch” of wisdom).

The early Stoics could also be stubbornly anti-empirical in their apologetics of Zeno’s writings, as when Chrysippus insisted in defending the idea that the heart, not the brain, is the seat of intelligence. This went against pretty conclusive anatomical evidence that was already available in the Hellenistic period, and earned the Stoics the scorn of Galen (for example, Tieleman 2002), though later Stoics did update their beliefs on the matter.

Despite this faux pas, Chrysippus was arguably the most influential Stoic thinker, responsible for an overhaul of the school, which had declined under the guidance of Cleanthes, a broad systematization of its teachings, and the introduction of a number of novel notions in logic—the aspect of Stoicism that has had the most technical philosophical impact in the long run. Famously, Diogenes Laertius (2015, VII.183) wrote that “But for Chrysippus, there had been no Porch.”

In the six decades following Chrysippus there were just two heads of the Stoa, Zeno of Tarsus (south-central Turkey) and Diogenes of Babylon, whose contributions were rather less significant than those of Chrysippus himself. We have to wait until 155 B.C.E. for the next impactful event, when the heads of the three major schools in Athens—the Stoics, the Academics and the Peripatetics—were sent by the city to Rome in order to help with diplomatic efforts. (It is interesting to note, as does Sedley (2003) that the fourth large school, the Epicurean one, was missing, following their stance of political non-involvement.) The philosophers in question, including the Stoic Diogenes of Babylon, made a huge impression on the Roman public with their public performances (and, apparently, an equally worrisome one on the Roman elite, thus beginning a long tradition of tension between philosophers and high-level politicians that characterized especially the post-Republican empire), paving the road for the later shift of philosophy from Athens to Rome, as well as other centers of learning, like Alexandria.

Beginning with Antipater of Tarsus, and then more obviously Panaetius (late II Century B.C.E.) and Posidonius (early I Century B.C.E.), the Stoics revisited their relationship with the Academy, especially in light of the above mentioned importance of the Timaeus for Stoic cosmology. Apparently, what particularly interested Posidonius was the fact that Plato’s main character in the dialogue is a Pythagorean, a school that Posidonius somewhat anachronistically managed to link to Stoicism.

It appears that the broader project pursued by both Panaetius and Posidonius was one of seeking common ground (Sedley 2003 uses the term “syncretism”) among Academicism, Aristotelianism and Stoicism itself, that is, the three branches of Socratic philosophy. This process seems to have been in part responsible for the further success of Stoicism once the major philosophers of the various schools moved from Athens to Rome, after the diaspora of 88-86 B.C.E.

c. Roman Stoicism

If the visit to Rome by the head of various philosophical schools in 155 B.C.E. was crucial for bringing philosophy to the attention of the Romans, the political events of 88-86 B.C.E. changed the course of Western philosophy in general, and Stoicism in particular, for the remainder of antiquity.

At that time philosophers, particularly the Peripatetic Athenion and—surprisingly—the Epicurean Aristion, were politically in charge at Athens, and made the crucial mistake of siding with Mithridates against Rome (Bugh 1992). The defeat of the King of Pontus, and consequently of Athens, spelled disaster for the latter and led to a diaspora of philosophers throughout the Mediterranean.

To be fair, we have no evidence of the continuation of the Stoa as an actual school in Athens after Panaetius (who often absented himself to Rome anyway), and we know that Posidonius taught in Rhodes, not Athens. However, according to Sedley (2003), it was the events of 88-86 B.C.E. that finally and permanently moved the center of gravity of Stoicism away from its Greek cradle to Rome, Rhodes (where an Epicurean school also flourished), and Tarsus, where a Stoic was at one point chosen by Augustus to govern the city.

Most crucially, however, Stoicism became important in Rome during the fraught time of the transition between the late Republic and the Empire, with Cato the Younger eventually becoming a role model for later Stoics because of his political opposition to the “tyrant” Julius Caesar. Sedley highlights two Stoic philosophers of the late First Century B.C.E., Athenodorus of Tarsus and Arius Didymus, as precursors of one of the greatest and most controversial Stoic figures, Seneca. Both Athenodorus and Arius were personal counselors to the first emperor, Augustus, and Arius even wrote a letter of consolation to Livia, Augustus’ wife, addressing the death of her son, which Seneca later hailed as a reference work of emotional therapy, the sort of work he himself engaged in and became famous for.

Once we get to the Imperial period (Gill 2003), we see a decided shift away from the more theoretical aspects of Stoicism (the “physics” and “logic,” see below) and toward more practical treatments of the ethics. However, as Gill points out, this should not lead us to think that the vitality of Stoicism had taken a nose dive by then: we know of a number of new treatises produced by Stoic writers of that period, on everything ranging from ethics (Hierocles’ Elements of Ethics) to physics (Seneca’s Natural Questions), and the Summary of the Traditions of Greek Theology by Cornutus is one of a handful of complete Stoic treatises to survive from any period of the history of the school. Still, it is certainly the case that the best known Stoics of the time were either teachers like Musonius Rufus and Epictetus, or politically active, like Seneca and Marcus Aurelius, thus shaping our understanding of the period as a contrast to the foundational and more theoretical one of Zeno and Chrysippus.

Importantly, it is from the late Republic and Empire that we also get some of the best indirect sources on Stoicism, particularly several books by Cicero (2014; for example., Paradox Stoicorum, De Finibus Bonorum et Malorum, Tusculanae Quaestiones, De Fato, Cato Maior de Senectute, Laelius de Amicitia, and De Officiis) and Diogenes Laertius’ Lives of the Eminent Philosophers (Book VII, 2015). And this literature went on to influence later writers well after the decline of Stoicism, particularly Plotinus (205-270 C.E.) and even the 6th Century C.E. Neoplatonist Simplicius.

All of the above notwithstanding, what is most vital about Stoicism during the Roman Imperial period, however, is also what arguably made the philosophy’s impact reverberate throughout the centuries, eventually leading to two revivals, the so-called Neostoicism of the Renaissance, and the current “modern Stoicism” movement to which I will turn at the end of this essay. The sources of such vitality were fundamentally two: on the one hand charismatic teachers like Musonius and Epictetus, and on the other hand influential political figures like Seneca and Marcus. Indeed, Musonius was, in a sense, both: not only he was a member of the Roman “knight” class, and the teacher of Epictetus, he was also politically active, openly criticizing the policies of both Nero and Vespasian, and getting exiled twice as a result. Others were not so lucky: Stoic philosophers suffered a series of persecutions from displeased emperors, which resulted in murders or exile for a number of them, especially during the reigns of Nero, Vespasian and Domitian. Seneca famously had to commit suicide on Nero’s orders, and Epictetus was exiled to Greece (where he established his school at Nicopolis) by Domitian.

It is also important to appreciate different “styles” of being Stoic among the major Roman figures. As Gill (2003) points out, Epictetus was rather strict, arching back to the Cynic model of quasi-asceticism (see, for instance, his “On Cynicism” in Discourses III.22). Musonius was a sometimes odd combination of “conservative” and “progressive” Stoic, advocating the importance of marriage and family, but also stating very clearly that women are just as capable of practicing virtue and philosophizing as men are, and moreover that it is hypocritical of men to consider their extramarital sexual activities differently from those of women! Seneca was not only more open to the pursuit of “preferred indifferents” (he was a wealthy Senator, but it seems unfair to accuse him of endorsing a simplistic self-serving philosophy: see the nuanced biographies by Romm 2014 and Wilson 2014), but explicitly stated that he was critical of some of the doctrines of the early Stoics, and that he was open to learn from other schools, including the Epicureans. Famously, Marcus Aurelius was open—one would almost want to say agnostic—about theology, at several points in the Meditations (1997) explicitly stating the two alternatives of “Providence” (Stoic doctrine) or “Atoms” (the Epicurean take), for instance: “Either there is a fatal necessity and invincible order, or a kind Providence, or a confusion without a purpose and without a director. If then there is an invincible necessity, why do you resist? But if there is a Providence that allows itself to be propitiated, make yourself worthy of the help of the divinity. But if there is a confusion without a governor, be content that in such a tempest you have yourself a certain ruling intelligence” (XII.14); or: “With respect to what may happen to you from without, consider that it happens either by chance or according to Providence, and you must neither blame chance nor accuse Providence” (XII.24). More is said about this specific topic in the section on Stoic metaphysics and teleology.

There is ample evidence, then, that Stoicism was alive and well during the Roman period, although the emphasis did shift—somewhat naturally, one might add—from laying down the fundamental ideas to refining them and putting them into practice, both in personal and social life.

d. Debates with Other Hellenistic Schools

One should understand the evolution of all Hellenistic schools of philosophy as being the result of continuous dialogue amongst themselves, a dialogue that often led to partial revisions of positions within any given school, or to the adoption of a modified notion borrowed from another school (Gill 2003). To have an idea of how this played out for Stoicism, let us briefly consider a few examples, related to the interactions between Stoicism and Epicureanism, Aristotelianism, and Platonism—without forgetting the direct influence that Cynicism had on the very birth of Stoicism and all the way to Epictetus.

Epictetus is pretty explicit about his—negative—opinions of the Epicureans, drawing as sharp a contrast as possible between the latter’s concern with pleasure and pain and the Stoic focus on virtue and integrity of character. For example, Discourses I.23 is entitled “Against Epicurus,” and begins: “[1] Even Epicurus realizes that we are social creatures by nature, but once he has identified our good with the shell, he cannot say anything inconsistent with that. [2] For he further insists—rightly—that we must not respect or approve anything that does not share in the nature of what is good.” “The shell” here is the body, a reference to the Epicureans’ insistence on pleasure and the absence of pain as what leads to ataraxia, or tranquillity of mind—a term interestingly different from the one preferred by the Stoics, apatheia, or lack of disturbing emotions, as shall be seen below.

A longer section, II.20, is entitled “Against the Epicureans and the Academics,” at the beginning of which Epictetus calls the bluff, in his mind, on the rivals’ theories, which he understands as clearly impractical and contrary to common sense: “[1] Even people who deny that statements can be valid or impressions clear [that is, the Skeptics] are obliged to make use of both. You might almost say that nothing proves the validity of a statement more than finding someone forced to use it while at the same time denying that it is sound.” Epictetus even goes so far as suggesting that Epicurus is incoherent, as he advises a life of retired tranquility away from society, and yet bothers to write books about it, thus showing himself to be concern about the welfare of society after all: “[15] What urged him to get out of bed and write the things he wrote was, of course, the strongest element in a human being—nature—which subjected him to her will despite his loud resistance.”

Attacking the Skeptics among the Academics, Epictetus turns up the rhetoric significantly: “What a travesty! [28] What are you doing? You prove yourself wrong on a daily basis and still you won’t give up these idle efforts. When you eat, where do you bring your hand—to your mouth, or to your eye? What do you step into when you bathe? When did you ever mistake your saucepan for a dish, or your serving spoon for a skewer?” And he sees his invective as justified—in sure Stoic fashion—not on theoretical grounds, but on practical ones: “[35] We could give adulterers grounds for rationalizing their behavior; such arguments could provide pretexts to misappropriate state funds; a rebellious young man could be emboldened further to rebel against his parents. So what, according to you, is good or bad, virtuous or vicious—this or that?”

Even so, not all Stoics rejected either Academic or Epicurean ideas altogether. I have mentioned Marcus Aurelius’ relative “agnosticism” about Providence vs. Atoms (though he clearly preferred the first option, in line with standard Stoic teaching), and Seneca is often sympathetic to Epicurean views, though, as Gill (2003, note 58) comments, this is in the spirit of showing that even some of the rival school’s ideas are congruent with Stoic ones. He very clearly states, however, in Natural Questions: “I do not agree with [all] the views of our school” (2014, VII.22.1).

Cicero, in Book III of De Finibus, provides us with some glimpses of the disagreement between Stoics and Aristotelians, by way of his imaginary dialogue with Cato the Younger. At [41] he writes: “Carneades never ceased to contend that on the whole so-called ‘problem of good and evil,’ there was no disagreement as to facts between the Stoics and the Peripatetics, but only as to terms. For my part, however, nothing seems to me more manifest than that there is more of a real than a verbal difference of opinion between those philosophers on these points.” He continues: “The Peripatetics say that all the things which under their system are called goods contribute to happiness; whereas our school does not believe that total happiness comprises everything that deserves to have a certain amount of value attached to it,” referring to the different treatment of “external goods” between Aristotelians and Stoics.

There are well documented examples of Stoic opinions changing in direct response to challenges from other schools, for instance the modified position on determinism that was adopted by Philopator (80-140 C.E.), a result of criticism from both the Peripatetic and the Middle Platonist philosophers. We also have clear instances of Stoic ideas being incorporated by other schools, as in the case of Antiochus of Ascalon (130-69 B.C.E.), who introduced Stoic notions in his revision of Platonism, justifying the move by claiming that Zeno (and Aristotle, for that matter) developed ideas that were implicit in Plato (Gill 2003). Finally, Stoicism found its way into Christianity via Middle Platonism, at the least since Clement of Alexandria (150-215 C.E.).

2. The First Two Topoi

A fundamental aspect of Stoic philosophy is the twofold idea that ethics is central to the effort, and that the study of ethics is to be supported by two other fields of inquiry, what the Stoics called “logic” and “physics.” Together, these form the three topoi of Stoicism.

We will take a closer look to each topos in turn, but it is first important to see why and how they are connected. Stoicism was a practical philosophy, the chief goal of which was to help people live a eudaimonic life, which the Stoics identified with a life spent practicing the cardinal virtues (next section). Later in the Roman period the emphasis shifted somewhat to the achievement of apatheia, but this too was possible because of the practice of the topos of ethics.

This, in turn, was to be supported by the study of the other two topoi, “logic,” which was more expansive than the modern technical meaning of the term, including logic sensu stricto, but also a theory of knowledge (that is, epistemology), as well as cognitive science, and “physics,” by which the Stoics meant roughly what we would today identify as a combination of natural science and metaphysics (the latter including theology). Roughly, then, “logic” means the study of how to reason about the world, while “physics” means the study of that world.

Logic and Physics are related to Ethics because Stoicism is a thoroughly naturalistic philosophy. Even when the Stoics are talking about “God” or “soul,” they are referring to physical entities, respectively identified with the rational principle embedded in the universe itself and with whatever makes human rationality possible. Stoics often invoked creative imagery to explain the relationship among Physics, Logic and Ethics, as found in Diogenes Laertius (VII.39), for instance. Perhaps the most famous of such analogies is the one using an egg, where the shell is the Logic, the white the Ethics, and red part the Physics. However, given how the three topoi were meant to relate to each other, this is probably misleading, possibly due to a misunderstanding of the biology of eggs (the Physics is supposed to be nurturing the Ethics, which means that the former should be the white and the latter the red part of the egg). The best simile in my mind is that of a garden: the fence is the Logic—defending the precious inside and defining its boundaries; the fertile soil is the Physics—providing the nutritive power by way of knowledge of the world; and the resulting fruits are the Ethics—the actual focal objective of Stoic teachings.

While the Stoics disagreed on the sequence in which the three topoi should be presented to students (that is, just like faculty in a modern university, they had contrasting opinions about the merits of different curricula!), the crucial point is that of a naturalistic philosophy where there is no sharp distinction between “is” and “ought,” as assumed in much modern moral philosophy, because what an agent ought to do (Ethics) is in fact closely informed by that agent’s knowledge of the workings of the world (Physics) as well as her capacity to reason correctly (Logic). This section describes the first two topoi and the next describe Ethics.

a. “Logic”

Stoics made important early contributions to both epistemology (Hankinson 2003) and logic proper (Bobzien 2003), and much has, deservedly, been written about it. While Stoics held that the Sage, who was something of an ideal figure, could achieve perfect knowledge of things, in practice they relied on a concept of cognitive progress, as well as moral progress, since both logic and physics are related to, and indeed function in the service of, ethics. They referred to this idea as prokopê (making progress), and they engaged in a long running dispute with Academic Skeptics about just how defensible this notion actually is.

Unlike the Epicureans, Stoics did not maintain that all impressions are true, but rather that some of them were “cataleptic” (that is, leading to comprehension) and others were not. Diogenes Laertius explains the difference (VII.46): “the cataleptic, which [the Stoics] hold to be the criterion of matters, is that which comes from something existent and is in accordance with the existent thing itself, and has been stamped and imprinted; the non-cataleptic either comes from something non-existent, or if from something existent then not in accordance with the existent thing; and it is neither clear, nor distinct.”

So the Stoics did admit that one’s perception can be wrong, as in cases of hallucinations, or dreams, or other sources of phantasma (that is, impressions on the mind, the result of automatic—we would say unconscious—judgment), but also that proper training allows one to make progress in distinguishing cataleptic from non-cataleptic impressions (that is, impressions to which we may reasonably give or withhold assent). Chrysippus even suggested that it is important to absorb a number of impressions, since it is the accumulation of impressions that leads to concept-formation and to making progress. In this sense, the Stoic account of knowledge was eminently empiricist in nature, and—especially after relentless Skeptical critiques—relied on something akin to what moderns call inference to the best explanation (Lipton 2003), as in their conclusion that our skin must have holes based on the observation that we sweat.

It is important to realize that a cataleptic impression is not quite knowledge. The Stoics distinguished among opinion (weak, or false), apprehension (characterized by an intermediate epistemic value), and knowledge (which is based on firm impressions unalterable by reason). Giving assent to a cataleptic impression is a step on the way to actual knowledge, but the latter is more structured and stable than any single impression could be. In a sense, then, the Stoics held to a coherentist view of justification (for example, Angere 2007), and ultimately, like all ancients, to a correspondence theory of truth (for example, O’Connor 1975).

Hankinson (2003) comments on an interesting aspect of the dispute between Stoics and Academic Skeptics, concerning the epistemic warrant to be granted to cataleptic impressions. What, precisely, makes them “clear and distinct,” a Stoic terminology that clearly anticipates Descartes (who, obviously, was not an empiricist)? If clarity and distinctiveness are internal features of cataleptic impressions, then these are phenomenal features, and it is easy to come up with counterexamples where they do not seem to work (for instance, the common occurrence of mistaking one member of a pair of twins for the other one).

This is where we encounter one of the many episodes of growth of Stoic thought in response to external pressure. Cicero tells us (2014, in Academica II.77) that Zeno was aware that the same impression could derive from something that did or did not exist, so he modified his stance (as Diogenes Laertius reports: VII.50), adding the following clause: “of such a type as could not come from something non-existent.” Of course this does not solve the issue, but it builds on the Stoic metaphysical assumption that there cannot be two things that are exactly alike, as much as at times it may appear so to us. Frede (1983) advanced the further view that what makes a cataleptic impression clear and distinct is not any internal feature of that impression, but rather an external causal feature related to its origin. According to this account, then, Stoic epistemology is externalist (for example, Almeder 1995), rather than internalist (for example, Goldman 1980). Indeed, there is evidence that they became—again as a result of criticism from the Skeptics—reliabilists about knowledge (Goldman 1994). Athenaeus tells of the story of Sphaerus, a student of Cleanthes and colleague of Chrysippus, who was shown at a banquet what turned out to be birds made of wax. After he reached to pick one up he was accused of having given assent to a false impression. To which he—rather cleverly, but indicatively—replied that he had merely assented to the proposition that it was reasonable to think of the objects as actual birds, not to the stronger claim that they actually were birds.

When it comes to the area of Stoic “logic” that is closest to our, much narrower, conception of the field, the school made major contributions. Their system of syllogistics recognized that not all valid arguments are syllogisms and significantly differs from Aristotle’s, having more in common with modern-day relevance logic (Bobzien 2006). To simplify quite a bit (but see Bobzien 2003 for a somewhat in-depth treatment), Stoic syllogistics was built on five basic types of syllogisms, and complemented by four rules for arguments that could be deployed to reduce all other types of syllogisms to one of the basic five.

The broader Stoic approach to logic has been characterized as a type of propositional logic, anticipating aspects of Frege’s work (Beaney 1997). Stoic logic made a fundamental distinction between “sayables” and “assertibles.” The former are a broader category that includes assertibles as well as questions, imperatives, oaths, invocations and even curses. The assertibles then are self-complete sayables that we use to make statements. For instance, “If Zeno is in Athens then Zeno is in Greece” is a conditional composite assertible, constructed out of the individual simple assertibles “Zeno is in Athens” and “Zeno is in Greece.” A major difference between Stoic assertibles and Fregean propositions is that the truth or falsehood of assertibles can change with time: “Zeno is in Athens” may be true now but not tomorrow, and it may become true again next month. It is also important to note that truth or falsehood are properties of assertibles, and indeed that being either true or false is a necessary and sufficient condition for being an assertible (that is, one cannot assert, or make statements about, things that are neither true nor false).

The Stoics were concerned with the validity of arguments, not with logical theorems or truths per se, which again is understandable in light of their interest to use logic to guard the fruits of their garden, the ethics. They also introduced modality into their logic, most importantly the modal properties of necessity, possibility, non-possibility, impossibility, plausibility and probability. This was a very modern and practically useful approach, as it directed attention to the fact that some assertibles induce assent even though they may be false, as well as to the observation that some assertibles have a higher likelihood of being true than not. Finally, the Stoics, and Chrysippus in particular, were sensitive to and attempted to provide an account of logical paradoxes such as the Liar and Sorites cases along lines that we today recognize as related to a semantic of vagueness (Tye 1994).

b. “Physics”

The Stoic topos of Physics includes what we today would classify as natural science (White 2003), metaphysics (Brunschwig 2003), and theology (Algra 2003). Let us briefly look at each in turn.

When it comes to natural science and cosmology, recall that the Stoics sought to “live according to nature,” which requires us to make our best efforts to understand nature. This also implies a very different view of natural science from the modern one: its study is not an end in itself, but rather subordinate to help us live a eudaimonic life.

Stoics thought that everything real, that is, everything that exists, is corporeal—including God and soul. They also recognized a category of incorporeals, which included things like the void, time, and the “sayables” (meanings, which played an important role in Stoic Logic). This may appear as a contradiction, given the staunchly materialist nature of Stoics philosophy, but is really no different from a modern philosophical naturalist who nonetheless grants that one can meaningfully talk about abstract concepts (“university,” “the number four”) which are grounded in materialism because they can only be thought of by corporeal beings such as ourselves.

They embraced what we might call a “vitalist” understanding of nature, which is permeated by two principles: an active one (identified with reason and God, referred to as the Logos) and a passive one (substance, matter). The active principle is un-generated and indestructible, while the passive one—which is identified with the four classical elements of water, fire, earth and air—is destroyed and recreated at every, eternally recurring, cosmic conflagration, a staple of Stoic cosmology. The cosmos itself is a living being, and its rational principle (Logos) is identified with aether, or the Stoic Fire (not to be confused with the elemental fire that is part of the passive principle). Consequently, God is immanent in the universe, and it is in fact identified with the creative cosmic Fire. This also means that the Stoics, unlike the Aristotelians, did not recognize the concept of a prime mover, nor of a Christian-type God outside of time and space, on the ground that something incorporeal cannot act on things, because it has no causal powers. From all of this, as White (2003) puts it, emerges a biological, rather than a mechanical picture of causation, which is significantly different from post-Cartesian and Newtonian mechanical philosophy.

Cosmic conflagrations, for the Stoics, repeat themselves in exact manner, apparently because God/Nature laid out things in the best possible way the previous time around, and there is therefore no reason to change (though one would get the same outcome from an entirely deterministic causal model of the universe). It is interesting to muse about the fact that some modern cosmological models also predict either identical or varied recurring universes (Ungerer and Smolin 2014), but of course do away with the concept of Providence altogether. According to Eusebius (quoted by White), during the phase of cosmic conflagration, the creative Fire is “a sort of seed, which possesses the principles of all things and the causes of all things that have occurred, are occurring, and will occur—the interweaving and ordering of which is fate, knowledge, truth, and a certain inevitable and inescapable law of the things that exist.”

Cicero, in De Fato, lays out the Stoic theory of causality and actually equates fate with antecedent causes. Chrysippus had argued that there is no possibility of motion without causes, deducing that therefore everything has a cause. This concept of universal causality led the Stoics to accept divination as a branch of physics, not a superstition, as explained again by Cicero in De Divinatione, and this makes sense once one understands the Stoic view of the cosmos: predicting the future is not something that one does by going outside the laws of physics, but by intelligently exploiting such laws.

Metaphysically the Stoics were determinists (Frede 2003). Here is Cicero: “[the Stoics] say, that it is impossible, when all the circumstances surrounding both the cause and that of which it is a cause are the same, that things should not turn out a certain way on one occasion but that they should turn out that way on some other occasion” (De Fato, 199.22-25). The Stoics did have a concept of chance, but they thought of it (much like modern scientists) as a measure of human ignorance: random events are simply events whose causes are not understood by humans.

The consequences of Stoic physics for their ethics are clear, and are summarized again by Cicero, when he says that Chrysippus aimed at a middle position between what we today would call strict incompatibilism and libertarianism (Griffith 2013). White (2003) interestingly notes in this respect that—just like Spinoza—the Stoics shifted the emphasis from moral responsibility to moral worth and dignity.

In terms of fundamental ontology, the Stoics were anti-corpuscularian (unlike the pre-Socratic Atomists, and Stoics’ chief rivals, the Epicureans), on the grounds that the idea of atoms violated their concept of a seamless unity of the cosmos. It is tempting to see this as in the same ballpark of modern quantum mechanical theories that see the entire universe as constituted of a single “wave function” (Ladyman and Ross 2009), but of course this would be an anachronistic interpretation.

3. The Third Topos: Ethics

Stoic Ethics was not just another theoretical subject, but an eminently practical one. Indeed, especially for the later Stoics, ethics—understood as the study of how to live one’s life—was the point of doing philosophy. It was no easy task: Epictetus famously said (in Discourses III.24.30): “The philosopher’s lecture room is a hospital: you ought not to walk out of it in a state of pleasure, but in pain—for you are not in good condition when you arrive!” The starting point for Epictetus was the famous dichotomy of control, as expressed at the very beginning of the Enchiridion: “We are responsible for some things, while there are others for which we cannot be held responsible” (also translated as “Some things are up to us, other things are not up to us”).

The early Stoics were somewhat more theoretical in their approach, with Zeno, Cleanthes and Chrysippus attempting to both systematize their doctrines and defend them from critiques from both Epicurean and especially Academic-Skeptic quarters. The early Stoa’s famous motto in ethics was “follow nature” (or “live according to nature”), by which they meant both the rational-providential aspect of the cosmos (see Physics above) and more specifically human nature, which they conceived as that of a social animal capable of bringing rational judgment to bear on problems posed by how to live one’s life. (It appears that Zeno’s original articulation of the principle was “live consistently” to which Cleanthes added the clarifying clause “with nature”: Schofield 2003.) Tightly related to this idea of following (human) nature was the Stoic concept of oikeiôsis, often translated as affinity, or appropriation. For the Stoics human beings have natural propensities to develop morally, propensities that begin as what we today would call instincts and can then be greatly refined with the onset of the age of reason at the childhood stage and beyond. It is interesting to note that this naturalistic account of the roots of virtuous/moral behavior is highly compatible with modern findings in both evolutionary and cognitive science (for example, Putnam and others 2014).

Specifically, we naturally: (i) behave in a fashion as to advance our interests and goals (health, wealth, and so forth); (ii) identify with other people’s interests (initially our parents, then friends, then countrymen); (iii) figure out ways to practically navigate the vicissitudes of life. The Stoics related these propensities directly to the four cardinal virtues of temperance, courage, justice and practical wisdom. Temperance and courage are required to pursue our goals, justice is a natural extension of our concern for an ever-increasing circle of people, and practical wisdom (phronêsis) is what best allows us to deal with whatever happens.

Which brings us to the matter of how the virtues are related to each other. To begin with, the Stoics recognized the above mentioned four cardinal virtues, but also a number of more specific ones within each major category (complete list in Sharpe 2014, derived from Stobaeus): for instance, practical wisdom included good judgment, discretion, resourcefulness; temperance could be broken down into propriety, sense of honor, self-control; courage was divided into perseverance, confidence, magnanimity; and justice comprised piety, kindness, sociability. Even so, they held to a view of virtue that is much more unitary than it may come across from this kind of list (Schoefield 2003). The cardinal virtues are derived from Socrates, especially in Plato’s Republic, and so is a certain unifying way of considering the virtues. Justice can be conceptualized as practical wisdom applied to social living; courage as wisdom concerning endurance; and temperance as wisdom with regard to matters of choice. Chrysippus further elaborated this idea of pluralism within an underlying unity, making the virtues essentially inseparable, so that, say, one cannot be courageous and yet intemperate—in the Stoic sense of those words.

Hadot (1998) draws a series of parallels between the four virtues, the three topoi and what are referred to as the three Stoic disciplines: desire, action, and assent. The discipline of desire, sometimes referred to as Stoic acceptance, is derived from the study of physics, and in particular from the idea of universal cause and effect. It consists in training oneself to desire what the universe allows and not to pursue what it does not allow. A famous metaphor here, used by Epictetus, is that of a dog leashed to a cart: the dog can either fight the cart’s movement at every inch, thus hurting himself and ending up miserable; or he can decide to gingerly go along with the ride and enjoy the panorama. This is a version of what Nietzsche eventually called amor fati (love your fate), and that is encapsulated in Epictetus’ phrase “endure [what the universe throws your way] and renounce [what the universe does not allow]” (Fragments 10). Consequently, according to Hadot, the discipline of desire is linked to the virtues of courage (to follow the order of the cosmos) and temperance (to be able to control one’s desires).

The second discipline, of action, is also called Stoic “philanthropy” and is the most prosocial of the cardinal virtues. The basic idea is that human beings ought to develop their natural concern for others in a way that is congruent with the exercise of the virtue of justice. Here the area of study most directly connected to the discipline is that of ethics itself. A representative quote is perhaps the one found in Marcus Aurelius’ Meditations (VIII.59): “Men exist for the sake of one another. Teach them then or bear with them.” The first sentence is a statement of philanthropy (in the Stoic, not modern, sense), while the second one makes it clear that for the emperor this was a duty to be performed either by engaging other people positively or at the very least by suffering their non virtuous behavior, if that is the case.

The last discipline is that of assent, referred to as Stoic “mindfulness” (not to be confused with the variety of Buddhist concepts by the same name, especially the Zen one). I will get back to the concept of assent in the next section, as it is related to the Stoic treatment of the (moral) psychology of emotions, but for now suffice to say that the discipline regards the necessity to make decisions about what to accept or reject of our experience of the world, that is, how to make proper judgments. It is therefore linked to the virtue of practical wisdom, as well as to the area of study of logic. If we had to summarize it in a single sentence, Seneca’s “bring the mind to bear upon your problems” (On Tranquility of Mind, X.4) may be appropriate.

As we have seen so far, Stoic ethics is concerned exclusively with the concept of virtue (and associated disciplines)—whether understood as a unitary thing with a number of facets or otherwise. In this the Stoics were akin to the Cynics and unlike the Peripatetics, who instead allowed that a number of other things are necessary for a eudaimonic life, including (some) wealth, health, education, and so forth. The Peripatetics would not have assented to the idea of a eudaimonic Sage on the rack, a classic Stoic concept.

However, Stoic ethics actually attempts to strike a balance between the asceticism of the Cynics and the somewhat elitist views of the Peripatetics. It does so through the introduction of the Stoic concept of preferred and dispreferred “indifferents” briefly mentioned at the beginning. This is found already in Zeno’s book on Ethics, which is now lost, but about which we know from Diogenes Laertius (VII.4). Zeno distinguished between indifferents that have value (axia) and those that have disvalue (apaxia). The first group included things like health, wealth and education, while the second group was comprised of things like sickness, poverty and ignorance. The move was a brilliant one: as I argued above, it allowed the Stoics to get the best of both the Cynic and the Peripatetic worlds: yes, it is true that—if they don’t get in the way of practicing virtue—some indifferents are preferred; but they are called indifferents for a reason: they do not truly matter for the pursuit of the (moral) eudaimonic life. In other words, while it is undeniable that people naturally and rationally seek the preferred indifferents, it is also the case that one can be a person of moral integrity, achieving eudaimonia, regardless of one’s material circumstances.

There is much more to be said about Stoic ethics, of course, but before closing this introductory sketch let me comment on an issue that does not fail to come up, and which I have already briefly mentioned above: the connection between the undeniably teleological-providential views of the cosmos advanced by Stoic physics and the actual practice of Stoic ethics. The issue is this: given that the Stoic themselves insisted that the study of physics (and of logic) influences how we understand ethics, and given that they believed in the providential nature of the cosmos, does that mean that only people who accept the latter view can pursuit eudaimonia? The generally accepted answer is no.

Gregory Vlastos (referred to in Schofield 2003) convincingly argued that what he called the “theocratic” principle does affect one’s conception of the relation between virtue and the order of the cosmos, specifically because it tells us that being virtuous is in agreement with such order. Crucially, however, Vlastos maintains that this does not change the content of virtue, nor does it affect one’s conception of eudaimonia. This is so because although the “physics” (which, remember, is a combination of natural sciences and metaphysics, and hence theology) does inform the ethics, it does so in what modern philosophers would call an underdetermined fashion: while ethics is not independent of physics (or logic), in the Stoic system, it also cannot be read directly off it. Stoic ethics is naturalistic, and thus very modern in nature, but it—to put it in rather anachronistic terms—does not simplistically erase Hume’s is/ought divide.

Vlastos’ position finds plenty of textual support from a number of Stoic sources, perhaps no more obviously so than in Marcus, as already reported. There are, however, other passages in the classical Stoic literature that do not lend themselves to a clear cut position on the matter, such as this one from Epictetus: “What does it matter to me […] whether the universe is composed of atoms or uncompounded substances, or of fire and earth? Is it not sufficient to know the true nature of good and evil, and the proper bounds of our desires and aversions, and also of our impulses to act and not to act; and by making use of these as rules to order the affairs of our life, to bid those things that are beyond us farewell? It may very well be that these latter things are not to be comprehended by the human mind, and even if one assumes that they are perfectly comprehensible, well what profit comes from comprehending them? And ought we not to say that those men trouble in vain who assign all this as necessary to the philosopher’s system of thought? […] What Nature is, and how she administers the universe, and whether she really exists or not, these are questions about which there is no need to go on to bother ourselves” (Fragments 1). Please remember that for the Stoics “nature” was synonymous with “god.”

Indeed, it is because of this and other passages that Ferraiolo (2015), for instance, concludes that: “metaphysical doctrines about the nature and existence of God, and a rationally governed cosmos, are rather cleanly separable from Stoic practical counsel, and its conductivity to a well-lived, eudaimonistic life. Stoicism may have developed within a worldview infused with presuppositions of a divinely-ordered universe … but the efficacy of Stoic counsel is not dependent upon creation, design, or any form of intelligent cosmological guidance.”

On balance, it seems fair to say that the ancient Stoics did believe in a (physical) god that they equated with the rational principle organizing the cosmos, and which was distributed throughout the universe in a way that can be construed as pantheistic. While it is the case that they maintained that an understanding of the cosmos informs the understanding of ethics, construed as the study of how to live one’s life, it can also be reasonably argued that Stoic metaphysics underdetermined—on the Stoics’ own conception—their ethics, thus leaving room for a “God or Atoms” position that may have developed as a concession to the criticisms of the Epicureans, who were atomists.

4. Apatheia and the Stoic Treatment of Emotions

The naturalistic system of ethics developed by the Stoics bridges what would later be referred to as the is/ought gap by way of a sophisticated account of human developmental moral psychology (Brennan 2003). This section focuses on a related, major difference between Stoics and Epicureans, which begins with the respective use of two key terms indicating a desirable state of mind according to the two schools, and continuing with a broader discussion of the Stoic classification of emotions (or “passions”).

As we have seen, Epictetus explains in a number of places where the Stoa differs from the Garden (for example, “Against Epicurus,” Discourses I.23), while Seneca tells his friend Lucilius that he happily borrows from Epicurus when it makes sense, as it is his “custom to cross even into the other camp, not as a deserter but as a spy” (Letter II, A beneficial reading program, in the new translation by Graver and Long 2015).

Recall that the Stoics thought the pivotal thing in life is virtue and its cultivation, while the Epicureans thought that the point was to seek moderate pleasure and especially avoid pain. Nonetheless, both schools thought that a crucial component of eudaimonia (the flourishing life) was something very similar, to which the Stoics referred to as apatheia and the Epicureans as ataraxia. There are, however, some differences between the two concepts, especially in the way the two schools taught how one could achieve, or at the least approximate, the respective states of mind.

The IEP article on Epictetus defines the two terms in the following fashion:

apatheia: freedom from passion, a constituent of the eudaimôn life

ataraxia: imperturbability, literally “without trouble,” sometimes translated as “tranquillity”; a state of mind that is a constituent of the eudaimôn life

So, both apatheia and ataraxia are components of the eudaimonic life, and indeed, while the second term is usually associated with the Epicureans, both schools used it.

As far as the Stoics are concerned, however, it is good to remember that “passion” did not mean what we now mean by that term, and indeed it did not even exactly overlap with the term “emotion” in the modern sense of the word. That is why it is grossly incorrect to say that the Stoics aimed at a passionless life, or at the suppression of emotions. Rather, the Stoics divided the “passions” into unhealthy and healthy ones. The first group included pain, fear, craving, and pleasure. The second one “discretion,” “willing,” and “delight.” The latter were the opposite of the first group, except for pain, which does not have a positive counterpart. Here is a summary diagram:

stoicism-passions-table

A diagram of the Stoic passions

For the Stoics, then, the “passions” are not automatic, instinctive reactions that we cannot avoid experiencing. Instead, they are the result of a judgment, giving “assent” to an “impression.” So even when you read a familiar word like “fear,” don’t think of the fight-or-flight response that is indeed unavoidable when we are suddenly presented with a possible danger. What the Stoics meant by “fear” was what comes after that: your considered opinion about what caused said instinctive reaction. The Stoics realized that we have automatic responses that are not under our control, and that is why they focused on what is under our control: the judgment rendered on the likely causes of our instinctive reactions, a judgment rendered by what Marcus Aurelius called the ruling faculty (in modern cognitive science terminology: the executive function of the brain).

The Stoic view of emotions finds very nice parallels in modern neuroscience. For instance, Joseph LeDoux (2015) makes the important, if often neglected, point that there is a difference between what neuroscientists mean by “emotion” and what psychologists mean. Neuroscientifically, fear, for example, is the result of a defense and reaction mechanism that is involuntary and nonconscious, and whose major neural correlate is the amygdala. But what psychologists refer to when they talk of “fear” is a more complex emotion, constructed in part of the basic defense and reaction mechanism, to which the conscious mind adds cognitive interpretation, something very similar to the Stoic concept. The two meanings are not in contradiction, but are rather complementary. The cognitive interpretation of the raw emotion of fear, then, is brought about by a combination of one’s memories, cultural upbringing, deliberative thinking, and so forth. The Stoics clearly referred to the psychological, not the neuroscientific meaning of emotion as “passions,” and LeDoux’s own research seems to support the Stoic account and the practicability of their discipline of assent, seen in the previous section.

Going back to the above diagram: pain is not the simple sensation of pain, but the failure to avoid something that we mistakenly judge bad. Similarly for the other pathê: fear is the irrational expectation of something bad or harmful; craving is the irrational striving for something mistakenly judged as good; and pleasure is the irrational elation over something that is actually not worth choosing. Contrariwise, the eupatheiai are the result of a rational aversion of vice and harmful things (discretion), a rational desire for virtue (willing), and a rational elation over virtue (delight). (It should be clear now why there is no such thing as a rational emotional pain.)

All of the above is why apatheia is best construed as equanimity in the face of what the world throws at us: if we apply reason to our experience, we will not be concerned with the things that do not matter, and we will correspondingly rejoice in the things that do matter.

There is another crucial difference between the two schools to be highlighted here: they get to apatheia/ataraxia by very different routes. The Epicureans sought ataraxia as a goal, achieved most of all through the avoidance of pain, which meant especially to withdraw from social and political life. It was good, for Epicurus, to cultivate your close friendships, but attempting to play a full role in the polis was a sure way to experience pain (physical or mental), and therefore it was to be avoided. For the Stoics, on the contrary, the goal was the exercise of virtue, which led them to embrace their social role. Marcus Aurelius, for instance, constantly writes in the Meditations that we need to get up in the morning and do the job of a human being, which he interprets to mean to be useful to society. Hierocles elaborated on the Cynic/Stoic concept of cosmopolitanism. The motto of the school was “follow nature,” by which it was meant, as we have seen, the human nature of a social animal capable of rational judgment. And of course one of the four virtues examined in the previous section is justice, and one of the three disciplines is that of action—both explicitly prosocial. Apatheia, then, was not a goal for Stoics, but an advantageous byproduct (a preferred indifferent, so to speak) of living the virtuous life.

5. Stoicism after the Hellenistic Era

As Long (2003) has remarked, Stoicism has had a pervasive, yet largely unacknowledged influence on Western philosophical thought throughout the Middle Ages, Renaissance, and into modern times. Among the philosophers that he lists as being directly or indirectly affected by Stoicism are Augustine, Thomas More, Descartes, Spinoza, Leibniz, Rousseau, Adam Smith, and Kant, to which we can easily add David Hume. During the Renaissance both Stoic books, particularly Epictetus’ Enchiridion and Seneca’s Letters, and books favorable to Stoicism, like Cicero’s De Officiis, were widely read.

Christianity was far more sympathetic to Stoicism than to its main rival, Epicureanism (and it also absorbed elements of Platonism in its “neo” form). The Epicurean emphasis on pleasure, as well as their metaphysics of cosmic chaos, where prima facie incompatible with Christian theology. The case of Stoicism was more complex. On the one hand, the Stoic insistence on materialism and pantheism was criticized and rejected; on the other hand, the idea of the Logos could easily be adapted—if in a fashion that the Stoics themselves would not have recognized—and the emphasis on virtue was often seen as pretty much the best that people could manage before the coming of Christ.

This is why we find an interestingly mixed record of Christian attitudes toward Stoicism. Augustine initially wrote favorably about it, while later on he was more critical. Tertullian was positively inclined toward Stoicism, and versions of the Enchiridion were commonly used (with Paul replacing Socrates) in monasteries. Peter Abelard and John of Salisbury were influenced by Stoic ethics too, while Thomas Aquinas was critical, especially of an early attempt at reviving Stoicism made by David of Dinant at the beginning of the 13th century.

A major revival of Stoicism did eventually take place, during the Renaissance, largely because of the work of Justus Lipsius (1547-1606). He was a humanist and classic philologist who published critical editions of Seneca and Tacitus. His major opus was De Constantia (1584), where he argued that Christians can draw on the resources of Stoicism during troubled times, while at the same time carefully pointing out aspects of Stoicism that are unacceptable for a Christian. Lipsius also drew on Epictetus, whose Enchiridion had first been translated in English a few years earlier. Other Neostoics included the French statesman Guillaume Du Vair, the churchman Pierre Charron, the Spanish author Francisco de Quevedo, and most importantly Michel de Montaigne, who wrote one of his essays in defense of Seneca.

The reception of Neostoicism was mixed. Even before Lipsius, Calvin had strongly criticized the “novi Stoici” for their revival of the idea of apatheia, and later critics included Pascal. In part in order to preempt such reactions, according to Sellars, one of the Neostoic texts began with the following cautious endorsement: “philosophie in generall is profitable unto a Christian man, if it be well and rightly used: but no kinde of philosophie is more profitable and neerer approaching unto Christianitie than the philosophie of the Stoicks.” Despite the interest in Stoicism displayed by other Renaissance figures, even outside of philosophy (for example, the poet Petrarch), Neostoicism never really became a movement, and its import largely rests on the impact of Lipsius’ writings, and perhaps on the influence of Montaigne.

Arguably the most important modern philosopher to be influenced by Stoicism is Spinoza, who was in fact accused by Leibniz to be a leader of the “sect” of the new Stoics, together with Descartes (Long 2003). There are indeed a number of striking similarities between the Stoic conception of the world and Spinoza’s. In both cases we have an all-pervasive God that is identified with Nature and with universal cause and effect. While it is true that the Stoic understanding of the cosmos was essentially dualistic—in contrast with Spinoza’s monism—the Stoic “active” and “passive” principles were nonetheless completely entwined, ultimately yielding an essentially unitary reality. Long points out, however, that a major difference was Spinoza’s concept of God’s infinite attributes and extension, in marked contrast to the finite (if eternal) God of the Stoics: “the upshot of both systems is a broadly similar conception of reality—monistic in its treatment of God as the ultimate cause of everything, dualistic in its two aspects of thought and extension, hierarchical in the different levels or modes of God’s attributes in particular beings, strictly determinist and physically active through and through.” He goes on to remark that the similarities are even more marked in terms of ethics, and “Spinoza’s ethics becomes transparently and profoundly Stoic.” That said, another major difference is that Spinoza did not believe in an underlying teleology to the world. For him Nature has no aim and God does not direct the cosmic drama. Indeed, as Long puts it: “If the Stoics had taken Spinoza’s route of denying divine providence, they would have avoided a battery of objections brought against them from antiquity onward.” In an important sense, perhaps, one can think of Spinoza as updating the Stoic system to modern times, a project that is currently seeing a number of concerted efforts.

Finally, there is also a connection between the Stoics and Kant, particularly in their shared concept of duty which transcends the specific consequences of one’s action. But as Long again points out, the differences are also quite striking: while Kant arrived at his system by a priori reasoning, the Stoics were eminently naturalistic and empiricist at heart. This is a major distinction between a deontological system like Kant’s and a eudaemonistic one like the Stoic, and it is only with the recent resurgence of virtue ethics in contemporary philosophy (Foot 1978, 2001; MacIntyre 1981/2013; Nussbaum 1994) that the ground was laid out for yet another revival of Stoicism as a practical moral philosophy.

6. Contemporary Stoicism

The 21st century is seeing yet another revival of virtue ethics in general and of Stoicism in particular. The already mentioned work by philosophers like Philippa Foot, Alasdair MacIntyre, and Martha Nussbaum, among others, has brought back virtue ethics as a viable alternative to the dominant Kantian-deontological and utilitarian-consequentialist approaches, so much so that a survey of professional philosophers by David Bourget and David Chalmers (2013) shows that deontology is (barely) the leading endorsed framework (26% of respondents), followed by consequentialism (24%) and not too far behind by virtue ethics (18%), with a scatter of other positions gathering less support. Of course ethics is not a popularity contest, but these numbers indicate the resurgence of virtue ethics in contemporary professional moral philosophy.

When it comes more specifically to Stoicism, new scholarly works and translations of classics, as well as biographies of prominent Stoics, keep appearing at a sustained rate. Examples include the superb Cambridge Companion to the Stoics (Inwood 2003), individual chapters of which have been cited throughout this entry; an essay on the concept of Stoic sagehood (Brouwer 2014); a volume on Epictetus (Long 2002); a contribution on Stoicism and emotion (Graver 2007); the first new translation of Seneca’s letters to Lucilius in a century (Graver and Long 2015); a new translation of Musonius Rufus (King 2011); a biography of Cato the Younger (Goodman 2012); one of Marcus Aurelius (McLynn 2009); and two of Seneca (Romm 2014 and Wilson 2014); and the list could continue.

In parallel with the above, Stoicism is, in some sense, returning to its roots as practical philosophy, as the ancient Stoics very clearly meant their system to be primarily of guidance for everyday life, not a theoretical exercise. Indeed, especially Epictetus is very clear in his disdain for purely theoretical philosophy: “We know how to analyze arguments, and have the skill a person needs to evaluate competent logicians. But in life what do I do? What today I say is good tomorrow I will swear is bad. And the reason is that, compared to what I know about syllogisms, my knowledge and experience of life fall far behind” (Discourses, II.3.4-5). Or consider Marcus’ famous injunction: “No longer talk at all about the kind of man that a good man ought to be, but be such” (Meditations, X.16).

The Modern Stoicism movement traces its roots to Victor Frankl’s (Sahakian 1979) logotherapy, as well as to early versions of Cognitive Behavioral Therapy, for instance in the work of Albert Ellis (Robertson 2010). But Stoicism is a philosophy, not a therapy, and it is in the works of philosophers such as William Irvine (2008), John Sellars (2003), and Lawrence Becker (1997) that we find articulations of 21st century Stoicism, though the more self-help oriented contribution by CBT therapist Donald Robertson (2013) is also worthy of note. All of these authors attempt to distance the philosophical meaning of “Stoic”—even in a modern setting—from the common English word “stoic,” indicating someone who goes through life with a stiff upper lip, so to speak. While there are commonalities between “Stoic” and “stoic,” for instance the emphasis on endurance, the latter is a diminutive version of the former, and the two should accordingly be kept distinct.

Perhaps the most comprehensive and scholarly attempt to update (as opposed to simply explain) Stoicism for modern audiences comes from Becker (1997), though a more accessible treatment is offered by Irvine (2008). One of Irvine’s major contributions is shifting from Epictetus’ famous dichotomy of control to a more reasonable trichotomy: some things are up to us (chiefly, our judgments and actions), some things are not up to us (major historical events, natural phenomena), but on a number of other things we have partial control. Irvine recasts the third category in terms of internalized goals, which makes more sense of the original dichotomy. Consider his example of playing a tennis match. The outcome of the game is under your partial control, in the sense that you can influence it; but it is also the result of variables that you cannot control, such as the skill of your opponent, the fairness of the referee, or even random gusts of wind interfering with the trajectory of the ball. Your goal, then, suggests Irvine, should not be to win the game—because that is not entirely within your control. Rather, it should be to play the best game you can, since that is within your control. By internalizing your goals you can therefore make good sense of even the original Epictetean dichotomy. As for the outcome, it should be accepted with equanimity.

Becker (1997) is more comprehensive and even includes a lengthy appendix in which he demonstrates that the formal calculus he deploys for his normative Stoic logic is consistent, suggesting also that it is complete. There are three important differences between his New Stoicism and the ancient variety: (i) Becker defends an interpretation of the inherent primacy of virtue in terms of maximization of one’s agency, and builds an argument to show that this is, indeed, the preferred goal of agents that are relevantly constituted like a normal human being; (ii) he interprets the Stoic dictum, “follow nature” as “follow the facts” (that is., abide by whatever picture of the universe our best science allows), consistently with Stoic sources attesting to their respect for what we would today call scientific inquiry, as well as with an updated Stoic approach to epistemology; and (iii) Becker does away with the ancient Stoic teleonomic view of the cosmos, precisely because it is no longer supported by our best scientific understanding of things. This is also what leads him to make his argument for virtue-as-maximization-of-agency referred to in (i) above. Whether Becker’s (or Irvine’s, or anyone else’s) attempt will succeed or not remains to be seen in terms of further scholarship and the evolution of the popular movement.

That movement has grown significantly in the early 21st century, manifesting itself in a number of forms. There is a good number of high quality blogs devoted to practical modern Stoicism. There is also a significant presence on social networks, for instance the Stoicism Group on Facebook.

7. Glossary

The Stoics were well known (some would say infamous) for having developed a rich technical vocabulary. Cicero, in book III of De Finibus, explicitly says that Zeno invented a number of new terms, and he feels that Latin is not a sufficiently sophisticated tongue to render all the subtleties of Greek thought. Below are some of the major Stoic terms and their meanings.

Andreia = courage, fortitude, one of the four Stoic cardinal virtues.

Apatheia = tranquility, overcoming disturbing desires and emotions.

Apoproēgmena = dispreferred indifferents, externals, outside of virtue that—other things being equal—should be avoided.

Aretê = virtue, excellence at one’s function. For Becker this is equivalent to the perfection of agency.

Ataraxia = absence of fear, largely an Epicurean concept, but also adopted by the Stoics.

Dikaiosynê = justice, integrity, one of the Stoic cardinal virtues.

Eu̯dai̯monía = flourishing, by means of living an ethical life.

Eupatheiai = the healthy passions cultivated by the Sage.

Hormê = the discipline of action.

Kathēkon = appropriate, rational, action, the thing one ought to do.

Logos = rational principle governing the universe.

Oikeiôsis = something properly yours, leading to Hierocles’ circle of expanding affection, Stoic cosmopolitanism.

Orexis = the discipline of desire.

Philanthrôpia = love of mankind, related to the concept of Oikeiôsis.

Phronȇsis = practical wisdom, one of the Stoic cardinal virtues.

Proēgmena = preferred indifferents, externals, outside of virtue that—other things being equal—can be pursued unless they compromise one’s virtue.

Propatheiai = involuntary emotional reactions, to which one has not yet given or withdrawn assent.

Prosochê = applying key ethical precepts to the present moment, mindfulness.

Sôphrosynê = self-discipline, temperance, one of the Stoic cardinal virtues.

Sunkatathesis = the discipline of assent.

 

8. References and Further Readings

  • Algra, K. (2003) Stoic theology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Almeder, R. (1995) Externalism and justification. Philosophia 24:465-469.
  • Angere, S. (2007) The defeasible nature of coherentist justification. Synthese 157:321-335.
  • Beaney, M. (1997) The Frege Reader. Blackwell.
  • Becker, L.C. (1997) A New Stoicism. Princeton University Press.
  • Bobzien, S. (2003) Logic. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bobzien, S. (2006) Ancient logic. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/entries/logic-ancient/ (accessed on 22 December 2015)
  • Bourget, D. and Chalmers, D.J. (2013) What do philosophers believe? Philosophical Studies 3:1-36.
  • Brennan, T. (2003) Stoic moral psychology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Broadie, S. & Rowe, C. (eds.) (2002) Aristotle — Nicomachean Ethics. Oxford University Press.
  • Brouwer, R. (2014) The Stoic Sage: The Early Stoics on Wisdom, Sagehood and Socrates. Cambridge University Press.
  • Brunschwig, J. (2003) Stoic metaphysics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Bugh, G.R. (1992) Athenion and Aristion of Athens. Phoenix 46:108-123.
  • Cicero, C.T. (2014) Complete Works. Delphi Classics.
  • Colish, M. (1985) The Stoic Tradition from Antiquity to the Early Middle Ages. E.J. Brill.
  • Diogenes Laertius (trans. by R.D. Hicks) (2015) Lives of the Eminent Philosophers. Delphi Classics.
  • Epictetus (trans. by R. Dobbin) (2008) Discourses and Selected Writings. Penguin
  • Ferraiolo, W. (2015) God or Atoms: Stoic Counsel With or Without Zeus. International Journal of Applied Philosophy 29:199-205.
  • Foot, P. (1978) Virtues and Vices. Blackwell.
  • Foot, P. (2001) Natural Goodness. Clarendon Press.
  • Frede, M. (1983). Stoics and skeptics on clear and distinct impressions. In: M.F. Burnyeat (ed.), The Skeptical Tradition. University of California Press, pp.65–93.
  • Frede, D. (2003) Stoic determinism. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Gill, C. (2003) The School in the Roman Imperial period. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Goldman, A. (1980) The internalist conception of justification. Midwest Studies in Philosophy 5:27-52.
  • Goldman, A. (1994) Naturalistic epistemology and reliabilism. Midwest Studies in Philosophy 19:301-320.
  • Goodman, R. (2012) Rome’s Last Citizen: The Life and Legacy of Cato, Mortal Enemy of Caesar. Thomas Dunne.
  • Goulet, R. (2013) Ancient philosophers: a first statistical survey. In: M. Chase, R.L. Clark, and M. McGhee (eds.) Philosophy as a Way of Life: Ancients and Moderns — Essays in Honor of Pierre Hadot. John Wiley & Sons.
  • Graver, M. (2007) Stoicism and Emotion. University Of Chicago Press.
  • Graver, M. and Long, A.A. (translators) (2015) Letters on Ethics: To Lucilius. University of Chicago Press.
  • Griffith M. (2013) Free Will: The Basics. Routledge.
  • Hadot, P. (1998) The Inner Citadel: the Meditations of Marcus Aurelius. Trans. by M. Chase, Cambridge University Press.
  • Hankinson, R.J. (2003) Stoic epistemology. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Inwood, B. (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Irvine, W.B. (2008) A Guide to the Good Life: The Ancient Art of Stoic Joy. Oxford University Press.
  • King, C. (2011) Musonius Rufus: Lectures and Sayings. CreateSpace.
  • Ladyman, J. and Ross, D. (2009) Every Thing Must Go: Metaphysics Naturalized. Oxford University Press.
  • LeDoux, J. (2015) Anxious: Using the Brain to Understand and Treat Fear and Anxiety. Viking.
  • Lipton, P. (2003) Inference to the Best Explanation. Routledge.
  • Long, A.A. (2002) Epictetus: A Stoic and Socratic Guide to Life. Oxford University Press.
  • Long, A.A. (2003) Stoicism in the philosophical tradition: Spinoza, Lipsius, Butler. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • MacIntyre, A. (1981/2013) After Virtue. A&C Black.
  • Marcus Aurelius (trans. by G. Long) (1997) Meditations. Dover.
  • McBrayer, G.A., Nichols, M.P., and Schaeffer, D. (2010) Euthydemus. Focus.
  • McLynn, F. (2009) Marcus Aurelius: A Life. Da Capo Press.
  • Nussbaum, M. (1994) The Therapy of Desire: Theory and Practice in Hellenistic Ethics. Princeton University Press.
  • O’Connor, D.J. (1975) The Correspondence Theory of Truth. Hutchinson.
  • Osler, M.J. (1991) Atoms, pneuma and tranquillity: Epicurean and Stoic Themes in European Thought. Cambridge University Press.
  • Putnam, H., Neiman, S., and Schloss, J.P. (eds.) (2014) Understanding Moral Sentiments: Darwinian Perspectives? Transaction Publishers.
  • Robertson, D. (2010) The Philosophy of Cognitive-behavioural Therapy (CBT): Stoic Philosophy as Rational and Cognitive Psychotherapy. Karnac Books.
  • Robertson, D. (2013) Stoicism and the Art of Happiness – Ancient Tips For Modern Challenges: Teach Yourself. Teach Yourself.
  • Romm, J. (2014) Dying Every Day: Seneca at the Court of Nero. Knopf.
  • Sahakian, W.S. (1979) Logotherapy’s Place in Philosophy. In: Logotherapy in Action. J. Fabry, R. Bulka, and W.S. Sahakian (eds.), foreword by Viktor Frankl. Jason Aronson.
  • Schofield, M. (2003) Stoic ethics. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sedley, D. (2003) The School, from Zeno to Arius Didymus. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Sellars, J. (2003) The Art of Living: The Stoics on the Nature and Function of Philosophy. Ashgate.
  • Seneca, L.A. (2014) Complete Works of Seneca the Younger. Delphi Classics.
  • Sharpe, M. (2014) Stoic virtue ethics. In: S. van Hooft and N. Athanassoulis (eds.), The Handbook of Virtue Ethics. Acumen Publishing.
  • Taran, L. (1971) The creation myth in Plato’s Timaeus. In: J.P. Anton, G.L. Kustas and A. Preus (eds.) Essays in Ancient Greek Philosophy. State University of New York Press, 372-407.
  • Tieleman, T. (2002) Galen on the seat of the intellect: anatomical experiment and philosophical tradition. In: C. Tuplin (ed.) Science and Mathematics in Ancient Greek Culture. Oxford University Press, pp. 256-273.
  • Tye, M. (1994) Sorites paradoxes and the semantics of vagueness.  Philosophical Perspectives 8:189-206.
  • Ungerer, R.M. and Smolin, L. (2014) The Singular Universe and the Reality of Time: A Proposal in Natural Philosophy. Cambridge University Press.
  • Verbeke, G. (1983) The Presence of Stoicism in Medieval Thought. Catholic University of America Press.
  • White, M.J. (2003) Stoic natural philosophy. In: B. Inwood (ed.) The Cambridge Companion to the Stoics. Cambridge University Press.
  • Wilson, E. (2014) The Greatest Empire: A Life of Seneca. Oxford University Press.

 

Author Information

Massimo Pigliucci
Email: mpigliucci@ccny.cuny.edu
City University of New York
U. S. A.

Aesthetics in Continental Philosophy

Klee angelus novus, 1920Although aesthetics is a significant area of research in its own right in the analytic philosophical tradition, aesthetics frequently seems to be accorded less value than philosophy of language, logic, epistemology, metaphysics, and other areas of value theory such as ethics and political philosophy. Many of the most prominent analytic philosophers have not written on aesthetics at all. Matters stand very differently in continental philosophy, where aesthetics has been given an important place by nearly every major thinker and tradition. There are undoubtedly important extra-philosophical reasons for this—such as the importance of art in European education and tradition and the French model of the philosophe as philosopher-writer—but there are also clearly philosophical reasons. In the analytic tradition, meaning and truth are frequently thought to be exemplified by logic, science, and the formal structures of language, whereas in continental philosophy, art has often taken this role of exemplifying meaning and truth. As such, art becomes akin to a philosophical activity insofar as it is thought to produce meaning and truth, and aesthetics takes an important place because it is seen as a branch of philosophy which gives access to some of philosophy’s perennially central concerns. Moreover, while the analytic tradition tends to abstract aesthetic questions from other concerns, the continental tradition tends to think about its role in relation to epistemology and metaphysics, to emphasise art’s historical and social situatedness, and to ask questions concerning its role and value in culture, politics, and everyday life. However, and in further contrast to analytic aesthetics, there is no general consensus concerning central topics of debate in continental aesthetics. Instead, and following a method of organisation typical of continental philosophy, this area of aesthetics may be approached according to major traditions and thinkers. This article gives a synoptic overview of such in the twentieth and twenty-first centuries. The ideas developed by each often remain highly unique, yet they have also influenced and reacted against each other (and these points of contact are marked within the article). Most of these developments have taken place in critical relation with modern and nineteenth-century aesthetics, especially as exemplified by the works of Immanuel Kant, G.W.F. Hegel, and Friedrich Nietzsche. Kant’s Critique of the Power of Judgement (1790) has been particularly important in shaping debates in later continental aesthetics, since on the one hand it stakes out aesthetics as a domain autonomous in relation to other areas of philosophical concern, such as epistemology and practical philosophy, and on the other it shows how this domain has relevance for other areas, (In Kant’s system, aesthetics provides a model for how judgement acts as a power that can unify the other branches of philosophical interest.)

Table of Contents

  1. The Position of Aesthetics in Continental Philosophy
  2. Phenomenology and Existentialism
    1. Introduction
    2. Heidegger
    3. Merleau-Ponty
  3. Hermeneutics
    1. Introduction
    2. Gadamer
  4. Psychoanalysis
    1. Introduction
    2. Freud
    3. Lacan
  5. Critical Theory
    1. Introduction
    2. Benjamin
    3. Adorno
  6. Poststructuralism
    1. Introduction
    2. Derrida
    3. Lyotard
    4. Deleuze
  7.  Developments in the Early 21st Centrury
    1. Introduction
    2. Rancière
    3. References and Further Reading

1. The Position of Aesthetics in Continental Philosophy

The importance and scope of aesthetics in continental philosophy may be indicated at the outset by taking the relatively ‘canonical’ example of Heidegger’s reading of Nietzsche on art. While the specific views about art and aesthetics expressed in this reading do not extend their influence to all traditions and thinkers within continental philosophy, the example gives a good indication of the dominant role aesthetics frequently takes in such traditions. During his first lecture course on Nietzsche, ‘The Will to Power as Art’, Heidegger sets out five statements on art:

  1. Art is the most perspicuous and familiar configuration of will to power;
  2. Art must be grasped in terms of the artist;
  3. According to the expanded concept of artist, art is the basic occurrence of all beings; to the extent that they are, beings are self-creating, created;
  4. Art is the distinctive countermovement to nihilism;
  5. Art is worth more than ’the truth’.

In addition to the above, and taking preeminent place as an expression of Nietzsche’s entire thinking on art, Heidegger adds the following major statement on art: art is the greatest stimulant of life.

These theses indicate that for (Heidegger’s) Nietzsche, art is far more than a pleasant diversion; it has profound ontological, cultural, political, and existential significance, and is even worth more than truth itself. Heidegger expands these theses as follows. Nietzsche’s ontology is that of the ‘will to power’, in which Being as a whole is understood in terms of shifting relations of confluent and conflictual forces, producing the creation and destruction of particular beings. Art is privileged both as an expression of will to power, and as a being which gives us special insight into the nature of Being as a whole as will to power.  The first statement suggests that of all types of being, art is that which is most clearly accessible to us in its essence. Moreover, art does not simply illuminate itself as a particular type of being, but illuminates Being as a whole. The second statement shifts the significance of art from its reception to its creation, and this shift opens up the ontological scope of aesthetics. Art, considered from the perspective of the artist, is then understood in terms of the creative act itself, which illuminates the way that beings in general are ‘brought forth’. Aesthetics, as meditation on art, may then be understood not simply as a consideration of beautiful things, but as ontology, the thinking of the essence of Being as a whole. The third statement tells us that, according to Nietzsche’s ontology, the will to power becomes visible as and with art, understood as paradigmatic for all creation, or productive ‘bringing forth’.

The fourth and fifth statements give art a practical dimension in Nietzsche’s philosophy. Heidegger insists that ‘truth’ in the fifth statement (and in all of Nietzsche’s philosophy) must be understood in a specifically Platonic sense as referring to the supposedly true supersensuous world of the Ideas, in contrast with the untrue sensuous world of mere appearances. For Nietzsche, the old values he associates with nihilism—the decadence of culture and the devaluation of life—are essentially grounded in this Platonic conception of truth through its dominance in the religion, morality, and philosophy of the Western tradition. Art then operates as a countermovement to nihilism on two essential points: first, as a sensuous thing, its very nature is to affirm the value of the sensuous world that nihilism denies; and second, as a paragon expression of will to power it helps us to understand what Nietzsche posits as the necessary grounding principle for the creation of new, non-nihilistic values (the will to power itself). Heidegger develops this reading of Nietzsche’s philosophy of art in order to critique it (see section 2. b. below). Nevertheless, in its clear emphasis on the ontological and practical roles of art, this reading indicates well the significance and scope that aesthetics (understood as philosophical reflection on art in general) has had for twentieth and twenty-first century continental philosophers.

2. Phenomenology and Existentialism

 a. Introduction

Phenomenology is a philosophical method which focuses on the close examination of phenomena, that which appears. In its contemporary sense, it was founded by Edmund Husserl in the early part of the twentieth century. Many philosophers influenced by Husserl’s work developed phenomenology in ways which contributed significantly to aesthetics, including Martin Heidegger, Roman Ingarden, Jean-Paul Sartre, Mikel Dufrenne, Maurice Merleau-Ponty, and Michel Henry. While developed in varying ways by each of these thinkers, in general phenomenology has offered an approach which displaces the traditional aesthetic categories of subject and object. Instead, phenomenology has focused on the examination of aesthetic experience and the work of art in terms of appearance and the conditions of appearance, thought as prior to the categorisation of ourselves and the world into subject and object. Moreover, phenomenological aesthetics has examined the ontology of the work of art as a special kind of thing which appears, with a distinctive character marking it out from the rest of appearances. Art has frequently been given a privileged status as affording us special insight into the way in which things in general come to appear, and how meaning as such is constituted. In this way, for many phenomenologists aesthetics has been closely connected with ontology, epistemology, and value theory.

Existentialism, while stemming from nineteenth century thinkers such as Søren Kierkegaard and Nietzsche, intersects with phenomenology in the thought of many of its prominent twentieth century exponents (for example, Heidegger, Sartre, de Beauvoir, and Merleau-Ponty). Existentialism focuses on the concrete existence of human life. It rejects the adequacy of traditional philosophical thought, which proceeds by way of abstract essences and general categories, to do justice to the lived experience of the individual. For existentialists, art—and, in particular, literature (many existentialist philosophers were also literary authors)—has an advantage over philosophy insofar as it is able to dramatise concrete experience through singular imaginary examples and ‘indirectly communicate’ existential truths. Art is also better able to evoke the irrational—such as sensation, affect, feeling, mood, and everyday, non-theoretical modes of thinking—which existentialists believe is necessary to do justice to the full range of human experience. Existentialism has typically emphasised human freedom, especially the freedom to create values, and art has been taken as a testament to, and an exemplary model for, such creative activity. I will develop some of these themes in further detail here by focusing on two of the most well-known and influential contributors to aesthetics in the tradition of existential phenomenology: Heidegger and Merleau-Ponty.

 b. Heidegger

Edmund Husserl’s most prominent student, Martin Heidegger, combined the method of phenomenology with a deep attention to the history of philosophy, especially that of ancient Greece, to forge one of the most influential philosophical bodies of work of the twentieth century. Heidegger turns his attention to art and aesthetics as part of his wider philosophical project, which seeks to uncover the meaning and truth of Being. Heidegger contends that the history of philosophy, and of western culture generally, has seen a decline with respect to Being, such that today Being has practically become nothing. (This is how he interprets Nietzsche’s thesis of nihilism, the negation of meaning and value as such.) Heidegger understands Being as the way particular beings (or ‘entities’) come to appear as what they are, with the meaning that they have, within particular historical epochs. For Heidegger, Being is a historical process, such that beings appear differently in different epochs. He specifies three main epochs, the ancient, medieval, and modern, to each of which corresponds a leading meaning of Being; that is, a main way in which beings are revealed. Heidegger’s reflections on art and aesthetics appear within his critique of the modern epoch and his attempts to retrieve a deeper understanding of Being from the near-oblivion into which he believes it has fallen.

Heidegger specifies that ‘aesthetics’ is itself a part of this modern tradition: a particular way of viewing art which first became explicit with Alexander Baumgarten’s Aesthetica (1750), then was quickly taken up in successive influential formulations, in particular by Kant (Critique of the Power of Judgement, 1790) and Hegel (Lectures on Fine Art, c. 1818–29). For Heidegger, the distinctive feature of the modern way of revealing beings is to give them the character of subject and object, as per Descartes’ philosophy. The aesthetic view of art follows suit by positioning the artwork as an object, ‘experienced’ by the one who takes in and appreciates it, positioned as subject. Heidegger wants to critique aesthetics as one aspect of modern philosophy, which he critiques in general, in order to allow a different view of art to emerge. According to Heidegger, the modern worldview covers over a more primordial relation of being-in-the-world in which we are immersed in and alongside other beings in the world. Heidegger’s own form of phenomenological philosophy aims to be attentive to beings as they appear in order to overcome the modern presupposition of thinking, which distributes things according to the subject/object divide before we truly encounter them, and to reveal the more primordial character of things which such presuppositions hide. For Heidegger, the modern tendency is particularly pernicious because it views beings according to a framework he calls Gestell, which determines them as resources available to be put to use (Bestand). Such a view distracts us from being attentive to the many other ways beings can be revealed and impoverishes the meaningfulness of the world we inhabit by reducing everything—including human beings themselves—to this narrow scope.

‘The Origin of the Work of Art’ is the most significant and well-know of Heidegger’s essays in which he tries to draw attention to the enigma of art and to sketch an alternative ontology of the artwork. Here, the artwork is described as the setting-to-work of truth. Truth must be understood not in its usual meaning as thought corresponding to facts in the world, but in Heidegger’s specific determination as aletheia, meaning disclosure, uncovering, or revealing. For Heidegger, truth means the way in which beings come to light as what they are and with the meaning they have. Art for him, therefore, has a privileged relation to truth because great art can be an occurrence of truth; that is, it has the power to reveal not just itself, but other beings in a particular way. He expresses this by saying that art ‘sets up a world’ and ‘sets forth the earth’, and that the artwork enacts the ‘strife’ which is the turbulent relation between world and earth. As in much of Heidegger’s writings, these terms are more suggestive than determinate, and remain open to contesting interpretations. They may, however, be roughly glossed as follows. ‘World’ is the network or system of interpretations in which beings appear as what they are and form an open but interrelated whole. The world, in short, is the set of shared meanings of a historical culture. It is the meaning of the artwork as interpreted in that culture, and the effect on cultural meaning in general that the work has. ‘Earth’, on the other hand, refers to the material and sensory thing that the work necessarily must be, and to its capacity to hold dimensions of potential meaning currently concealed but which might, in the future, come to be revealed. The earth is the inexhaustibility of the work, its irreducibility to interpretation by a critic or a culture. For Heidegger, every artwork, at least if it is ‘great’, contains both these dimensions, which exist in a state of tension or strife: world opening up to reveal meaning, earth drawing back to keep meaning opaque. While these terms themselves may seem opaque, arguably they do successfully describe something of the mysterious character of artworks in their ability to simultaneously reward and thwart our desire to understand them. Heidegger also asserts in this essay that the essence of all art is poetry. ‘Poetry’ needs to be understood here in a specific sense, not simply as the pleasing combination of words, but insofar as Heidegger understands poetry as the essence of language, which again has a significant relation to truth. ‘Poetry’ in this sense designates the capacity of language to reveal beings and determine them as what they are in their specific character by naming them, and he asserts here that poetry has a privileged place in the system of the arts by virtue of being that art which most exemplifies the ontological power of all art to reveal. In sum, Heidegger’s phenomenological approach to art aims to subvert the aesthetic tradition’s determination of art as the object of aesthetic experience, and to uncover a deeper meaning in which art may be understood as a site of ontological revealing. This is consonant with his wider project of overcoming the metaphysical tradition, and modern philosophy more specifically, in order to rethink the meaning and truth of Being.

 c. Merleau-Ponty

Albeit in a different way, Merleau-Ponty also seeks to understand art in the context of an ontology which moves beyond the subject/object division characteristic of modern philosophy. This is most evident in his late essay ‘Eye and Mind’ (1960), while the earlier essay ‘Cézanne’s Doubt’ (1945) presents a more existentialist perspective. For Merleau-Ponty, it is as if painting were phenomenological research pursued by other means. In fact—and this contra Sartre—he holds art in a privileged position over philosophy and literature, and accords to painting a higher position at least than music, which for him remains too amorphous to properly render phenomenological insight. The most distinctive feature of Merleau-Ponty’s own brand of phenomenology is the emphasis he puts on the embodied nature of human existence only noted in passing by Heidegger. The body takes on primary importance for Merleau-Ponty as he sees it as the fundamental condition for the appearing of a world. Such appearing takes place through the body’s perceptual system. For Merleau-Ponty, science and philosophy have distorted both our idea of the body and the way perception works by freezing them in idealised third-person representations. The aim of his phenomenological ontology is to return us to a more primordial understanding of our first-person experience of the body as it is lived by us, and of perception as it reveals the world. He gives painting a privilege over science and philosophy because he sees the painter as performing a kind of natural epoche, the phenomenological ‘reduction’ which suspends our commonsense beliefs about things, while he or she tries to see how things genuinely appear to vision and to render what they see on canvas. Moreover, Merleau-Ponty emphasises the necessarily bodily activity of painting, claiming that it would be impossible for a pure mind to paint. This is because painting revels in the bodily conditions that abstract thought tends to idealise and cover, such as the position of the body in space from which any sight is necessarily seen (the viewpoint), the movement of the eye, prevailing lighting conditions affecting the quality of visual perception, and so on.

As well as science and philosophy in general, Merleau-Ponty’s critical targets are Descartes’ theory of vision in his Dioptics (1637), and the ‘linear perspective’ method of constructing the space of painting developed in the Italian Renaissance, which employs precise geometric rules for creating the illusion of three-dimensional space on the two-dimensional surface of the painting. Neither is condemned outright by Merleau-Ponty. Rather, what he criticises in both is a partial truth which abstracts certain elements from our perceptual experience, and elevates them to the pretension of sole and exhaustive truth. Descartes’ theory of vision treats space as a homogenous expanse equally accessible at all points; an abstraction useful for thought, but impossible for any actual body to experience. Linear perspective, for its part, has for centuries been treated as the formula of realism, and has been supposed to present things as they actually appear to the eye. Merleau-Ponty contests this, citing various abstractions and exclusions which operate on real perception in order to construct linear perspective. Two distinctive features on which he concentrates are depth and movement.

First, perspective presents the illusion of depth by varying the sizes of objects relative to ‘parallel’ lines which converge at a vanishing point. Because this method was presented as rendering the true nature of visual space, the theoreticians of the Renaissance had to deny the theorem of Euclid’s Geometry which states that parallel lines never converge. Second, Merleau-Ponty notes that static art such as photography, painting, and sculpture, no matter how supposedly realistic, falsifies reality by excluding time, and hence, motion. Following a suggestion made by Auguste Rodin, he asserts that the phenomenology of movement is best expressed by a paradoxical arrangement in which different aspects of the figure in motion, which would be visible at different times in real life, are presented simultaneously in the artwork. According to his analysis, the truth of movement is better expressed by (for example) Théodore Géricault’s anatomically incorrect painting of racing horses Epsom Derby (1821) than by the gaits of horses photographically captured by Étienne-Jules Marey. What the painter is able to capture, Merleau-Ponty asserts, is not the outside of the object of motion, but motion’s ‘secret cipher’: time rendered visible in an indirect, stylistic manner.

In general for Merleau-Ponty, it is a focus on the primary qualities (those which can be specified with exactness: quantified and rationally calculated, such as extension and form) which is responsible for the intellectual abstractions that distort our understanding of the body’s perception. The supposedly secondary qualities, especially colour, are what reveal more primordial truths about perception in painting.  For him, painting is capable of rendering visible the birth of perception, the way that fully-formed, recognisable objects emerge from a deeper, more primordial, inchoate visual field. This is evident in Paul Cézanne’s painting in the way that rather than beginning with lines which give form to objects and then adding colour, the reverse procedure is evident—dashes of graduated colour build up and give shape to forms. In this way, Merleau-Ponty sees Cézanne and artists in general as performing phenomenological work: they are able to reveal the conditions and processes of perception, which are usually covered over by our focus on the end product, or what is perceived.

Merleau-Ponty’s essay ‘Cézanne’s Doubt’ displays aspects of an existentialist aesthetic by giving attention to the relationship between the life of the artist and the meaning of the artist’s work. This takes place in conversation with Sigmund Freud’s psychoanalytic treatment of Leonardo da Vinci (see section 4. b. below), which is often taken to excessively reduce the meaning of the work to the artist’s psychopathology. Echoing positions worked out in The Phenomenology of Perception (1945) in response to Sartre’s views on radical human freedom, Merleau-Ponty develops a nuanced view of the relationship between freedom and concrete situation in human existence. He asserts that the meaning of the work takes its bearings from the artist’s life but cannot be wholly reduced to it, just as the free transcendence of the subject must take its bearings from the facts of the life which it is born into but to which it is not reducible. For Merleau-Ponty, the individual’s freedom is necessarily determined in relation to facts over which we have no choice (for example, Leonardo’s being abandoned by his father in early childhood), but how we respond to such facts retains an element of choice. At the same time, the artwork produced by an individual will be informed by facts about the artist’s life, yet the meaning of such works will nevertheless defy any attempt to reduce it to such facts. Thus, Merleau-Ponty develops a qualified defence of the psycho-biographical approach to art characteristic of psychoanalytical criticism. According to him, the doubt which plagued Cézanne’s life and work stems from a general feature of situated human existence (albeit one which he felt more keenly than most): our freedom to create meaning is attended by no guarantee that it will become meaningful in the fullest sense of which it is capable; that is, to be accepted in the eyes of others, to transform the world that others inhabit, and to contribute to the store of human culture. In these various ways, then, Merleau-Ponty’s reading of Cézanne exemplifies the concerns of an existentialist aesthetic: to see the meaning of the artwork in relation to the life of the individual artist and to see art itself as exemplifying general features of human existence.

Merleau-Ponty’s essay ‘Eye and Mind’ develops many of the themes above in the context of his late ontology of ‘the visible and the invisible’, which attempts to overcome the subject/object division by invoking new concepts such as ‘flesh’. Against the transparent lucidity of Cartesian consciousness, flesh invokes the thickness and density of the perceptual field: the ambiguous region of the mutual imbrication of the perceiving self and the perceived world in a space in which both overlap without clear dividing line or boundary, but also without either being able to be reduced to the other. The perceiver and perceived cross over and inhabit the same ambiguous space, but this space is non-homogenous, and includes breaks, gaps, and undersides where the two fail to meet, or at least to do so in mutual accord. Merleau-Ponty gives the helpful example of seeing the world from the bottom of a pool of water. In this case, the water itself is like flesh, and the ‘distortions’ of our perception it introduces appear to be between subject and object insofar as they condition how we see objects beyond the pool. For Merleau-Ponty, our whole perceptual field and our body’s being-in-the-world, has this in-between character in all cases, even when this is less evident. Although in this later work he abandons much of the language of ‘classical’ phenomenology, Merleau-Ponty continues the phenomenological project of bringing to light the conditions of appearances which themselves are usually not apparent. Art holds for him a privileged status in its capacity to do this. Merleau-Ponty’s ontology of painting differs markedly from Heidegger’s aesthetics by disputing the latter’s location of the essence of all art in poetry, and insisting that there is a difference between the kind of meaning apparent in language and that which emerges with the visual. (This difference is taken up and developed by Jean-François Lyotard—see section 6. c.)

3. Hermeneutics

 a. Introduction

Hermeneutics is a theory of interpretation and understanding which has roots in Biblical exegesis and developments in German romanticism, but which emerged as a significant branch of contemporary continental philosophy primarily through the work of one of Heidegger’s most prominent students: Hans-Georg Gadamer. Under Gadamer’s influence, hermeneutics has been developed in differing ways by various other leading proponents, such as Paul Ricoeur in France and Gianni Vattimo in Italy. However, it is Gadamer’s work which remains at the core of this tradition, and continues to be most influential in aesthetics.

 b. Gadamer

As a philosophical heir to Heidegger, Gadamer took up aspects of the phenomenological tradition but focused on further developing reflections on interpretation and understanding found in Heidegger’s Being and Time. The relevance of Gadamer’s hermeneutics for aesthetics can be understood to have two parts. First, in his 1960 magnum opus Truth and Method, art and aesthetic experience are used as an example to defend the kind of understanding appropriate to the human sciences from methodological scientism and to develop a general theory of hermeneutics. Second, in various later essays, Gadamer wrote more explicitly about the hermeneutic character of the experience of artwork and of specific literary and artistic works. Of these later essays, the most significant for developing a hermeneutic philosophy of art is ‘The Relevance of the Beautiful’ (1964).

Gadamer’s general approach to hermeneutics developed in Truth and Method seeks to defend the human sciences from reduction to the kind of methodology appropriate for the natural sciences. Rather than trying to gain objective knowledge about reality in the way natural sciences do, Gadamer argues that the human sciences aim at understanding works (theoretical or literary texts, artworks, and other cultural products), which themselves cannot be understood to be entirely separate from the one who seeks to understand. Rather, both are inscribed within a common horizon of tradition. Tradition (Überlieferung), he insists, must not be understood in the static sense of conserving what already exists but rather as a transmission through which works from a world different to the one we inhabit are passed down to us. Understanding then becomes something like a translation, in which the aim is a ‘fusion of horizons’ of the world of the work and the world we inhabit. Both ourselves, as interpreters, and the work interpreted are transformed by these acts of interpretation, so that we cannot speak of understanding in the human sciences on the model of a subject correctly representing external natural objects to itself. Tradition, for Gadamer, is a matter of constant change and transformation of understanding through acts of interpretation within a continuous overarching horizon.  In the first section of Truth and Method, aesthetic experience and the artwork stand as paradigm examples to show why understanding in the human sciences differs from explanation in the natural sciences, as propaedeutic to working out a general theory of hermeneutics.

Turning to the later essays and Gadamer’s more explicit ‘aesthetics’, we continue to see Heidegger’s influence as well as a number of significant differences. Like Heidegger, Gadamer seeks to challenge and overcome what has been called aesthetics in modern philosophy but for different reasons and with a different aim in mind. While Heidegger concerns himself primarily with the kind of monumental art that can open and sustain a world, Gadamer is interested in any and all kinds of aesthetic experiences, including more mundane ones. Gadamer objects to Kant’s thesis of the disinterested nature of aesthetic experience, insisting instead that such experience brings a cognitive content which connects artworks with our understanding of other features of the world, and other types of experience, including our ‘interests’. Artworks are encountered and understood as part and parcel of the general fabric of interpretations we weave as we encounter the wider world. It is this supposedly disinterested nature of aesthetic experience which Gadamer believes needs to be overcome in modern aesthetics in order to make way for a more genuine understanding of, and possibility for, encountering artworks. In this way, he asserts, modern philosophical aesthetics should be overcome by being absorbed into hermeneutics.

In ‘The Relevance of the Beautiful’, Gadamer applies his hermeneutic approach to illuminate the nature of the work of art. The problem he sets himself to understand is the nature of the artwork considered trans-historically, so that we may understand what ‘art’ means such that the same word refers to the works of the ancient world and to contemporary experimental arts, such as non-objective painting. He proposes that we may do this with reference to the anthropological basis of our experience of art, which he develops through three key ideas: play, symbol, and festival. Like Freud (see section 4. b. below), though without any direct reference, Gadamer links the phenomenon of art as a human activity to that of play. What is significant about play for Gadamer is that it is an intentional activity involving a mere repetition without any real purpose or goal. Applied to the work of art, the concept of play implies that there is no real separation between the work itself and the one who receives it: the work is constituted through a kind of playful activity of the receiver with the work. This activity constitutes the work; it brings the work’s different aspects together through the synthetic activity of interpretation and constitutes the unity of the work. (This understanding of the work again challenges the modern aesthetic tradition, which maintains a distinction between the subject and object in aesthetic experience.)

Gadamer uses the notion of the symbol to explain the way in which an artwork should be thought to be meaningful. As we have seen, Gadamer wants to underline the cognitive dimension of artworks: what they are about and the connections to our interests and involvements that they imply. At the same time, he objects to Hegel’s idealist account of artworks, which reduces them entirely to conceptual content. For Hegel, art has ‘ended’ because the conceptual content artworks were best able to express in sensuous form in classical times has been superseded by philosophy’s more perspicuous conceptual articulation. Here Gadamer again takes inspiration from Heidegger, and reiterates the latter’s idea that every revealing is also a concealing; every setting up of a world also sets forth an earth, so that every artwork always maintains something concealed within it which resists the current interpretation. Gadamer translates this into every encounter with a work of art, so that the artwork is imputed with an inexhaustible excess of meaning, gradually revealed through repeated engagements with the work (like a conversation), without the prospect that such meaning might ever be wholly revealed or exhausted.

The symbol, for Gadamer, then expresses the way that an artwork can have a meaning which is cognitive and quasi-linguistic yet excessive and inexhaustible. For Gadamer, language stands as a paragon for all experience of meaning, so that even apparently non-linguistic experiences such as encounters with artworks must be understood according to a linguistic model. Art speaks to us, it says something to us, and understanding a work of art is all about working out what it has to say. This means learning to listen to or read a work of art, which in turn means learning to understand the kind of language it speaks. He notes how both ancient and contemporary works can challenge us by appearing to speak a language we do not understand and how interpreting them requires a process of learning the appropriate language in order to approach a meaning which will, however, never be able to be summed up in a simple conceptual determination or linguistic formulation. In this manner, understanding a work of art will be an interminable affair involving repeated encounters and acts of interpretation.

Finally, the festival reveals something about the temporal character of the artwork and its affinity with human community. Like the experience of a festival or holiday, the artwork invites us into an experience of time which differs from the quantitative, calculative experience of time we have when we are engaged in work (and similar everyday activities). Gadamer calls this ‘fulfilled’ or ‘autonomous’ time; it is time which has a certain unity and cannot be dissolved into separate moments, and which stands apart—and stands us apart, as we experience it—from everyday concerns. It is here that we see the phenomenological character of Gadamer’s aesthetics: like Heidegger, he is concerned with the way that art invites us to ‘let things be’, to be open to the way they reveal themselves. It is also through festive experiences that community is formed by dissolving the usual hierarchical distances which divide citizens according to social roles.  Art, Gadamer proposes, can also be experienced in a way which does not appeal to any particular social class, but unites people in sharing the same kind of experience. While hermeneutics has sometimes been characterised as conservative because of its emphasis on tradition and community, it must be emphasised that Gadamer sees both in terms of openness and transformation. And while he reminds us of the trans-historical importance of great works such as Ancient Greek tragedy, it is notable that Gadamer also asserts the legitimacy and importance of contemporary experimental arts, of happenings and anti-art, and even of pop music.

4. Psychoanalysis

 a. Introduction

Psychoanalysis, which received more mainstream acceptance in continental Europe than in English-speaking countries, produced a body of theory which has become a major current feeding into continental philosophy, including its aesthetic reflections. Sigmund Freud, the father of psychoanalysis, extended his theoretical model of the human psyche beyond the clinical setting to develop many aspects of a general philosophical anthropology. One area he treated in a number of essays was art. In his writings, we find both some general reflections on the psychological significance of creative activity in general and the interpretation of a number of specific painters and writers, including Leonardo, Michelangelo, Fyodor Dostoevsky, and E.T.A Hoffmann. Following Freud, other psychoanalysts, notably C.J. Jung, Melanie Klein, and Jacques Lacan, have included ruminations on art in their distinctive developments of psychoanalytic theory. Psychoanalysis has inspired many art critics to adopt its ideas and methods, and artists themselves have also been subject to its influence (most notably in the Surrealist movement). Aspects of psychoanalytic theory have been taken up by some continental philosophers, such as Julia Kristeva, Jean-François Lyotard, and Slavoj Zizek, in their own aesthetics and philosophies of art. Such philosophers have engaged deeply with the writings of psychoanalysts themselves, treating their ideas as philosophical theories, and we will do the same here, outlining some of the prominent contributions to aesthetics we find in the works of Freud and Lacan.

 b. Freud

Freud develops the outlines of a general approach to aesthetics that he calls variously ‘pathography’ or ‘psychobiography’. He finds the origins of the artist’s creative activity in children’s play, in which the child imaginatively creates a world of his or her own. He suggests that this creative activity then takes the form of phantasies in adulthood, where phantasies are understood as imaginative fulfilments of desires which remain unfulfilled in reality. Artists, he contends, are particularly neurotic people who are especially incapable of fulfilling their desires in reality and find a substitute sense of fulfilment by externalising their fantasies in works of art. However, artists are also especially gifted: their talents allow them to represent such desires in a way which makes them acceptable to others when the desires confronted directly in themselves would be repulsive (because of their brute, animalistic character—which is why they are often repressed). Artists’ own activities have a therapeutic value for them because artistic expression acts as a release for the pressures of desires which are unfulfilled and/or repressed. Moreover, Freud contends that the real reason we enjoy art is that it serves the same function for the aesthetic spectator. The formal or properly aesthetic qualities, he suggests, merely have an initial ‘incentive value’ to draw us to the work, while the real enjoyment comes from the release we feel by sharing with the artist the phantasy of fulfilling unfulfilled or repressed desires.

In general, the desires which artists represent are of two main kinds: ambitious (the desire for power, achievement, and security) and erotic (the desire for love and sexual pleasure). However, according to Freud, artists express their own unique phantasies with enough specificity that, with the help of biographical knowledge of the artist’s life, we may interpret artworks like symptoms (hence, ‘pathography’, meaning ‘marks of illness’) in order to reconstruct a picture of the artist’s psychological life (hence ‘psychobiography’.) Freud’s most famous example here comes from Leonardo. He sees in Leonardo’s painting The Virgin and Child with St. Anne (c. 1503) the outlined figure of a vulture in the Virgin’s clothing and uses this as a clue to a psychoanalytic interpretation which also draws on Leonardo’s diaries and on biographical accounts. He notes a passage in the diaries which seems to underline the significance of the vulture where Leonardo writes of a memory from early childhood in which a vulture repeatedly struck him on his open mouth with its tail. Interpreting this as a fellatio fantasy, and drawing it together with a number of other interpretive elements, Freud diagnoses Leonardo as a passive homosexual who did not actively pursue his homosexual desires but sublimated them into his work. This case his proven infamous because of Freud’s misinterpretation of the key word in Leonardo’s writings—he in fact refers to a kite, not a vulture—but the psychoanalytic approach does not stand or fall with a single example. The general approach illustrated here, that of psychobiography, has since been taken up and developed by numerous other writers in relation to many other examples. More significantly, and aside from the many criticisms to which psychoanalysis in general has been subject, this form of psychoanalytic aesthetics has been criticised for reducing the value of the work to the life of the artist (see section 2. c.) and attempting psychoanalysis without the benefit of a living and present analysand.

 c. Lacan

Jacques Lacan is arguably the second most influential psychoanalyst, both in general and in aesthetics and art theory, after Freud. Lacan’s contribution to aesthetics is at once less and more ambitious than Freud’s: less insofar as he did not produce any lengthy psycho=biographical studies of artists yet more insofar as he moved beyond Freud’s hesitant tone where art was concerned to confidently pronounce on the motive of artistic creation and the power of visual art to fascinate. Moreover, his writings and seminars are peppered with examples from the arts used to explain psychoanalytic concepts, an approach which has proven influential on many other art writers and cultural theorists, who (questionably) assume that the reverse also holds true: that such psychoanalytic concepts tell us something about the arts that may be used to illustrate them. Lacan’s distinctive approach to psychoanalysis may be glossed as an attempt to update and formalise Freud’s teachings by applying the concepts and methods of structural linguistics (see section 6. a.). Freud’s own works lend themselves to this because of their extensive references to the importance of language in psychic functioning. Lacan’s approach is well indicated by his famous dictum: ‘the unconscious is structured like a language’.

While retaining Freud’s categories (the unconscious, id, ego, and so forth), Lacan added three ‘registers’ of psychological functioning to his model of the mind: the imaginary, the symbolic, and the real. The imaginary concerns thinking in images, the symbolic is the register of language and formal symbolisation but is also associated with the law and social custom, while the real designates what falls outside the limits of the imaginary and the symbolic. Together, the imaginary and symbolic constitute what we think of as ‘reality’, which contrasts with ‘the real’. The real has its origins in the experience of infantile life, before the imaginary and symbolic mental functions have developed, but it returns as a kind of surplus energy to make its presence felt in those registers throughout life. Two further key ideas are necessary to understand Lacan’s specific contributions to aesthetics: the famous ‘mirror stage’ and symbolic castration. Lacan postulates that at around 6–18 months of age, the child develops an awareness of their separation from the mother’s body, but experiences their own body as disorganised, lacking unity. This unity is provided by identification with other people (as in a mirror image) but at the cost of a fundamental split and lack: contra Descartes, the thinking subject is not a self-sufficient unity but gains an identity only through a fundamental identification with an other. This identification with an other structures our own desire, as we learn to desire what the other desires (by imitation) but also in relation to what the other desires (we ask ourselves: what does the other want from me, or want me to be?).

Lacan transforms Freud’s theories of the Oedipus complex and castration anxiety by suggesting that the Oedipus complex is in fact resolved through the accomplishment of a symbolic castration. This occurs as the child learns language and also learns to attenuate their desires in relation to the law and social custom. This is attended by a further alienation and feeling of lack as the unconscious idea of a pleasurable plenitude prior to both the alienation of the mirror stage and symbolic castration develops. Lacan postulates that things or objects can take on the role of what he calls the ‘object petit a’ (the object designated with a little ‘a’, for autre, the French word for ‘other’). Human beings are principally motivated by a ‘fundamental fantasy’ which is the fantasy of fulfilling our lack through the object petit a, which we unconsciously see as a lost object (symbolically, the phallus which is lost through castration), the regaining of which would make us whole. The central aim of Lacanian psychoanalysis is to help the analysand ‘traverse the fantasy’—to realise the fundamental fantasy for what it is (precisely that, an impossible fantasy), and to accept the inevitable necessity of symbolic castration.

With these basic points in place, we are in a position to understand some of the key ideas Lacan outlines in the text which has been his most influential in aesthetics, the section called ‘Of the Gaze as object petit a’ in the transcript of his Seminar Book XI: The Four Fundamental Concepts of Psychoanalysis (delivered in 1964). Coincidentally, the first of these seminars was given in the same week that Merleau-Ponty’s posthumous work The Visible and the Invisible appeared, and Lacan takes the phenomenologist’s work as the basis from which he develops a psychoanalytic theory of art (see the section on Phenomenology and Existentialism above). At the same time, it must be understood that Lacan develops these ideas in a direction which, by focusing on the unconscious and a structuralist approach, seeks to provide an alternative account of the visual to that of existential phenomenology. The latter, for Lacan, does not adequately account for the decentred and split nature of subjectivity because it identifies it too strongly with the conscious ego. Lacan begins from Merleau-Ponty’s contention that the seeing subject is not the most primordial aspect of the visual, as such, and develops a deeper conception of the visible, and the invisible which conditions it, in terms of a distinction between ‘the eye’ and ‘the gaze’. ‘The eye’ designates the seeing subject, associated with the Cartesian analysis of vision, and the ideal point of the observer in single-point perspective painting. By contrast, ‘the gaze’ designates something in the visual field which escapes the eye’s ability to see it clearly and, in so doing, is the presence in the visual of something which escapes the Cartesian subject’s supposedly transparent self-consciousness and mastery of the objects it surveys. The skewed perspective in the visual field that the gaze suggests is famously exemplified by Lacan with Hans Holbein’s anamorphic painting The Ambassadors (1533). The skull at the bottom of the picture can only be seen clearly by viewing the rest of the painting from an extreme angle, illustrating both the lack of a single perspective which can master the visual field and the lack at the heart of the subject (exemplified by the skull as the symbol of death, the memento mori of the painting’s traditional interpretation).

The primordial model of this ‘gaze’ is the gaze of the mother, which evokes in the child questions that persist in our unconscious and shape our experience of vision: what does the (m)other want from me? What does she want me to be? As a seeing subject (‘the eye’), Lacan argues, my visual field is haunted by something which looks back at me, but which I cannot clearly see. This haunting of the visual field by an other which decentres my own point of vision is what Lacan understands by ‘the gaze’. As patterned on our mother’s look, the gaze is an example of the object petit a, and it evokes in us the sense of a lack which might be filled by regaining the appropriate lost object. Lacan answers the question ‘What is a picture?’ by suggesting that a picture is something the artist produces in the hope of appeasing or pacifying the gaze: it is an attempt to give the mother what she wants in the hope of regaining the fantasised, lost, utopian, perfect relationship with her, and thus fulfilling our own lack. This then leads to Lacan’s pronouncements on the meaning of all painting (and by extension, all visual art): it is ‘a trap for the gaze’.

For Lacan, visual art has a particular psychological function which works specifically in the register of the imaginary: it acts as a ‘lure’ for desire, inviting us to fantasise about the overcoming of alienation and the regaining of the lost object. This is because it is not only the painter who seeks to fulfil his or her desire by giving the other what it wants with the picture but the picture itself, as an image of self-enclosed completeness, which satisfies the spectator’s desire to fulfil the fundamental fantasy. It is in this way, Lacan suggests, that the picture can have a taming, civilizing, and fascinating power. Some interpreters (such as Lyotard; see section 6. c.) have taken this to be Lacan’s last word on art, which leaves it at a level inferior to the symbolic and the possibility of traversing the fantasy. Others, however, have suggested that Lacan’s work is open to the reading that some visual art can work at the symbolic level by deconstructing the illusion of painting from within, showing how the supposed realism of the painting is a product of artifice. Desire is then prevented from fantasising about its own fulfilment in the supposed unity and wholeness of the image and is forced to confront the arbitrary constructions of the symbolic order. This seems to be suggested, for example, by Lacan’s reading of Diego Velázquez’s Las Meninas (1656) in his May 1966 seminar. Notably, this was a response to Michel Foucault’s examination of the same painting in his immediately popular and now classic 1966 work The Order of Things, where it is examined as a representation of how representation itself was understood in what the French call the Classical period. However it is interpreted, Lacan’s idea of the gaze makes a fascinating contribution to aesthetics by suggesting that our experience of the visual is not a simple given, reducible to geometric analysis, but is conditioned by our split subjectivity and the intrigues of our desire. His ideas have been particularly influential in film theory, where through various formulations they have dominated the field for several decades. A key development for film theory, and beyond it, is the way that Lacan’s ideas were taken up by Louis Althusser to develop a structuralist theory of ideology according to which our subjectivity is structured through its capture in the gaze of the big Other, the symbolic authority figure which conditions social reality in any given society.

5. Critical Theory

 a. Introduction

The tradition known as critical theory is associated with the Frankfurt School, more formally known as the Institute for Social Research (Institut für Sozialforschung), which was founded in Frankfurt in 1923, moved its base of operations to New York in 1934, then returned to Frankfurt in 1951. With its unique combination of sociology and philosophy, Critical Theory is arguably the most prominent strand of Western Marxism. A number of philosophers and cultural critics working in this tradition have made contributions to aesthetics that have been highly influential throughout continental philosophy and in wider aesthetic and art-critical contexts. These thinkers have been concerned with both the fate of art under the conditions of industrial capitalism and the potential of art to critique such conditions. These themes were treated by Walter Benjamin in his highly influential essay ‘The Work of Art in the Age of Mechanical Reproduction’ (1936), and by Theodor W. Adorno in his Aesthetic Theory (1970) and numerous shorter works.

 b. Benjamin

Benjamin’s essay hypothesises that the function of the artwork has changed as the conditions of production under industrial capitalism have changed. Following Marx but seeking to update his analyses, he suggests that it has taken some time for the changes at the superstructural (cultural) level to manifest the implications of the changes at the substructural (economic) level, which Marx analysed in the nineteenth century. The key factor in this change is the technique of mechanical reproduction. Benjamin concedes that reproducibility as such has always been a concern with art since antiquity and points to developments in the history of reproducibility such as the printing press and lithography. However, he classifies these techniques as forms of manual reproduction and asserts that with mechanical reproduction we see the development of something significantly new. The main artforms he has in mind, and discusses at length in the essay, are photography and film. With these, the process of the production of images itself is largely mechanical and reproduction can no longer be said to simply copy an original. Benjamin famously claims that what artworks previously had, which they lose through mechanical reproduction, is what he calls aura. The aura of a work is the unique ‘presence’ which the original exudes in occupying a distinct time and place.  It is what gives a work its authenticity and makes it possible to distinguish between an authentic original and a forgery: the authentic original has occupied a unique series of times and places, which constitutes its history. Benjamin rightly notes that with arts such as photography and film, it no longer makes sense to draw such distinctions: one reproduction from a photographic negative, for example, is no more or less authentic than another. Instead of existing as a unique object, the work of art in the age of mechanical reproduction now exists as a multiplicity of copies.

According to Benjamin, this change also has the effect of extracting the artwork from tradition. For him, the uniqueness of a work means that it is imbedded in a fabric of tradition. This traditional uniqueness is associated with the anthropological basis of artworks in ritual, and Benjamin uses Marx’s categories of use value and exchange value to suggest that ritual or cult value is the original use value of an artwork. While such value becomes secularised in the ‘cult of beauty’ that is modern aesthetics and the art world, something of the use value of the work persists in the emphasis on its authenticity. However, Benjamin argues that with the advent of mechanical reproduction, artworks are finally liberated from this cult value and instead take on an ‘exhibition value’. Copies are put into mass circulation, exhibited far more widely than would be possible with an authentic original. For Benjamin, this accords with a broader social and cultural phenomenon of ‘the mass’, a sense of the universal equivalence and exchangeability of all things in the social domain. Mechanical reproduction feeds the desire of the masses for things to be brought close, as distinct from the unique work of art which is always at a distance (even when one is ‘present’ to it in a gallery or other setting). According to Benjamin, the quantitative transformation of artworks demanded by the masses also leads to a qualitative transformation, as the nature and function of art comes to be understood according to the model of the arts of mechanical reproduction. Thus, art is transformed by its loss of aura.

Moreover, Benjamin asserts that human modes of perception are historically transformable, and the arts of mechanical reproduction are altering our perceptions of the world. Techniques in photography and film such as the close-up and slow motion are not simply reproducing what we previously know of the world but introducing new perceptions and knowledges as they capture things entirely unknown to the naked eye. Benjamin suggests that as art loses its ritual or cult value it takes on a political value, and while the essay lacks a clear political programme or set of prescriptions, he asserts that the concepts it develops are useful for a revolutionary communist politics of art. For him, older aesthetic traditions based around the aura can be seen as culpable in their co-optation by fascist regimes, while the transformations wrought in the arts by industrial technologies opens more promising possibilities for the politicization of art through the democratic communication of ideas.

 c. Adorno

While Adorno’s reflections on art and culture developed to some extent from a critical disagreement with Benjamin over the democratizing potentials of radio and film, this developed into a highly productive engagement. Adorno agrees with Benjamin that contemporary developments in society and the arts radically challenge modern and romantic aesthetics but has a far more pessimistic view of the mass media ‘culture industry’. For Adorno, popular culture (including most radio and film) is complicit with the contemporary social system of capitalist exploitation, which he analyses as a culmination of the logic of ‘instrumental rationality’ devolving from the Enlightenment. In this system, human beings are radically alienated from nature through the project of manipulation and control of the natural world, which has not resulted in the hoped-for emancipation of human beings, but a ‘new barbarism’ in which we are psychologically dominated by the very system which was supposed to set us free. This system is one which determines everything according to a logic of rational specification and calculation, leaving little of any other way of understanding ourselves or relating to the world than what we understand to be instrumentally useful or productive.

The culture industry acts as an ideological support of this system, keeping people blind to the real conditions of their existence. However, Adorno saw more positive potentials in ‘genuine’ art, in particular experimental modernism. Through extensive writings on musicology, literature, and the visual arts, Adorno contributed one of the most important bodies of work in continental aesthetics. These reflections culminated in Adorno’s last book, Aesthetic Theory, completed but not finally edited by him.

Aesthetic Theory deliberately employs strategies which resist any easy summation of the work into simply stated concepts and theses. Adorno developed a paratactic style of writing on the model of atonal music, in which sentences clash with each other to dissonant effect, rather than developing a clear line of argument. Moreover, Adorno deploys his own ‘negative’ dialectical style of thinking, in which pairs of contrasting concepts ‘constellate’ around topics of discussion without resolving into static propositions and conclusions. These difficulties are far from arbitrary and are, in fact, highly motivated by Adorno’s own views on how critical thought can best resist the system in which it operates. This includes a demand that thought take time and be difficult, in contrast to the expectations of quick and easy consumption which dominate in the culture industry. Nevertheless, there are several key themes the work develops which are readily appreciable in the context of the broader tradition of aesthetics in continental philosophy we are outlining here.

First, Aesthetic Theory is a critical interrogation of the tradition of philosophical aesthetics itself, especially as exemplified by the works of Kant and Hegel. Adorno’s view is that many of the categories of traditional aesthetics are outmoded because of developments in both society and the arts. Nevertheless, he seeks to rethink such categories critically rather than simply abandon the legacy of the aesthetic tradition. To take one easily appreciable example, Adorno draws attention to the limitations of the aesthetic tradition’s focus on the beautiful in the face of the apparent ugliness and dissonance characteristic of much modernist art. Second, Adorno (somewhat infamously) asserts the autonomy of the artwork. (Here he follows Kant on the autonomy of aesthetic judgement but insists that this autonomy should also be ascribed to the art object itself.) This is an insistence that the aesthetic value of an artwork is independent of other values which might be ascribed to it, such as epistemological or ethical value. Unfortunately, this claim has often been (mis)interpreted to mean that artworks should be understood to be entirely unrelated to their social context or political value. Adorno’s view is more complex and, in fact, strongly asserts the relevance of cultural context and the political import of art.

As a third major point, then, for Adorno, artworks may be understood as ‘monads’ (a concept drawn from Gottfried Leibniz): while they are independent, self-enclosed entities, they are products of the social conditions in which they are created and mirror these social conditions within them. Following on from Marx’s framework of analysis, Adorno sees the conditions of contemporary capitalist society as fundamentally contradictory, and it is these contradictions which the artwork embodies. Adorno argues that the most politically relevant artworks are not ones with explicit political content, but those which best reflect the deeply conflicted conditions of contemporary culture (such as the atonal compositions of Arnold Schoenberg or the absurdist literature of Samuel Beckett). Such works have a ‘truth content’ but not one which could be stated in clear propositions with cognitive value. Moreover, the autonomy of art in fact gives it a function of political resistance: while the artwork, like everything else under the conditions of contemporary capitalism, has a commodity form, it also resists incorporation into the instrumentally rationalised system of production and consumption through its very uselessness. For Adorno, modernist experimental art is a privileged site of politics in the contemporary world, as it can both reflect and resist the difficulties and contradictions of contemporary existence better than explicit political discourse. Nevertheless, due to its opacity, art still needs a philosophical aesthetics to aid in its comprehension, and the complex arguments of Aesthetic Theory attempt to rework the concepts of the aesthetic tradition so that they become adequate to the task. While there was no outright dialogue between them, there are palpable and interesting resonances between Adorno’s aesthetic theory and the dominant modernist school of art theory which was developed in America by Clement Greenberg, Michael Fried, and others in the twentieth century. As we find in Adorno’s writings, this brand of aesthetic modernism also combined a concern with formalism, autonomy, and experimentation in the arts with a belief in the socially and politically critical relevance of such works.

6. Poststructuralism

 a. Introduction

Poststructuralism is the name given in the English-speaking world for a loose collection of influential French philosophers and theorists working in the wake of structuralism, a movement which itself deserves some mention for its impact on aesthetics in continental philosophy. Structuralism came to prominence in France in the nineteen-fifties and -sixties, rivalling and, to some extent, succeeding phenomenology and existentialism as a leading methodological approach in the human sciences. It applies some basic tenets of Ferdinand de Saussure’s structural linguistics to phenomena other than language, such as the unconscious (Lacan, as we have seen above in section 4. c.), myth and ritual (Claude Lévi-Strauss), and history (Michel Foucault). Most significantly for aesthetics, Roland Barthes applied structuralist principles to literary criticism, and developed Saussure’s suggestion of a ‘semiology’, a study of signs in general (broader than the study of linguistic signs alone), applying such an approach to various forms of art and culture. Simply put, structuralism views the meaningful content of any phenomena as given in the structured relations between basic units (signs). This structure is taken to be hidden (or deep), and interpretation of an artwork or cultural product then becomes a matter of making the structure which informs it explicit. Because of its formalism and methodological rigour, structuralism was touted by its supporters as a more ‘scientific’ method for studying the phenomena of the human sciences (that is, ‘meaningful’ phenomena), and it swept through the French academy like a revolution.

To some extent, poststructuralism can be understood as a philosophical reaction to the excessive zeal for formal method that structuralism exhibited. Most poststructuralists continued to draw on the phenomenological tradition, as well as psychoanalytic theory, and adopted aspects of structuralism while critiquing others. In short, poststructuralists tend to argue that meaning is not reducible to static structures and cannot be uncovered using a formal method. Generalising greatly, we might say that poststructuralists insist upon the necessity of some element of indeterminacy (which accounts for the genesis of the structure) that operates within the structure to generate meaning, and that constitutes an instability which threatens the coherence of the structure and may disrupt it and cause it to change. Understood as an interplay between structure and the element of indeterminacy (often called ‘the event’), meaning cannot be uncovered using a formal method, and poststructuralists have had recourse to highly unorthodox, experimental modes of thinking and writing in theorising and demonstrating those aspects of meaning or effect they believe structuralism misses. Art and aesthetics have been significant topics for all poststructuralists because, as the philosophical tradition attests, aesthetic meaning or effect seems to be a paradigm case of a kind of meaning which is not ‘scientific’. I will summarise here some of the key ideas of the two poststructuralists who have been most influential in aesthetics (Jacques Derrida and Gilles Deleuze) as well as those of the philosopher in this tradition who has engaged most extensively with art, Jean-François Lyotard.

 b. Derrida

Derrida and his philosophy of deconstruction have had an enormous influence on literary criticism and some influence as well in the wider arts and aesthetic theory. Notoriously difficult to summarise, Derrida’s works may be approached for our purposes through the observation that he develops a quasi-transcendental theory of meaning, which has implications for how meaning is understood to operate in philosophy, literature, and the arts. In post-Kantian, contemporary continental philosophy, ‘transcendental’ refers to the ‘conditions of possibility’ for a thing. The ‘quasi’ in Derrida’s case notes that while traditional transcendental thinking posits a priori structures that are taken to be universal and necessary, Derrida follows Heidegger in positing that the way things become meaningful is a function of time and subject to temporal and historical change. Derrida’s ‘principle’ of meaning, which claims to capture something of these conditions for anything whatsoever being meaningful, is ‘arche-writing’. This term indicates that it is some of the key features or properties of writing, as it has been understood in the metaphysical tradition, which are quasi-transcendental conditions of the possibility of meaning, rather than writing as such. These features are indicated by Derrida’s well-known term différance, which indicates spatial differing and temporal deferring.

This idea of différance contests the principle of meaning which has, according to Derrida, dominated throughout the Western tradition, which he calls the ‘metaphysics of presence’. This theory proposes an origin or full presence of ‘pure’ meaning in an idea held in the mind, which is then progressively corrupted by being put into spoken, then written, discourse. This supposed corruption of meaning corresponds with the spatial and temporal differing and deferring which, Derrida contends, are in fact the conditions of anything being meaningful in the first place. According to Derrida, there is no possibility of a pure, simple, original meaningful presentation, and every apparently original presentation is always already a repetition or a re-presentation. His arguments are extremely complex but may be treated summarily by noting how they draw on the traditions of phenomenology and structural linguistics. In the Husserlean phenomenological tradition, which takes consciousness as the transcendental condition of meaning, Derrida reads Husserl to show that conscious experience requires a synthesis of different temporal moments, such that any ‘presence’ of something to consciousness is already subject to the passing of time, that is, temporal difference. From the structural linguistics of de Saussure, Derrida draws the idea that every linguistic meaning only functions because of the possibility of its reiteration, or what Derrida calls its ‘iterability’.  Every linguistic usage draws from an already-existing store of linguistic meaning (the virtual structure of language as a whole), and in that sense is already a reiteration. Moreover, every use presupposes the possibility of the listener or reader reiterating the use in another context, because the very nature of linguistic competence—and thus, the capacity to understand—depends upon the ability to use language in this appropriative and citational manner.

While Derrida has often been seen as collapsing the distinction between philosophy and literature, he is in fact drawn to the latter and deploys it to contaminate and complicate the former because of the differences he sees between them. While he seeks to deconstruct any simple opposition between philosophy and literature, such a deconstruction would not be possible without also insisting on the differences between them. Philosophy has traditionally set itself up in opposition to the ‘merely’ literary, claiming truth to be its own exclusive competence and categorising literature as belonging to the fictional or untrue. Philosophical texts have typically been tightly structured according to the metaphysics of presence, deployed in structures of binary oppositions which set up hierarchies of meaning, such as truth/falsity, essence/appearance, form/matter, presence/absence, and so on. By contrast, although Derrida sees all meaning and all texts as to some degree structured by the metaphysics of presence, he sees the virtue of literature (and especially the works of experimental writers such as Stéphane Mallarmé, James Joyce, or Antonin Artaud) as asserting and developing the ambiguities, contradictions, aporias, and playfulness of meaning that philosophical texts and modes of writing strive to suppress. Deconstruction, for Derrida, is a strategy of reading and writing which aims to identify and subvert the binary oppositions structuring a text, showing how the privileged term is in fact parasitic on the underprivileged one, and opening up the space for a play of meaning beyond simple oppositions by inventing concepts (such as the trace, différance, the hymen, and so on) which are ‘undecidable’ from the point of view of such oppositions. What Derrida finds in literature are such undecidables already in play to a much greater extent than in philosophical texts, and he affirms and reinscribes these in his own writings. Derrida strove to emulate literary modes of writing in his philosophical texts precisely in order to open them to a freer play of meaning. Through Derrida’s influential association with prominent literary critics such as Paul de Man and J. Hillis Miller, deconstruction became enormously popular in literary criticism from the nineteen-seventies to -nineties, often taking the form of a reductive methodology for exposing contradictions internal to a text, which Derrida himself would never have approved of.

When Derrida turns his attention to the visual arts, in texts such as The Truth in Painting, he develops concepts (such as the trait, the parergon, and the subjectile) which essentially follow the same differential logic as arche-writing. Derrida suspects any supposition of a pure presence of meaning in an image and works in various ways to complicate this, showing that images depend on an ambiguous play between concepts and categories such as the inside and outside of the frame, the visible and the invisible, word and image, single artwork and entire oeuvre, and so on. These playful movements are processes of spatial differing and temporal deferring, which work against the metaphysics of presence and underline a differential form of meaning in the visual which is similar to that which he sees operating in the written text. Moreover, Derrida also seeks to complicate any simple opposition between visual and textual meaning, seeing such an opposition as itself implying a metaphysics of presence. This complication is notably played out in his text on the Italian artist Valerio Adami in The Truth in Painting, where attention is given to the communication and interplay of meaning between Adami’s images and the text he places within, outside, and in transgression of the frame of his visual works.

 c. Lyotard

Lyotard’s Discourse, Figure (1971) stages a significant encounter between phenomenology, structuralism, and psychoanalysis, with the aim of doing justice to the aesthetic event, and in particular the visual. Lyotard insists—against structuralism, hermeneutics, and indeed much of the literature of art history and visual culture—that the visual has its own kind of meaning, which differs from and cannot be reduced to linguistic meaning. For him, it is wrong to say that a picture can be ‘read’. Instead, he tries to account for how art can leave us with the feeling of being ‘lost for words’. The first part of Discourse, Figure carefully compares the kind of meaning proper to perception, as developed in Merleau-Ponty’s phenomenology (see section 2. c.), with the kind of meaning operative in language according to structuralism. While it is clear that Lyotard thinks Merleau-Ponty gives a more adequate account of the kind of meaning specific to the visual, he also finds phenomenology ultimately inadequate. He argues that Merleau-Ponty’s notion of the ‘flesh’ remains a too-harmonious interface with the world at the level of conscious perception, and has recourse to psychoanalysis to try to find in the unconscious the source of radical creativity and sheer unexpectedness characteristic of avant-garde art.

Lyotard objects to Lacan’s structuralist reading of the unconscious, however, and believes that the latter’s interpretation of art as lodged in the register of the imaginary, acting as a lure for desire, is an affront to the grandeur of art (see section 4. c.). Employing a close reading of Freud, he develops an alternative view of the unconscious, which emphasises plastic transformations (rather than linguistic operations) of its contents. Lyotard also objects to much of Freud’s own explicit aesthetics, however, and argues that the meaning of an artwork is not to be found in the pathology of the artist. Instead, he develops Freud’s theory of the unconscious and desire, along with Merleau-Ponty’s phenomenology, to give a complex account of the artwork: it is neither simply the impression of conscious perceptions, nor the expression of unconscious desires (fantasies), but a mutual deconstruction of one by the other, which produces a new and unrecognisable form. He refers to this deconstructive element as the figural.

In Lyotard’s later work, he reconfigures the traditional aesthetic category of the sublime to account for and defend avant-garde art and the significance of the aesthetic in the contemporary world. Lyotard now follows Adorno in postulating a crisis of traditional aesthetics, both in relation to the conditions of (post)industrial capitalism and developments in the arts, and tries to update aesthetics in response (see section 5. c.). For Lyotard, there is a crisis of meaning on the level of perception in the contemporary world, because—following the analyses of Heidegger, Benjamin, Adorno, and others—scientific and technological developments, operating in tandem with capitalism, have mutated the perceptual bearings by which we coordinated ourselves in the world. Sciences and technologies have both extended our sensory capacities (seeing and hearing at a distance, through television and telephones, for example), and revealed a reality beyond our body’s capacities for sensory awareness (atoms, microbes, nebulae, and so on). According to Lyotard, these changes have meant that the basic forms of sensory experience—time and space—have been thrown into uncertainty.

Lyotard sees avant-garde art of the twentieth-century as having pursued an analogous exploration of this crisis of perception. Traditionally, aesthetics has been concerned with the beautiful, understood in the arts as an ideal fit between the form and matter of a work. Lyotard sees avant-garde art, especially minimalism and abstraction, as moving away from a concern with ‘good form’ and towards an exploration of matter. Following Kant, ‘matter’ is something which defies rational calculation and specification: for example, colour in painting and timbre in music. While Kant himself only saw the sublime in art in depictions of sublime scenes in nature (mountains, storms at sea, and so forth), Lyotard suggests that the sublime is the aesthetic category appropriate to art which is less interested in exploring formal structure than in experimenting with matter, precisely because the sublime concerns feeling in relation to something ‘formless’. Lyotard characterises the sublime stake of art as ‘presenting the unpresentable’, because for him the aesthetic event is something which cannot be reduced to a ‘presentation’, understood in the Kantian sense as a ‘good form’ given to a sensation. Rather, art-events evoke thoughts and feelings in relation to works which surprise us and we cannot make sense of on the level of perception as well as concept: works that leave us feeling moved but lost for words. In his later works, the sublime is the aesthetic which Lyotard thinks best names this feeling. However, he also seeks to update this category in relation to the way it was understood in romantic or modern aesthetics. While in such aesthetics it had invoked the Idea of the absolute through a nostalgic feeling of loss for something transcendent which is missing, Lyotard posits a postmodern, immanent sublime in which the absolute is nothing other than the formless work itself.

 d. Deleuze

Both in his writings with Félix Guattari and on his own, Gilles Deleuze made important and influential contributions to the philosophy of film, painting, literature, and music. Many of his reflections on aesthetic issues are summarised in his last book with Guattari, What Is Philosophy?, where they are accompanied by a criticism of the phenomenological approach to aesthetics and Merleau-Ponty’s notion of the ‘flesh’ in particular. Characteristic of all of Deleuze’s work, he sees the level of perception with which phenomenologists are preoccupied as insufficiently deep to provide a full account of reality. It is on this level that he and Guattari situate, for example, Mikel Dufrenne’s a priori of aesthetic experience, and Merleau-Ponty’s flesh (see section 2. c.). Deleuze and Guattari delve deeper to give an account of art and aesthetic experience grounded in a metaphysical description of reality, where ‘sensation’ becomes the key aesthetic issue. Sensation is posited in a register prior to the distinction between subject and object, and consists of two main types: percepts and affects. Understood in this specific sense, they are perceptions and feelings considered independently of the lived experience which reveals them and raised to the level of independent metaphysical existence. For Deleuze and Guattari, a work of art is a ‘being of sensation’, a compound of percepts and affects, which is a ‘monument’ that preserves the sensation in and as the material from which the work is made. While for them the artist undoubtedly has a role in creating the work and the spectator or auditor a role in appreciating it, the emphasis is on the independent ontological status of the work as embodying that aspect of being which is sensation. They associate Merleau-Ponty’s flesh with the lived experience which reveals sensation, but insist on two further, deeper, and necessary conditions for sensation: the ‘house’, and ‘cosmic forces’. (While terms such as these may appear strange in the context of philosophical discourse, these and others are inspired by the writings of artists and other non-philosophers, and their use indicates a characteristic poststructuralist desire to think art with artists and art itself, rather than construct an independent theory about it, on the model of traditional aesthetics.)

Briefly glossed, the ‘house’ is the structure that gives sensations some consistency, such as the frames of paintings or the walls of architectural constructions but also more abstract principles of composition. ‘Cosmic forces’ are the basic physical and metaphysical forces constituting the real. Deleuze and Guattari list gravity, heaviness, rotation, the vortex, explosion, expansion, germination, and time. The main point here is that Deleuze and Guattari want to connect the activity of art with things usually considered extraneous to art and, indeed, with the universe as a whole. One notable way this placement of art within a broad metaphysical view plays out is the claim that animals can be artists through their exploitation of the expressive qualities of materials in marking territory, attracting mates, and so on. However, Deleuze and Guattari also insist that art must be considered to be a form of thinking which thinks with sensations, just as philosophy thinks with concepts and science thinks with functions. Art thinks against the common opinions, doxa, or clichés of our perceptions and feelings and adds new varieties of sensation to the world. This insistence gives art a legitimacy equal to that of philosophy and science, again indicating the importance accorded to the aesthetic which is characteristic of continental philosophy.

7. Developments in the Early 21st Century

 a. Introduction

Contemporary continental philosophy continues to see contributions to aesthetics which develop on all of the previous traditions discussed. Some twentieth century contributions to continental aesthetics, such as Adorno’s Aesthetic Theory or Lyotard’s extensive writings on the arts, still await much needed interpretation and discussion before the potential of their influence can be made manifest. In addition, contributing to the broad, pluralist landscape of aesthetics in continental philosophy, most of the more prominent continental philosophers in the early 21st century have written on the arts, including figures such as Slavoj Zizek, Alain Badiou, Giorgio Agamben, Jean-Luc Nancy, Michel Serres, Peter Sloterdijk, and Bernard Stiegler. Perhaps the most notable of these, however, is Jacques Rancière, whose distinctive works in aesthetics during the first fifteen years of the 21st century has revivified thinking on the relations between art and politics. We may therefore take Rancière’s work as an indicative example of early 21st century developments in continental aesthetics, while keeping in mind that this is just one of many important developments.

 b. Rancière

Rancière has become known for the idea of the ‘distribution of the sensible’, which suggests that systems of inclusion and exclusion, and of political relationships generally, don’t only operate on the conceptual or cognitive level, but on the sensory level. The idea of the distribution of the sensible captures the way the world is divided up according to sensations and the political implications of this. Rancière suggests that within communities there is a dimension of the sensible that is held in common by all members, allowing a common participation in the community as such, but that this is subdivided into parts, dividing members according to different areas of participation and non-participation. The distribution of the sensible concerns the circulation of words and images, the demarcation of spaces and times, and the forms of activity. It concerns the way that certain things are held to be meaningful or even self-evident to sense perception, while others are dismissed as meaningless noise. The different ways that what is held to be meaningful on a sensorial level in various contexts then affects what can meaningfully be thought, said, made, or done in those contexts. According to Rancière, social inequalities are in large part a result of this sensible distribution. A key implication of this idea is that art can be understood to be directly political on the level of the sensible (rather than indirectly, as simply representing ideas about social and political issues). Rancière’s politics is one of a non-utopian ideal of democratic emancipation, understood as the constant process of intervening in the current order to broaden spaces of participation and to open potentials of inclusion and participation where these are closed to parts of the community through the existing distribution of the sensible. Art can play an important political role by intervening in the existing order of distributions and helping to redistribute the sensible.

Rancière has also made a notable contribution to aesthetics in contesting the category of ‘modernism’, which has dominated much of the discourse around  art history and aesthetic theory in the early 21st century. According to Rancière, modernism attempts to impose a single meaning and a single historical narrative on the course of developments in the arts, a course which he sees as more complex, involving multiple meanings and temporalities. Moreover, modernism abstracts developments in the arts from other social and cultural forms of collective experience, which, on the contrary, Rancière sees as co-determining them. In place of categories which organise artistic developments according to a simple linear historical progression, Rancière proposes three ‘regimes’ of the arts. These regimes operate to some degree in a historically periodising fashion—as different regimes have predominated in different historical periods—but they also complicate and cut across such periodization. This is because they are not most fundamentally historical categories but, rather, ways that art is thought to operate or be significant, which can function in any historical period. Significantly, more than one regime of art can be operative at a single time. These regimes of art are 1. the ethical regime of images; 2. the representative (or poetic) regime of art; and 3. the aesthetic regime of art.

The ethical regime of images predominated in Ancient Greece and is exemplified with Plato’s discussion of images. Art does not emerge as a category here. Images are understood in relation to their effect on the ethos, or mode of behaviour, of members of the community, and they are interrogated according to their origin and their end, function, or purpose. In this regime, there are images which are thought to be truer or falser and to have a beneficial or detrimental effect on the ethical community. The representative regime of art predominated from the Renaissance to the nineteenth century. Here the idea of not just art but of a system of the arts emerged. Arts were thought of in terms of poetics; that is, sets of rules which determine the different forms of expression and arrange them in a hierarchy, and which also determine which forms of expression (arts, genres) are suitable for particular types of content. It is called the representative regime because this system of categorisation of the arts is organised around the key idea of representation, or mimesis, understood as a fit between form of expression and type of content. Finally, the aesthetic regime of the arts roughly corresponds with the experimentations more usually categorised with terms such as ‘modernism’ or ‘the avant-garde’. With this regime, the idea of art as something truly unique and singular emerges. However, this singularity is involved in a paradox, insofar as the rules for governing the arts characteristic of the representative regime also break down. In the aesthetic regime, art is asserted as a special kind of activity but, since anything can now count as art, there are no longer any criteria for distinguishing it from other forms of activity or production. While the aesthetic regime predominates in the contemporary world with its highly pluralist art scene, Rancière insists that all three regimes are still operative today to some degree.

8. References and Further Reading

  • Adorno, Theodor W. Aesthetic Theory. Edited by Gretel Adorno and Rolf Tiedemann, translated by Robert Hullot-Kentor. London and New York: Continuum, 1997.
    • Adorno’s major work on aesthetics.
  • Benjamin, Walter. ‘The Work of Art in the Age of Mechanical Reproduction’. In Illuminations, edited by Hannah Arendt, translated by Harry Zorn, 211–44. London: Pimlico, 1999.
    • The best known of the several versions of this highly influential essay, in which Benjamin develops the concept of artistic ‘aura’.
  • Cazeaux, Clive, ed. The Continental Aesthetics Reader, 2nd ed. London and New York: Routledge, 2011.
    • A collection of classic readings in aesthetics across the major traditions in continental philosophy, accompanied by insightful introductory essays.
  • Deleuze, Gilles, and Félix Guattari. What Is Philosophy? Translated by Hugh Tomlinson and Graham Burchell. London: Verso, 1994.
    • The chapter ‘Percept, Affect, and Concept’ condenses many aspects of Deleuze’s more extensive treatments of painting, film, and literature, and positions art in relation to philosophy and science.
  • Derrida, Jacques. Acts of Literature. Edited by Derek Attridge. London and New York: Routledge, 1992.
    • An edited collection of some of Derrida’s most important writings on literary topics, including essays on Mallarmé, Joyce, Kafka, Ponge, and Celan, and an interview with Derrida on literature.
  • Derrida, Jacques. The Truth in Painting. Translated by Geoff Bennington and Ian McLeod. Chicago and London: University of Chicago Press, 1987.
    • Derrida’s most well-known application of deconstructive strategies to aesthetic topics beyond literature. Contains the essay on Valerio Adami, ‘+R (Into the Bargain)’.
  • Freud, Sigmund. Writings on Art and Literature. Stanford: Stanford University Press, 1997.
    • A selection of Freud’s writings on aesthetic topics collected from James Strachey’s Standard Edition, which however does not include the two important texts listed below.
  • Freud, Sigmund. ‘Creative Writers and Day-Dreaming’. In Jensen’s ‘Gradiva’ and Other Works. Vol. 9 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 141–54. London: Hogarth Press, 1959.
    • Freud, Sigmund. ‘Leonardo da Vinci and a Memory of His Childhood’. In Five Lectures on Psycho-Analysis, Leonardo da Vinci, and Other Works. Vol. 11 of The Standard Edition of the Complete Psychological Works of Sigmund Freud, edited by James Strachey, 63–138. London: Hogarth Press, 1957.
  •  Gadamer, Hans-Georg. Truth and Method, 2nd revised ed. Translated by Joel Weinsheimer and Donald G. Marshall. London: Bloomsbury, 2013.
  •  Gadamer, Hans-Georg. The Relevance of the Beautiful and Other Essays. Edited by Robert Bernasconi, translated by Nicholas Walker. Cambridge: Cambridge University Press, 1986.
  •  Heidegger, Martin. Nietzsche: Volumes One and Two. Translated by David Farrell Krell. San Francisco: Harper Collins, 1991.
    • Volume one, ‘The Will to Power as Art’, presents Nietzsche’s view of art as holding a privileged ontological status and a value higher than truth.
  • Heidegger, Martin. ‘The Origin of the Work of Art’. In Off the Beaten Track, edited and translated by Julian Young and Kenneth Haynes, 1–56. Cambridge: Cambridge University Press, 2002.
    • Heidegger’s most extensive, significant, and well-known contribution to aesthetics.
  • Kearney, Richard and David Rasmussen, eds. Continental Aesthetics: Romanticism to Postmodernism: An Anthology. London: Wiley-Blackwell, 2001.
    • A useful collection of classic readings, though unaccompanied by any guiding text.
  • Lacan, Jacques. The Seminar of Jacques Lacan, Book XI: The Four Fundamental Concepts of Psychoanalysis. Edited by Jacques-Alain Miller, translated by Alan Sheridan. New York and London: W.W. Norton & Company, 1978.
    • Transcripts of Lacan’s seminar delivered in 1964 and first published in French in 1973. Contains the most influential of Lacan’s work relating to aesthetics, ‘Of the Gaze as Object petit a.’ A problematic translation, but still the only one available.
  • Lyotard, Jean-François. Discourse, Figure. Translated by Antony Hudek and Mary Lyton. Minneapolis: University of Minnesota Press, 2011.
    • The definitive statement of Lyotard’s early aesthetics, which stages an encounter between structuralist, phenomenological, and psychoanalytic approaches.
  • Lyotard, Jean-François. Writings on Contemporary Art and Artists. Edited by Herman Parret. Leuven: Leuven University Press, 2013.
    • Extensive, though still not exhaustive, bilingual (French and English) collection of Lyotard’s writings on aesthetic topics.
  • Merleau-Ponty, Maurice. The Merleau-Ponty Aesthetics Reader. Edited by Michael B. Smith. Evanston, Illinois: Northwestern University Press, 1993.
    • Contains the essays ‘Cézanne’s Doubt’, ‘Indirect Language and the Voices of Silence’, and ‘Eye and Mind’, along with introductory essays on each and a collection of critical essays on Merleau-Ponty’s philosophy of art.
  • Merleau-Ponty, Maurice. The Visible and the Invisible. Edited by Claude Lefort, translated by Alphonso Lingis. Evanston: Northwestern University Press, 1968.
    • Merleau-Ponty’s final, unfinished book. Contains the chapter ‘The Intertwining—The Chiasm’, which outlines the ontology developed in relation to painting in ‘Eye and Mind.’
  • Rancière, Jacques. Dissensus: On Politics and Aesthetics. Edited and translated by Steven Corcoran. London: Bloomsbury, 2010.
    • A collection of some of Rancière’s most important essays on the relationship of politics and aesthetics.
  • Rancière, Jacques. The Politics of Aesthetics. Edited and translated by Gabriel Rockhill. London: Bloomsbury, 2013.
    • A brief, accessible introduction to some of Rancière’s most important ideas in aesthetics, such as the distribution of the sensible, the critique of ‘modernism’ as an aesthetic category, and the three regimes of art.
  • Rancière, Jacques. Aisthesis: Scenes from the Aesthetic Regime of Art. Translated by Zakir Paul. London: Verso, 2013.
    • Rancière’s most complete and definitive work on aesthetics to date.

 

Author Information

Ashley Woodward
Email: a.z.woodward@dundee.ac.uk
University of Dundee
United Kingdom

Epistemology and Relativism

Epistemology is, roughly, the philosophical theory of knowledge, its nature and scope. What is the status of epistemological claims? Relativists regard the status of (at least some kinds of) epistemological claims as, in some way, relative— that is to say, that the truths which (some kinds of) epistemological claims aspire to are relative truths. Self-described relativists vary, sometimes dramatically, in how they think about relative truth and what a commitment to it involves. Section 1 outlines some of these key differences and distinguishes between broadly two kinds of approaches to epistemic relativism. Proposals under the description of traditional epistemic relativism are the focus of Sections 2-4. These are, (i) arguments that appeal in some way to the Pyrrhonian problematic; (ii) arguments that appeal to apparently irreconcilable disagreements (for example, as in the famous dispute between Galileo and Bellarmine); and (iii) arguments that appeal to the alleged incommensurability of epistemic systems or frameworks. New (semantic) epistemic relativism, a linguistically motivated form of epistemic relativism defended in the most sophistication by John MacFarlane (for example, 2014), is the focus of Sections 5-6.  According to MacFarlane’s brand of epistemic relativism, whether a given knowledge-ascribing sentence is true depends on the epistemic standards at play in what he calls the context of assessment, which is the context in which the knowledge ascription (for example, ‘Galileo knows the earth revolves around the sun’) is being assessed for truth or falsity. Because the very same knowledge ascription can be assessed for truth or falsity from indefinitely many perspectives, knowledge-ascribing sentences do not get their truth values absolutely, but only relatively. The article concludes by canvassing some of the potential ramifications this more contemporary form of epistemic relativism has for projects in mainstream epistemology.

Table of Contents

  1. Relativism in Epistemology: Two Approaches
  2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument
  3. Traditional Arguments for Epistemic Relativism: Non-Neutrality
  4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity
  5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’
  6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology
  7. References and Further Reading

1. Relativism in Epistemology: Two Approaches

“Relativism” is notoriously difficult to define. There are however some core insights about relativism that are more or less embraced across the board amongst self-described relativists. One such insight is negative, framed in terms of what relativists are characteristically united in denying. Take for example the following epistemological claims:

  1. Copernicus’s belief that the earth revolves around the sun is justified.
  2. Edmund does not know that the man who will get the job has ten coins in his pocket.
  3. Knowledge is not factorable into component parts.
  4. Beliefs formed on the basis of direct observation are better justified than beliefs formed on the basis of drug-induced wishful thinking.

Relativists of all stripes typically deny at least one—if not all—of the following: that the truth of claims like (a-d) are applicable to all times and frameworks; that they are objective (for example, trivially dependent on our judgments or beliefs) and monistic (for example, in the sense that competing claims are mutually exclusive) (see Baghramian and Carter (2015)). In some cases—a notable example here is Richard Rorty (1979)—philosophers have been labelled relativists primarily on the basis of their distinctive denial(s) of such claims about the status of these kinds of judgments.

Moreover, along with denying the sorts of claims characteristic of metaepistemological realism (for example, Cuneo 2007: Ch 3), the epistemic relativist is also committed to denying the metaepistemological analogues of non-relativist positions that are familiar territory in contemporary metaethics.

For example, contra epistemic error theory (for example Olson 2009), which insists that claims like (a)-(d) which attribute epistemic properties are categorically false, the epistemic relativist maintains that some claims like (a)-(d), which attribute epistemic properties, are true—albeit, true in a way that is in some interesting sense ‘relative’. Likewise, contra the epistemic expressivist (for example Chrisman 2007; Gibbard 1990; Field 1998) who insists that claims like (a-d) are expressions of attitude, the relativist is a cognitivist. Accordingly, the relativist maintains that (a)-(d) are truth-apt, while adding that the truth-aptness is not to be thought of as the realist thinks of it; expressions like (a)-(d) are relatively truth-apt in that the truths they aspire to are relative truths. (We consider shortly what this might involve—as the point is highly controversial amongst relativists).

Another core insight about relativism, generally construed, is co-variance (for example Baghramian 2004; 2014 and Swoyer 2014). Co-variance is the idea that some object, x, depends on some underlying, independent variable, y, such that, in some suitably specified sense, change in the latter results in a change in the former. In embracing relativism about some class of truths, one thereby embraces some kind of co-variance claim. For example, a cultural relativist about epistemic justification tells us that the truth of claims (a-b) varies with local cultural norms and in doing so holds that cultural norm change instances change in what one counts as knowing, justifiably believing, and so forth.

Beyond these mostly uncontroversial ingredients of a relativist proposal—or necessary conditions for being a relativist—the matter of what is sufficient for a view to count as a relativist view is controversial. One influential approach to characterizing relativism has been put forward by Paul Boghossian (2006a). As Boghossian sees things, we can attribute to the epistemic relativist the following package of three claims: epistemic non-absolutism, epistemic relationism and epistemic pluralism.

Epistemic Relativism (Boghossian’s Formulation)

  1. There are no absolute facts about what belief a particular item of information justifies. (Epistemic non-absolutism)
  2. If a person, S’s, epistemic judgments are to have any prospect of being true, we must not construe his utterances of the form ‘‘E justifies belief B’’ as expressing the claim E justifies belief B but rather as expressing the claim: According to the epistemic system C, that I, S, accept, information E justifies belief B. (Epistemic relationism)
  3. There are many fundamentally different, genuinely alternative epistemic systems, but no facts by virtue of which one of these systems is more correct than any of the others. (Epistemic pluralism)

Boghossian’s model is often called the replacement model for formulating epistemic relativism. This is largely due to the inclusion of claim (B), the epistemic relationism thesis. In attributing relationism to the epistemic relativist, Boghossian (2006a: 84) regards the relativist as effectively endorsing a replacing of unqualified epistemic claims with explicitly relational ones. As he puts it:

[…] the relativist urges, we must reform our talk so that we no longer speak simply about what is justified by the evidence, but only about what is justified by the evidence according to the particular epistemic system that we happen to accept, noting, all the while, that there are no facts by virtue of which our particular system is more correct than any of the others.

One of the central moves Boghossian makes against the epistemic relativist in his monograph Fear of Knowledge is to argue that epistemic relativism—formulated as such—is ultimately an incoherent position. In response, some critics—notably Martin Kusch (2010)—have replied that epistemic relativism, formulated in accordance with the replacement model, is not incoherent for the reasons Boghossian suggests—or, at least, in Kusch’s case, that there is a version of this view that is defensible.

A comparatively deeper issue, however, and one that is prior to whether the replacement model leads to incoherence, is whether the inclusion of the relationist clause is an apt way of representing the relativist’s view. Though Boghossian and Kusch disagree on the matter of whether epistemic relativism formulated within the replacement model is tractable, both think that the framework is capable of characterising the epistemic relativist’s core position.

But this point is highly controversial. Crispin Wright (2008: 383) for instance, says of Boghossian’s inclusion of the relationist clause in formulating epistemic relativism:

We can envision an epistemic relativist feeling very distant from this characterisation and of its implicit perception of the situation.

Wright’s complaint, in the main, is that, insisting on the relationist clause is tantamount to insisting that the only way the relativist (who must reject absolute facts about what justifies what) can make sense of how claims of the form ‘S is justified in believing X’ are true (at all) is by construing their content in an explicitly relational way, so that the explicitly relational truths (for example ‘S is justified in believing X, according to system A) are themselves candidates for absolute truth.

But this, Wright says:

[…] is just to fail to take seriously the thesis that claims such as [sic … S is justified in believing X] can indeed be true or false, albeit, only relatively so. (Ibid., 383, my italics).

Wright’s complaint, as quoted in this passage, gestures to what is probably the most substantial divide in the contemporary landscape in relation to epistemic relativism. There are really two important and connected ideas that need unpacking here. The first has to do with charity, and the second has to do with inclusiveness.

Regarding charity: to the extent that one insists that epistemic relationism is an indispensable component of epistemic relativism, one is de facto excluding (by viewing as tacitly unintelligible) the thought that non-explicitly relational claims (for example S is justified in believing p) can be true or false, albeit, only relatively so. And so if it turns out that that this excluded possibility is a viable one, then the attribution to the relativist of the relationist clause is not a suitably charitable way of formulating the relativist’s position.

New (semantic) relativists—whose motivations draw from analytic philosophy of language—regard this excluded possibility as not only viable, but moreover, the only legitimate way to capture a philosophically interesting kind of relativist position. The rationale for thinking this way has been articulated most notably by John MacFarlane (for example 2007, 2011, 2014). MacFarlane’s work over the past decade has stressed that simply relativizing propositional truth to what seem like exotic parameters (for example other than worlds and times—such as judges, perspectives, or standards (including epistemic standards)—is not in itself ‘enough to make one a relativist about truth in the most philosophically interesting sense’. This is because such relativization is compatible with truth absolutism, and MacFarlane’s position is that philosophically interesting relativism must part ways with the absolutist.

Consider, for example, that the epistemic contextualist (for example Cohen 1988; DeRose 1992, 2009) insists that whether ‘S knows that p’ is true can shift with different standards at play in different contexts in which the sentence ‘S knows that p’ is used. This is because, for the contextualist, my utterance of “Keith knows the bank is open” can express different propositions depending on the context in which I use this sentence. If I use the sentence in a context in which it doesn’t matter to me whether Keith knows the bank is open, what I’ve asserted can be true even if uttering the very same sentence would come out false if uttered in a context in which it is extremely important to me that the bank is open—and for the contextualist, this is so even if all other epistemically relevant features of Keith’s situation (for example what evidence Keith has for thinking the bank is open) are held fixed across these contexts of use. When knowledge ‘is relative to an epistemic standard’ in the way that the contextualist relativizes knowledge to an epistemic standard, it remains that a particular occurrence of ‘knows’ used in a particular context, gets its truth value absolutely. A philosophically interesting relativist, as MacFarlane sees it, denies this. The line, according to MacFarlane, between the (genuine) relativist and the non-relativist is best understood as a line that is between views that allow truth to vary with the context of assessment and those that do not’ (2014, vi). A context of assessment is a possible situation in which a use of a sentence might be assessed, where the agent of the context is the assessor of the use of a sentence. This view is described in more detail in Section 5.

This brings us to the point about inclusiveness. From the perspective of the new-age (semantic) relativist like MacFarlane, the kind of position described by Boghossian as epistemic relativism is not really an interesting relativist position. Boghossian’s epistemic relativist, modelled on Gilbert Harman’s (1975) moral relativism, is (by MacFarlane’s lights) best understood as a version of contextualism (see MacFarlane (2014: 33, fn. 5)). After all, (a la epistemic relationism) the explicitly relational claims which Boghossian regards the relativist as in the market to putting forward as true are candidates for absolute truth.

This article does not attempt to adjudicate which kind of approach to thinking about relativism, more generally, is the right one. Rather, the article is divided into two main parts: in short, (i) arguments for epistemic relativism which do not give a context of assessment a significant semantic role (Sections 2-4)—which is termed traditional arguments for epistemic relativism, and (ii) arguments that do—which is termed new (semantic) epistemic relativism (Sections 5-6). The former kinds of arguments are not primarily motivated by considerations to do with how we use language whereas the latter kind of argument strategy (the focus of Sections 5-6) is.

2. Traditional Arguments for Epistemic Relativism: The Pyrrhonian Argument

One influential argument strategy under the banner of epistemic relativism takes as a starting point a famous philosophical puzzle traditionally associated with Pyrrhonian skepticism— that is to say, the Pyrrhonian problematic. The most famous version of the puzzle, the ‘regress’ version of the problematic, goes as follows—the simple presentation here owes to John Greco (2013, 179). Suppose you claim to know that p is true but you are asked to provide a good reason for p. If it is granted that good reasons—for example the sort of reasons good enough to epistemically justify a belief—are non-arbitrary reasons, reasons that we have good reason to believe, then a regress threatens. The idea is that, at least, with the above assumptions in place, it looks as though knowledge as well as epistemic justification require an infinite number of good reasons. But it seems that this is something we do not have, and thus, as the puzzle goes, it looks like we do not know or justifiably believe anything. With reference to this puzzle, the sceptic effectively places the onus on her non-sceptical adversary to reject one or more of the assumptions underwriting the puzzle. Foundationalism, coherentism and infinitism are typically distinguished from one another with reference to which assumption(s) is rejected.

Against this background, Howard Sankey (2010; 2011; 2012) has argued, in a series of papers, that the Pyrrhonian problematic offers the tools to capture the most compelling argument strategy available to the epistemic relativist; in one place, he writes that the ancient Pyrrhonian argument “constitutes the foundation for contemporary epistemic relativism” (Sankey 2012, 184, my italics).

Sankey’s argument comes in primarily in two parts: a negative part and a positive part. Before outlining the negative part, some terminology is helpful.  Sankey (2013: 3) defines epistemic relativism in a restricted way: as a view about epistemic norms, where he defines an epistemic norm as ‘a criterion or rule that may be employed to justify a belief’.  Epistemic relativism is then defined as the thesis that there are no epistemic norms over and above the variable epistemic norms operative in different (local) cultural settings or contexts, where these local contexts are defined as always including at least a system of beliefs and a set of norms. (Sankey 2012, 187). For Sankey’s relativist, whether a belief is justified, or counts as knowledge, depends on epistemic norms, and so, given that different epistemic norms can operate in different contexts, the same belief might be rational/justified/knowledge relative to one context, and not to another.

Sankey’s ‘negative’ argument on behalf of the relativist appeals to the Pyrrhonian puzzle to generate the intermediate conclusion that all epistemic norms are on equal standing; his positive argument moves from the equal standing claim established by the negative argument to the conclusion that epistemic relativism (as he has defined it) is true. The negative argument can be summarized as follows: Take an epistemic norm, N1. Question: how is N1 to be justified? With reference to the Pyrrhonian puzzle, the options don’t look very promising. One option is to Justify N1 by appealing to a further epistemic norm N2. Another option is to justify N1 by appealing to N1. Sankey says neither of these options satisfactorily justifies N1; the former generates an infinite regress, the latter is viciously circular. Now: take any other epistemic norms, N3, N4 … Nn. By running through this same line of thinking with any of N3, N4 … Nn in an attempt to justify any of these norms, we end up in the same place. That is, each of N1 and N3, N4 … Nn are equally lacking in justification. From here, Sankey’s positive move (for example see Sankey 2011 §3, esp. pp. 564-566) on behalf of the relativist goes as follows:

If no norm is better justified than any other, all norms have equal standing. Since it is not possible to provide an ultimate grounding for any set of norms, the only possible form of justification is justification on the basis of a set of operative norms. Thus, the norms operative within a particular context provide justification for beliefs formed within that context. Those who occupy a different context in which different norms are operative are justified by the norms which apply in that context… the relativist is now in a position to claim that epistemic justification is relative to locally operative norms.

Sankey himself, not a relativist, attempts a naturalistically motivated overriding strategy to the argument—one which grants the relativistic challenge as legitimate and then attempts to meet the challenge (2010). Carter (2016) and Seidel (2013) by contrast have proposed undercutting responses which call into question whether the relativist can viably use the argument strategy which Sankey regards as the epistemic relativist’s strongest play. Carter (2016, Ch. 3) challenges the first (negative) part of the argument by noting that the intermediate conclusion (that all norms are equally justified) is one the would-be relativist is entitled to only if it is already granted that foundationalism, coherentism and infinitism are all unsuccessful. But Sankey’s relativist proposes no positive case for this—but rather takes it for granted.

Carter (2016) and Markus Seidel (2013, 137) have both expressed worries that, even if the first part of the argument were granted (and so, even if it were granted that by the Pyrrhonian strategy is effective in establishing that all epistemic norms are on epistemic standing), it’s not clear how relativism is to be motivated over scepticism. As Seidel puts it, Sankey’s relativist actually travels so far down the road with the sceptic that the relativist is “at pains to provide us with reasons [for the relativist to] part company” (137). That is: once it has been claimed that all norms are equally unjustified—no norm is more justified than in any other in any way—it is not apparent, as Seidel observes, how locally credible epistemic norms are supposed to have any positive epistemic status, positive status the relativist wants to preserve when insisting that epistemic norms aspire to relative justification.

For an alternative perspective for how relativism might be better motivated than scepticism—generally speaking—see Michael Williams (for example, 1991; 2001) who defends an anti-sceptical form of relativism (though he rejects this label), specifically a Wittgensteinian-inspired brand of contextualism’ (compare, DeRose 1992), as an alternative to both scepticism as well as metaepistemological realism.

3. Traditional Arguments for Epistemic Relativism: Non-Neutrality

Another kind of argument for traditional epistemic relativism is what Harvey Siegel (2011: 205) has termed the non-neutrality argument. A much-discussed reference point for this argument strategy is Rorty’s (1979) discussion of the famous dispute between Galileo and Cardinal Bellarmine about Copernican heliocentrism. In short, Galileo and Cardinal Bellarmine could not agree about the truth of Copernican heliocentrism, but even more, they also could not agree about what evidential standards were even relevant to settling the matter. Galileo had argued for the Copernican picture on the basis of telescopic evidence. Cardinal Bellarmine dismissed Galileo’s suggestion that Earth revolves around the sun as heretical, by appeal to Scripture. From these disparate starting points, Rorty noted, it looked as though neither was in a position to appeal to neutral ground in the service of rational adjudication—each was operating within a different “grid which determines what sorts of evidence there could be for statements about the movements of the planets” (Rorty 1979: 330-331).

Siegel (2011: 105-106) captures, with reference to this case, the relativist’s reasoning as follows:

The relativist here claims that there can be no non-relative resolution of the dispute concerning the existence of the moons, precisely because there is no neutral, non-question-begging way to resolve the dispute concerning the standards. Any proposed meta-standard that favors regarding naked eye observation, Scripture, or the writings of Aristotle as the relevant standard by which to evaluate “the moons exist” will be judged by Galileo as unfairly favoring his opponents since he thinks he has good reasons to reject the epistemic authority of all these proposed standards; likewise, any proposed metastandard that favors Galileo’s preferred standard, telescopic observation, will be judged to be unfair by his opponents, who claim to have good reasons to reject that proposed standard. In this way, the absence of neutral (meta-) standards seems to make the case for relativism.

The pro-relativist argument that is motivated by the Galileo/Bellarmine dispute, which Siegel (2011: 206) calls “No Neutrality, Therefore Relativism”, as represented in Siegel’s passage, can be pared down to the following argument:

“No Neutrality, Therefore Relativism”

  1. There can be a non-relative resolution of the dispute concerning the existence of the moons, only if there is an appropriately neutral meta-norm available.
  2. In the context of the dispute between Galileo and Bellarmine, no such metanorm is available.
  3. Therefore, it is not the case that there can be a non-relative resolution of the dispute concerning the existence of the moons.
  4. Therefore, epistemic relativism is true.

As stated, the argument is not valid. In order to make the argument valid, a further ‘bridge’ premise (or premises) would be needed to get from (3)—the premise that there can be no non-relative resolution of the dispute concerning moons [or some similar such dispute]—to the conclusion that epistemic relativism is true (4).

What are the prospects of ‘bridging’ (3) and (4)? The viability of a no-neutrality therefore relativism-style argument rests importantly on this question. Steven Hales (2014) defends a version of the no-neutrality therefore relativism argument which attempts to bridge the gap (between (3) and (4)) via process of elimination. Hales argues, with reference to a case involving a similarly deadlocked dispute concerning the nature of the human soul (by interlocutors who adhere to analytic philosophy of mind and the Catechism, respectively) that—from their irreconcilable position—the salient options for resolving the dispute are: (i) keep arguing until capitulation, (ii) compromise, (iii) locate an ambiguity or contextual factors; (iv) accept scepticism or (v) adopt relativism (Hales 2014: 63). Relativism is defended by Hales as the most satisfactory option.

Carter (2015, Ch. 4) has criticised this strategy. For one thing, appealing to relativism’s success as a disagreement-resolution strategy doesn’t obviously help move one from (3) to (4). For example, even if both parties’ can easily resolve their disagreement by adopting the belief that relativism is true, relativism might just as well be false. More generally, that interlocutors’ accepting something X is efficacious in resolving a dispute is not satisfactory grounds for thinking X is true or even probably true. Furthermore, Hales’ process of elimination strategy dismisses skepticism out of hand as “throwing in the towel.” However, this just reinvites the issue of why relativism should be (in the face of the no-neutrality, therefore relativism) argument regarded as motivated over skepticism. As with Sankey’s redeployment of the Pyrrhonian argument considered in Section 2, it is not clear how this is so.

It is worth noting that the no-neutrality therefore relativism argument is but one way philosophers have attempted to motivate relativism by pointing to disagreements. Another route is to appeal to what Max Kölbel (2003) calls “faultless disagreements” (for example, apparently genuine disagreements in some discretionary area of discourse where it seems neither party to the disagreement has made a mistake). These faultless disagreement strategies which appeal to disagreements to motivate relativism, and the neutrality-based strategy considered in this section, are only superficially similar. Unlike the no-neutrality, therefore relativism argument, faultless-disagreement arguments simply do not regard properties of any particular disagreement (for example, the disagreement between Bellarmine and Galileo) as in the market for establishing epistemic relativism. Faultless disagreement-style arguments reason from semantic and pragmatic evidence about disagreement patterns, much more generally, to the conclusion that a relativist semantics (in certain domains where we find such disagreements) best explains our practices of attributing certain terms. This kind of argument is discussed in more detail in Section 5, as it is an argument strategy used by new (semantic) epistemic relativists.

4. Traditional Arguments for Epistemic Relativism: Incommensurability and Circularity

A third kind of argument which has motivated versions of epistemic relativism appeals to incommensurability and epistemic circularity. The idea is that, upon confronting radically different epistemic systems (for example, radically different Kuhnian paradigms, Wittgensteinian framework propositions or individuals who employ what Ian Hacking (1982) calls alien ‘styles of reasoning’) we are called upon to justify not just ordinary beliefs as we usually do, but rather the very epistemic system (that is, the set of epistemic principles or rules) within which we our epistemic evaluations are made. However, once we begin to attempt to justify our own epistemic system, epistemic circularity threatens. Michael Williams (2007: 3-4) expresses the idea on behalf of the relativist as follows:

In determining whether a belief—any belief—is justified, we always rely, implicitly or explicitly, on an epistemic framework: some standards or procedures that separate justified from unjustified convictions. But what about the claims embodied in the framework itself: are they justified? In answering this question, we inevitably apply our own epistemic framework. So, assuming that our framework is coherent and does not undermine itself, the best we can hope for is a justification that is epistemically circular, employing our epistemic framework in support of itself. Since this procedure can be followed by anyone, whatever his epistemic framework, all such frameworks, provided they are coherent, are equally defensible (or indefensible).

There are really two ‘key moves’ in this line of thinking. The first key move contends that—in the face of radically different epistemic systems from our own—our activity of attempting to justify our own epistemic system will lead to epistemic circularity. The second key move adverts to the claim that all attempts to justify epistemic systems result in epistemic circularity and from this claim concludes the epistemic relativist-friendly conclusion that all epistemic systems are equally defensible, or on a par.

The first move, stated more carefully, seems to be that, when an individual S is in a position where S is trying to justify S’s own epistemic framework or system, X, by attempting to justify the claims that comprise the system (x1 … xn), then: (i)  S must (inevitably) apply that system (X); and, the application, by S, of a system X to justify the claims (x1 … xn) of that very system, X, is sufficient for leaving S’s epistemic justification for the claims of X (x1 … xn) circular.

From here, it is helpful to note three central issues which are relevant to the success of this kind of ‘pro-relativist’ strategy, in so far as the kind of epistemic circularity that is supposed to materialise via the application of a system in its own defence is itself of a sort that will leave all epistemic systems equally defensible. The first two issues concern the first key move and the third concerns the second key move.

Firstly, note that it seems in principle possible to pre-empt epistemic circularity altogether by simply rejecting that the justification of S’s epistemic framework depends on S’s ability to non-circularly justify that framework. Consider, for example, the line an externalist reliabilist might take. The process reliabilist (for example, Goldman 1979) might say that the epistemic principles constituting S’s epistemic system (X) are justified simply provided they are reliable and regardless of whether one can successfully justify or know that they are reliable. Compare here the reliabilist’s commitment to basic knowledge— that is to say, that S can know p even though S has no antecedent knowledge that the process R that produced S’s belief is reliable. Likewise, as this idea goes—at greater generality—the reliabilist is in a position to submit that any positive epistemic status which the belief that our own epistemic principles are correct has does not depend on any antecedent facts about our appreciation that they have this status. The reliabilist attempts to undercut the circularity objection then by mooting it.

Two salient replies to this line of reasoning have to do with assertion and bootstrapping, respectively. Regarding assertion: as Mikkel Gerken (2012, 379) has suggested, although some conversational contexts are ones where “S may assert something although S is unable to provide any reason for it” other contexts may not be permissive in this way.  Discursive contexts are on Gerken’s view ones where “interlocutors share a presupposition that an asserter must be able to back up unqualified assertions by reasons… and in which ‘being a cooperative speaker involves being sensitive to reasons for and against what is asserted” (2012, 379). Gerken’s position is that, in such contexts, epistemically appropriate assertion must be discursively justified, where discursive justification is something S possesses only if S is able to articulate some epistemic reasons for believing that p. But if this is right, then, there is a case to make that while an externalist line such as the one sketched above cuts epistemic circularity off at the pass, it does so in a way that would effectively leave one in no position to claim (in the face of a challenge from an interlocutor with a radically different epistemic system) to know that one’s own system is correct.

A second salient kind of reply to the externalist move is to suggest, in short, that even if (with reference to the Williams passage quoted above) it looks as though epistemic circularity materialises only once one uses the epistemic principles constituting one’s own epistemic system in the service of justifying it, this might be misleading. The idea here is that if one attempts to cut this kind of epistemic circularity off at the pass, by opting for the reliabilist move sketched above, then one at the same time (at least, potentially) encounters what is allegedly another malignant form of epistemic circularity in the form of bootstrapping (for example, Vogel 2000)— that is to say, that one would be in a position to acquire track-record evidence via the deliverances of applying one’s own epistemic principles that the application of one’s own epistemic principles is reliable. This point, in conjunction with the previous point about assertion, suggest that the kind of circularity problem Williams intimates can’t be simply circumvented by ‘going externalist’ without also incurring some further challenges.

So the viability of an attempt to block epistemic circularity ex ante by “going externalist” was the first of three issues to highlight relevant to the viability of the kind of argument strategy Williams describes. The second issue concerns the nature of the epistemic circularity in question and which on this line of argument is said to materialise when one attempts to justify one’s epistemic system by appealing to it. Consider that there are in fact two very different kinds of ways in which one might apply an epistemic principle or rule in the service of justifying one’s epistemic system (where, again, the epistemic system is understood as a set of epistemic principles).

Firstly, one might apply a principle by simply following it (for example as when one might follow an inference rule in the service of justifying that inference rule or perhaps justifying the epistemic system of which the inference rule is a part). See Boghossian (2001). However, just as a judge might apply a rule (consider, the rule that ‘one must drive only with a license’) not by following the rule but by invoking its authority (for example McCallum 1966), one might also apply one’s own epistemic principle or principles not by following them but by invoking their authority. For example, one might attempt to justify inference to the best explanation (IBE) by invoking the authority of the wider system of epistemic principles within which IBE belongs: Western Science.

The overarching point here is that the kind of epistemic circularity that materialises as a function of one’s appealing to one’s epistemic system in the service of justifying it can take on different shapes—with different kinds of premise-conclusion dependence relations. Accordingly, an argument that attempts to move from epistemic circularity to relativism must thus be appropriately sensitive to these different shapes epistemic circularity can potentially take on when one applies one’s own epistemic system in the service of justifying it. This is because it is not obvious that all such shapes are equally epistemically objectionable. (For discussion on this point, see Pryor 2004 and Wright 2007).

The third issue to raise concerns the second ‘key move’ in the sequence Williams describes: the move that is supposed to get us from circularity to relativism. Even on the assumption that the kind of epistemically circular justification one is left with for one’s own epistemic principles (and more generally, one’s epistemic system) renders all epistemic principles on an ‘equal footing’—this equal-footing option is compatible with both scepticism as well as relativism. An argument successfully establishes epistemic relativism from the position described only if provides a non-arbitrary reason to embrace relativism over scepticism.

5. New (Semantic) Epistemic Relativism: Assessment-Sensitive Semantics for ‘Knows’

One recurring objection-type to traditional arguments for epistemic relativism (of the sort surveyed in §2-4) is that these arguments face a shared difficulty when it comes to showing why, in light of the philosophical considerations adverted to, relativism is at the end of the day a more attractive option than skepticism. New (semantic) epistemic relativism doesn’t face this kind of challenge. This is because new (semantic) relativism (hereafter, new relativism) is motivated on the basis of very different kinds of philosophical considerations than the argument strategies considered in §§2-4.

The present section is organised as follows: two preliminary points about new relativism are first noted, and then MacFarlane’s most substantial (2014) argument for an assessment-sensitive semantics for “knows” is outlined; it is an argument that depends on two key premises, and MacFarlane’s rationale for defending these premises are discussed in some depth. Note that while there are other ways of motivating semantic relativism that do not appeal explicitly to ‘contexts of assessment’ (for example, Richard 2004; Egan 2007), which is MacFarlane’s distinctive terminology, I am in what follows focusing on MacFarlane’s presentation, as it is the most developed.

That said, the first preliminary point to note concerns the relationship between epistemic contextualism and relativism. As was noted in section 1, epistemic contextualism is—by MacFarlane’s lights—not on the interesting side of the line between absolutism and relativism. The point to stress here is that while the contextualist can, no less than the relativist, recognize a ‘standards’ parameter (and in this respect can allow the extension of “knows” to vary with standards), for the contextualist, its value will be supplied by the context of use, whereas the relativist (proper) takes it to be supplied completely independently of the context of use, by the context of assessment.

The second preliminary remark concerns the rationale for embracing a MacFarlane-style relativist semantics for “knows” which should be understood as differing from the kind of rationale we find in Lewis’s (1980) and Kaplan’s (1989) foundational work in semantics according to which sentence truth was relativized to familiar parameters such as worlds, times and locations. The important point here is that while Lewis’s and Kaplan’s reasons for “proliferating” parameters were primarily based on considerations to do with intensional operators, the more contemporary reasons (for example as appealed to by MacFarlane and other ‘new relativists’) for adding a standards parameter (that is in the context of assessment) are often to do with respecting linguistic use data, for example disagreement data (for example, see Baghramian and Carter 2015). For example, those who endorse truth-relativism about predicates of personal taste, (for example Lasersohn 2005; Kölbel 2003, MacFarlane 2014) take a truth-relativist semantics to better explain our patterns of using terms like “tasty” than do competing contextualist, sensitive and insensitive invariantist semantics. Accordingly, defending new-age relativism typically involves, for some area of discourse D, a philosophical comparison of costs and benefits of different competing semantic approaches to the relevant D expressions, replete with a case for thinking that the truth-relativist all-things-considered performs the best. A familiar such claimed advantage by a MacFarlane-style truth-relativist is that the kind of ‘subjectivity’ (for example standards-dependence) the contextualist claims the traditional invariantist cannot explain can be captured by the relativist without—or so the relativist tells us—“losing disagreement” where losing disagreement is a stock objection to contextualism in areas where disagreements appear genuine.

In three different places, MacFarlane (2005, 2009, 2014) has argued that knowledge attributions of the form “S knows that p” are assessment-sensitive. The focus of his presentation has varied across these three defenses of the view, but one core strand of thought resurfaces each time.

For ease of convenience, we can call this core strand MacFarlane’s “master argument” for an assessment-sensitive semantics for knowledge attributions.

Master Argument for Assessment Sensitive Semantics for Knowledge Attributions

(1) Standard invariantism, contextualism and SSI all have advantages and weaknesses.

(2) Relativism preserves the advantages while avoiding the disadvantages.
(3) Therefore, prima facie, we should be relativists about knowledge attributions.

The remainder of this section attempts to show why MacFarlane thinks that premises (1) and (2) of the master argument are true, and thus why he thinks we should embrace a relativist treatment of “knows”. The discussion to this end draws primarily from MacFarlane’s latest presentation of his relativist treatment of “knows”, one which gives the notion of relevant alternatives a central place.

Question: Why should we think (1) is true? As MacFarlane sees things, each of the three standard views of the semantics of knowledge-attributions—standard invariantism, contextualism and subject-sensitive invariantism (SSI)—has a grain of truth to it, as well as an “Achilles heel: a residuum of facts about our use of knowledge attributions that it can explain only with special pleading” (2005, 197).

His latest way of making this point relies on a kind of sceptical “conundrum”, one which arises in light of our ordinary practices of attributing knowledge, and which he uses as a frame of reference for magnifying what he regards as the salient weaknesses of the three standard views.

MacFarlane’s Conundrum: If you ask me whether I know that I have two dollars in my pocket, I will say that I do. I remember getting two dollar bills this morning as change for my breakfast; I would have stuffed them into my pocket, and I haven’t bought anything else since. On the other hand, if you ask me whether I know that my pockets have not been picked in the last few hours, I will say that I do not. Pickpockets are stealthy; one doesn’t always notice them. But how can I know that I have two dollars in my pocket if I don’t know that my pockets haven’t been picked? After all, if my pockets were picked, then I don’t have two dollars in my pocket. It is tempting to concede that I don’t know that I have two dollars in my pocket. And this capitulation seems harmless enough. All I have to do to gain the knowledge I thought I had is check my pockets. But we can play the same game again. I see the bills I received this morning. They are right there in my pocket. But can I rule out the possibility that they are counterfeits? Surely not. I don’t have the special skills that are needed to tell counterfeit from genuine bills. How, then, can I know that I have two dollars in my pocket? After all, if the bills are counterfeit, then I don’t have two dollars in my pocket (2014: 177).

MacFarlane articulates the form of the conundrum-argument as follows:

(i) p obviously entails q. [premise]

(ii) If a knows that p, then a could come to know that q without further empirical investigation. [1, Closure]

(iii) a does not know that q and could not come to know that q without further empirical investigation. [premise]

(iv) Hence a does not know that p. [2, 3, modus tollens]

Standard (insensitive) invariantism, the view that the epistemic standards that must be met for “S knows p” to be true are not (in any way) context sensitive, faces two central problems, by MacFarlane’s lights. Both problems are familiar. Firstly, standard invariantism has trouble making sense of the variability of our willingness to attribute knowledge. Secondly, standard invariantism seems stuck with an unhappy choice of either: embracing scepticism (if the invariantist simply accepts (iv)), embracing dogmatism (if the invariantist tries to avoid the sceptical conclusion (iv) by rejecting (iii)), or rejecting the closure principle which licenses the move from (i) to (ii)— that is to say, the principle that (as MacFarlane states it): ‘if a knows that p, and p obviously entails q, then a could come to know q without further empirical investigation’ (2014, 177).

By contrast, contextualism offers a way to avoid each of these problems facing standard invariantism. Unlike the invariantist whose position is at tension with data about the variability of our willingness to attribute knowledge, the contextualist has an explanation to offer for this variability: namely, our willingness to attribute knowledge varies across contexts because what is meant by “knows” is sensitive to the context in which it is used. As MacFarlane writes, “on the most natural form of this view, ‘knowing’ that p requires being able to rule out contextually relevant alternatives to p. Which alternatives are relevant depends on the context”. For instance, and with reference to MacFarlane’s Conundrum, when I’m first asked whether I know (p)—that I have two dollars in my pocket—‘knowing’ that p requires I need only to be able to rule out very basic alternatives (for example that I didn’t already spend the $2); I needn’t be able to also rule out that my pockets have been picked to count as ‘knowing’ (Ibid., p. 177). Though when someone asks me whether my pockets have been picked, then ‘knowing’ requires ruling out this alternative, and if I can’t, then the standard required for ‘knowing’ in this context is not met. Contextualism can make sense not only of the variability of our willingness to attribute knowledge, but it also avoids the unpalatable dilemma facing standard invariantism: reject closure or embrace scepticism or dogmatism. As the standard line goes, contextualists needn’t be tarred as sceptics or dogmatists because they can in fact preserve closure, at least, within any one context of use. So contextualism is looking pretty good.

However, although treating “knows” like “tall”—where the meaning of knows depends on the context in which it is being used—offers a nice escape route (vis-à-vis MacFarlane’s Conundrum), there are other respects in which treating “knows” like “tall” raises new problems. For example, an apparent disagreement between A and B about whether Michael Jordan is tall quickly is revealed to be no disagreement at all when it is clear to both parties that A means “tall for a given person” and B means “tall for an NBA player”. However, as MacFarlane notes, things are different with “know”. He writes:

If I say “I know that I have two dollars in my pocket,” and you later say, “You didn’t know that you had two dollars in your pocket, because you couldn’t rule out the possibility that the bills were counterfeit,” I will naturally take your claim to be a challenge to my own, which I will consider myself obliged either to defend or to withdraw. It does not seem an option for me to say, as the contextualist account would suggest I should: “Yes, you’re right, I didn’t know. Still, what I said was true, and I stick by it. I only meant that I could rule out the alternatives that were relevant then.” Similarly, the skeptic regards herself as disagreeing with ordinary knowledge claims—otherwise skepticism would not be very interesting. But if the contextualist is right, this is just a confusion (Ibid., p. 181; compare, Vogel 1990).

And here is where the special pleading comes in. The contextualist can attempt to say that our taking each other to agree/disagree with each other in the relevant kinds of cases is just a mistake of some sort. But, as MacFarlane sees it, this is a double edged sword: the more speaker error the contextualist must posit to explain the way we use “knows”, the less the contextualist can rely on the way we use “knows” to support contextualism. While contextualism does better than standard invariantism in that it avoids the dilemma raised to standard invariantism, standard invariantism makes better sense of disagreement.

By contrast with insensitive invariantism and contextualism, subject-sensitive invariantism (‘SSI’) might have the best offer to make yet. According to SSI, whether my utterance of “Archie knows that his car is in the parking lot” is true does depend on context, though in a different sense than it does for the contextualist: rather than depending on what alternatives I (the utterer of the sentence) can rule out (for example whether or not I know there are no thieves lurking nearby) what matters on SSI is whether Archie, the subject of the knowledge attribution, can rule out the alternatives relevant to his practical environment. This proposal has some advantages. For one thing, the ‘SSIist’ looks well-positioned to make sense of disagreement, given that ‘knows’ is not being treated like ‘tall’. Further, the SSIist unlike the insensitive invariantist can make sense of variability in willingness to attribute knowledge. Where the special pleading comes in concerns temporal and modal embedding.

The alleged problem (see, for example, Blome-Tillmann 2009) for SSIists is this: temporal and modal operators shift the circumstances of evaluation in such a way that, if SSI is true, we should expect that (in cases of temporal and modal embeddings of “know”) knowledge attributions will track whether the subject can rule out alternatives relevant in the subject’s practical environment in the (temporally or modally shifted) circumstance of evaluation. But this prediction doesn’t seem to pan out, as speakers are inclined to regard the same alternatives as relevant when evaluating non-embedded and embedded uses of “know”.

As MacFarlane sees it, I will not be inclined to say either of the following, which the SSIist predicts I should be willing to say:

Temporal embedding: I know that I had two dollars in my pocket after breakfast, but I didn’t know it this morning, when the possibility of counterfeits was relevant to my practical deliberations—even though I believed it then on the same grounds that I do now.

Modal embedding: I know that I have two dollars in my pocket, but if the possibility of counterfeiting were relevant to my practical situation, I would not know this—even if I believed it on the same grounds as now.

The moral of the story—though see Stanley (2016) for a reply on behalf of the SSIist—is supposed to be that, while each of the three leading competitor views does better than others in some respects, none of these views can make sense of our willingness to attribute knowledge without some sort of Achilles heel. And that is more or less MacFarlane’s defense of (1) in the master argument.

What about premise (2)? Premise (2) of the master argument, recall, says that:

(2) Relativism preserves the advantages while avoiding the disadvantages.

Toward the end of defending (2), MacFarlane suggests that what we want is a semantics for knowledge attributions that satisfies the following three key desiderata, desiderata such that (as he takes himself to have established in defending (1)) none of the three leading contender views can satisfy all of them:

Alternative-variation: It would explain how the alternatives one must rule out to count as knowing vary with context (otherwise, the view faces the dilemma facing insensitive invariantism, with respect to MacFarlane’s conundrum).

Alternative variation context ( use): the alternatives one must rule out to count as knowing must not vary with context of use (otherwise: disagreement cannot be preserved, a la contextualism).

Alternative variation context ( circumstances of subject): the alternatives one must rule out to count as knowing must not vary with circumstances of the subject to whom knowledge is ascribed (otherwise: temporal and modal embeddings cannot be made sense of, a la SSI).

Here is where the relativist is said to come to the rescue. The first step is to preserve alternative variation by taking the relevant alternatives to be determined by the context of assessment. As MacFarlane puts it:

The resulting view would agree with contextualism in its predictions about when speakers can attribute knowledge, since when one is considering whether to make a claim, one is assessing it from one’s current context of use. So it would explain the variability data as ably as contextualism does, and offer the same way of rescuing closure from the challenge posed by the conundrum. But it would differ from contextualism in its predictions about truth assessments of knowledge claims made by other speakers, and about when knowledge claims made earlier must be retracted. Moreover … it would vindicate our judgments about disagreement between knowledge claims across contexts (MacFarlane 2014, 188).

What about the temporal and modal embedding problem that faced SSI? Relativism, he argues, dodges this because a parameter for a set of contextually relevant alternatives is added to the index as a parameter distinct from world and time indices such that shifting the world and time indices (for example as when ‘knows’ is temporally or modally embedded) does not involve shifting also the relevant alternatives parameter (Ibid., 188).

the relation “knows” expresses does not vary with the context—there is just a single knowing relation—but the extension of that relation varies across relevant alternatives. As a result, it makes sense to ask about the extension of “knows” only relative to both a context of use (which fixes the world and time) and a context of assessment (which fixes the relevant alternatives). (Ibid., 189).

MacFarlane takes the view he hass proposed as one that escapes the sceptical conundrum while threading the gauntlet so as to avoid the disagreement problem that faces contextualists and the temporal and modal embedding problem that faces SSI. At this stage, we can see why MacFarlane thinks his view has all the advantages and none of the disadvantages. This concludes the presentation of MacFarlane’s defense of premise (2) of the master argument. And from (1) and (2) it follows that “knows” gets a relativist treatment.

6. New (Semantic) Epistemic Relativism: Issues and Implications in Epistemology

Is MacFarlane’s argument sound? Interestingly, this is relatively new terrain. The above line of argument is from 2014, so there has yet to be substantial criticism in the literature to this new form of relativism. See, however, Carter (2016, Ch. 7) for criticisms of MacFarlane’s (2014) view to the effect that the view generates the wrong results in cases of environmental epistemic luck and normative defeaters.

In this section, however, the focus is on implications in epistemology for embracing an assessment-sensitive semantics for “knows.” MacFarlane concludes his 2009 defense of an assessment-sensitive semantics for “knows” with a section entitled “Questions for the Relativist.” One question he asks, in light of his recommendation to extend a truth-relativist semantics for “knows” is: “are there other expressions for which a relativist treatment is needed? How does know relate to them?” (MacFarlane 2009: 16). A more specific version of this question is: if “know” gets a truth-relativist semantics, then since knowledge relates intimately with other epistemic concepts, do any other epistemic concepts need a relativist treatment? This is an important question and one which has obvious implications for the wider shape new epistemic relativism would take.

In tracing out epistemological ramifications of a relativist treatment of ‘knows’ in epistemology, it is helpful to begin with especially tight conceptual connections (between knowledge and other epistemic standings) and move outward from there. This section takes as a starting point two such connections: namely, connections between propositional knowledge and (i) evidence; and (ii) knowledge-how (for a more detailed discussion, see Carter 2017).

Firstly, evidence. Consider, as an example case, Williamson’s (2000) knowledge-evidence equivalence: E=K. Suppose, for reductio, that E=K, and further, that the truth-conditions for E are not assessment sensitive, but the truth-conditions for K, are. The resulting tension would be untenable (at best), at worst, contradictory. While of course Williamson’s view is controversial, it seems that if Williamson is right that our evidence is what we know, and thus that S’s evidence includes E if, and only if, S knows E, then one who embraces a relativist semantics for (propositional) knowledge ascriptions should be willing to embrace the view that that evidence ascriptions are assessment-sensitive.

Of course, E=K is a controversial position. The above point however was meant to illustrate one very straightforward sense in which a commitment to giving a relativist treatment to “knows” would have a straightforward implication in epistemological theory.

Let us move from a straightforward equivalence thesis (as was E=K) to a reductivist thesis. We needn’t look further than the most standard contemporary version of intellectualism about knowledge-how. Reductivist versions of intellectualism (compare Bengson & Moffett (2011)) insist that knowing how to do something is just a species of propositional knowledge (Stanley 2010, 207). As Stanley puts it:

[…] you know how to ride a bicycle if and only if you know in what way you could ride a bicycle. But you know in what way you could ride a bicycle if and only if you possess some propositional knowledge, viz. knowing, of a certain way w which is a way in which you could ride a bicycle, that w is a way in which you could ride a bicycle (Ibid., 209).

Like Williamson’s E=K thesis, Stanley’s reduction of knowledge-how to a kind of knowledge-that is also controversial, though very much a live and increasingly popular view in contemporary epistemology. Suppose, for reductio, that knowing how to do something is (a la Stanley) just a kind of propositional knowledge, and further, that the truth-conditions for knowing how to do something (for example, as in the case of attributions of the form “Hannah knows how to ride a bike”) are not assessment sensitive, but the truth-conditions for proposition knowledge are, such that “Hannah knows p” is assessment-sensitive, where p is a proposition specifying of a way w which is a way in which Hannah could ride a bicycle, that w is a way in which Hannah could ride a bicycle. Again, the resulting tension would be untenable (at best), at worst, contradictory.

What the foregoing brief consideration of evidence and knowledge-how indicates is that, at least for those with certain substantive commitments in epistemology where epistemic standings other than knowledge are either identified with or in some way reduced to (a kind of) propositional knowledge, an extension of an assessment-sensitive semantics to these standings as well looks potentially unavoidable. One interesting future direction of research will be to trace out the implications of a relativist semantics for “knows” even further, by moving outward to epistemic standings with (perhaps) looser but not insignificant conceptual connections to knowledge, such as justification, rationality, understanding and intellectual virtue. See Carter (2014; 2015, Ch. 8) for some discussion here. A further complementary direction for future research will be to consider how other notions, besides “knows’ for which a relativist semantics has been proposed might have implications in epistemology. A natural candidate expression here is “ought” (for example, Kolodny and MacFarlane 2010; MacFarlane 2014, Ch. 11). In short, if the moral ought gets a relativist treatment, it is hard to see how the epistemic ought would not likewise. However, if the epistemic “ought” is relative, then this has ramifications for epistemic normativity more generally. For example, if whether one ought to believe something is a relative matter, then plausibly, whether one is justified in believing something is a relative matter. Likewise, if epistemic oughts are relative, then presumably so will the epistemic norms which generate epistemic oughts.

A relativist treatment of “knows” also stands to have interesting implications for epistemologists concerned with how the kind of function the concept of knowledge plays might potentially inform our theory of knowledge. A flourishing contemporary research program within mainstream epistemology, one which Robin McKenna (2013) has called the “functional turn” in epistemology, takes as a starting point that “a successful analysis of knowledge must also fit with an account of the distinctive function or social role that the concept plays in our community […] Call this the ‘functional turn’ in epistemology (McKenna 2013: 335-336). Participants in the functional turn in epistemology appeal to practical explications of the concept of knowledge, on the basis of which they identify a function, where that function is regarded as generating an ex ante constraint on an analysis of knowledge (or a semantics of knowledge attributions). Henderson (2009; 2011), McKenna (2013; 2014), Pritchard (2012) and Hannon (2013; 2014; 2015) have for instance defended views about the concept of knowledge (or knowledge ascriptions) inspired by Craig’s (1990) favoured account of the function of knowledge as identifying good informants. By contrast, Kappel (2010), Kelp (2011) and Rysiew identify closure of inquiry as the relevant function and regard this rather than Craig’s tracking-good-informants function as generative of an ex ante constraint for theorizing about knowledge and its truth-conditions. For Krista Lawlor (2013) the relevant function is identified (a la Austin) as that of providing assurance.

Can “knows”, given a relativist treatment, potentially play (any of) these widely identified functional roles— that is, of identifying reliable informants, marking the closure of inquiry or providing assurance? This is an open question for future research.

Finally, and much more generally, semantic (new) relativism about “knows” raises some interesting metaepistemological issues. Mainstream epistemologists, by and large, take for granted within epistemological theory that the explanandum under the description of “knowledge” is not relative. If the ordinary concept of knowledge, however, requires a relativist treatment, then this presses the complicated issue of whether the ordinary concept of knowledge and the concept of interest to epistemologists are the same, and (even more generally) just how knowledge attributions should inform the theory of knowledge.

7. References and Further Reading

  • Baghramian, Maria. Relativism. London: Routledge, 2004.
  • Baghramian, Maria. The Many Faces of Relativism. London: Routledge, 2014.
  • Baghramian, Maria and Carter, J. Adam. “Relativism.” Stanford Encyclopedia of Philosophy, 2015. http://plato.stanford.edu/entries/relativism/
  • Blome-Tillmann, Michael. “Contextualism, Subject-Sensitive Invariantism, and the Interaction of’ Knowledge’-Ascriptions with Modal and Temporal Operators.” Philosophy and Phenomenological Research (2009): 315-331.
  • Boghossian, Paul. “How are Objective Epistemic Reasons Possible?” Philosophical Studies 106, no. 1 (2001): 1-40.
  • Boghossian, Paul. “Epistemic Relativism.” The Routledge Companion to Epistemology, 2011. doi:10.4324/9780203839065.ch8.
  • Boghossian, Paul. Fear of Knowledge: against Relativism and Constructivism. Oxford: Clarendon Press, 2006.
  • Carter, J. Adam. Metaepistemology and Relativism. Palgrave Macmillan, 2016.
  • Carter, J. Adam. “Disagreement, Relativism and Doxastic Revision.” Erkenntnis 79, no. S1 (February 2013): 155–72. doi:10.1007/s10670-013-9450-7.
  • Carter, J. Adam. “Relativism, Knowledge and Understanding.” Episteme 11, no. 01 (April 2013): 35–52. doi:10.1017/epi.2013.45.
  • Carter, J. Adam. “Epistemological Implications of Relativism.” In J.J. Ichikawa (ed.) Routledge Handbook of Contextualism, 2017, London: Routledge.
  • Chrisman, Matthew. “From Epistemic Contextualism to Epistemic Expressivism.” Philosophical Studies 135, no. 2 (2006): 225–54. doi:10.1007/s11098-005-2012-3.
  • Cohen, Stewart. “How to Be a Fallibilist.” Philosophical Perspectives 2 (1988): 91. doi:10.2307/2214070.
  • Craig, Edward. Knowledge and the State of Nature: An Essay in Conceptual Synthesis. Oxford University Press, 1990.
  • Cuneo, Terence. The Normative Web: An Argument for Moral Realism. Oxford: Oxford University Press, 2007.
  • DeRose, Keith. The Case for Contextualism. Oxford: Oxford Univ. Press, 2009.
  • Derose, Keith. “Contextualism and Knowledge Attributions.” Philosophy and Phenomenological Research 52, no. 4 (1992): 913. doi:10.2307/2107917.
  • Derrida, Jacques. Of Grammatology. Baltimore: Johns Hopkins University Press, 1976.
  • Egan, Andy. “Epistemic Modals, Relativism and Assertion.” Philosophical Studies 133, no. 1 (2007): 1-22.
  • Gerken, M. “Discursive justification and skepticism.” Synthese, (2012). 189 (2), 373-394.
  • Gibbard, Allan. Wise Choices, Apt Feelings: A Theory of Normative Judgment, n.d.
  • Greco, John. “Reflective Knowledge and the Pyrrhonian Problematic.” Virtuous Thoughts: The Philosophy of Ernest Sosa, 2013, 179–91. doi:10.1007/978-94-007-5934-3_10.
  • Hacking, Ian. ‘Language, Truth and Reason.’ In Rationality and Relativism, 48–66. (1982).
  • Hales, Steven D. “Motivations for Relativism as a Solution to Disagreements.” Philosophy 89, no. 01 (September 2013): 63–82. doi:10.1017/s003181911300051x.
  • Hales, Steven D. Relativism and the Foundations of Philosophy. Cambridge, MA: MIT Press, 2006.
  • Harman, Gilbert. “Moral Relativism Defended.” The Philosophical Review 84, no. 1 (1975): 3. doi:10.2307/2184078.
  • Kaplan, David (1977). “Demonstratives.” In Joseph Almog, John Perry & Howard Wettstein (eds.), Themes from Kaplan. Oxford University Press 481-563.
  • Kölbel, Max. “III-Faultless Disagreement.” Proceedings of the Aristotelian Society (Hardback) 104, no. 1 (2004): 53–73. doi:10.1111/j.0066-7373.2004.00081.x.
  • Kolodny, Niko, and John MacFarlane. “Ifs and Oughts.” The Journal of philosophy 107, no. 3 (2010): 115-143.
  • Lammenranta, Markus. “The Pyrrhonian Problematic.” Oxford Handbooks Online, 2008. doi:10.1093/oxfordhb/9780195183214.003.0002.
  • Lasersohn, Peter. “Context Dependence, Disagreement, and Predicates of Personal Taste.” Linguistics and Philosophy 28 (6):643—686, 2005.
  • Lewis, David. “Index, Context, and Content.” In Stig Kanger & Sven Öhman (eds.), Philosophy and Grammar: Reidel (1980): 79-100.
  • MacFarlane, John. Assessment Sensitivity: Relative Truth and Its Applications. Oxford: Oxford University Press, 2014.
  • MacFarlane, John. ‘The Assessment Sensitivity of Knowledge Attributions.’ Oxford Studies in Epistemology 1: 197– 233 (2005).
  • Macfarlane, John. “Relativism and Knowledge Attributions.” The Routledge Companion To Epistemology, 2011. doi:10.4324/9780203839065.ch49.
  • Macfarlane, John. “Xiv *-Making Sense of Relative Truth.” Proceedings Of the Aristotelian Society (Hardback) 105, no. 1 (2005): 305–23. doi:10.1111/j.0066-7373.2004.00116.x.
  • McKenna, Robin. “Knowledge Ascriptions, Social Roles and Semantics. ”Episteme 10(4), 335-350. (2013).
  • Meiland, Jack and Michael Krausz. (eds). Relativism, Cognitive and Moral. Notre Dame, Indiana: University of Notre Dame Press, 1982.
  • Olson, Jonas. “Error Theory and Reasons for Belief.” Reasons for Belief, 2011, 75–93. doi:10.1017/cbo9780511977206.006.
  • Pryor, James. “What’s Wrong with Moore’s Argument?” Philosophical Issues 14, no. 1 (2004): 349-378.
  • Richard, Mark. “Contextualism and Relativism.” Philosophical Studies 119, no. 1 (2004): 215-242.
  • Rorty, Richard. Philosophy and the Mirror of Nature. Princeton: Princeton University Press, 1979.
  • Sankey, Howard. “Scepticism, Relativism and the Argument from the Criterion.” Studies In History and Philosophy of Science Part A 43, no. 1 (2012): 182–90. doi:10.1016/j.shpsa.2011.12.026.
  • Sankey, Howard. “Witchcraft, Relativism and the Problem of the Criterion.” Erkenn Erkenntnis 72, no. 1 (2009): 1–16. doi:10.1007/s10670-009-9193-7.
  • Seidel, Markus. Epistemic Relativism: A Constructive Critique. Palgrave MacMillan, 2014.
  • Seidel, Markus. “Scylla and Charybdis of the Epistemic Relativist: Why the Epistemic Relativist Still Cannot Use the Sceptic’s Strategy.” Studies in History and Philosophy of Science Part A 44, no. 1 (2013): 145–49. doi:10.1016/j.shpsa.2012.10.004.
  • Seidel, Markus. “Why The Epistemic Relativist Cannot Use the Sceptic’s Strategy. A Comment on Sankey.” Studies In History and Philosophy of Science Part A 44, no. 1 (2013): 134–39. doi:10.1016/j.shpsa.2012.06.004.
  • Stanley, Jason. “On a Case for Truth Relativism.” Philosophy and Phenomenological Research 92.1, 2016: 179-188
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism.” Episteme 4, no. 01 (2007): 93–114. doi:10.3366/epi.2007.4.1.93.
  • Williamson, Timothy. Knowledge and its Limits. Oxford: Oxford University Press, 2000.
  • Vogel, Jonathan. “Reliabilism Leveled.” The Journal of Philosophy 97 (2000): 602-623.
  • Vogel, Jonathan. “Are there Counterexamples to the Closure Principle?”. In Michael David Roth & Glenn Ross (eds.), Doubting: Contemporary Perspectives on Skepticism. Dordrecht: Kluwer (1990): 13-29.
  • Williams, Michael. Unnatural Doubts: Epistemological Realism and the Basis of Skepticism, Cambridge, MA: Blackwell, 1991.
  • Williams, Michael. Problems of Knowledge, Oxford and New York: Oxford University Press, 2001.
  • Williams, Michael. “Why (Wittgensteinian) Contextualism Is Not Relativism,’ Episteme 4 (2007): 93–114.
  • Wright, Crispin. “Fear of Relativism?” Philosophical Studies 141, no. 3 (2008): 379–90. doi:10.1007/s11098-008-9280-7.
  • Wright, Crispin. “The Perils of Dogmatism.” In Nuccetelli & Seay (eds.), Themes from G. E. Moore: New Essays in Epistemology. Oxford University Press
  • Wright, Crispin. “New Age Relativism and Epistemic Possibility: The Question of Evidence 1.” Philosophical Issues 17, no. 1 (2007): 262–83.

 

Author Information

J. Adam Carter
Email: jadamcarter@gmail.com
University of Edinburgh
United Kingdom

Tu Weiming (1940—)

Tu Weiming (pinyin: Du Weiming) is one of the most famous Chinese Confucian thinkers of the 20th and 21st centuries. As a prominent member of the third generation of “New Confucians,” Tu stressed the significance of religiosity within Confucianism. Inspired by his teacher Mou Zongsan as well as his decades of study and teaching at Princeton University, the University of California, and Harvard University, Tu aimed to renovate and enhance Confucianism through an encounter with Western (in particular American) social theory and Christian theology. His writings about Confucianism have served as critical links between Western philosophy and religious studies and the world of modern Confucian thought. Tu asserted that Confucianism can learn something from Western modernity without losing recognition of its own heritage. By engaging in such “civilizational dialogue,” Tu hoped that different religions and cultures can learn from each other in order to develop a global ethic. From Tu’s perspective, the Confucian ideas of ren (“humaneness” or “benevolence”) and what he calls “anthropocosmic unity” can make powerful contributions to the resolution of issues facing the contemporary world.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism has also generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than Tu. Nonetheless, Tu’s influence on contemporary Confucian philosophy cannot be overestimated, especially where its reception in the West is concerned.

Table of Contents

  1. Biography
  2. Confucianism as Religious Humanism
    1. Continuity of Being
    2. Anthropocosmic Unity
  3. Selfhood as Creative Transformation
    1. Ren
    2. Li
    3. The Junzi or Profound Person
    4. Fiduciary Community and Filial Piety
    5. Embodied Knowing
      1. Cheng
      2. Fourfold Human Nature
  4. Confucianism and Modernity
    1. Critique of Confucian Tradition
    2. Critique of Western Enlightenment Thought
    3. Civilizational Dialogue
  5. Influence and Criticism
  6. References and Further Reading

1. Biography

Tu was born to well-educated parents in Kunming, Yunnan Province, China in 1940. He has described the nanny who helped to care for him as uneducated, yet expressive of Confucian values in her daily words and actions. Thus, although Tu did not study the Confucian classics during his childhood, he was brought up immersed within a Confucian cultural environment. In 1949, he moved with his family to Taiwan and studied at Taipei Municipal Jianguo High School. At that time, the Taiwan government was promoting national moral education, which included a heavy emphasis on Confucianism—a subject about which the young Tu became enthusiastic. Among his teachers was Zhou Wenjie, a student of the “New Confucian” philosopher Mou Zongsan. After completing high school, Tu enrolled in Taiwan’s Tunghai University, where he studied directly with Mou as well as Mou’s fellow “New Confucian” thinker, Xu Fuguan. Tu’s undergraduate studies with Mou and Xu led to his being awarded a Harvard-Yenching Institute scholarship to study at Harvard University in the United States. Here, he completed courses taught by luminaries of Western social thought, such as the sociologists Talcott Parsons and Robert N. Bellah, and the historian of religions Wilfred Cantwell Smith, earning both an M.A. (1963) and a Ph.D. (1968) in East Asian studies. Beginning in 1967, Tu served as a faculty member in a series of prestigious U.S. universities, initially at Princeton University and later at the University of California at Berkeley, from which he went on to become a professor at Harvard University (1981-2010). As of 2016, Tu held two academic positions, serving as both the Chair Professor of Humanities and Founding Director of the Institute for Advanced Humanistic Studies at Beijing University and as Research Professor and Senior Fellow of the Asia Center at Harvard University.

2. Confucianism as Religious Humanism

Tu’s understanding of selfhood is intertwined with his understanding of religiosity. Indeed, one of the distinctive features of Tu’s Confucianism is his emphasis on the religiousness of Confucianism. Many Confucian scholars stress that Confucianism is a cultural tradition or a philosophy rather than a religion. However, Tu insists that Confucianism also involves religiosity. Unlike other religions, Confucianism is not an institutional religion; however, similar to other religions, Confucianism has its ultimate concern, which is creative self-transformation of the self. According to Tu, ethics (norms for behavior), aesthetics (theory of value), and religiosity (commitment to engagement with an ultimate concern) are inseparable from one another for Confucianism, and Chinese traditional thought in general. Thus, for Tu, being religious is equivalent to learning to be fully human. It is an infinite developmental process and the ultimate ideal is to achieve the oneness of Heaven (or Tian—the cosmic source of ethical and aesthetic values for Confucians) and humanity—in Chinese, Tianren heyi. This transformational ideal of self-cultivation is at the heart of Tu’s understanding of Confucianism as a form of religious humanism.

According to Tu, Confucian self-cultivation is a long and strenuous process; it is also a process of ceaseless creative self-transformation. What is important is one’s conscious determination to do so. In a key early work, Centrality and Commonality: an Essay on Confucian Religiousness (1989; later republished in a bilingual Chinese/English edition as Zhongyong dongjian [An Insight into Zhongyong, 2008], Tu highlights the distinction between “learning for the sake of the self” and “learning for the sake of others.” From a Confucian perspective, one’s motivation of self-cultivation should be for the sake of the self—that is, it should be seen as an intrinsic good, not undertaken for reasons that are purely external to oneself. As Tu states, “A decision to turn our attention inward to come to terms with our inner self, the true self, is the precondition for embarking on the spiritual journey of ultimate self-transformation”. Learning only for the sake of others is an inauthentic motivation, because it is driven by consideration of others’ opinions, rather than a genuine desire to cultivate oneself. However, Tu does not argue that Confucianism aims at cultivating a private ego. Rather, for Tu, Confucianism emphasizes the cultivation of a true self critically. Here, Tu sees himself as making a claim similar to the classical Confucian thinker Mencius (Mengzi), for whom the aim of knowing the true self is to recognize the inner “great body” (dati) which signifies the true self that can form a unity with Heaven, Earth and numerous other things. Thus, the ultimate aim of self-cultivation is the unity of Heaven and humans.

One important approach to recognizing the great body is to identify one’s sense of commiseration, that is, the sense of sympathy and empathy, as the unique characteristic of human nature distinguishing us from animals. To recognize the great body, as Tu understands this, also means to establish the will (lizhi), to make decisions to act in accordance with the great body. We must deliberately transcend the temptation of learning for the sake of others in order to learn for ourselves. The willingness to do this would help us to would help us to access the inexhaustible inner resources of self-transformation and to achieve a state of being truly “self-possessed” (zide) (to use Mencius’ term)—that is, the authentic way of learning to be human.

With his study of Mencius and other neo-Confucians, Tu argues that Confucianism encourages “human perfectibility through self-effort.” It means that the source of self-actualization is in humans, rather than through the mediation of some supernatural agent. It assumes that everyone possesses sufficient internal resources for ultimate self-transformation. And this assumption is further based on a Chinese cosmology that, in his Confucian Thought: Selfhood as Creative Transformation (1985), Tu calls the “continuity of being.”

a. Continuity of Being

Tu believes that the aim of unity of Heaven and humans is perceivable and realizable through self-effort because of what Confucians assume is a kind of continuity between the self, others, and Heaven and Earth, in which humans, by nature, share a certain reality of Heaven. Therefore, we are obligated to cultivate ourselves to actualize this moral ideal. Thus, Tu argues that Chinese cosmology does not reject the idea of a Creator, but that Confucianism, unlike Christianity, does reject the dichotomy of creator and creature. For Confucianism, it is inconceivable that man can be alienated from Heaven in any essential way. For this reason, Tu’s Confucian religiosity possesses no equivalent to the Christian idea of original sin and divine grace.

According to Tu, within Chinese culture the cosmos is perceived as an unfolding of all-embracing continuous creativity in which all modalities are organically connected. It consists of a dynamic energy field in which all parts of the cosmos mutually interact in a “spontaneously self-generating life process.” While the nature of the cosmic is impersonal, it is not inhuman. It is impartial to all modalities of being, and rejects a kind of anthropocentrism. While human beings are part of this, our minds and consciousness make us unique in that we are able to probe “the transcendental anchorage of our nature,” and to achieve “sympathetic accord with the myriad things in nature.” This sense of continuity of being also gives a sense of deep awe towards nature and Heaven within Chinese culture, an aspiration to sustain harmony with nature. Thus, Tu argues, self-knowledge is both a necessary and sufficient condition of knowing the Way (Dao), that is, the way of actualization of authentic human nature. As humans are endowed by Heaven with the “centrality” of the universe—the most refined quality of the universe—Heaven sends humans on the mission of transforming the cosmos into its actualization.

It is important to note that here Tu is taking liberties with the English translation of the Chinese phrase zhongyong, the title of one of the “Four Books” of classical Confucianism, often rendered as “doctrine of the mean” in scholarship prior to Tu’s work on the text. Instead, Tu understands zhong as “centrality” and yong as “commonality.”

“Centrality” denotes the most refined and irreducible quality inherent in human beings. Therefore, the way that comes from “centrality” cannot be separated from us. It is an ontological condition of being, in which one’s mind is unperturbed by external forces. According to ancient Chinese thought, human beings have embodied the centrality of Heaven and Earth. Therefore, human beings are united with Heaven and Earth by inherent centrality. Centrality is a state of the self in which emotions are not yet aroused. When our emotions are aroused to the due measure, it is called harmony. Thus, while centrality is the great foundation of the universe, harmony is an appropriate, unfolding expression of the inner self. Tu reminds us that although the great foundation is inherent in human beings, it does not guarantee that everyone can attain the harmonious state. It is a heavy burden and a long road to attain centrality and harmony; one needs the determination of self-cultivation. By “commonality,” Tu means activities that are ordinary and common, such as eating and walking, which are deployed to describe the Way in Zhongyong. Thus, Tu emphasizes, human beings must accept their embeddedness in the world as the condition of self-transcendence. As the Way (Dao) is inseparable from our ordinary life, we must realize our humanity through our ordinary daily existence. It is tantamount to fulfilling our Heavenly-ordained mission.

b. Anthropocosmic Unity

For Tu, the Heaven-human relationship is not simply that of creator and creature, but “one of mutual fidelity; and the only way for man to know Heaven is to penetrate deeply into his own ground of being.” Thus, Tu argues, the ideal of the Heaven-human relationship is neither theocentric (God-centered or, in Confucian terms, Heaven-centered) nor anthropocentric (human-centered), but “anthropocosmic unity.” By this, Tu means that through a continuous interaction between the two, “the human way necessitates a transcendent anchorage for the existence of man and an immanent confirmation for the course of Heaven.” Because of the mutuality of Heaven and humans, the Confucian Heaven-human relationship is also a kind of “transcendent as immanent.”

From the above analysis, we can see that Confucian religiousness involves three interrelated dimensions: (1) the self-transformation of the person, (2) the communal act of community and (3) the dialogical response to the transcendent. Confucianism aims at forming a kind of organismic unity between humans and Heaven. Thus, Tu stresses that Confucianism is a kind of “inclusive humanism” with “an anthropocosmic idea.” Unlike secular humanism, Tu’s Confucian inclusive humanism is very much concerned with the transcendent. Indeed, Tu claims that if Confucianism wants to be continuously developed it must develop its spiritual tradition in a religious form, so that it can continuously contribute to society. This is because Tu considers religiosity as that which provides human beings with a sense of deep awe towards Heaven and nature; without this sense of transcendence, human life would be shallow. Moreover, for Tu, although human beings by nature are earthbound, they have an aspiration to transcend themselves and to join with Heaven.

Given Tu’s ideas about the “continuity of being” that properly leads to the development of “anthropocosmic unity,” we can see why Tu’s thesis of “learning for the sake of the self” is not a kind of ego-centrism. As there is continuity of being between the self, the others and the cosmos, learning for the sake of the self can be said to be also for the sake of others, as well as the whole world. Furthermore, in the following section, it is explained that Tu does not reject the motivation of self-cultivation driven by our sincere feeling towards our family members. Thus, what Tu really wants to reject seems to be the motivation of “learning for the recognition from others” rather than “learning for the sake of others.”

3. Selfhood as Creative Transformation

While Tu’s Confucianism emphasizes learning for the sake of the self, it does not mean that one can learn to be fully human separated from the community. Tu asserts that people generally establish a “fruitful communication with the transcendent through communal participation.” Apart from communal participation, humans must develop a “constant dialogical relationship with Heaven” and are transformed through “a faithful dialogical response to the transcendent” in self-cultivation. For Tu, implicit in Confucian thought is a “covenant” with Heaven, for our moral duty is to achieve the highest human aspirations of forming a “trinity with Heaven and Earth” through self-realization and community-perfecting. Thus, on Tu’s account, Confucian self-cultivation is a gradual process of combining all levels of the community in the process of self-transformation, from the family to the neighborhood, clan, race, nation, world, and finally to the universe and the cosmos.

Following the work of his teacher, Bellah, Tu argues that one of the perennial human problems in modernity is individualism. In order to respond to individualism, Tu investigates the possibility of “a new vision of the self which is rooted in the reality of a shared life together with other human beings and inseparable from the truth of transcendence” (CT, 8) and his answer is “Confucian selfhood as creative transformation” (CT, 7). Tu stresses that the main concern of Confucianism is how to become a sage or how to become fully realized as an authentic human being. Confucianism believes that man is perfectible by self-effort and one has to achieve self-realization through self-cultivation.

a. Ren

Tu understands the classical Confucian term ren (benevolence or humaneness) as the highest human achievement of self-cultivation and the fullest manifestation of humanity. A man of humanity (renren), who embodies love in his daily life, represents the most authentic realization of human beings from Tu’s Confucian perspective. However, in actual practice, there is a gradual process of extension of love, and ren is most exemplified in our caring toward our relatives (qinqin). This is further discussed in the following section. For Tu, ren is primarily concerned with the self, although human relations are also crucial to it. Tu conceives ren as a concept of personal morality, as a principle of the inward process of self-fulfilment. It is not simply a personal virtue, but also a metaphysical moral mind that is, at the same time, equivalent to the cosmic mind. Thus, ren is both the moral and metaphysical foundation of self-cultivation. While we are earthbound and limited, we can also participate in the communal and divine enterprise of self-transformation, that is, to enlarge one’s humanity so that humanity as a whole shared by every human being will be enriched. The driving force of such anthropological and cosmological assumptions enable Confucianism to function as an ethico-religious system in Chinese society despite its lack of institutional religious character.

b. Li

According to Tu, li (ritual) is the externalization of ren in a specific context. Li implies the existence of human relationships. For Tu as a Confucian, the self and sociality are not separable. Society is conceived as an extended self. Confucian self-transformation must be manifested in the context of sociality. And li points to a concrete manifestation by which one enters into proper relations with others. Tu conceives li as a dynamic process of humanization. It is a process of individual development from (1) cultivating personal life to (2) regulating familial relations, and then (3) ordering the social affairs, before finally (4) bringing peace to the world. It assumes forms of integration between personality, family, state, and the world. Thus, for Tu, Confucian self-cultivation is a gradual process of inclusion. Without li, there would be a lack of a concrete manifestation of ren. However, without ren as the internal basis, li could become coercive and distort human nature. While ren denotes metaphysical reality, li means the standard of this world. Thus, Tu argues, there is a creative tension about Confucian concepts of ren and li, and it is important to seek the balance between ren and li in a dynamic process.

c. The Junzi or Profound Person

Just as with the phrase zhongyong, in another case Tu uses license in translating a classical Confucian term. Junzi, which typically is rendered as “gentleman” or “superior person” in other English-language scholarship, becomes “profound person” in Tu’s usage of the term. This translation shows Tu’s particular concern regarding the inner nature of morally superior persons. Basically, junzi is the paradigmatic model of Confucian personality. For Tu, self-knowledge of the profound person is deeper than that of average people. As the Way (Dao) is inherited in human nature, the actualization of the Way very much depends on our self-knowledge. While the Way is deeply rooted in Heaven-endowed human nature, it is manifested in ordinary life. Thus, the profound person must be sensitive to one’s interiority through which true humanity is manifested as the Way. However, following Confucian tradition, Tu claims that only a few have the inner strength to fully actualize what is inherent in them.

To be a profound person means that one can be (1) sensitive to the outside world, (2) at ease with oneself and (3) courageous. The important way of self-cultivation to achieve self-knowledge is “vigilant solitariness” (shentu) or “self-watchfulness when alone,” that is, a kind of continuous vigilance on one’s own. It is “a process toward an ever-deepening subjectivity.” On the one hand, through continuous critical self-examination, one is sensitive to the subtle manifestation of one’s inner feelings. On the other hand, by such inner penetration of the self, one is able to reach the reality underlying common humanity and to realize the true nature of human-relatedness. Tu argues that in practising vigilant solitariness, one can hear one’s authentic self as expressing the quality of Heaven-ordained nature. As our innermost core is common to all human beings, our self-understanding of Heaven-ordained nature would help us to know the “great foundation” (tapen) of the cosmos.

The uniqueness of the profound person lies in how he integrates the Way with his daily life; therefore, the way of a profound person is also understood as the common way. However, such commonness cannot be defined in terms of an abstract formula by which everyone can learn to become a profound person. Thus, while the Way is common, it is also dynamic in nature. As Tu states, “its meaning can never be fully comprehended and its potential never exhausted. There is always something ‘hidden’ in its commonness.” In the face of constantly changing social situations, a profound person can still be at peace with themselves. This is because he “rectifies himself and seeks nothing from others”, and can recognize the possibilities in his fate and bring himself into harmony with the situation. Thus, it is a kind of creative transformation not only of the self, but also of the world in the spirit of Zhongyong. Therefore, in practising vigilant solitariness, one must also be courageous in order to be truthful to oneself and to resist the temptation of seeking recognition from others as a motivation for self-cultivation. One must possess an inner strength to continue the long and strenuous task of self-cultivation at one’s own pace, undisturbed by the changing environment.

d. Fiduciary Community and Filial Piety

As the Way (Dao) starts with human-relatedness, the idea of community is also an important concern for Confucianism. For Tu, the ideal Confucian society is conceived as “a fiduciary community based on mutual trust.” Thus, the goal of politics is not simply the establishment of law and social order, but also the development of a fiduciary community through moral persuasion by rectifying the ruler’s moral character as a moral leader.

The self, in the process of creative transformation, embodies the network of continuous expanding and deepening human relationships. According to Tu, the network is “a series of concentric circles” which involves different kinds of structural limitations, such as gender, race, and social background. However, through self-cultivation, the structural limitations of each circle can be transformed into an instrument of self-transcendence, and we can extend ourselves continuously to achieve unity with Heaven and Earth and other numerous things. When we have achieved unity with the most generalized commonality, we also reconfirm the centrality of the self. Tu states, “This broadening and deepening of the self can be characterized, in Mencian terminology, as the manifestation of the ‘great self’ and the concomitant dissolution of the ‘small self’.”

In Confucianism, filial piety is considered the prime virtue of the community underlying the anthropocosmic vision. With most other Confucians, Tu defines filial piety in terms of “transmission and continuity.” He understands exercising filial piety as honoring one’s parents by transmitting the wisdom and values exemplified by their parents, and continuing their unfinished tasks. The underlying idea is that their developments are indebted to the tradition, that is, the origin of their existence; thus they are responsible for transmitting the wisdom of the old and honor their forefathers in their ancestral line. As Tu points out,

The centrality of filial piety in Confucian ethics is predicated on the belief that human beings become aware of themselves by responding naturally to the loving care of those around them. Such a reciprocal response, laden with rich symbolic significance for the transmission and continuity of humanity, is seen by the Confucians as the way to provide a solid basis for personal growth: filial piety and brotherly love are roots (pen) of humanity. (C&C, 140)

Tu states that Confucians reject the idea of identity formation through isolation, because our sincere feeling towards our family members can provide motivation for our self-transformation. When we have nurtured our mind/heart to be capable of regulating our family, we will then open ourselves and transcend our egocentrism. However, familialism may degenerate into nepotism and may become a kind of structural limitation. Thus, for Tu, we have to establish meaningful relationships with people outside our familial members and transcend the nepotism of familialism. Furthermore, in order to avoid being a narrow-minded nepotist, ren must go with righteousness (yi)—that is, a sense of judgment and honoring the worthy. Thus, li is not only the externalization of ren; it also signifies a structure of yi realized in the context of human relations. It is primarily concerned with establishing an authentic way of human-relatedness.

Indeed, Tu is thoroughly Confucian insofar as he considers filial piety to be the foundation of political virtue. Traditionally, a filial son is perceived as likely to be vigilant about his personal conduct, diligent about family affairs, compassionate in regard to social obligations and, therefore, qualified for political assignments. Historically, filial sons often end up becoming loyal ministers. According to Tu, Confucianism often uses family as a metaphor for country and the world, addressing the emperor as son of Heaven, and the magistrate as the “father-mother official,” because Confucianism perceives a kind of transcendent vision implicated by familial nomenclature. It implies that the self is not egoistic; rather the enrichment of the self is through cultivating one’s relationship with the family. The emphasis of family is not equivalent to nepotism; rather it is concerned with a developmental process towards organismic unity with the country and the world. Thus, Tu claims that Confucianism perceives self-cultivation and regulation of the family as the root, and governing the country and peace of the world as the branches. And “the dichotomy of root and branch conveys the sense of a dynamic transformation from self to family, to community, to state, and to the world as a whole.” (C&C, 148).

Rituals and ceremonies (li), as manifestations of the ethico-religiosity of community, are also important elements of moral education and social solidarity. In particular, through ancestral worship, the old are respected and the dead are honored for their past contributions; their yearning for the forefathers also establishes their communal identity. Indeed, with the anthropocosmic vision, we are not only responsible to our ancestors, but also to future generations, for they will also transmit our values and continue our tasks.

It is well-known that Confucianism is very much concerned about five cardinal human relationships. According to Tu, reciprocity (shu) is the fundamental principle of five cardinal human relationships and human-environmental relationships. Reciprocity helps us to harmonize social relationships, interact sympathetically with nature and establish a dialogical relationship with Heaven. “Through reciprocity, humanity becomes interfused with the cosmic transformation and thus, as a co-creator, forms a trinity with Heaven and Earth. Humanity, in this perspective, stands as the filial son and daughter of the cosmos.” (C&C, 134).

Another important virtue is reverence toward Heaven. Tu states that filial piety and reverence toward Heaven are “parallel principles in the Confucian anthropocosmic worldview.” Both do not simply serve political purposes, but fundamentally bear a cosmological concern, that is, a concern of “bringing peace and harmony to the universe.” Thus, these two virtues attempt to establish “a pattern of mutual dependence and organismic unity” between Heaven and humans.

e. Embodied Knowing

For Tu, moral reasoning is a kind of “embodied knowing” (tizhi). Following Neo­Confucianism, Tu argues that Confucian ontology rejects Kantian metaphysics which assumes the objectivity of the moral will, excluding any emotional dimension. Tu’s Confucianism also rejects the study of humanity by means of the scientific method. Following Zhang Zai, Tu argues for the distinction between moral knowledge and empirical knowledge. While empirical knowledge derives ideas from our detached observations, moral knowledge cannot be grasped in a disengaged way. Rather, moral knowledge is a kind of bodily experiential knowledge based on reflection on one’s bodily practice and experience. Crucial to Tu’s understanding of embodied knowing are his interpretations of two key classical Confucian concepts, cheng (sincerity) and the fourfold division of human nature into body (shen), heart/mind (xin), soul (ling) and spirit (shen).

i. Cheng

Unlike Christianity, Tu points out, Confucian thought does not regard revelation as the foundation of perceiving the moral order, but rather looks to “our common experience.” In order to perceive this ultimate reality and to attain our true humanity, people must learn to be cheng, that is, to be sincere, true, and real, to our common experience. According to Zhongyong, cheng leads to enlightenment (ming). Cheng also allows us to fully realize ourselves and to understand that Heaven is inherent in human nature. Tu stresses that the possibility of being cheng is not because of divine grace; rather it is based on the idea of heavenly endowed human nature, by which the identification of human nature with reality of Heaven becomes possible. According to Tu, cheng not only means sincerity, but also authenticity. To be cheng does not only signify what a person should be in an ultimate sense, but also signifies a process of actualizing the ultimate reality in ordinary life. The program of self-cultivation that leads to the development of cheng is “a process toward an ever-deepening subjectivity” which involves “a deep penetration into one’s own ground of existence” and includes and embraces others as “an integral part of one’s quest for self-realization.” For this reason, says Tu, Confucianism gives enormous value to the practice of vigilant solitariness.

Ontologically speaking, the expression of the moral subject is a priori true and sincere. However, in reality, if we do not maintain the exercise of self-cultivation, we cannot be cheng enough, and the moral knowledge of the subject would finally be exhausted. This is because our moral knowledge and action are intertwined. Our bodily experiential knowledge can lead to self-transformation, that is, the enhancement of moral knowledge and the renovation of one’s disposition. Thus, for Confucianism, embodied knowing must necessarily lead to the enhancement of moral practices. The highest human achievement by moral self-cultivation is to become an authentic moral person, which is what Tu calls “renren,” that is, a man who embodies ren; such embodiment must have a concrete manifestation by the observance of rites (li).

The ultimate concern of self-cultivation is not simply about the self, but also to manifest our humanity. As human nature shares the ultimate reality of Heaven, human nature is potentially a manifestation of the reality of Heaven. The manifestation of authentic humanity implies that human beings, as co-creators, participate in the creative process of the cosmos. Tu stresses that such a process is not creation ex nihilo (out of nothing). Rather, humans are “capable of assisting the transforming and nourishing process of heaven and earth.” In short, the creative transformation of sage is derived from human inner nature by focusing on our common human experience. As Tu states, “The way of the sage therefore is centered on the commonality of human nature.”

Moreover, Tu’s vision of Confucian embodied knowing and self-cultivation is to be realized through social practice in a complicated social network. The aim of Confucian self-cultivation is not only to establish oneself, but also to establish others. Thus, our moral knowledge is not simply derived through our own self-understanding, but also by knowing others. Knowing others is not through a kind of disengaged introspection, but rather though participating in a network of mutual trust, knowing others’ dispositions and characters through dialogue and interaction with them. Thus, embodied knowing is also a kind of empathic sensual perception; it rejects the objectification of others, things or humans. And it can accommodate and integrate everything in the world, letting all these things become something that is non-objectified in our minds. Finally, as there is continuity between the being of Heaven and that of humans, embodied knowing is also related to the framework of unity and harmony between humans and Heaven (Tianren heyi). Everyone can be connected to Heaven by embodied knowing of one’s own nature.

ii. Fourfold Human Nature

Tu’s idea of embodied knowing is based on his fourfold division of human nature, perhaps better understood as four levels of subjectivity. According to Tu, the foundation of Confucian morality is an embodied person with sensitivity and emotion. It includes body (shen), heart/mind (xin), soul (ling) and spirit (shen), that is, four different levels. Our embodied knowing is an experiential knowledge based on the integration of these four different levels.

Unlike self-cultivation in some religions which are concerned about nurturing the human soul only, Confucian self-cultivation very much concerns the human body. Tu reminds us that Confucian self-cultivation literally means “nourishing the body” (xiushen). As our body includes five sense organs, Confucian teaching traditionally emphasizes nourishing our bodily senses through the Six Arts (liuyi) – the six ancient disciplines of ritual, music, archery, charioteering, calligraphy, and mathematics, which were foundational to the classical Confucian curriculum. The aim is to aestheticize human life through the practice of rituals (li) and music (yue), to cultivate one’s disposition, to facilitate one’s thinking and emotion-controlling, and finally to achieve one’s embodiment of virtues (yishen tizhi).

While our body contains sense organs, our heart/mind is a rational faculty integrating our different sensual experiences. Here, Tu is building on the Neo­Confucian interpretation of Mencius’ anthropology as the study of heart/mind (xin). Our heart/mind includes spiritual resources endowed by Heaven as the defining feature of being human. It also provides theoretical and practical foundations of our self-cultivation. The spiritual resources are four germinations (siduan) and the power of the will for self-realization. Four germinations are four kinds of universal predispositions. They are shown in our sense of (1) commiseration, (2) shame, (3) reverence, and (4) rightness and wrongness. By cultivating these four germinations, we can acquire four Confucian virtues: benevolence (ren), propriety (yi), observance of rites (li), and wisdom (zhi). By doing our best with our mind, we can extend our virtuous sense not only towards another person, but it also “flows abroad, above and beneath, like that of Heaven and Earth” (Mencius 7A:13). The power of will is what Mencius calls, “vast, flowing qi” (haoranzhiqi). This power is great and strong; it can be forever latent, but is never totally lost. If one can nourish this power with uprightness, “it will fill the space between Heaven and earth” (Mencius 2A: 2). Thus, profound persons should focus on tapping their own internal energy in the process of realizing humanity.

Apart from body and mind, there are also soul and spirit in human beings. Tu claims that the soul is the extension of the mind, a kind of awareness in the existential situation. Spirit is the state of transcendence, the ultimate goal of self-cultivation. The relation between soul and spirit is just like that of body and mind, where the former is concrete, definite, and with fixed shape, and the latter is indefinite and hardly traceable. Tu argues that we can find Confucian stages of development of self-transcendence in the Mencius. The stage from goodness to realness belongs to the ascending level from body to mind. Beauty to greatness is at the rise from the mind to the soul. Sage-spirit is about the ascending status from the soul to the spirit. Thus, one self-elevating process inevitably involves one’s embodiment through ritual practice, in which one can learn about one’s mind, and then be aware of one’s soul, and finally rise to the level of spirit. By such Confucian self-cultivation, we can experience a kind of union with the cosmos; therefore, the embodiment of virtue and a ritual-musical cultivated harmonious world can finally be ratified. Here and throughout Tu’s presentation of Confucianism, the role of religiosity in the Confucian humanistic project remains central.

4. Confucianism and Modernity

In Confucian China and its Modern Fate (1958), Joseph R. Levenson argues that due to the degeneration of feudal society which nourished Confucianism, Confucianism inevitably faded into the background of modernity. However, Tu disagrees with Levenson and criticizes Levenson’s analysis for failing to distinguish between “Confucian China” and “Confucian tradition.” For Tu, “Confucian China” denotes the politicalized Confucian ideology of traditional Chinese feudal society. However, by “Confucian tradition,” Tu means the main spirit of Confucian Chinese culture by which Chinese people traditionally have governed their lives. To a certain extent, Confucian tradition is responsible for the problem of Confucian China. However, Tu believes that as long as we can eliminate the adverse influences of Confucian China, the Confucian tradition can be revived.

In order to refute Levenson’s claims for Confucianism’s demise, Tu explores the possibility of the “third epoch” of Confucianism. Tu considers Confucianism from earliest times through the Han dynasty (202 B.C.E.-220 C.E.) as the first epoch, Neo-Confucianism from the Song (960-1279 C.E.) through the Ming (1368-1644 C.E.) dynasties as the second epoch, and Confucian thought from the May Fourth New Cultural Movement (1919 C.E.) to the present as the third epoch. Historically, the development of Confucianism occurred through dialogue and debate with other traditions in China. For Tu, the challenges to Confucianism in its third epoch come not only from Western science, democracy, psychology, and religiosity, but also from the more universal question of how Confucianism as an inclusive humanism can address the perennial human problems of the world–that is, how Confucianism can be a new philosophical anthropology for humanity as a whole. Tu is not unique in formulating contemporary Confucianism’s intellectual and spiritual situation in such terms, as similar formulations can be found in the work of his teacher, Mou, as well as the writings of Mou’s associates, Tang Junyi and Xu Fuguan. Where Tu distinguishes himself from other “New Confucians” is in his insistence that contemporary Confucians must not only (1) reflect on the past problems of traditional Confucianism, but also (2) communicate with different civilizations so that Confucian thinkers can benefit from such dialogues and thus contribute to the global world.

a. Critique of Confucian Tradition

Regarding the controversies about traditional Confucianism as it existed prior to modernity, Tu thinks that one crucial problem was its political integration with despotic regimes. He calls this kind of politicized Confucianism “Confucian China,” by which he means a conservative political ideology used by the powerful to control and oppress people, including the internalized oppression whereby Chinese minds became accustomed to rejecting any proposals for cultural change or reform. Tu’s critique thus is very much in sympathy with the May Fourth New Cultural Movement, but he argues that some May Fourth intellectuals went too far in their criticisms of Confucianism by uncritically embracing Western Enlightenment thought, which had problems of its own. Pointing out that the West has its vices and China has its virtues, culturally and intellectually speaking, Tu suggests that Chinese interaction with Western culture should be grounded in a deep understanding of the Chinese cultural heritage and a realistic awareness of the pitfalls of Western-style modernity. Finally, Tu insists that there can be different kinds of modernity; Western modernity is not the only way, especially for China.

b. Critique of Western Enlightenment Thought

Tu is very much appreciative towards values and institutions of Western modernity brought to us by the Enlightenment, values such as reason, freedom, equality, individuality, rule of law, democracy, science, and capitalism. He thinks that China can learn from these in its modernization. However, Tu criticizes Enlightenment for inducing scientism, anthropocentrism, and individualism. Anthropocentrism and scientism bring to us a kind of disenchantment and finally lead to the exploitation of nature and destruction of the environment. Anthropocentrism with the rise of capitalism also leads to the marketization of the economy, politics, academia, and education, and, what is deemed as worse by Tu, the marketization of religions. Capitalist society has been dominated by the instrumental reason which, for Tu, is disembodied, unsympathetic, and even cruel. Individualism also caused the prevalence of egocentrism and the disintegration of community. Thus, Enlightenment has finally precipitated the antagonism among humans and antagonism between humans and nature. In these respects, Tu thinks that Confucian values, such as sympathy, distributive justice, strong sense of communal responsibility, propriety, and the sense of collectivism and emphasis of self-restraint, should also be universalized and be able to contribute to the modern world.

c. Civilizational Dialogue

In response to the many Western social theorists who regard Western modernity as the future trend of human development, Tu points out the fallacy of Eurocentric chauvinism present in their work. Inspired by the existentialist philosopher Karl Jaspers’ concept of the “Axial Age”—whereby modern civilizations have been shaped by different cultural and intellectual traditions that arose between the 700s and the 200s B.C.E., including Confucianism—Tu argues that the rise of East Asian industrialization has dispelled the myth of Westernization as the single model of modernization. In the case of Singapore, for example, Tu cites how Confucian virtues—such as frugality, industriousness, self-discipline, loyalty, and active participation in collective welfare—have proven to be constitutive to the success of the ethnically Chinese city-state. In Tu’s view, Singapore—and by extension all of industrialized East Asia—could not have proceeded to modernity by a route other than its own (Chinese) cultural traditions, and cannot be judged by the standards of other cultural traditions. Thus, for Tu, the development of modernity is a pluralistic phenomenon that proceeds by different cultural pathways.

Moreover, Tu identifies the twenty-first century cultural-historical moment as a kind of new “Axial Age,” in which conditions of cultural and religious pluralism can foster constructive dialogue between traditions and civilizations. Throughout this conversation, each civilization should open itself to learning from others, on the one hand, while at the same time maintaining a strong sense of self-recognition so that it will develop, rather than dissipate, as a result of dialogue. Tu’s hope is that such civilizational dialogue will aid in the search for a global ethic. In Way, Learning, and Politics: Essays on the Confucian Intellectual (1993, hereafter abbreviated as WLP), Tu writes: “If the well-being of humanity is its central concern, Confucian humanism in the third epoch cannot afford to be confined to East Asian culture. A global perspective is needed to universalize its concerns. Confucians can benefit from dialogue with Jewish, Christian, and Islamic theologians, with Buddhists, with Marxists, and with Freudian and post-Freudian psychologists.” (WLP, 158-9).

This emphasis on inter-cultural, inter-religious, inter-disciplinary, and inter-civilizational dialogue is one of the defining features of Tu’s Confucianism. It also represents a response to the theory of the “clash of civilizations” proposed by Samuel P. Huntington. Huntington argues that international conflict in the post-Cold War era was primarily caused by conflicts between people’s cultural and religious identities. Huntington anticipates that the conflict between the West and what he calls Confucian and Islamic civilizations will be the greatest conflict in the future. Tu does not deny the existence of a civilizational clash; nevertheless, Tu criticizes Huntington’s civilizational clash as one-sided and static in its assumptions about the structure of civilizations. In contrast, Tu finds that the development of civilizations is dynamic and mutually influential, and he argues that the trend of global development should be the promotion of civilizational dialogue rather than the anticipation of civilizational conflict. From Tu’s Confucian perspective, this posture reflects a distinctively Chinese faith that true harmony may be achieved by respecting differences, as stated in Lunyu (Analects) 13:23: “The superior man is conciliatory but does not identify himself with others.” Such civilizational dialogue is crucial for the future development of Confucianism as Tu envisions it, just as Confucianism in the past developed out of dialogue with Buddhism in China. If contemporary Confucians want their tradition to continue to grow, argues Tu, then they must face the challenge of Western civilization. Tu goes so far as to say that, without such civilizational dialogue, there can be no future for Confucianism as a living tradition.

But the value of civilizational dialogue does not lie only with its benefits to the Confucian tradition. In Tu’s mind, Confucianism is not simply for people in China or East Asia, but also for the whole world. Tu thinks that the third epoch of Confucianism must respond to four aspects of challenges from the West: (1) the spirit of scientific inquiry, (2) democracy, (3) Western religiosity and its sense of transcendence, and (4) the Freudian psychological exploration of human nature. Thus, Confucianism must open itself, reach out to the world, and learn from the West about the ideas of freedom, equality, science, democracy, human rights, and rule of the law, while rejecting the Western tendencies toward either radical individualism or radical collectivism. Confucianism must reject its own past elements of authoritarianism, hierarchalism, and androcentrism (male-centeredness) while conserving the value to be found in traditional Confucian aesthetics, morality and religiosity. Through such civilizational dialogue, Tu believes that Confucianism can both renew itself and become a valuable resource for the world.

5. Influence and Criticism

One of the important and influential contributions of Tu’s Confucianism is his ongoing dialogue with non-Confucian religions and social theories. Tu’s many years of living in the United States make such dialogue possible and have enabled him to become the foremost spokesperson for Confucian thought in the West. However, his prolonged expatriate status also renders him vulnerable to criticisms from Confucian thinkers in China and elsewhere outside of the West.

While Tu’s particular presentation of Confucian thought has proven to be both intelligible and popular among Westerners, his use of Western religious concepts and terminology to describe Confucianism also has generated controversy in the Chinese Confucian world. In particular, the cultural hybridity and explicit spirituality that are key elements of Tu’s Confucianism have been criticized by some other contemporary Chinese Confucian thinkers, who—like modern Chinese philosophy in general—have been more influenced by nationalism and secularism than a diaspora thinker such as Tu. For example, some Chinese Confucian thinkers have pointed out that the idea of a covenant with Heaven can scarcely be found in Confucian texts, even in implicit terms. Others have questioned whether Confucianism truly possesses a concept of dialogue between human beings and Heaven, given that Confucius is recorded as having taught that Heaven does not say anything (Analects 17:19). Finally, Tu’s philosophical anthropology—particularly his distinction between soul (ling) and spirit (shen), which is not always clear in his writings—has come under fire from some other Confucian critics.

Despite such controversies, however, Tu’s enormous impact and legacy as a modernizer of Confucian thought and champion of Confucian engagement with non-Confucian traditions must be acknowledged.

6. References and Further Reading

  • Hung, Andrew T. W. “Tu Wei-Ming and Charles Taylor on Embodied Moral Reasoning.” Philosophy, Culture, and Traditions 3 (2013): 199-216.
  • Huntington, Samuel P. “The Clash of Civilizations?” Foreign Affairs 72/3 (Summer 1993): 22-49.
  • Levenson, Joseph R. Confucian China and its Modern Fate: The Problem of Intellectual Continuity, Volume 1. Berkeley: University of California Press, 1958.
  • Tu, Wei-ming. Neo-Confucian Thought in Action: Wang Yang-ming’s Youth (1472-1509). Berkeley: University of California Press, 1976.
  • Tu, Wei-ming. Humanity and Self-Cultivation: Essays in Confucian Thought. Berkeley: Asian Humanities Press, 1978.
  • Tu, Wei-ming. Confucian Ethics Today: The Singapore Challenge. Singapore: Federal Publications, 1984.
  • Tu, Wei-ming. Confucian Thought: Selfhood as Creative Transformation. Albany: State University of New York Press, 1985.
  • Tu, Wei-ming. Centrality and Commonality: an Essay on Confucian Religiousness, Albany: State University of New York Press, 1989. Later published as Zhongyong dongjian (An Insight into Zhongyong), bilingual (Chinese and English) edition, trans. Duan Dezhi (Beijing: People’s Publishing House, 2008).
  • Tu, Wei-ming. Way, Learning, and Politics: Essays on the Confucian Intellectual. Albany: State University of New York Press, 1993.
  • Tu, Wei-ming, and Mary Evelyn Tucker, eds. Confucian Spirituality. 2 vols. New York: The Crossroad Publishing Company, 2003-04.

 

Author Information

Hung Tsz Wan Andrew
Email: ccandrew@hkcc-polyu.edu.hk
Hong Kong Community College, The Hong Kong Polytechnic University
China

Thick Concepts

A term expresses a thick concept if it expresses a specific evaluative concept that is also substantially descriptive. It is a matter of debate how this rough account should be unpacked, but examples can help to convey the basic idea. Thick concepts are often illustrated with virtue concepts like courageous and generous, action concepts like murder and betray, epistemic concepts like dogmatic and wise, and aesthetic concepts like gaudy and brilliant. These concepts seem to be evaluative, unlike purely descriptive concepts such as red and water. But they also seem different from general evaluative concepts. In particular, thick concepts are typically contrasted with thin concepts like good, wrong, permissible, and ought, which are general evaluative concepts that do not seem substantially descriptive. When Jane says that Max is good, she appears to be evaluating him without providing much description, if any. Thick concepts, on the other hand, are evaluative and substantially descriptive at the same time. For instance, when Max says that Jane is courageous, he seems to be doing two things: evaluating her positively and describing her as willing to face risk. Because of their descriptiveness, thick concepts are especially good candidates for evaluative concepts that pick out properties in the world. Thus they provide an avenue for thinking about ethical claims as being about the world in the same way as descriptive claims.

Thick concepts became a focal point in ethics during the second half of the twentieth century. At that time, discussions of thick concepts began to emerge in response to certain disagreements about thin concepts. For example, in twentieth-century ethics, consequentialists and deontologists hotly debated various accounts of good and right. It was also claimed by non-cognitivists and error-theorists that these thin concepts do not correspond to any properties in the world. Dissatisfaction with these viewpoints prompted many ethicists to consider the implications of thick concepts. The notion of a thick concept was thought to provide insight into meta-ethical questions such as whether there is a fact-value distinction, whether there are ethical truths, and, if there are such truths, whether these truths are objective. Some ethicists also theorized about the role that thick concepts can play in normative ethics, such as in virtue theory. By the beginning of the twenty-first century, the interest in thick concepts had spread to other philosophical disciplines such as epistemology, aesthetics, metaphysics, moral psychology, and the philosophy of law.

Nevertheless, the emerging interest in thick concepts has sparked debates over many questions: How exactly are thick concepts evaluative? How do they combine evaluation and description? How are thick concepts related to thin concepts? And do thick concepts have the sort of significance commonly attributed to them? This article surveys various attempts at answering these questions.

Table of Contents

  1. Background and Preliminaries
  2. Significance of Thick Concepts
    1. Foot’s Argument against the Is-Ought Gap
    2. McDowell’s Disentangling Argument
    3. Williams on Ethical Truth
    4. Thick Concepts in Normative Ethics
  3. How Do Thick Concepts Combine Evaluation and Description?
    1. Reductive Views
    2. Non-Reductive Views
  4. How Do Thick and Thin Differ?
    1. In Kind: Williams’ View
    2. Only in Degree: The Continuum View
    3. In Kind: Hare’s View
  5. Are Thick Terms Truth-Conditionally Evaluative?
    1. Pragmatic View
    2. Semantic View
  6. Broader Applications
  7. References and Further Reading

1. Background and Preliminaries

Bernard Williams first introduced the phrase ‘thick concept’ in his 1985 book, Ethics and the Limits of Philosophy. Williams used this phrase to classify a number of ethical concepts that are plausibly controlled by the facts, such as treachery, brutality, and courage. But his use of the phrase was assimilated from Clifford Geertz’ notion of a thick description—an anthropologist’s tool for describing “a multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another” (1973:). Incidentally, Geertz borrowed the phrase ‘thick description’ from Gilbert Ryle, who took thick description to be a way of categorizing actions and personality traits by reference to intentions, desires, and beliefs (1971). Although Geertz’ and Ryle’s notions of thick description influenced Williams’ terminology, their notions did not necessarily involve evaluation. By contrast, Williams’ notion of a thick concept is bound up with both evaluation and description. Or, in Williams’ terms, thick concepts are both “action-guiding” and “guided by the world”. They are action-guiding in that they typically indicate the presence of reasons for action, and they are world-guided in that their correct application depends on how the world is (1985: 128, 140-41).

Although the phrase ‘thick concept’ first appeared in Williams’ Ethics and the Limits of Philosophy, there was a distinction between thick and thin that predated Williams’ 1985 book. In R.M. Hare’s The Language of Morals, published in 1952, Hare distinguished between primarily evaluative words and secondarily evaluative words (121-2). Hare later identified the former with thin terms, and the latter with thick terms (1997:54). So, the idea of a thick term was present in ethics well before Williams’ terminology.

Hare’s distinction between thick and thin is explicitly about words, and it makes no mention of concepts.  But, in general, the literature on the thick speaks about both thick concepts and thick terms.  Very roughly, concepts are on the level of propositions and meanings (broadly construed), whereas terms are the linguistic entities used to express these items.  In this entry, expressions with single-quotes, for example ‘chaste’, will be used to designate terms.  Italicized expressions, for example chaste, will be used to designate concepts.  Thick concepts, then, can approximately be seen as the meanings of thick terms.

Thick and thin terms are two distinct subclasses of the evaluative. However, readers should exercise caution when encountering the phrase ‘thin term’, since some theorists allow wholly descriptive terms like ‘red’, ‘grass’, and ‘green’ to count as thin (for example, Elgin 2008: 372). Their usage of ‘thin’ diverges from prevailing philosophical jargon, where the thin is seen as a subclass of the evaluative. This article uses the prevailing jargon: on this way of speaking, there are no wholly descriptive thin terms.

It is typically claimed that thick terms are in some sense evaluative and descriptive, but what do these notions mean? The evaluative and the descriptive are normally meant to distinguish between two classes of terms. Descriptive terms can be illustrated with words like ‘red’, ‘solid’, ‘small’, ‘tall’, ‘water’, ‘cat’, and ‘hydrogen’. Paradigmatic evaluative terms include thin terms like ‘good’, ‘bad’, and ‘best’, as well as normative words like ‘ought’, ‘should’, ‘right’, and ‘wrong’. Although some theorists deny that there is any substantive difference between the descriptive and the evaluative (for example, Jackson 1998:120), on the face of it there is a difference. Various attempts have been made to account for this putative difference. Two general approaches are relevant for our purposes.

One approach stems from traditional non-cognitivism. On this view, descriptive terms express beliefs and are capable of picking out properties and facts. But paradigmatic evaluative terms, such as ‘good’ and ‘right’, have neither of these features. These terms do not express beliefs and are incapable of picking out properties and facts. Instead, the function of an evaluative term is to express and induce attitudes, or to commend, condemn, and instruct. Basically, for traditional non-cognitivism, descriptive expressions are capable of representing properties and facts, whereas evaluative one’s express attitudes or imperatives that cannot represent properties or facts. Since this version of the distinction denies that evaluations can be factual, we can call it the strong distinction.

The strong distinction is also known as the fact/value distinction. Thick terms are often seen as a problem for this distinction because they seem both descriptive and evaluative. Indeed, Williams holds that the world-guidedness of thick concepts “is enough to refute the simplest oppositions of fact and value” (1985:150).

There is also a weak distinction between description and evaluation, which is neutral on the question of whether evaluations can be factual. Proponents of this weak distinction may agree with the strong distinction regarding the primary function of evaluative terms. For example, they might agree that evaluative terms function to express and induce attitudes, or to commend, condemn, and instruct. However, they do not rule out the possibility that evaluations are also factual. What then distinguishes the evaluative from the descriptive? Simply put, descriptive terms are all the other predicates within a language—that is, descriptive terms just are non-evaluative. Since this distinction allows that evaluations can be factual, we can call it the weak distinction.

Thick terms are seen as significant because they straddle the above distinctions—they have something in common with both the evaluative and the descriptive. Consequently, thick terms raise interesting questions about whether there is value in the world and whether value claims can be inferred from factual ones. The main arguments and views in this vicinity are from Philippa Foot, John McDowell, and Bernard Williams, which are discussed next.

2. Significance of Thick Concepts

a. Foot’s Argument against the Is-Ought Gap

David Hume is often interpreted as holding that one cannot derive an ‘ought’ from an ‘is’, or more generally, that one cannot derive an evaluative statement from a purely descriptive statement. This view has some intuitive appeal. The basic thought is that evaluative statements can condemn, commend and instruct, whereas descriptive statements can do none of these things. So any inference from purely descriptive premises to an evaluative conclusion would involve a conclusion with content nowhere expressed in its premises. Hence, the inference as a result must be invalid. If we do have a valid inference to an evaluative conclusion, then the premises must somehow involve evaluative content, perhaps covertly. This claim that there are no conceptually valid inferences from purely descriptive premises to evaluative conclusions is known as the “is-ought gap”.

The is-ought gap may seem plausible when the evaluative conclusion employs thin terms like ‘right’ and ‘wrong’. Philippa Foot rejects the is-ought gap by focusing instead on an evaluative conclusion that employs a thick term: ‘rude.’ She points out that ‘rude’ should count as evaluative, because it seems to express an attitude, or to condemn, much like ‘bad’ and ‘wrong’. But, according to Foot, this evaluation can be derived from a description. Consider the description D1: that x causes offence by indicating a lack of respect. Can one accept D1 as true, but deny that x is rude? Foot thinks this denial would be inconsistent. If she is right, then a thick evaluative claim—that x is rude—can be derived from a descriptive claim (Foot 1958).

Foot’s argument is primarily aimed at non-cognitivists, like Hare.  But Hare replies by considering an analogous inference involving a racial slur.  To demonstrate Hare’s point, consider a racial slur like ‘gringo’, ‘kraut’, or ‘honky’.  Most of us disagree with the attitude of contempt that is expressed by the slur ‘kraut’.  But, according to Hare, an analogous inference would logically require us to accept that attitude—that is, to despise Germans.  And this is absurd.  Consider the descriptive claim D1*: that x is a native of Germany.  Is it logically consistent for one to accept D1* as true but deny that x is a kraut?  If the denial in Foot’s example is inconsistent, then, according to Hare’s thinking, it should also be inconsistent in this example.  So, by Foot’s reasoning, D1* should entail ‘x is a kraut’.  Moreover, ‘kraut’ is a term of contempt, which means that this conclusion entails that one must despise x.  By the transitivity of entailment, it follows that an acceptance of D1* requires one to despise x, which is an unintuitive result.  According to Hare, the two inferences are “identical in form.”  So, there must be something wrong with both inferences (1963:188).

Where do the above inferences go wrong? Hare holds that people who reject the attitude associated with ‘kraut’ will substitute it with an evaluatively-neutral expression, such as ‘German’, which does not commit them to the attitude of contempt. So, people who accept D1* are not required to use the evaluative word ‘kraut’ in expressing the conclusion of the inference. Analogous points hold for ‘rude’. Hare concedes that we rarely have evaluatively-neutral expressions corresponding to paradigmatic thick terms, but he thinks such expressions are at least possible. After all, we could use ‘rude’ with a certain tone of voice or with scare-quotes around it, thereby indicating that we mean it in a purely descriptive sense (1963:188-89).

Hare’s response to Foot assumes that slurs are evaluative in the same way as thick terms. This, however, has been taken by some to be unintuitive. But Hare could modify his reply: instead of employing slurs he could use thick terms like ‘chaste’, ‘blasphemous’, ‘perverse’, and ‘lewd’, which are often called “objectionable thick terms”. Objectionable thick terms are terms that embody values that ought to be rejected. It is, of course, a matter of debate whether these thick terms really are objectionable. But Hare could run his argument by using thick terms that are commonly regarded as objectionable. Such terms seem to be evaluative in much the same way as ‘rude’. And, much like slurs, there are many people who reject the values embodied by such terms; these people are consequently reluctant to use the term in question. Notice that arguments like Foot’s would require such people to accept the values embodied by the thick terms they regard as objectionable, and this seems equally implausible. So, Hare’s basic reply need not assume any fundamental similarity between thick terms and slurs.

However, it is unclear that Hare has shown what he needs to show—that the relevant is-ought inferences are invalid. In particular, he has not shown that it’s possible for D1 to be true while ‘x is rude’ is false, or for D1* to be true while ‘x is a kraut’ is false. The mere fact that reluctant speakers will substitute the evaluative conclusions with neutral ones does not show that the evaluative conclusions are false. For instance, you may hate the word ‘prune’ and prefer to substitute it with ‘dried plum’, but that doesn’t mean it’s false that the thing in question is a prune (Foot 1958:509).

Hare could claim that the is-ought gap only exists on the level of concepts or propositions, not on the level of terms or sentences. This makes a difference because Hare holds that the evaluations of thick terms are detachable in the sense that there could be evaluatively-neutral expressions that are propositionally equivalent to sentences involving thick terms. Detachability is the upshot of Hare’s view that ‘German’ can be substituted for ‘kraut’. If the evaluations of thick terms are detachable, then they only attach to the terms, but are not entailed by the propositions expressed by such terms. So, there is no breach of the is-ought gap on the level of propositions. From D1, one can infer the proposition that x is German. And this proposition can also be expressed by using the term ‘kraut’. But one cannot infer a negative evaluation from this proposition; the negative evaluation is only inferable from uses the term ‘kraut’, not from the proposition expressed by such uses.

Hare’s view that the evaluations of thick terms are detachable has led to debates over how exactly thick terms are evaluative. For example, is the evaluation merely pragmatically associated with the term in a way that would make it detachable? Or are these evaluations part of its truth-conditions? These debates are discussed in section 5.

b. McDowell’s Disentangling Argument

Even if Foot is right that there are descriptions that are sufficient for the correct application of thick terms, it need not be the case that these descriptions are necessary. John McDowell’s Disentangling Argument is believed to show that there could not be a description that is both necessary and sufficient for the correct application of a thick term. In this argument McDowell is primarily arguing against non-cognitivists, such as Hare, who accept the strong distinction between description and evaluation.

Before diving into the argument, recall that thick terms seem to straddle the strong distinction. For example, the claim that OJ committed murder seems to aim at stating a fact, which is a feature of descriptive claims (on the strong distinction). But this claim also seems evaluative; by calling it murder, rather than killing, we seem to be evaluating OJ’s action negatively. Thus, thick terms such as ‘murder’ call into doubt the strong distinction because they seem to be both descriptive and evaluative. This issue does not present any obvious challenge to those who accept only the weak distinction, since a term’s being evaluative on the weak-distinction does not preclude it from being fact-stating. How do proponents of the strong distinction meet this challenge?

A.J. Ayer is one non-cognitivist who holds that thick terms like ‘hideous’, ‘beautiful’, and ‘virtue’ are solely on the evaluative side of the strong distinction—they are purely non-factual, evaluative concepts (1946:108-13). Ayer’s view is counterintuitive, and, if generalized, would oddly entail that there is no fact as to whether OJ committed murder.

Most non-cognitivists disagree with Ayer, and claim that thick terms have hybrid meanings that contain two different kinds of content: a descriptive content and an evaluative content. This kind of view is called a Reductive View because it reduces the meaning of a thick term to a descriptive content along with a more basic evaluative content (for example, a thin concept).

McDowell’s Disentangling Argument targets a specific kind of Reductive View, one that is coupled with the strong distinction between description and evaluation. We may thus call his target “the Strong Reductive View”. McDowell assumes that Strong Reductive Views must hold that the thick concept’s descriptive content completely determines the thick concept’s extension. It does so by identifying a property that completely determines what does and does not fall within the thick concept’s extension. The evaluative content plays no role in determining what property the thick concept picks out, but is instead an attitudinal or prescriptive tag that explains the concept’s evaluative perspective.

McDowell’s argument against Strong Reductive Views invites us to consider the epistemic position of an outsider who does not share the evaluative perspective associated with a given thick term. Consider, for example, someone who fails to understand the sexual mores associated with ‘chaste’. Will this person be able to anticipate what this term applies to in new cases? Initially, one might think this is possible: is this not what anthropologists are trained to do? Williams, a proponent of McDowell’s argument, says that anthropologists must at least “grasp imaginatively” the evaluative point of ‘chaste’ (1985:142). She must imagine that she accepts the evaluative point of this term, at least for the purposes of anticipating its usage. Even this might be a problem for Strong Reductive Views. If a Strong Reductive View is correct, then there would be no need for an outsider to grasp the evaluative content of ‘chaste’, even imaginatively. After all, the descriptive content is supposedly what drives the extension of ‘chaste’, which means that an unsympathetic outsider could master its extension just by grasping the descriptive content and observing that it applies to all and only the features that the insiders call ‘chaste’. So, the Strong Reductive View seems to predict that an unsympathetic outsider could anticipate the insider’s usage of ‘chaste’. Many find this implausible.

McDowell’s argument has two premises: (1) If a Strong Reductive View is true of ‘chaste’, then an unsympathetic outsider could master the extension of ‘chaste’ (that is, she could know what things ‘chaste’ would apply to in new cases) without having any grasp of its evaluative content. But, (2) an unsympathetic outsider surely could not achieve this—she could not anticipate its usage if she stands completely outside the evaluative perspective of those who employ the concept. Therefore, the Strong Reductive View is not true of ‘chaste’ (1981:201-3). This sort of argument could be advanced with respect to any thick term and perhaps even thin ones.

The Disentangling Argument is sometimes thought to be a distinctive problem for all Reductive Views, not just Strong Reductive Views. But this is a mistake. Consider Reductive Views that accept only a weak distinction between description and evaluation: call such views “Weak Reductive Views”. Weak Reductive Views can allow that the evaluative content of ‘chaste’ picks out a property, and can therefore allow that this evaluative content plays a role in determining the extension of ‘chaste’. For example, if morally good is the evaluative content associated with ‘chaste’, and morally good picks out a property, then morally good can also play a role in determining the extension of ‘chaste’. This means that its extension need not be completely determined by its descriptive content. In this case, an unsympathetic outsider would be a very strange person—that is, someone who does not accept the evaluative point of morally good, even imaginatively. But such a person does not seem impossible.

To be sure, Weak Reductive Views could be vulnerable to the Disentangling Argument if they accept an additional claim, namely, that chaste is coextensive with a descriptive concept that is perhaps not encoded within the content of chaste. Consider an analogy: it is plausible that water and H2O are coextensive, even though neither concept is encoded within the other. If Weak Reductive Views hold that this situation is true of chaste and some descriptive concept D, which is not encoded in the content of chaste, then the Disentangling Argument could be run against these views. This type of Weak Reductive View predicts that an outsider could master the extension of ‘chaste’ just by grasping D and observing that insiders apply ‘chaste’ to all and only things that are D. Thus, the Disentangling Argument could be run against Weak Reductive Views if they accept the additional claim that chaste is coextensive with a descriptive concept.

However, the same problem also arises for Non-Reductive Views that accept this additional claim. Non-Reductive Views hold that thick concepts cannot be divided into distinct contents (more on this in section 3). And, strictly speaking, Non-Reductive Views are compatible with the additional claim just mentioned—that chaste is coextensive with a descriptive concept. If Non-Reductivists accept this additional claim—which would be uncharacteristic, though not inconsistent— then the combined view would also be vulnerable to the Disentangling Argument.

So, the Disentangling Argument can be used to target any view, Reductive or Non-Reductive, that holds thick concepts to be coextensive with descriptive concepts. It is thus a mistake to think the Disentangling Argument is a problem for all and only Reductive Views. The reason McDowell’s argument targets Strong Reductive Views is that these views appear fit to accept the problematic claim—that thick concepts are coextensive with descriptive concepts.

Most opponents of the Disentangling Argument reject premise (1), by showing that Strong Reductive Views can allow that an unsympathetic outsider could not master the extension of thick terms. This approach is discussed in section 3a. But Hare takes a different approach. He accepts the Strong Reductive View but rejects premise (2). Recall Hare’s way of arguing that there could be a descriptive concept that is extensionally equivalent to a thick concept. One can express this descriptive concept by muting the thick term’s evaluative content in one of two ways: either by using the thick term with a certain tone of voice or by placing scare-quotes around the term. Suppose that these methods successfully show that there is a purely descriptive concept—call it des-chaste—which is coextensive with chaste. In this case, an outsider could employ des-chaste to track the insider’s usage of ‘chaste’.

One might object that Hare’s two methods of uncovering des-chaste reveal that this concept cannot be grasped without already grasping chaste. So, the interpreter in question would not be a genuine outsider. However, although Hare’s methods of uncovering des-chaste require a grasp of chaste, there is no automatic reason to assume that there could not be another method of uncovering des-chaste without grasping chaste—for example, by learning des-chaste independently of any encounter with insiders or their value system.

Would the outsider’s grasp of des-chaste help her anticipate the insider’s use of ‘chaste’ in new cases? Hare thinks so. According to Hare, the outsider could anticipate their use in new cases because she could observe similarities between the old cases and the new cases, and infer based on those similarities that ‘chaste’ would or would not apply in new cases (1997: 61). Of course, McDowell and followers would not be convinced by this claim, since they hold that the similarities between such cases are evaluative. In other words, they accept what is known as “the shapelessness hypothesis”—that the extensions of thick terms are only unified by evaluative similarity relations.

The fundamental disagreement between Hare and McDowell concerns whether the shapelessness hypothesis is true. Is there any reason to accept shapelessness? This hypothesis is sometimes supported by the fact that it can explain why premise (2) of the Disentangling Argument is plausible—that is, it can explain why an unsympathetic outsider could not master the extension of ‘chaste’ (Roberts 2013: 680). Of course, this idea won’t convince someone like Hare who rejects premise (2). Something more should be said. In section 5b, further support for shapelessness is discussed.

It is worth noting that there is a common thread running through Hare’s replies to both Foot and McDowell. In both replies Hare claims that there could be a descriptive expression that is extensionally equivalent to a thick term. Without this claim he could not hold that des-chaste is coextensive with chaste. He also could not escape Foot’s objection to the is-ought gap by claiming that the evaluations of thick terms are detachable.

c. Williams on Ethical Truth

If successful, McDowell’s argument would show that there could not be a wholly descriptive expression coextensive with a thick term. Furthermore, if we assume that utterances involving thick terms are sometimes true, McDowell’s argument might show that there are evaluative facts—facts that can only be characterized in evaluative terms. But this is too quick. After all, sentences involving thick terms might only be true in a minimalist sense. On the minimalist theory of truth, to say that ‘lying is dishonest’ is true is equivalent to saying simply that lying is dishonest. This is all that can be significantly said about the truth of this sentence. Since nothing more can be said, its mere truth does not entail the existence of a fact to which the sentence corresponds. So, even if McDowell’s argument succeeds, the truth of sentences involving thick terms does not guarantee that there are facts that can only be characterized in evaluative terms.

To get a fuller picture of how the truth of such sentences might support the existence of evaluative facts, we must turn to Bernard Williams. According to Williams, utterances involving thick terms show promise of being more than just minimally true, whereas utterances involving thin terms do not. The main difference, according to Williams, is that thick terms bear a close connection to the concept of knowledge and to the notion of a helpful advisor.

Consider the connection between knowledge and thick concepts. There is a precedent for thinking that certain epistemic difficulties arise for thin concepts but not thick ones. For example, how exactly can one come to know that lying is sometimes wrong? This plausible truth, which involves a thin concept, seems to be neither analytic nor a posteriori. Some ethicists have thus held that it is synthetic a priori and is knowable by a special faculty of the mind, such as moral intuition. But many ethicists find this view implausible and have instead turned to thick concepts for an account of ethical knowledge. It may seem more plausible that we can know a posteriori that a thick concept applies, for example, that a certain action is cowardly. According to Mark Platts, we can know such truths “by looking and seeing,” without any special faculty, such as moral intuition (1988: 285).

Williams agrees that thick ethical knowledge is more feasible than thin ethical knowledge. His reasons, however, are different from Platts’. Williams holds that the concept of knowledge is associated with the notion of a helpful informant or advisor, and that there are only such advisors with regard to the application of thick concepts, not thin ones. According to Williams, a helpful advisor is someone who is better than others at seeing that a certain outcome, policy, or action falls under a concept. And Williams holds that there are helpful advisors with regard to thick concepts. For example, the advice that a certain action would be cowardly “can offer the person who is being advised a genuine discovery” (1993: 217). Are there helpful advisors with regard to thin concepts? Not according to Williams—“not many people are going to say ‘Well, I didn’t understand the professor’s argument for his conclusion that abortion is wrong, but since he is qualified in the subject, abortion probably is wrong’” (1995: 235). Thus, according to Williams, utterances involving thick terms show promise of being more than minimally true, given that thick terms have this association with knowledge and helpful informants.

Even though Williams holds that utterances involving thick terms can be more than minimally true, he does not think these utterances can be objectively true—that is, true independently of particular perspectives. To illustrate this, Williams asks us to compare ethics with science. Although there are disagreements in science, there is at least some chance of scientists converging on a perspective-free account of the world, and this convergence would be best explained by the correctness of that account. But Williams thinks our ethical opinions stand no chance of converging on an account of how the world really is independently of particular perspectives—at any rate, if they do converge, this will not be because these opinions have tracked how the world is independently of perspective (1985: 135-6). So, on Williams’ view, the truth of ethical opinions is dependent upon perspective, and hence, not objectively true.

How exactly is the truth of utterances involving thick terms dependent upon perspective? Williams illustrates his view by asking us to envision a hyper-traditional society which is maximally homogenous and minimally reflective. Williams holds that ethical reflection primarily employs thin concepts, and that this hyper-traditional society is unreflective because it only employs thick concepts. According to Williams, their utterances involving thick terms can be true in their language L, which is distinct from our language since L does not express thin concepts whereas our language does. Williams thinks it is undeniable that the thick concepts expressed in L need not be expressible with our language, which means that we may be unable to use our language to assert or deny what insiders say with their thick terms. Of course, it is possible for a sympathetic outsider, such as an anthropologist, to understand and speak L. But, according to Williams, the outsider cannot formulate an equivalent utterance in his own language because “the expressive powers of his own language are different from those of the native language precisely in the respect that the native language contains an ethical concept which his doesn’t” (1995: 239).

To explain Williams’ view further, we can borrow an example from Allan Gibbard (1992). Imagine that gopa is a positive thick concept expressible in L but not expressible in our language. Although a reflective outsider cannot assert that x is gopa in her own language, she can likely reject the proposition that x is good, which involves a thin concept. And if the local’s thick concept gopa entails good, then the outsider could reject the insider’s statement as false by denying that x is good. So, it looks like the insider’s statement can be assessed as false from an outside perspective. However, Williams does not accept that the insider’s concept gopa entails good. A judgment involving a thin concept, such as good, “is essentially the product of reflection” which comes about “when someone stands back from the practices of the society and its use of the concepts and asks… whether these are good ways in which to assess actions…” (1985:146). But this hyper-traditional society is unreflective, which means they do not employ the thin concept good. So, according to Williams, there’s no reason to assume that their concept gopa entails good, which means the outsider’s denial that x is good poses no clear threat to the truth of ‘x is gopa’.

On Williams’ view, if a person from the hyper-traditional society has knowledge that x is gopa, but later reflects and draws the conclusion that x is good, this reflection may unseat his previous knowledge by making it so that this person no longer possesses the traditional concept gopa (1995: 238). In this way, “rejection can destroy knowledge,” because the one who reflects may thereby cease to possess their traditional thick concepts (1985, 148).

Williams has here outlined a possibility in which utterances involving thick terms could be true in a way that is dependent upon perspective—in particular, the perspective of a person who speaks a certain ethical language, such as L. Opposition to Williams comes from at least two fronts.

First, McDowell (1998) and Hilary Putnam (1990) have both objected to Williams’ conception of science as providing a perspective-free account of the world. They hold that science is perspective-dependent. Although this objection would destroy Williams’ contrast between science and ethics, it would not mean that ethical truth is perspective-free, but only that science and ethics are both perspective-dependent, which leaves ethics in good company.

A second source of opposition is known as Thin Centralism—the view that thin concepts are conceptually prior to, and independent of, thick concepts. If good is conceptually prior to gopa, then the locals cannot grasp gopa without also grasping good. This would mean that Williams is wrong to claim that the locals may lose the concept gopa when they draw the reflective inference from x is gopa to x is good. Furthermore, the outsider who denies that x is good is required, by way of this inference, to deny that x is gopa. So, the truth or falsity of ‘x is gopa’ would not depend on what is knowable solely from the local’s perspective, contrary to Williams’ view. It may depend partly on whether x is good, which may be discernible from the outsider’s perspective.

Williams rejects Thin Centralism, though he does not give any arguments against it (1995: 234). He is plausibly a Thick Centralist, holding that thick concepts are conceptually prior to, and independent of, thin concepts. That is, one cannot grasp a thin concept without grasping some thick concept or other, but not vice versa. This view can be understood by way of a color analogy. The concept color is a very general concept that, according to Susan Hurley, cannot be understood independently of specific color concepts, such as red, green, etc. (1989: 16). And according to Thick Centralism, thin concepts like good cannot be grasped independently of specific thick concepts like courageous, kind, and so on.

It might be true that the grasp of color requires the grasp of some specific color concept (for example, red), but is the opposite also true? Does the grasp of red require a grasp of color? If so, then the color analogy would actually support what is known as the No-Priority View—thick and thin concepts are conceptually interdependent with neither one being prior to the other (Dancy 2013). It is worth noting that the No-Priority View is not available to Williams, since this view would mean that the local’s grasp of gopa requires the grasp of a thin concept, and this presents the same problem that Thin Centralism presents for Williams’ view.

d. Thick Concepts in Normative Ethics

It is often urged that ethicists should stop focusing as much on thin concepts and should expand or shift attention towards the thick (Anscombe 1958; Williams 1985). As a result, there has been much attention paid to thick concepts within meta-ethics, primarily regarding the issues discussed above. Have thick concepts also played a substantive role in normative ethics? They have to some extent. Normative ethics is partly concerned with the question of what kind of person one should be. And the virtue and vice concepts, which are paradigmatic thick concepts, have played a significant role in these discussions.

However, normative ethics is also concerned with the question of how one should act, and in this context it is common to focus on thin concepts, like right, wrong, and good. Of course, there are some thick concepts, such as just and equitable that figure into these discussions, but it is not immediately clear why it would matter whether these concepts are thick, rather than thin or purely descriptive. There is at least one attempt at giving thick concepts a substantive role in a theory of how to act. This comes from Rosalind Hursthouse’s virtue theory of right action.

Virtue theory is sometimes criticized for being unable to provide a theory of right action. The mere fact that virtues are character traits of persons does not mean that virtue concepts cannot be applied to actions. Actions can also be honest, courageous, patient, and so on. The problem is that these characterizations of action do not clearly tell us anything about rightness, which would be a major flaw of a normative ethical theory.

Hursthouse meets this criticism by providing a theory of right action in terms of virtue. She holds that an action is right just in case it is what a virtuous agent would characteristically do in the circumstances (1999: 28). The virtuous agent is one who has the virtuous character traits and exercises them. And a virtue is a character trait that a human being needs to flourish or live well. These particular virtues must be enumerated, but the list typically includes paradigmatic thick terms, such as ‘courage’, ‘honesty’, ‘patience’, ‘generosity’, and so on. Hursthouse explicitly claims that the virtue terms are thick (1996: 27). Does it matter for her view whether the virtue terms are thick? Hursthouse’s theory faces an objection, and it is in response to this objection that it might matter.

The objection alleges that the virtue theory of right action cannot provide clear action-guidance, whereas rival normative theories, such as deontology and utilitarianism, can provide clear action-guidance by generating rules, such as “Don’t lie” or “Maximize happiness.” According to this objection, the virtue theory of right action can only generate a very unhelpful rule: “Do what a virtuous person would do.” This rule is not likely to provide action-guidance. If you are a fully virtuous person, you will already know what to do and so would not require the rule. If you are less than fully virtuous, you may have no idea what a virtuous person would do in the circumstances, especially if you don’t know of anyone who is fully virtuous (indeed, such a person might be purely hypothetical). So, according to this objection, the virtue theory of right action cannot provide action-guidance.

In response, Hursthouse points out that every virtue generates positive instruction on how to act—do what is honest, charitable, generous, and so on. And every vice generates a prohibition—do not do what is dishonest, uncharitable, mean, and so on (1999: 36). So, one can get action-guidance without reflecting on what a hypothetical virtuous agent would do in the circumstances. According to Hursthouse, “the agent may employ her concepts of the virtues and vices directly, rather than imagining what some hypothetical exemplar would do” (1991: 227). For example, the agent may reason “I must not tell this lie, since it would be dishonest.” And since dishonesty is a vice, which no virtuous person would have, this agent will be directed towards right action.

Thus, it’s important for Hursthouse’s view that the virtue concepts are at least action-guiding. After all, imagine that the virtue concepts were wholly descriptive concepts of character traits, like slow, calm, or quiet. These descriptive concepts would not generate any prohibitions or positive instruction.

Does it matter whether the virtue concepts are thick rather than thin concepts? Hursthouse does not speak directly to this question, though she does claim that, if we are unclear on what to do in a circumstance, we can seek advice from people who are morally better than ourselves (1999: 35). And, here, Williams’ point about helpful advisors might be useful. If the virtue concepts were thin, then on Williams’ view there would be no helpful advisor with regard to whether the virtue concepts apply. But such advice is possible if the virtue concepts are thick. In short, it is important for Hursthouse that the virtue concepts are action-guiding. And, if Williams is right, it may also matter whether the virtue concepts are thick.

One potential challenge to Hursthouse’s reply might contest the traditional list of virtues, and claim that there is no reason to think this list, when properly enumerated, will contain thick action-guiding concepts. For example, why should we think that courageous will be on the list of virtues rather than a similar concept that rarely generates positive instruction (for example, gutsy)? In considering this objection, readers are advised to consult Hursthouse’s approach to enumerating the virtues (1999: Ch. 8).

Another potential challenge may come from Thin Centralism. Suppose that right is conceptually prior to, and independent of, courageous. In this case, it might be argued that the positive instruction generated by courage (for example, “Do what is courageous”) is wholly due to the action-guidingness of right. The latter is precisely what we wanted to explain, which means that Hursthouse’s reply might be uninformative. However, Hursthouse does not account for particular virtue concepts in terms of right. Furthermore, even if Thin Centralism is true, it could still be claimed that some other thin concept, such as good, is conceptually prior to thick virtue concepts. So, Hursthouse’s account cannot be deemed uninformative merely on the basis of Thin Centralism.

3. How Do Thick Concepts Combine Evaluation and Description?

Thin Centralists typically accept Reductive Views of the thick, which aim to analyze the meanings of thick terms by citing more fundamental concepts (for example, thin concepts and descriptive concepts). Proponents of these Reductive Views often aim at escaping the Disentangling Argument. In particular, they aim to reject premise (1) of that argument by showing that Reductive Views can consistently claim that an outsider could not grasp the extension of a thick term. This strategy proceeds by providing different versions of the Reductive View, which shall be discussed below.

It is worth noting that Reductive Views are typically neutral on whether the weak or strong distinction ought to be accepted. They also tend to be neutral on whether cognitivism or non-cognitivism is true. To be sure, Reductivism is often associated with non-cognitivists, like Hare, but there are some traditional cognitivists, like Henry Sidgwick and G.E. Moore, who hold Reductive Views of the thick (Hurka 2011: 7).

Those who reject Thin Centralism and accept the Disentangling Argument normally accept Non-Reductive Views, holding that the meanings of thick terms are evaluative and descriptive in some sense, though cannot be divided into distinct contents. The basic disagreement between Reductive and Non-Reductive views is on whether thick concepts are fundamental evaluative concepts or are complexes built up from more fundamental concepts (for example, thin concepts). These two approaches are compared in the following sections.

a. Reductive Views

In general, Reductive Views understand the meaning of a thick term as the combination of a descriptive content with an evaluative content. Different Reductive Views can be distinguished based on how they specify this general account. There are three main types of Reductive Views: (i) some views specify the sort of descriptive content within the analysis; (ii) some views specify the relation between evaluative and descriptive contents; and (iii) other views specify what the evaluative content is. There are also various ways of combining (i)-(iii).

Consider type (i) first. Daniel Elstein and Thomas Hurka provide two patterns of analysis that explain the descriptive content of a thick term in two different ways. On their first pattern of analysis, the descriptive content of a thick term is not fully specified within the meaning of the thick term. The meaning of the thick term may only specify that there are some good-making descriptive properties of a general type, without specifying exactly what these good-making properties are. For example, on their view, ‘x is just’ means ‘x is good, and there are properties XYZ (not specified) that distributions have as distributions, such that x has XYZ and XYZ make any distribution that has them good’. Elstein and Hurka hold that this kind of Reductive View is not a Strong Reductive View, because the thick concept does not have a fully specified descriptive content that determines the thick concept’s extension. Still, their view is available to non-cognitivists who accept the strong distinction. Most importantly, Elstein and Hurka believe their view allows non-cognitivists to claim that an outsider could not grasp the extension of ‘just’. Grasping that extension requires determining which properties of the general type are the good-making ones, and doing this requires evaluative judgments that the outsider is not equipped to make (2009: 521-2).

Elstein and Hurka’s second pattern of analysis involves an additional evaluation, which is embedded within the descriptive content. Many virtue and vice concepts are supposed to fit into this second pattern of analysis. For example, on their view, ‘an act x is courageous’ means roughly ‘x is good, and x involves an agent’s accepting risk of harm for himself for the sake of goods greater than the evil of that harm, where this property makes any act that has it good’ (2009: 527). The reference to goods is an embedded evaluation, and it is impossible to determine the extension of ‘courageous’ without determining what can count as goods—but determining this requires an evaluation which the outsider is not equipped to make (2009: 526).

Stephen Burton offers an account of type (ii) by clarifying the relationship between descriptive and evaluative contents of a thick concept. A simple way of expressing the relationship between a thick term’s evaluative and descriptive contents is as follows: ‘x is D and therefore x is E’, where D is a description and E is an evaluation that follows from that description. The trouble is that this simple formula entails that D is coextensive with the thick term itself, and this makes the simple formula vulnerable to the Disentangling Argument. So, Burton modifies the account so that the thick term is not coextensive with D. Burton proposes that a thick term’s meaning can be analyzed as follows: ‘x is E in virtue of some particular instance of D’. For example, ‘courageous’ means ‘(pro tanto) good in virtue of some particular instance of sticking to one’s guns despite great personal risk’. Here, the thick term only groups together those cases in which a thing is E in virtue of some particular instance of D. But D does not entail E, and so is not coextensive with the thick term. Thus, an outsider’s ability to track D will not be enough for her to track the insider’s use of the thick term. But what does it mean for E to depend upon a particular instance of D? For Burton, this means that E “depends on the various different characteristics and contexts” of D, and so D alone is not sufficient for E. Various different characteristics and contexts, which are not encoded in the meaning of the thick term, also need to obtain (1992: 31).

Now consider a view of type (iii). Most Reductive Views hold that thick concepts inherit their evaluative-ness from a constituent thin concept. However, Christine Tappolet proposes that they are instead evaluative on account of specific affective concepts, like admirable, pleasant, desirable, and amusing. These concepts are not thin concepts, but Tappolet holds that they are the basic evaluative constituents of thick concepts, like courageous and generous. For example, Tappolet’s analysis of courageous goes like this: ‘x is courageous’ means ‘x is D and x is admirable in virtue of this particular instance of D’, where D is a description. Essentially, Tappolet accepts Burton’s account of the relation between descriptive and evaluative contents, but modifies the account so that it incorporates affective concepts instead of thin concepts. In doing so, she parts company with other Reductivists by rejecting Thin Centralism. She rejects Thin Centralism because she holds that understanding a thin concept, such as good, requires an understanding of certain specific concepts such as pleasant and admirable (2004: 216).

An objection may arise: affective concepts are also thick concepts, but they do not fit into Tappolet’s analyses of thick concepts. This is because one affective concept, such as admirable, cannot be defined in terms of another, such as pleasant. How then should we account for these affective concepts? Tappolet’s answer is that affective concepts are to be treated differently from other thick concepts, like courageous. In particular, she treats positive affective concepts as determinates of the determinable good, and she holds that determinates cannot be analyzed in terms of their determinables. Roughly, the determinable/determinate relation is a relation of general concepts to more specific ones, where the general determinables are common to each specific determinate, but there is nothing distinguishing the determinates from each other except for the determinates themselves—for example, the only thing that distinguishes red from other colors is redness itself.

Edward Harcourt and Alan Thomas (2013) have pointed to a tension between Tappolet’s treatment of affective concepts and her treatment of other thick concepts. What reason is there to think courageous is analyzable but not admirable? Tappolet holds that admirable is unanalyzable because there is no way of stating the relevant descriptive content associated with admirable (2004: 217). In response, Harcourt and Thomas claim that this is just as much a problem for her analyses of other thick concepts. For example, it is far from clear what should be substituted for ‘D’ within Tappolet’s analysis of courageous. This objection leads Harcourt and Thomas to a Non-Reductive View, according to which all thick concepts are treated as determinates of thin concepts like good and bad (2013: 25-9).

One problem is that there is reason to think that both parties to this dispute are mistaken in claiming that affective concepts cannot be analyzed. There is a simple Reductive account of the meaning of ‘admirable’, which is not represented by any of the above views—‘admirable’ just means ‘worthy of admiration’. Similar accounts can be given for other affective concepts. If this simple analysis is correct, then Tappolet and Harcourt and Thomas are mistaken about the unanalyzability of thick affective concepts.

Some of the analyses provided above may not withstand potential counterexamples. But it is worth pointing out that our inability to state an adequate analysis for a given thick term does not show that its meaning is unanalyzable. Analyses can only be attempted by using a language, and it is possible that our language’s vocabulary does not contain the expressions needed for providing an adequate analysis of the thick term’s meaning. Reductive Views are only committed to the view that the meanings of thick terms involve appropriately related evaluative and descriptive contents; they are not committed to there being any actual language that can express these contents in a way that counts as a satisfactory analysis.

What then is the point in providing these patterns of analysis? The point is to illustrate the general ways in which descriptive and evaluative contents can be combined within the meanings of thick terms. Typically, Reductive Views only commit to the possibility of there being a certain general type of analysis and do not commit to the particular details of their sample analyses (for example, Elstein and Hurka, 2009: 531).

Are there any advantages to Reductivism about the thick? According to Hurka, Reductivism allows cognitivists to explain the difference between virtues and their cognate vices (2011: 7). For example, both courage and foolhardiness involve a willingness to face risk for a cause. What then differentiates courage from foolhardiness? It is plausible that courage requires that the cause be good enough to justify the risk, whereas foolhardiness does not require this. This explanation appeals to a thin concept—good—that many Reductivists are perfectly willing to cite as a constituent of courage. However, there is nothing forbidding Non-Reductivists from also claiming that courage requires a good enough cause, provided they do not take this content to be a constituent of courage. So, this may be no clear advantage for Reductivism.

Another potential advantage is that Reductivism allows us to explain a wide variety of evaluative concepts by recognizing only a few basic ones, such as ought or good. Moreover, if a successful analysis can be achieved, then Non-Reductivists are committed to positing two meanings where Reductivists can posit only one. For example, if the meaning of ‘admirable’ can be analyzed with ‘worthy of admiration’, then Reductivists can claim that the meanings of these two expressions are identical, whereas Non-Reductivists must hold that these meanings are distinct. Lastly, Reductive Views can explain how a thick term is both evaluative and descriptive, since the evaluative-ness of a thick term’s meaning is inherited from a constituent content that is paradigmatically evaluative (for example, a thin concept); and the descriptiveness of its meaning is inherited from a constituent descriptive content. In the next section, we shall examine whether Non-Reductivists can provide a comparable explanation.

b. Non-Reductive Views

Non-Reductive Views hold that the meanings of thick terms are both descriptive and evaluative, although these features are not due to constituent contents within the meanings of thick terms. In slogan form, thick concepts are irreducibly thick. For example, the thick term ‘brutal’ expresses a sui generis evaluative concept, which is not a combination of bad or wrong along with some descriptive content. The challenge is for Non-Reductive Views to explain how these meanings are both evaluative and descriptive. As noted, Reductive Views explain this in terms of constituent contents. The challenge is for Non-Reductive Views to explain how the meanings of thick terms are both descriptive and evaluative without appealing to constituent contents.

This challenge should be weakened in light of the fact that our notions of the descriptive and the evaluative are theoretically-loaded. Non-Reductive theorists do not accept the strong distinction between description and evaluation, because they hold that thick terms are both evaluative and capable of picking out properties. The strong distinction precludes this possibility, unless the content of the thick term is built up from constituents, which Non-Reductivists reject. Non-Reductivists typically accept some version of the weak distinction, but the present challenge cannot be framed in terms of this distinction. On the weak distinction, the descriptive is identical to the non-evaluative. This means that Non-Reductivists are being asked to explain how the meanings of thick terms are both evaluative and not evaluative, which is plainly contradictory. How then are we to understand the challenge faced by Non-Reductive Views?

The challenge can be framed in a two-fold way: (I) Non-Reductive Views need to explain what the meanings of thick terms have in common with the meanings of thin terms—this would explain the evaluative-ness of the thick term’s meaning. And (II) they also need to explain what the meanings of thick terms have in common with the meanings of paradigmatic descriptive terms—this would explain the descriptiveness of the thick term’s meaning.

Starting with (I), Jonathan Dancy holds that both thick and thin terms express concepts that have “practical relevance,” a feature that is lacked by descriptive concepts. To see what he means, consider how thick and thin concepts differ from descriptive concepts like water. The latter can make a practical difference in some circumstances: water may be something to seek when stranded in a desert. But in this case, we must explain the practical relevance of water by citing other properties in the particular situation, such as being thirsty, in a desert, and so forth. By contrast, there is nothing to be explained when a thick or thin concept makes a practical difference, since their practical relevance “is to be expected.” For example, it is expected that courage is something to aspire for and admire, and this does not require explanation by citing other concepts. Dancy expands upon this by claiming that competence with a thick concept requires not only an ability to determine when the concept applies, but also an ability to determine what practical relevance its application has in the circumstances. Competence with a descriptive concept requires only the former, not the latter (2013: 56).

At this point, Reductive theorists may emphasize a potential benefit of their view—they have a simple explanation for why competence with a thick concept requires an ability to determine its practical relevance. In particular, competence with a thick concept requires an ability to determine its practical relevance because its constituent thin concept is practically relevant. But Dancy and other Non-Reductivists cannot appeal to this explanation. How then can they explain the practical relevance of thick concepts?

Conceptual competence can surely be explained without appealing to constituent concepts, otherwise competence with a simple concept would be inexplicable. One potential explanation, which does not appeal to constituent concepts, comes from Harcourt and Thomas (2013: 24-7). Harcourt and Thomas hold that thick concepts are related to good and bad analogously to how red is related to colored. On their view, colored is not a constituent of red, since there are no other concepts that can be combined with colored to yield red. Instead, red is a determinate of the determinable color. Similarly, the thin concept bad is not a constituent of the thick concept brutal—according to Harcourt and Thomas, there is no other concept that can be combined with bad to yield brutal. Instead, brutal is a determinate of the determinable bad. Moreover, given that brutal is a determinate of bad, it can be claimed that the practical relevance of brutal is inherited from the practical relevance of bad, even though the latter is not a constituent of the former.

Debbie Roberts provides another explanation of what the meanings of thick terms have in common with thin terms. Many ethicists claim that thick and thin terms express and induce attitudes, or condemn, commend, and instruct. Roberts takes a different approach. On her view, a concept is evaluative in virtue of ascribing an evaluative property. A concept ascribes a property if and only if the real definition of the property it refers to is given by the content of that concept. What then is an evaluative property? According to Roberts, a property P is evaluative if (i) P is intrinsically linked to human concerns and purposes; (ii) there are various lower-level properties that can each make it the case that P is instantiated (that is, P is multiply-realizable); but (iii) these lower-level properties do not necessitate that P is instantiated (that is, other features must also obtain). Roberts holds that both thick and thin concepts ascribe properties that satisfy (i)-(iii) (2013).

One potential problem is that there might be some paradigmatically descriptive properties that satisfy (i)-(iii).  Consider a particular mental state with moral content, such as the belief that lying is wrong.  The property of being in this state is intrinsically linked to human concerns and purposes, since it is a moral belief.  And if belief-states are multiply realizable, then this property will satisfy (ii) as well.  And finally, if there are lower-level brain states that make it the case that someone has this belief, without necessitating it, then (iii) will be satisfied as well.  Thus, certain mental properties may satisfy (i)-(iii), even though they seem descriptive. Roberts could reply by holding that the above-mentioned moral belief is not linked to human concerns and purposes in the right sort of way.

Turning to (II): What do the meanings of thick terms have in common with paradigmatic descriptive terms? Recall that a key point about paradigmatic descriptive terms is that these terms are capable of representing properties. Non-Reductive theorists can point out that thick terms also seem capable of representing properties. This, in fact, was the fundamental motivation for focusing on thick terms to begin with. And nearly all ethicists (except for Ayer) would agree that this is true. It plainly seems true that ‘courage’ is capable of picking out a property, and in this way ‘courage’ shares something in common with paradigmatic descriptive terms like ‘red’ and ‘water’.

Another key point about descriptive terms is that they are intuitively different from thin terms like ‘wrong’ and ‘good’. Indeed, a central motivation for classifying terms as descriptive is to exclude thin terms like ‘good’ and ‘wrong’ from paradigmatically descriptive expressions. How then do thick terms share this feature with the descriptive—that of being different from thin terms? There are two general answers that Non-Reductivists provide. On one approach, thick and thin differ in kind. On the other, thick and thin differ only in degree but not in kind. These general approaches are discussed in the next section. Reductivist theories are also discussed under each approach.

4. How Do Thick and Thin Differ?

a. In Kind: Williams’ View

Williams is a Non-Reductive theorist who holds that thick and thin differ in kind. On his view, thick terms are both world-guided and action-guiding. For Williams, a world-guided term is one whose usage is “controlled by the facts”—that is, there are conditions for its correct application and competent users can largely agree that it does or does not apply in new situations. An action-guiding term is one that is “characteristically related to reasons for action” (1985: 140-1). For Williams, thick terms are both world-guided and action-guiding, whereas thin terms are action-guiding but “do not display world-guidedness” (1985: 152).

There are some potential problems for Williams’ distinction. First, Williams’ claim that thin terms “do not display world-guidedness” seems to commit him to something controversial—namely, that non-cognitivism is true of thin terms. Other Non-Reductivists accept Williams’ characterization of thick terms, but hold that thin terms are also world-guided and action-guiding (Dancy 2013: 56). If they are right, then Williams’ distinction between thick and thin is compromised.

Nevertheless, there is a straightforward way of distinguishing between thick and thin, which does not assume non-cognitivism about thin terms. On this view, thin terms express wholly evaluative concepts, whereas thick terms express concepts that are partly evaluative and partly descriptive. This straightforward distinction gives us a difference in kind between thick and thin. The trouble is that it too appears to be theoretically loaded (much like Williams’ distinction). This straightforward distinction presupposes a Reductive View, since it holds that thick concepts are built up from evaluative and descriptive components. Another potential problem is that it is not clear whether thin concepts are wholly evaluative. For example, it looks as though the thin concept ought implies the descriptive concept can, assuming the ought-implies-can principle (Väyrynen 2013: 7). Of course, as Dancy points out, the mere fact that one concept entails another does not mean that the latter is a constituent of the former—cow entails not-a-horse, but neither is a constituent of the other (2013: 49).

A second potential problem for Williams’ view, and the straightforward view just mentioned, comes from Samuel Scheffler. Scheffler points out that there are many evaluative terms that are hard to classify as either thick or thin. Consider ‘just’, ‘fair’, ‘impartial’, ‘rights’, ‘autonomy’, and ‘consent’. Upon reflecting on such concepts, Scheffler suggests that world-guidedness is a matter of degree and that a division of ethical concepts into thick and thin is a “considerable oversimplification” (1987: 417-8).

In a later essay, Williams replies to Scheffler by agreeing that thickness comes in degrees, and that “there is an important class of concepts that lie between the thick and the thin” (1995: 234). This reply, however, does not entail that Williams must reject his earlier view.  Assume that thickness and thinness each come in degrees, and that thick and thin do not exhaust all evaluative concepts. These two claims do not entail that the difference between thick and thin is merely a matter of degree. This can be seen via analogy: belief that P and disbelief that P are exclusive categories that each come in degree, and which do not exhaust all doxastic states since suspension of judgment is also possible. But the difference between belief that P and disbelief that P is not merely a matter of degree. These states are different in kind, assuming the former is about the affirmative proposition P while the latter is about the negation ¬P. Similarly, thick and thin could also differ in kind, even if they are exclusive degree categories that do not exhaust all evaluative concepts. Thus, Scheffler’s considerations and Williams’ concessions do not entail that Williams’ earlier view is false.

b. Only in Degree: The Continuum View

Still, many theorists have seized upon Scheffler’s point and have claimed that thick and thin differ only in degree, not in kind. Some consider this to be the standard view (Väyrynen 2008: 391). On this view, thin and thick lie on opposite ends of a continuum of evaluative concepts, with no sharp dividing line between them. For example, good and bad might lie on one end of the continuum, with kind, compassionate, and cruel on the other end. There are at least two gradable notions that can serve to distinguish the ends of this continuum—degrees of specificity or amounts of descriptive content. Greater specificity, or greater amounts of descriptive content, provides a thicker concept with a narrower range of application. Non-Reductive theorists typically focus on the greater specificity of thick terms. Reductive theorists can choose either path; indeed, they can explain the greater specificity of a thick concept in terms of how much descriptive content it has as a constituent. In general, a concept must have enough specificity, or enough descriptive content, for it to reside on the thicker end of the continuum.

Support for the continuum view may come from several considerations. First, consider that some thin concepts have narrower ranges of application than other thin concepts. For example, good can apply to actions, people, food, cars, and so on, whereas right cannot apply to all these things. This may suggest that there are degrees of thinness. Second, as already noted, some thin concepts have descriptive entailments—for instance, the thin concept ought entails the descriptive concept can. Even if can is not a constituent of ought, this entailment at least narrows down the range of application for ought, which could bring it closer to the thick end of the spectrum, even if it is still fairly thin. Thirdly, there seems to be a vague area between thick and thin—for example, it is not clear whether just has enough specificity or enough descriptive content for it to count as thick, but it is also hard to classify this concept as thin. So, perhaps just is a borderline case between thick and thin.

Given these considerations, one may be tempted to hold that thick and thin do not differ in kind. But the above considerations do not strictly entail this. Again, analogous considerations hold for both belief and disbelief—some beliefs have narrower ranges of application than other beliefs (for example, rabbits cannot have complex mathematical beliefs though they can have perceptual beliefs). There is also a vague area between belief and disbelief, yet these two doxastic states differ in kind. So, these considerations only seem to support the Continuum View if there is no way of drawing a distinction in kind. But Hare has provided a distinction in kind that has largely escaped notice.

c. In Kind: Hare’s View

Hare is a Reductivist who holds that thick and thin are distinct in kind, not merely in degree. He holds that thick terms have both descriptive and evaluative meanings associated with them. Interestingly, Hare holds that this is also true of thin terms. Thus, for Hare, thin terms are not wholly evaluative, contrary to the straightforward view mentioned in 4a.

What then is the difference between thick and thin? The difference has to do with the relationship that the two meanings bear to the term in question. A thin term is one whose evaluative meaning is “more firmly attached” to it than its descriptive meaning. And a thick term is one whose descriptive meaning is “more firmly attached” than its evaluative meaning (1963: 24-5). Although Hare agrees that being firmly attached is “only a matter of probability and degree” (1989: 125), this does not mean that the distinction between thick and thin is only a matter of degree. Indeed, Hare’s phrase “more firmly attached” actually marks out a difference in kind. Consider an analogy: a child who is more firmly attached to her mother than to her father is different in kind from a child who is more firmly attached to her father than to her mother. Both are different in kind from a child who is equally attached to both parents. So, the language that Hare uses actually suggests three possible categories of evaluative terms—thick, thin, and neither. Although Hare never mentions the third category, it is at least a potential category for Scheffler’s examples of the neither thick nor thin.

What does Hare mean by “more firmly attached”? For Hare, the more firmly attached meaning is the one that is less likely to change when language users alter their usage of the term. For example, it is less likely that ‘right’ will eventually be used to evaluate actions negatively (or neutrally) than that it will be used to describe lying, promise-breaking, killing, torture, and so forth. The reason is that, if we start using ‘right’ to evaluate actions negatively (or neutrally), there is a great chance that we will be misunderstood or accused of misusing the word. In this sense, the evaluative meaning of ‘right’ is more firmly attached than its descriptive meaning. But just the opposite is the case for thick terms like ‘generous’. If we start using ‘generous’ to evaluate actions negatively, we will not be misunderstood (for example, Ebenezer Scrooge could use ‘generous’ negatively and we would still understand him). Yet, if we started using ‘generous’ to describe selfish acts, for example, then we will be misunderstood or accused of misusing the term. In this sense, the descriptive meaning of ‘generous’ is more firmly attached than its evaluative meaning (1989: 125).

Hare frames his distinction in terms of descriptive and evaluative meanings, which assumes a Reductive View. But his distinction and thought experiment can be formulated without assuming a Reductive View. Rather than talking about descriptive and evaluative meanings, we could instead speak of two different speech acts—describing and evaluating—that are commonly performed through ordinary uses of the terms. Hare’s thought experiment can be formulated by changing the speech acts that we typically perform with the term. For example, although we ordinarily use ‘generous’ to perform a speech act of positive evaluation, a speaker who uses it to evaluate negatively would still be understood.

At the outset it was said that thick concepts are evaluative concepts that are substantially descriptive. Thin concepts, by contrast, are not substantially descriptive. Exactly what ‘substantially descriptive’ means can now be clarified, depending on which of the above three views is accepted. On Williams’ view, being substantially descriptive is matter of being world-guided. On the Continuum View, being substantially descriptive is a matter of having enough specificity or enough descriptive content. On Hare’s view, being substantially descriptive is a matter of having a descriptive meaning that is more firmly attached than its evaluative meaning.

5. Are Thick Terms Truth-Conditionally Evaluative?

The putative significance of the thick depends upon a crucial assumption about how thick terms are evaluative. Several of the arguments and hypotheses discussed in 2.a-c assume that thick terms are evaluative as a matter of truth-conditions—that is, the conditions that must obtain for utterances involving thick terms to express true propositions.

To see how this assumption is made, first recall Foot’s argument. If ‘x is rude’ were not evaluative as a matter of truth-conditions, then its truth would not require anything evaluative, and there would not be anything evaluative following from the purely descriptive claim that x causes offense by indicating lack of respect. Hare’s response, that the evaluation of ‘rude’ is detachable, is a denial of the assumption that ‘rude’ is evaluative in its truth-conditions. Consider McDowell’s premise (2) of the Disentangling Argument. It’s often assumed that the only reason an outsider could not master the extension of ‘chaste’ must be that the truth-conditions associated with ‘chaste’ incorporate something evaluative, which the outsider cannot track. Moreover, the shapelessness hypothesis states that the extensions of thick terms are only unified by evaluative similarity relations. This suggests that something evaluative must obtain for utterances involving thick terms to express true propositions.

Nevertheless, it is controversial that thick terms are evaluative as a matter of truth-conditions. Generally, ethicists agree that thick terms are somehow associated with evaluative contents, but not all agree that these contents are part of the truth-conditions of utterances involving thick terms. How else can a thick term be associated with evaluative content, if not by way of truth-conditions?

Our use of language can communicate lots of information that is not part of the truth-conditions of what we say. In each of the following cases, a speaker B communicates a proposition that is not part of the truth-conditions of B’s utterance. In this first example, the proposition is communicated by way of presupposition:

B: “I don’t regret going to the party.”

Presupposition: that B went to the party.

Plausibly, B’s utterance could express a true proposition even if its presupposition is false—one way to have no regrets about going to a party is by simply not going. This presupposition can plausibly be apart of the background of the conversation at hand, but not part of the truth-conditions of B’s utterance.

Now consider a slightly different example, involving a phone conversation between A and B. In this case, B communicates a proposition by way of conversational implicature:

A: “Is Bob there?”

B: “He’s in the shower.”

Conversational Implicature: that Bob cannot talk on the phone right now.

This proposition is not part of the truth-conditions of B’s utterance—it is obviously possible that Bob can talk on the phone while in the shower. Instead, this proposition is inferred from B’s utterance by relying on conversational maxims and observations from context (for example, that A and B are having a phone conversation, and that B would not provide irrelevant information about Bob’s showering unless he is trying to convey that Bob cannot talk).

Now consider a third example, where B communicates a proposition by way of conventional implicature:

B: “Sue is British but brave.”

Conventional Implicature: that Sue’s bravery is unexpected given that she is British.

The proposition communicated in this example is not part of the truth-conditions of B’s utterance. One way to see this is by comparing B’s utterance with “Sue is British and brave.” These two utterances would seem to be true in all the same circumstances. But the latter does not communicate the implicature in question. This implicature is detachable, in the sense that a truth-conditionally equivalent statement need not have the implicature in question. Although Hare does not mention conventional implicature, his view about the detachability of a thick term’s evaluation could be explained in terms of conventional implicature.

In short, there are many ways to communicate information without it being part of the truth-conditions of the utterance. There are three widely-discussed pragmatic mechanisms—presupposition, conversational implicature, and conventional implicature—and there are others as well. Some hold that utterances involving thick terms do not convey evaluations that are part of their truth-conditions, but instead convey them via some pragmatic mechanism. This view is known as the Pragmatic View. It should be noted that some proponents of the Pragmatic View write as though their view entails that there are no thick concepts (Blackburn 1992). In other words, if the only evaluation associated with courage is pragmatically associated with it, then these philosophers will say that courage is not really a thick concept. Still, others who accept the Pragmatic View are happy to talk of these concepts as thick (Väyrynen 2013). This article does so as well.

The traditional view, however, is that thick terms are evaluative as a matter of their truth-conditions—this view is known as the Semantic View. The follow two sections discuss the Pragmatic and Semantic View, respectively.

a. Pragmatic View

Sometimes the Pragmatic View is supported by the idea that thick terms are variable in what evaluations they express. Typically, a given thick term conveys a particular evaluation that is either positive or negative, but not both. It is natural to assume that the term conveys this evaluation (whichever it is) in all assertive contexts. But it turns out that many paradigmatic thick terms can be used to evaluate something negatively in some contexts while positively in others. There are two ways of illustrating this variability.

The first involves combining a thick term with comparative constructions, such as ‘too’ or ‘not…enough’. For example, ‘lewd’ is typically negative, but it appears to convey something positive in the following quote: “this year’s carnival was not lewd enough” (Blackburn 1992: 296). Similarly, ‘tidy’ is typically positive, but can be used to convey something negative if one is criticized as “too tidy” (Hare 1952: 121). These considerations may be taken to show that thick terms are not evaluative as a matter of truth-conditions—if these thick terms have an evaluation as part of their truth-conditions, one might think ‘not lewd enough’ and ‘too tidy’ should be semantically awkward, but they are not.

Opponents to this argument may claim that the atypical evaluation in each case can be explained solely by reference to ‘too’ and ‘not… enough’, without claiming that ‘lewd’ and ‘tidy’ express an atypical evaluation. Consider that ‘too F’ and ‘not F enough’ are evaluative even when F is a wholly non-evaluative expression—for example, one might say that a color sample is too red, which seems to characterize the sample negatively. Here, the negative evaluation is solely because of ‘too’, and so it should be no surprise if this word generates a negative evaluation when combined with ‘tidy’. Furthermore, there are contexts where F is clearly seen as a positive quality even when it is combined with ‘too’. Borrowing an example from Väyrynen, a military commander could count a soldier as too courageous to waste on a simple mission, and instead select him for a more formidable mission where his courage would be needed (2011: 7). Thus, it appears the atypical evaluation can be attributed to the modifiers ‘too’ and ‘not…enough’ rather than the thick term itself.

The second sort of example pertains to utterances that convey an atypical evaluation without employing a comparative construction. For example, even though ‘cruel’ typically conveys a negative evaluation, it might be that the cruelty of an action was “just what made it such fun” (Hare 1981: 73). Or, even though ‘frugal’ typically conveys a positive evaluation, a person could be condemned as frugal if his “main job is dispensing hospitality” (Blackburn 1992: 286).

A worry associated with examples of this second sort is that the atypical evaluation can be explained in ways that are consistent with the Semantic View. For instance, it might be claimed that the examples involve non-literal uses of the thick term, or that they only convey the alternative evaluation by way of speaker meaning, and not word meaning. Alternatively, one could hold that thick terms are context sensitive, and that there are several different evaluations conveyed by the thick term depending on the context of utterance. In this case, the thick term would be evaluative as a matter of truth-conditions —it is just that those truth-conditions incorporate different evaluations in different contexts (Väyrynen 2011: 8-14).

Another argument for the Pragmatic View comes from Pekka Väyrynen (2013), who focuses on objectionable thick terms. Recall that objectionable thick terms embody values that ought to be rejected. Potential examples include ‘lewd’, ‘perverse’, and ‘blasphemous’, and ‘chaste’. The last of these terms seems to embody the view that a certain kind of sexual restraint is praiseworthy. Those that reject this view regard ‘chaste’ as objectionable—one can refer to such individuals as chastity-objectors. Chastity-objectors tend to exhibit interesting linguistic behavior. They would obviously be reluctant to assert that, say, John is chaste; but they are also reluctant to utter non-affirmative sentences like the following:

(a) John is not chaste.

(b) Is John chaste?

(c) Possibly, John is chaste.

(d) If John is chaste, then so is Mary.

None of these utterances imply that the truth-conditions of ‘chaste’ are satisfied. So, if chastity-objectors are reluctant to utter (a-d), this leads us to expect that the evaluation projects outside of the truth-conditions of ‘chaste’. It is worth noting that this argument is not restricted to examples like ‘chaste’, ‘perverse’, ‘lewd’, and ‘blasphemous’. According to Väyrynen, virtually any thick term could be regarded as objectionable, at least in principle, which means that his argument should extend even to examples like ‘courageous’ and ‘murder’.

In addition to arguing against the Semantic View, proponents of the Pragmatic View need to explain what pragmatic mechanism is responsible for the evaluations of thick terms. The three pragmatic mechanisms cited above can provide potential explanations, but Väyrynen rejects these explanations in favor of an alternative view. He proposes that the evaluative implications of paradigmatic thick terms are “not-at-issue” in normal contexts. Roughly, an implication is at-issue if it is part of the main point of the conversation at hand, and it is not-at-issue if it is part of the background (2013: ch. 5). Väyrynen takes this pragmatic view to be “superior to its rivals by standard methodological principles” from the philosophy of language and linguistics (2013: 10).

Proponents of the Semantic View can provide at least two lines of response to Väyrynen’s argument involving objectionable thick terms. For the first, it is important to note that Väyrynen explains the objector’s reluctance by holding that (a-d) all project the same evaluation beyond the truth-conditions of ‘chaste’. But one might hold that there is no single evaluation projected by all of (a-d)—instead, there are at least two different claims implied throughout (a-d). For example, just as ‘not happy’ conversationally implicates ‘unhappy’, it is equally plausible that ‘not chaste’ conversationally implicates ‘unchaste’. Since chastity-objectors clearly do not want to imply that John is unchaste, they are reluctant to assert (a). Moreover, (b-d) conversationally imply (or assert) that John might be chaste. If chastity-objectors believe it is impossible for anyone to be chaste, then they will be reluctant to assert (b-d). This piecemeal approach calls into doubt Väyrynen’s claim that a single evaluation projects beyond the truth-conditions of ‘chaste’. Moreover, these ways of explaining the reluctance of chastity-objectors are perfectly consistent with the Semantic View (Kyle 2013a: 13-19).

For a second response, it can be pointed out that even Väyrynen’s preferred explanation is consistent with the Semantic View (Kyle 2015). The mere fact that an evaluation projects outside of the truth-conditions of ‘chaste’ does not entail that there is no evaluation within those truth-conditions. For example, it is possible that ‘John is chaste’ conveys two evaluations, one that is part of its truth-conditions, and another that projects outside of them. This possibility can be illustrated with the affirmative sentence ‘It is good that Sue is moral’, which has an evaluative content within its truth-conditions and also projects one outside those truth-conditions. Consider the corresponding non-affirmative sentences:

(a′) It is not good that Sue is moral.

(b′) Is it good that Sue is moral?

(c′) Possibly, it’s good that Sue is moral.

 (d′) If it is good that Sue is moral, then we should applaud her.

An evaluative content—that Sue is moral—is implied by each of (a′-d′), as well as the affirmative sentence. So this evaluation projects much like the evaluation that Väyrynen thinks is projected by ‘chaste’. But none of this precludes the affirmative statement from having a different evaluation as part of its truth-conditions, namely the evaluation associated with ‘good’. Of course, this doubling of evaluation will have no purchase unless there is reason to think there is an evaluation within the truth-conditions of ‘chaste.’

b. Semantic View

What reason is there to think the evaluations of thick terms might be part of their truth-conditions? One potential reason stems from considering additional linguistic data. Notice that the following claim seems highly awkward:

(e) Sue is generous and not good in any way.

Similar statements can be provided using negative thick terms and ‘not bad in any way’. The Semantic View provides a straightforward explanation of the awkwardness of (e). This view can claim that (e) is a contradiction, assuming goodness-in-a-way is a part of the truth-conditions associated with ‘generous’ (Kyle 2013a). This, of course, is only one potential explanation—there may be other ways of explaining the awkwardness of (e), for example, by claiming that ‘generous’ presupposes or conventionally implicates an evaluation that incorporates goodness-in-a-way. Just as before, the issue must be decided by figuring out which view is the best explanation of this and other linguistic data. The matter is up for debate.

Still, one might object that the Semantic View is ill-suited to explain the oddity of (e), since this view mistakenly predicts that (e) would sound odd in every context, yet there are some unusual contexts in which (e) would not sound odd. For example, imagine Ebenezer Scrooge uttering (e) in a context where generosity is seen as a bad thing. (e) might not seem awkward in this context. However, it is a mistake to think that the Semantic View predicts that (e) is awkward in all contexts. Consider that its second conjunct involves a quantifier expression—‘any’—and quantifiers are notoriously context-sensitive. In one context, it might be true to say ‘O.J. Simpson is not good in any way’; but in other contexts, where being a good athlete is relevant to discussion, an utterance of the same sentence could be false. Similarly, the second conjunct in (e) is true or false relative to context. And it is only in contexts where generosity is a relevant way of being good that (e) should sound contradictory. In contexts where generosity is not a relevant way of being good—such as Scrooge’s context—(e) should not sound contradictory. In those contexts, the first part of (e) could be true while the second part is false (Kyle 2013a).

It’s worth emphasizing that the linguistic data about thick terms does not strictly entail the Semantic View, or the Pragmatic View. Rather, the proponents of such views only claim that their respective view is part of the best explanation of a wide-body of linguistic data involving thick terms. This matter has only been explored in recent years, with proponents on each side (Kyle 2013a; Väyrynen 2013).

Another way of supporting the Semantic View stems from the shapelessness hypothesis, that the extensions of thick terms are only unified by evaluative similarity relations. If the shapelessness hypothesis is true, then the truth-conditions associated with thick terms must be at least partly evaluative. But what reason is there to accept the shapelessness hypothesis?

The main support for shapelessness comes from the idea that thick terms seem to “outrun” any descriptive characterizations we can give to the items in their extensions (Kirchin 2010). Consider the various types of action that can be considered kind—shoveling snow for a neighbor, giving chocolate to a child, adopting a stray cat, standing up to someone’s bully, paying a complement to a friend, and so on. Furthermore, there are some actions that would be considered kind in some circumstances even though the opposite action would be considered kind in other circumstances—for example, telling the truth is sometimes kind but so is telling a white lie. Can these various actions be descriptively classified in a way that allows us to correctly characterize kind actions in new cases? The descriptive classification might be a long disjunction of unrelated features, a shapeless classification that would be unhelpful in confronting new cases.

One way of opposing this argument is to show that the various actions mentioned above can be unified under a shapely descriptive classification. For example, each of the above actions seems to benefit others by treating them as ends in themselves. This shapely classification helps us classify at least some new cases of kind action (for example, giving food to a homeless person). And benefit can be understood in purely descriptive terms—for example, as the increasing of happiness. So, it is not obvious that shapelessness is actually supported by the outrunning data given above, although other data could perhaps be provided.

Suppose that thick terms do outrun our ability to provide wholly descriptive characterizations of their extensions. Väyrynen argues that this does not support the Semantic View, because our inability to give a descriptive classification for a thick term can be explained even if the Semantic View is false. Consider that, for some terms T, the extension of T cannot be unified under relations that are expressible in independently intelligible T-free terms (that is, terms that can be understood independently of T). For example, it might be that the extension of ‘pain’ across all sentient animals cannot be unified without employing the word ‘pain’ itself. Now return to the question of what explains our inability to give a wholly descriptive characterization of the extension of ‘kind’. It might be that the only way to characterize its extension is by employing the word ‘kind’ itself. But this would not be a descriptive classification, if the Pragmatic View were true of ‘kind’. It is just that the evaluative-ness of ‘kind’ would be explained by a pragmatic mechanism, rather than its truth-conditions. So, if ‘kind’ is a term like T, then one could not give a wholly descriptive classification of the extension of ‘kind’, even if the Semantic View is false (2013 193-201).

6. Broader Applications

Thick concepts have been an interest primarily among ethicists, although these concepts have made an entrance into discussions in other areas, such as aesthetics, metaphysics, philosophy of law, moral psychology, and epistemology. This section focuses mainly on epistemology’s recent discussions on thick concepts, since these discussions have been the most extensive (outside of ethics). But the discussions from the first four areas shall be briefly summarized.

In aesthetics, there is much discussion about thick aesthetic concepts, like gaudy, elegant, delicate, and brilliant. However, many of these discussions are centered on the question of whether there are any thick aesthetic concepts at all. In this context, it is often assumed that an aesthetic concept is not thick if it is only pragmatically associated with evaluative content (recall that this assumption is sometimes made in ethics as well). So, the discussions over whether there are any thick aesthetic concepts often mirrors the discussions in ethics on whether thick concepts are only pragmatically evaluative, or whether they are evaluative as a matter of truth-conditions (Bronzon 2009; Zangwill 2001). It is worth noting that Burton’s Reductive View (discussed in 3a) is explicitly aimed at accounting for thick aesthetic concepts, as well as ethical ones.

In metaphysics, Gideon Yaffe criticizes two competing views on the nature of freedom of will—one that equates freedom of will with self-expression and one that equates it with self-transcendence. Yaffe then holds that the debates between these approaches have proceeded from the (at least implicit) assumption that freedom of will is a descriptive concept. He argues that there are facts about freedom of will that are best explained if freedom of will is instead assumed to be thick. According to Yaffe, the descriptive content of this concept would correspond to the features that make the agent either self-expressive or self-transcendent. But, according to Yaffe, this is not enough to account for freedom of will. We must also determine whether “the agent has choices that come about through worthwhile processes, processes possessing a certain kind of value” (2000: 219-20). And it is this kind of value that corresponds to the evaluative content of freedom of will.

In the philosophy of law David Enoch and Kevin Toh (2013) point out that legal statements often straddle the divide between the descriptive and the evaluative. They put forth the hypothesis that many legal statements express thick concepts. Potential examples of such thick concepts may include crime, constitutional, inheritance, and infringement, though Enoch and Toh focus primarily on the concept legal, which they argue is a thick concept. The descriptive content of legal consists in its representation of certain social facts, and its evaluative content is a kind of endorsement. They do not assert that the thickness of legal can by itself settle debates over the nature of law. But they hold that its classification as thick can situate these debates within a broader philosophical context analogous to that of ethics, and can introduce new options for thinking about the nature of law.

Thick concepts also play a role in Gabriel Abend’s critique of current moral psychology and neuroscience. Abend argues that moral psychologists and neuroscientists unwarrantedly restrict their research to thin ethical concepts, but ignore thick ones. In particular, these scientists attempt to understand the psychological or neural bases for moral judgments, and they do so by testing various subjects’ judgments involving thin concepts like right and permissible. But judgments involving thick ethical concepts, like cruel and courageous, have scarcely been featured in these experiments. This is no small oversight, given “that thick concepts appear in some or much of people’s moral lives” (2011: 150). Abend also argues that this problem cannot be fixed merely by expanding out psychological and neurological research to include thick ethical judgments, because thick concepts “challenge the conception of a hardwired and universal moral capacity in a way that thin concepts do not” (2011: 145-6). In advancing this last point, Abend relies on the Disentangling Argument and the shapelessness hypothesis, as well as the claim that thick concepts presuppose institutional and cultural facts that do not hold universally.

Outside of ethics, the most extensive discussions on thick concepts occur in epistemology. In 2008, a special issue of the journal Philosophical Papers (vol. 37 no. 3) was devoted to thick concepts in epistemology. Examples of thick epistemic concepts include concepts like intellectual curiosity, truthfulness, open-mindedness, and dogmatic; these are contrasted with thin epistemic concepts, which are typically illustrated with concepts like justification, rationality, and knowledge. The editors of this issue hold that “traditional epistemology has tended towards using the thin concepts in theorizing,” but “these thin epistemic concepts are far less prevalent in everyday discourse than the thick epistemic” (2008: 342). The overarching question of this collection is whether epistemology would benefit from substantive investigations of thick epistemic concepts.

One way of addressing this question is simply to do a substantive investigation of a particular thick epistemic concept, and to show that epistemological theories are enhanced by the investigation. Two contributors to the Philosophical Papers collection take this approach. Catherine Elgin focuses on the concept trustworthy. She argues that trustworthiness does not reduce to justified or reliable true belief, but can help to explain why justified or reliable true beliefs are valuable (2008: 371-87). Harvey Siegel considers whether education is an epistemic virtue concept, and whether it makes sense to classify it as thick. Siegel is skeptical about the helpfulness of the thin/thin distinction, but, to the extent that this distinction is viable, he maintains that education is “more thick than thin.” He also seeks to clarify the relationship between education and virtue epistemology (2008: 467).

The other contributors focus on general issues concerning thick epistemic concepts. Heather Battaly’s contribution seeks to address an objection advanced by Simon Blackburn against Non-Reductive Views. According to Blackburn, Non-Reductive Views mistakenly imply that the differences in how we respond to, say, lewdness would not count as genuine disagreements, because the disputants would be employing different concepts and therefore talking past one another. For example, a person who thinks the carnival was not lewd enough might be employing a different concept from someone who disvalues lewdness, because there would be “no detachable description, no ‘semantic anchor,’ that they can share” (1992: 297-99). (This is an expanded version of the variability objection discussed in section 5a). Battaly responds by arguing that certain thick epistemic concepts, such as open-minded, are subject to combinatorial vagueness—these concepts have several independent conditions of application, but there is no sharp distinction between the conditions that are necessary or sufficient and those conditions that are neither. Battaly holds that disputants can share the same vague concept, while disagreeing over which conditions are necessary and/or sufficient. Battaly maintains that this allows them to have genuine disagreements about whether the concept refers to an epistemic virtue. According to Battaly, this at least shows that virtue epistemologists can disagree about the epistemic virtues without talking past one another (2008: 435-54).

Väyrynen’s contribution focuses on whether thick and thin epistemic concepts can be distinguished in ways comparable to thick and thin ethical concepts, and on whether a focus on thick epistemic concepts can lead to a preferable epistemology. Regarding the first issue, he argues that the way thick and thin concepts are typically distinguished in ethics provides no straightforward distinction between thick and thin epistemic concepts (2008: 390-95). Regarding the second, he argues that neither semantics nor substantive epistemological theory provides a basis for assigning thick epistemic concepts theoretical priority over thin epistemic concepts (2008: 395-408). Väyrynen concludes by claiming that we so far lack good reasons for taking a theoretical turn to a thick epistemology.

Despite Väyrynen’s final conclusion, the Philosophical Papers collection contains an explicit defense of the view that epistemology should expand its focus to thick epistemic concepts. Bernard Williams pushed for a similar expansion in the ethical sphere. And the comparison here brings up an important question, which is the main question of Alan Thomas’ contribution: Can Williams’ treatment of thick ethical concepts be applied analogously in the epistemic sphere? Recall that Williams holds that utterances involving thick ethical concepts cannot be objectively true. Thus, if epistemologists want to model their theory of thick epistemic concepts after Williams’ view in ethics, it appears they will not be able to claim that there are objectively true claims involving thick epistemic concepts. Thomas, however, points out that Williams’ non-objectivism in ethics “is based on the assumption that there are a variety of social worlds, structured by plural sets of thick ethical concepts.” But Williams’ view of thick epistemic concepts, such as truthfulness, allows for the possibility of “only one epistemic world.” According to Thomas, truthfulness is “such a central need of human life that it can be abstractly modeled in a way that… [is] culturally invariant” (2008: 368).

Guy Axtell and J. Adam Carter focus on outlining a positive account for how thick epistemic concepts could play a central role in epistemological theory. The account begins by claiming that epistemic value should be a central focus in epistemology, and that not all epistemic values can be reduced to the value of truth (or to some other single epistemic good). Other values, such as open-mindedness, “can be useful in articulating our epistemic aims,” even if they cannot be thusly reduced (2008: 418). Axtell and Carter also reject Thin Centralism in the epistemic sphere—the view that “general concepts like ‘justified’ and ‘ought’ are logically prior to and independent of specific reason-giving thick epistemic concepts of virtue and vice” (2008: 418). In the epistemic sphere, ‘justified’ and ‘ought’ are primarily used to evaluate beliefs, but the rejection of Thin Centralism allows that there could be fundamental ways of evaluating agents with virtue and vice concepts, and that these evaluations may not be reducible to belief evaluations. The above tenets open up the possibility of, what Axtell and Carter call, a “second-wave of virtue epistemology.” This contrasts with the “first-wave” which takes thick epistemic concepts to be theoretically important primarily because they play a role in analyzing knowledge. On the second wave of virtue epistemology, thick epistemic concepts are “a subject for research in their own right, apart from whatever role they might have in explaining knowledge” (2008: 427).

Many authors in the Philosophical Papers collection take knowledge to be a paradigmatic example of a thin epistemic concept (Battaly 2008: 435; Axtell and Carter 2008: 427; Thomas 2008: 363; Väyrynen 2008: 392; Kotzee and Wanderer 2008: 339). It is not immediately clear whether their arguments rely substantively on this assumption, but the assumption has been contested in a separate context. Brent Kyle argues that knowledge is actually a thick concept. According to Kyle, knowledge is best accounted for as a close relation between a descriptive content—true belief—and an evaluative content—justification. If successful, this argument would establish that traditional epistemology has already focused on at least one thick concept—namely knowledge. But Kyle’s main goal is not to defend the traditional focus of epistemological theories. Instead, he aims to argue that the thickness of knowledge can explain why the Gettier Problem arises. He does so by arguing that the Gettier Problem is a specific instance of a general problem about analyzing thick concepts. It is worth noting that his argument takes no stand on whether thick concepts can be analyzed, or on whether the Gettier Problem is resolvable (Kyle 2013b).

Generally speaking, thick concepts have become a source of optimism for many philosophers who find traditional research within normative disciplines to be myopic, stagnant, or misdirected. Nevertheless, it is still a matter of debate whether a plausible theory of thick concepts actually has the implications typically hoped for. In particular, the literature on thick concepts still contains lively debates regarding fundamental issues such as the Disentangling Argument, the shapelessness hypothesis, non-reductivism, and the Semantic View. And if proponents of the significance of thick concepts make assumptions regarding these controversial issues, then their views will be met with significant opposition, at least until these issues are resolved. But, on the flip side, if opposing theorists account for all normativity with thin concepts, and take these concepts to be non-factual, they too will meet significant opposition. The recent debates about thick concepts are largely responsible for this. Ultimately, whatever approach one takes to these fundamental issues, it is clear that theories of value and normativity cannot be complete unless they give some attention to the thick.

7. References and Further Reading

  • Abend, G.  2011, “Thick Concepts and the Moral Brain,” European Journal of Sociology 52, 143-72.
  • Anscombe, E.  1958, “Modern Moral Philosophy,” Philosophy 33, 1-19.
  • Axtell, G. and A. Carter, 2008, “Just the Right Thickness: A Defense of Second-Wave Virtue Epistemology,” Philosophical Papers 37, 413–434.
  • Ayer, A.J.  1946, Language, Truth, and Logic, Dover, New York.
  • Battaly, H.  2008, “Metaethics Meets Virtue Epistemology: Salvaging Disagreement about the Epistemically Thick,” Philosophical Papers 37, 435–454.
  • Blackburn, S.  1992, “Through Thick and Thin,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Bronzon, R.  2009, “Thick Aesthetic Concepts,” The Journal of Aesthetics and Art Criticism 67:2, 191-99.
  • Burton, S.  1992, “‘Thick’ Concepts Revisited,” Analysis 52, 28-32.
  • Dancy, J.  2013, “Practical Concepts,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Dancy, J.  1995, “In Defense of Thick Concepts,” Midwest Studies in Philosophy 20, 263-79.
  • Elgin, C.  2008, “Trustworthiness,” Philosophical Papers 37, 371-87.
  • Elstein, D. and T. Hurka, 2009, “From Thick to Thin: Two Moral Reduction Plans,” The Canadian Journal of Philosophy 39, 551-36.
  • Enoch, D. and K. Toh 2013, “Legal as a Thick Concept,” in W.J. Waluchow and S. Sciaraffa (eds.), Philosophical Foundations of the Nature of Law, Oxford University Press, Oxford.
  • Foot, P.  1958, “Moral Arguments,” Mind 67, 502-13.
  • Geertz, C.  1973, “Thick Description: Toward and Interpretive Theory of Culture,” The Interpretation of Cultures: Selected Essays, Basic Books, New York, 3-30.
  • Gibbard, A.  1992, “Thick Concepts and Warrant for Feelings,” Proceedings of the Aristotelian Society, suppl. vol. 66, 285-99.
  • Harcourt E. and A. Thomas, 2013, “Thick Concepts, Analysis, and Reductionism,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Hare, R.M.  1952, The Language of Morals, Oxford University Press, Oxford.
  • Hare, R.M.  1963, Freedom and Reason, Oxford University Press, Oxford.
  • Hare, R.M.  1989, Essays in Ethical Theory, Oxford University Press, Oxford.
  • Hare, R.M.  1997, Sorting Out Ethics, Clarendon Press, Oxford.
  • Hurka, T.  2011, “Common Themes from Sidgwick to Ewing,” in T. Hurka (ed.) Underivative Duty, Oxford University Press, Oxford, 6-25.
  • Hurley, S.  1989, Natural Reasons, Oxford: Oxford University Press.
  • Hursthouse, R. 1991, “Virtue Theory and Abortion,” Philosophy and Public Affairs 20, 223-46.
  • Hursthouse, R.  1996, “Normative Virtue Ethics,” in Roger Crisp (ed.) How Should One Live?  Oxford University Press, Oxford, 19-36.
  • Hursthouse, R. 1999, On Virtue Ethics, Oxford University Press, Oxford.
  • Jackson, F.  1998, From Metaphysics to Ethics: A Defence of Conceptual Analysis, Oxford University Press, Oxford.
  • Kirchin, S.  2010, “The Shapelessness Hypothesis,” Philosophers Imprint 10, 1-28.
  • Kotzee, B. and J. Wanderer, 2008, “Introduction: A Thicker Epistemology?” Philosophical Papers 37, 337-343.
  • Kyle, B.  2013a, “How Are Thick Terms Evaluative?” Philosophers’ Imprint 13, 1-20.
  • Kyle, B.  2013b, “Knowledge as a Thick Concept: Explaining Why the Gettier Problem Arises,” Philosophical Studies 165, 1-27.
  • Kyle, B. 2015, “Review of ‘The Lewd, the Rude, and the Nasty: A Study of Thick Concepts in Ethics’ by Pekka Väyrynen,” The Philosophical Quarterly 65, 576-82.
  • McDowell, J.  1981, “Non-Cognitivism and Rule-Following,” in S. Holtzman and C. Leich (eds.) Wittgenstein: To Follow a Rule, Routledge, London.
  • McDowell, J. 1998, “Aesthetic Value, Objectivity, and the Fabric of the World,” in Mind, Value, and Reality, Harvard University Press, Cambridge, Mass.
  • Platts, M.  1988, “Moral Reality,” in G. Sayer-McCord (ed.) Essays on Moral Realism, Cornell University Press, Ithaca, NY.
  • Putnam, H. 1990, “Objectivity and the Science/Ethics Distinction,” in J. Conant (ed.) Realism with a Human Face, Harvard University Press, Cambridge, Mass.
  • Roberts, D.  2011, “Shapelessness and the Thick,” Ethics 121, 489-520.
  • Roberts, D.  2013, “It’s Evaluation, Only Thicker,” in S. Kirchin (ed.) Thick Concepts, Oxford University Press, Oxford.
  • Ryle, G.  1971, “The Thinking of Thoughts: What is ‘Le Penseur’ Doing?” in Collected Papers 2, Routledge, London, 480-83.
  • Scheffler, S.  1987, “Morality through Thick and Thin: A Critical Notice of Ethics and the Limits of Philosophy,” Philosophical Review 96, 411-34.
  • Siegel, H.  2008, “Is ‘Education’ a Thick Epistemic Concept?”  Philosophical Papers 37, 455-469.
  • Tappolet, C.  2004, “Through Thick and Thin: Good and Its Determinates,” Dialectica 58, 207-21.
  • Thomas, A.  2008, “The Genealogy of Epistemic Virtue Concepts,” Philosophical Papers 37, 345–369.
  • Väyrynen, P.  2008, “Slim Epistemology with a Thick Skin,” Philosophical Papers 37, 389–412.
  • Väyrynen, P. 2011, “Thick Concepts and Variability,” Philosophers’ Imprint 11, 1-17.
  • Väyrynen, P.  2013, The Rude, the Lewd, and the Nasty: A Study of Thick Concepts in Ethics, Oxford University Press, Oxford.
  • Williams, B.  1985, Ethics and the Limits of Philosophy, Harvard University Press, Cambridge, Mass.
  • Williams, B.  1993, “Who Needs Ethical Knowledge?” in A Griffiths, Royal Institute of Philosophy, suppl. vol. 35, 213-22.
  • Williams, B.  1995, “Truth in Ethics,” Ratio 8(3), 227-42.
  • Yaffe, G.  2000, “Free Will and Agency at Its Best,” Philosophical Perspectives 14, 203-29.
  • Zangwill, N.  2001, “The Beautiful, the Dainty, and the Dumpy,” in Metaphysics of Beauty, Cornell University Press, Ithaca, NY.

 

Author Information

Brent G. Kyle
Email: brent.kyle@usafa.edu
United States Air Force Academy
U. S. A.

An encyclopedia of philosophy articles written by professional philosophers.